Interactive deobfuscation

transcript

A thrift shop for static deobfuscation

whoami

• Security researcher • Break stuff, reverse, make them better and

break again• Part of nullsec non profit group

How it all started

• Presumably a simple crackme– Eventually discovered as wb aes

• I wanted to solve it statically– Since running things is cheating– Goal was to solve in lt a month• A race I didn’t manage to fulfill when working statically

blame this person =>

• Name is md5’ed• Serial is transformed / permutated using

unknown function

Challenge archeology• Overall the crackme was deployed into 2 main parts• Deobfuscation– Opaque predicates, lookup tables, value tables and

“spaghetti” code• Cryptanalysis– The original cipher was whitebox’ed

Deobfuscation

Deobfuscation - Layer0• Found some jmps, decided to map them all– find_lookuptables(“Mov <register>, dword ptr

[addr*4]”)– Add xrefs, define locs

• IDA can’t map them all into graph views (due to size, more RAM == bigger graph)

• After looking a bit there seem to be some logic and different operations inside them

• However they all lead to the same path eventually

Deobfuscation Layer1

• Removal of jmps and basic block identification– All the obfuscation was done in a matter to effect the bb

itself, after a jmp to another table occurred everything was restored

• Follow_jmps_by_addr(addr) to find bb boundaries– Follow jcc until a jmp / push + ret sequence is found– Compress it, remove jccs and make one BB– In case xrefs, patch them together

Deobfuscation – Layer2

• Opaque predicates

• Ops which used to make the bb bigger– Simple rule – operations are per bb and do not

exceed it – Wrote a simple emulator to emulate bb and

optimize them to simple instructions• 1 exception – do not touch lookup tables values

– More on this later

Deobfuscation – Layer 3

• Tables, and lots of them– Apart from the jmptables which lead the way

• Tables are used as part of the cipher itself • Key is dismantled inside them (more on this

later)• Each table has a different role and some are

doubled for obfuscation• FindTables to the rescue

Deobfuscation – Layer3

• FindTables basically taints memory and looks for read of 16b tables

• Once it finds one it defines an array of 0xFF to that addr

• All value tables are mapped using this way, their usage however varies

Deobfuscation – Layer 4

• Once we have all the code cleaned we getseveral consecutive lookup tables• Loops are unrolled and become normal

repetitive ops (per round and state)• All deobfuscated code was written into a new

section called “deobf” to make code reading easier

• It is now time to move on to the cryptanalysis stage

Cryptanal

• The idea to automate every process is infeasible and too much time consuming

• I decided to split the work into two main stages:– Operation identification– Key extraction

• Both are used interactively – Thus the name interactive deobfuscation

Cryptanal archeology• Discovered BGE attacks from the academia– Chow , Xiao

• sysk’s phrack article• Eventually said FUCK YOU ALL gonna do it

myself w/o cryptic math– Lack of algebra lessons and focus

Cryptanal – Layer0• Actual wb code to encrypt a text• Loops 9 times which made me quite frustrated– Before discovering it was wb’ed– After counting the loops by hand I thought it

might be AES– But where’s the key ? • LOLWTF ? md5(user) == wbaes.dec(serial,user_as_key)

– No, key must be *embedded*• LOLWUT? md5(user) == wbaes.d/enc(serial,key) ??

– Output isn’t ascii so it could be both enc/dec

Cryptanal – Rijndael on a toe

• Several simple operations– AddRoundKey, SubBytes , ShiftRows,MixColumns

• Some operations are linear and could be replaced with their previous op

• The key to understand the attack is to sniff the first round and extract the key– In the future I found Eloi made my life harder

rijndael

whitebox(rijndael)

evolves into =>

whitebox(rijndael)

• 1st transformation:– ShiftRows is linear, and thus could be replaced in

op position with AddRoundKey– SubBytes and ShiftRows could be replaced in op

position, as SubBytes does the same op

• Let “Linear” aka lin be– lin(x) ^ lin(y) == lin(x ^ y)

• 2nd transformation– It is possible to tranform and “compress” several ops into

one• By using XORtables and T/yboxes

– T/yibox• Combine AddRoundKey and SubBytes into one operation

(lookup table) to emit 1 byte • SubBytes(x ^ k[i])

• XORtable– Transform MixColumns into a series of lookuptables,

particulary these tables are created by XORing one input byte at a time through the MixColumns vector

• 3rd transformation– Append external encoding into the keys and lookuptables – Replace table values with random ones upon stage – 41 => 32, 21 => 56, 12 => 4– Let G & F be encoding values– G() o AES() o F()

• Such that G & F cancel each other out eventually

– The external encoding is what makes the whitebox variant “attack resistant”

Attaq 101

• Chow stated that his implementation doesn’t leak any information – In reality the XORtables and T/ytables still leaks

one nibble each time – Not very helpful but still something

• Since the external encoding cancel each out it might be worth to understand them– Hint hint

Attaq!• If we look at input encoding and output

encoding we know that they both cancel each other out

• Thus if we manage to find the values of the encoding we’d only have a “naked” implementation of wbaes

• And then just sniff the first round key and extract the key

Cryptbox

• Let’s try to look at MixColumns in the Ty/itables transformations

• In a general idea it transforms32b to 32b values

• Let P be input encodingand Q output encoding

• Now let’s try to give an approximation about the encoding values

• Billet suggests to zero out two bits out of the 4 and build up a new lookup table and perform the transformation

• Once we have that we construct a new lookup tableto their reversed operation

whitebox^whitebox

• We get 256 possible bijectionswhich can be used to build up output encoding approximations

• The same operation is done to the input encoding using the acquired approximation we had for Q

• Once have the external encoding values we can just sniff the first round key and extract the keys

• @shiftreduce • shiftreduce@gmail.com

• Thanks to Eloi for making this challenge• greetz @

#ecl,#nullsec,inbarr,nirizr,skier_,emdel,over, Mikae, l_inc,

Interactive deobfuscation

Documents