microcorruption hollywood writeup

A writeup of the hollywood challenge from microcorruption.

Initial reverse engineering

This challenge is a whole lot harder than the other challenges. When freshly initialized, it contains code that seem to unpack or copy itself continuously into random memory locations and then cleans up. This prevents dumping a fully unpacked version of the code.

Nevertheless, dumping a partially unpacked at the point where it asks the user for the password is usefull.

Here is my python script to convert the Live Memory Dump from the debugger:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28 #!/usr/bin/env python

import sys

if len(sys.argv) < 2:
    print 'Usage : %s <msp430.dump> > msp430.bin' % sys.argv[0]
    quit()
    
with open(sys.argv[1], 'r') as f:
    data = f.read()
    
data = data.split('\n')

last_addr = 0x0000
def addLine(addr, line):
    return '00' * (addr - last_addr - 16) + line

bin = ''
for line in data:
    if len(line) < 1:
        continue
    l = line[8:48]
    if l[0] == '*':
        continue
    bin += addLine(int(line[:4],16),l.replace(' ',''))
    last_addr = int(line[:4],16)
    
print bin.decode('hex')

You can then open this .bin in IDA or binninja and tell it that its a ti msp430 and load it at address 0.

ida

Understanding what is happening

After a few hours of reverse engineering this program in the crappy debugger, I finally understood what it was. It’s a poor man’s virtual machine that has an encoded chain of code gadgets. The execution flow goes as follow:

copy the code to unpack a gadget into a random location
jump to that location
zero out the code from previous step
unpack a gadget into a random location
jump to that location
The gadget sets the address of the next gadget in r12 and jumps back to step 1.

most gadgets only actually executes a single instruction. The rest of its content is for managing the chained list of gadget.

Here is the exact sequence of commands I used to extract the gadgets:

> reset
> c
Enter AB when prompted to enter password.
> #define aaa s ; s ; b pc+86 ; c
> #define aa s ; s ; b pc+86 ; c
> #define bb u pc ; s ; b pc+6e ; c
> #define cc u pc ; s ; breakpoints
> aaa
> bb
> cc
Record the instruction in the Current Instruction window. (it should be mov #0x2600, r5)
> aa
> bb
> cc
Record the instruction in the Current Instruction window. (it should be clr r6)

After each execution of cc, record the instruction(s) about to be executed. This is the instruction found in the decoded gadget. For some reason, if I chained aa bb and cc into a single macro, the debugger gets confused after a few executions. The breakpoints in cc is to make sure that all breakpoints are gone before the next round… you need to do this manually if you see any. The debugger manages to forget a few every now and then. Very annoying.

At this point it becomes a normal reversing job and I ended up with the following algorithm. The algorithm essentially makes a rudimentary hash of the password and checks for two conditions. Being lazy I wrote the algo in python and tried to bruteforce the password.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44 #!/usr/bin/env python

from pwn import *
import string

log.info('searching for the password')

password = 'aaaa'

def itera(r4, r6, val):
	r4 = (r4 + val) & 0xffff
	r4 = u16(p16(r4)[::-1])	#swpb
	r6 ^= val
	r6 ^= r4
	r4 ^= r6
	r6 ^= r4
	return (r4, r6)

def testpass(paswd):
	r4 = 0
	r6 = 0
	(r4, r6) = itera(r4, r6, u16(paswd[0:2]))
	(r4, r6) = itera(r4, r6, u16(paswd[2:4]))
	if r4 == 0xfeb1:
		if r6 == 0x9298:
			log.success('r4 was right with pass: %s' % paswd)
			log.success('r6 was also right with pass: %s' % paswd)
			quit()

alnum = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
alnum = string.printable

i = 0
pwd = '    '

with log.progress('crunching...') as p:
	for a in xrange(0x00, 0xff):
		for b in xrange(0x00, 0xff):
			for c in xrange(0x00, 0xff):
				i += 1
				p.status("At %i%% : %s" % (i*100/pow(len(alnum),3), pwd))
				for d in xrange(0x00, 0xff):
					pwd = chr(a)+chr(b)+chr(c)+chr(d)
					testpass(pwd)

It failed because its only trying for passwords of up to 4 characters and the smallest possible password is 5 bytes long. It took forever for 4 bytes so I turned to z3 as this is what its made for.

Here is my z3 code inspired by https://gist.github.com/laanwj/7020e2fb9922f5c8b4e5522bbd3353fe. I felt pretty bad looking at a hollywood write-up for my z3 help but my google keyword searches gave me this as my first hit. At this point I knew what to do, I just needed the proper z3 syntax.

Here is my python script that uses z3 to solve this problem:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35 #!/usr/bin/env python

# inspired by https://gist.github.com/laanwj/7020e2fb9922f5c8b4e5522bbd3353fe for z3

from z3 import *

def itera(r4, r6, val):
	return (r6 ^ val, RotateLeft(r4 + val,8))


def hashfun(values):
	r4 = 0
	r6 = 0
	for v in values:
		(r4, r6) = itera(r4, r6, v)
	return (r4, r6)

def solve():
	for length in range(2,50, 1):
		values = [BitVec(chr(41+i), 16) for i in range(length)]
		(r4_f, r6_f) = hashfun(values)
		s = Solver()
		s.add(r4_f == 0xfeb1)
		s.add(r6_f == 0x9298)
		if s.check() == sat:
			print 'solution of len %2d: ' % length,
			m = s.model()
			final = ''
			for x in values:
				l = m[x].as_long()
				final += '%02x%02x' % (l & 0xff, l >> 8 & 0xff)
			print final

if __name__ == '__main__':
	solve()

enjoy