libfuzzer workshop学习之路 final
workshop一共给出了11个lesson,每一个lesson都会涉及到一些新的东西,这篇以最后的两个案例(对re2和pcre2的fuzz)为例,会涉及到一些链接库的选择以及插桩编译时的一些参数的设置,还有max_len的设置对我们最后fuzz结果的影响。
fuzzing pcre2
pcre2:Perl Compatible Regular Expressions Version 2
(Perl兼容的正则表达式)即是一个C语言编写的正则表达式函数库,被很多开源软件所使用比如PHP,Apache,Nmap等。
workshop提供的pcre2版本是10.00,先进行源码编译工作。
tar xzf pcre2-10.00.tgz
cd pcre2-10.00
./autogen.sh
export FUZZ_CXXFLAGS="-O2 -fno-omit-frame-pointer -gline-tables-only -fsanitize=address,fuzzer-no-link -fsanitize-address-use-after-scope"
CXX="clang++ $FUZZ_CXXFLAGS" CC="clang $FUZZ_CXXFLAGS" \
CCLD="clang++ $FUZZ_CXXFLAGS" ./configure --enable-never-backslash-C \
--with-match-limit=1000 --with-match-limit-recursion=1000
make -j
这里的一些插桩的参数和进阶篇的差不多,要注意的编译选项是fuzzer-no-link
,如果修改大型项目的CFLAGS,它也需要编译自己的主符号的可执行文件,则可能需要在不链接的情况下仅请求检测,即fuzzer-no-link
强制在链接阶段不生效。因此当我在插桩编译一个较大的开源库的时候推荐加上这个选项,如果不加的话fuzz效率如下:
#2 INITED cov: 7 ft: 8 corp: 1/1b exec/s: 0 rss: 27Mb
#3 NEW cov: 9 ft: 10 corp: 2/5b lim: 4 exec/s: 0 rss: 27Mb L: 4/4 MS: 1 CrossOver-
#7 REDUCE cov: 9 ft: 10 corp: 2/3b lim: 4 exec/s: 0 rss: 28Mb L: 2/2 MS: 4 ChangeByte-CrossOver-ChangeBinInt-EraseBytes-
#35 REDUCE cov: 10 ft: 11 corp: 3/5b lim: 4 exec/s: 0 rss: 28Mb L: 2/2 MS: 3 CopyPart-ChangeByte-EraseBytes-
#146 REDUCE cov: 10 ft: 11 corp: 3/4b lim: 4 exec/s: 0 rss: 28Mb L: 1/2 MS: 1 EraseBytes-
#1491 REDUCE cov: 16 ft: 17 corp: 4/21b lim: 17 exec/s: 0 rss: 28Mb L: 17/17 MS: 5 ChangeBit-ShuffleBytes-InsertRepeatedBytes-ChangeBit-CrossOver-
#1889 REDUCE cov: 16 ft: 17 corp: 4/20b lim: 17 exec/s: 0 rss: 28Mb L: 16/16 MS: 3 ShuffleBytes-CopyPart-EraseBytes-
#524288 pulse cov: 16 ft: 17 corp: 4/20b lim: 4096 exec/s: 87381 rss: 830Mb
#1048576 pulse cov: 16 ft: 17 corp: 4/20b lim: 4096 exec/s: 104857 rss: 830Mb
#2097152 pulse cov: 16 ft: 17 corp: 4/20b lim: 4096 exec/s: 123361 rss: 830Mb
#4194304 pulse cov: 16 ft: 17 corp: 4/20b lim: 4096 exec/s: 127100 rss: 830Mb
#8388608 pulse cov: 16 ft: 17 corp: 4/20b lim: 4096 exec/s: 131072 rss: 830Mb
另外,在执行configure生成makefile时针对pcre2添加了一些参数:--with-match-limit=1000
:限制一次匹配时使用的资源数为1000,默认值为10000000--with-match-limit-recursion=1000
:限制一次匹配时的递归深度为1000,默认为10000000(几乎可以说是无限)--enable-never-backslash-C
:禁用在字符串中,将反斜线作为转义序列接受。
编译好开源库后就要研究harness了,workshop提供的如下:
// Copyright 2016 Google Inc. All Rights Reserved.
// Licensed under the Apache License, Version 2.0 (the "License");
#include <stdint.h>
#include <stddef.h>
#include <string>
#include "pcre2posix.h"
using std::string;
extern "C" int LLVMFuzzerTestOneInput(const unsigned char *data, size_t size) {
if (size < 1) return 0;
regex_t preg;
string str(reinterpret_cast<const char*>(data), size);
string pat(str);
int flags = data[size/2] - 'a'; // Make it 0 when the byte is 'a'.
if (0 == regcomp(&preg, pat.c_str(), flags)) {
regmatch_t pmatch[5];
regexec(&preg, str.c_str(), 5, pmatch, 0);
regfree(&preg);
}
return 0;
}
解释一下逻辑:首先将样本输入中的’a’置0,之后通过regcomp()函数编译正则表达式,即将指定的正则表达式pat.c_str()编译为特定数据格式preg,使得匹配更加有效。函数regexec()会使用这个数据在目标文本串中进行模式匹配,之后regfree()释放正则表达式。
这个harness通过include库”pcre2posix.h”,将pcre2主要的函数包含在了里面,同时这些函数涉及到的一些内存相关的操作也常常是触发crash的点。
之后进行编译链接:
clang++ -O2 -fno-omit-frame-pointer -gline-tables-only -fsanitize=address,fuzzer-no-link -fsanitize-address-use-after-scope pcre2_fuzzer.cc -I pcre2-10.00/src -Wl,--whole-archive pcre2-10.00/.libs/libpcre2-8.a pcre2-10.00/.libs/libpcre2-posix.a -Wl,-no-whole-archive -fsanitize=fuzzer -o pcre2-10.00-fsanitize_fuzzer
和之前不同,这次多了一些参数:--whole-archive
和--no-whole-archive
是ld专有的命令行参数,clang++并不认识,要通过clang++传递到ld,需要在他们前面加-Wl
。--whole-archive
可以把 在其后面出现的静态库包含的函数和变量输出到动态库,--no-whole-archive
则关掉这个特性,因此这里将两个静态库libpcre2-8.a和libpcre2-posix.a里的符号输出到动态库里,使得程序可以在运行时动态链接使用到的函数,也使得fuzz效率得到了提升。执行一下很快得到了crash:
#538040 NEW cov: 3286 ft: 15824 corp: 6803/133Kb lim: 74 exec/s: 1775 rss: 775Mb L: 24/74 MS: 3 ChangeASCIIInt-ChangeASCIIInt-EraseBytes-
#538092 REDUCE cov: 3286 ft: 15824 corp: 6803/133Kb lim: 74 exec/s: 1775 rss: 775Mb L: 23/74 MS: 2 CopyPart-EraseBytes-
#538098 REDUCE cov: 3286 ft: 15824 corp: 6803/133Kb lim: 74 exec/s: 1758 rss: 775Mb L: 6/74 MS: 1 EraseBytes-
#538204 REDUCE cov: 3286 ft: 15824 corp: 6803/133Kb lim: 74 exec/s: 1758 rss: 775Mb L: 16/74 MS: 1 EraseBytes-
#538415 REDUCE cov: 3286 ft: 15825 corp: 6804/134Kb lim: 74 exec/s: 1759 rss: 775Mb L: 35/74 MS: 1 ShuffleBytes-
=================================================================
==17319==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffe809de45f at pc 0x0000005e1518 bp 0x7ffe809dd8f0 sp 0x7ffe809dd8e8
READ of size 1 at 0x7ffe809de45f thread T0
#0 0x5e1517 in match /home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00/src/pcre2_match.c:5968:11
#1 0x5a0624 in pcre2_match_8 /home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00/src/pcre2_match.c:6876:8
#2 0x5f5e64 in regexec /home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00/src/pcre2posix.c:291:6
#3 0x551947 in LLVMFuzzerTestOneInput /home/admin/libfuzzer-workshop/lessons/11/pcre2_fuzzer.cc:21:5
#4 0x459661 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:553:15
#5 0x458ea5 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:469:3
#6 0x45b147 in fuzzer::Fuzzer::MutateAndTestOne() /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:695:19
#7 0x45be65 in fuzzer::Fuzzer::Loop(std::Fuzzer::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerLoop.cpp:831:5
#8 0x449c28 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:825:6
#9 0x473092 in main /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/fuzzer/FuzzerMain.cpp:19:10
#10 0x7f0d3f5c3bf6 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21bf6)
#11 0x41ddb9 in _start (/home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00-fsanitize_fuzzer+0x41ddb9)
Address 0x7ffe809de45f is located in stack of thread T0 at offset 159 in frame
#0 0x55136f in LLVMFuzzerTestOneInput /home/admin/libfuzzer-workshop/lessons/11/pcre2_fuzzer.cc:13
This frame has 6 object(s):
[32, 40) '__dnew.i.i.i.i26'
[64, 72) '__dnew.i.i.i.i'
[96, 128) 'preg' (line 15)
[160, 192) 'str' (line 16) <== Memory access at offset 159 underflows this variable
[224, 256) 'pat' (line 17)
[288, 328) 'pmatch' (line 20)
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00/src/pcre2_match.c:5968:11 in match
Shadow bytes around the buggy address:
0x100050133c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100050133c40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100050133c50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100050133c60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100050133c70: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f8 f2 f2 f2
=>0x100050133c80: f8 f2 f2 f2 00 00 00 00 f2 f2 f2[f2]00 00 00 00
0x100050133c90: f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2 00 00 00 00
0x100050133ca0: 00 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00
0x100050133cb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100050133cc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x100050133cd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==17319==ABORTING
MS: 1 ChangeBit-; base unit: 7a9e5264e8896a1d996088a56a315765c53c7b33
0x5c,0x43,0x2b,0x5c,0x53,0x2b,0xde,0xac,0xd4,0xa3,0x53,0x2b,0x21,0x21,0x68,
\\C+\\S+\xde\xac\xd4\xa3S+!!h
artifact_prefix='./'; Test unit written to ./crash-5ae911f7e958e646e05ebe28421183f6efc0bc88
Base64: XEMrXFMr3qzUo1MrISFo
SUMMARY: AddressSanitizer: stack-buffer-overflow /home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00/src/pcre2_match.c:5968:11 in match
指出在pcre2_match.c里存在stackoverflow。对漏洞进行定位:
在pcre2posix.c中调用了pcre2_match
#in pcre2posix.c
rc = pcre2_match((const pcre2_code *)preg->re_pcre2_code,(PCRE2_SPTR)string + so, (eo - so), 0, options, md, NULL);
pcre2_match定义在pcre2_match.c中,在pcre2_match中调用了match函数:
#in pcre2_match.c
rc = match(start_match, mb->start_code, start_match, 2, mb, NULL, 0);
在执行match的过程中出现栈溢出的位置在于:
for(;;)
{
if (eptr == pp) goto TAIL_RECURSE;
RMATCH(eptr, ecode, offset_top, mb, eptrb, RM46);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
eptr--;
BACKCHAR(eptr); //overflow处
if (ctype == OP_ANYNL && eptr > pp && UCHAR21(eptr) == CHAR_NL &&
UCHAR21(eptr - 1) == CHAR_CR) eptr--;
}
当我以为fuzz的工作已经完成的时候,只是尝试着修改了一下编译链接harness时的静态库为全部库:
clang++ -O2 -fno-omit-frame-pointer -gline-tables-only -fsanitize=address,fuzzer-no-link -fsanitize-address-use-after-scope pcre2_fuzzer.cc -I pcre2-10.00/src -Wl,--whole-archive pcre2-10.00/.libs/*.a -Wl,-no-whole-archive -fsanitize=fuzzer -o pcre2-10.00-fsanitize_fuzzer
再次fuzz的结果令我惊讶:
#605510 REDUCE cov: 3273 ft: 15706 corp: 6963/139Kb lim: 86 exec/s: 255 rss: 597Mb L: 18/86 MS: 1 EraseBytes-
#605733 NEW cov: 3273 ft: 15707 corp: 6964/139Kb lim: 86 exec/s: 255 rss: 597Mb L: 29/86 MS: 3 ShuffleBytes-CopyPart-CMP- DE: "+n"-
#605994 REDUCE cov: 3273 ft: 15707 corp: 6964/139Kb lim: 86 exec/s: 255 rss: 597Mb L: 36/86 MS: 1 EraseBytes-
#606040 REDUCE cov: 3273 ft: 15707 corp: 6964/139Kb lim: 86 exec/s: 255 rss: 597Mb L: 19/86 MS: 1 EraseBytes-
#606121 NEW cov: 3273 ft: 15708 corp: 6965/139Kb lim: 86 exec/s: 255 rss: 597Mb L: 27/86 MS: 1 CopyPart-
#606196 NEW cov: 3273 ft: 15709 corp: 6966/139Kb lim: 86 exec/s: 255 rss: 597Mb L: 86/86 MS: 5 ChangeASCIIInt-ChangeBit-ChangeBit-ChangeASCIIInt-CrossOver-
=================================================================
==10857==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6110001625ea at pc 0x00000055d548 bp 0x7ffccf4098f0 sp 0x7ffccf4098e8
WRITE of size 1 at 0x6110001625ea thread T0
#0 0x55d547 in _pcre2_ord2utf_8 /home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00/src/pcre2_ord2utf.c:92:12
#1 0x4f60f4 in add_to_class /home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00/src/pcre2_compile.c:2870:20
#2 0x4f5dd0 in add_to_class /home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00/src/pcre2_compile.c:2820:18
#3 0x4e03e0 in compile_branch /home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00/src/pcre2_compile.c:3923:11
#4 0x4d3f2f in compile_regex /home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00/src/pcre2_compile.c:6723:8
#5 0x4d136c in pcre2_compile_8 /home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00/src/pcre2_compile.c:7734:7
#6 0x56c3b3 in regcomp /home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00/src/pcre2posix.c:219:23
#7 0x4c83c9 in LLVMFuzzerTestOneInput /home/admin/libfuzzer-workshop/lessons/11/pcre2_fuzzer.cc:19:12
#8 0x585632 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /home/admin/libfuzzer-workshop/libFuzzer/Fuzzer/./FuzzerLoop.cpp:556:15
#9 0x584cd5 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) /home/admin/libfuzzer-workshop/libFuzzer/Fuzzer/./FuzzerLoop.cpp:470:3
#10 0x58606c in fuzzer::Fuzzer::MutateAndTestOne() /home/admin/libfuzzer-workshop/libFuzzer/Fuzzer/./FuzzerLoop.cpp:698:19
#11 0x586c75 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) /home/admin/libfuzzer-workshop/libFuzzer/Fuzzer/./FuzzerLoop.cpp:830:5
#12 0x572b8b in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /home/admin/libfuzzer-workshop/libFuzzer/Fuzzer/./FuzzerDriver.cpp:824:6
#13 0x56cc20 in main /home/admin/libfuzzer-workshop/libFuzzer/Fuzzer/./FuzzerMain.cpp:19:10
#14 0x7f16a7ecbbf6 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21bf6)
#15 0x41deb9 in _start (/home/admin/libfuzzer-workshop/lessons/11/pcre2_10.00_fuzzer+0x41deb9)
0x6110001625ea is located 0 bytes to the right of 234-byte region [0x611000162500,0x6110001625ea)
allocated by thread T0 here:
#0 0x495dbd in malloc /local/mnt/workspace/bcain_clang_bcain-ubuntu_23113/llvm/utils/release/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:145:3
#1 0x4d0953 in pcre2_compile_8 /home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00/src/pcre2_compile.c:7656:3
#2 0x56c3b3 in regcomp /home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00/src/pcre2posix.c:219:23
#3 0x4c83c9 in LLVMFuzzerTestOneInput /home/admin/libfuzzer-workshop/lessons/11/pcre2_fuzzer.cc:19:12
#4 0x585632 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /home/admin/libfuzzer-workshop/libFuzzer/Fuzzer/./FuzzerLoop.cpp:556:15
#5 0x584cd5 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) /home/admin/libfuzzer-workshop/libFuzzer/Fuzzer/./FuzzerLoop.cpp:470:3
#6 0x58606c in fuzzer::Fuzzer::MutateAndTestOne() /home/admin/libfuzzer-workshop/libFuzzer/Fuzzer/./FuzzerLoop.cpp:698:19
#7 0x586c75 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) /home/admin/libfuzzer-workshop/libFuzzer/Fuzzer/./FuzzerLoop.cpp:830:5
#8 0x572b8b in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /home/admin/libfuzzer-workshop/libFuzzer/Fuzzer/./FuzzerDriver.cpp:824:6
#9 0x56cc20 in main /home/admin/libfuzzer-workshop/libFuzzer/Fuzzer/./FuzzerMain.cpp:19:10
#10 0x7f16a7ecbbf6 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21bf6)
SUMMARY: AddressSanitizer: heap-buffer-overflow /home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00/src/pcre2_ord2utf.c:92:12 in _pcre2_ord2utf_8
Shadow bytes around the buggy address:
0x0c2280024460: fd fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa
0x0c2280024470: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
0x0c2280024480: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c2280024490: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
0x0c22800244a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c22800244b0: 00 00 00 00 00 00 00 00 00 00 00 00 00[02]fa fa
0x0c22800244c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c22800244d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c22800244e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c22800244f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c2280024500: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==10857==ABORTING
MS: 5 InsertRepeatedBytes-CMP-CrossOver-ChangeBit-CrossOver- DE: "+\xc6"-; base unit: ce48e02587af5cb5d3e84053d6d5b4545bbb6e32
0x5b,0x2a,0x5d,0x3f,0x5b,0x3f,0x3f,0x5c,0x53,0x3f,0x5b,0x2a,0x5d,0x3f,0x5b,0x3f,0x3f,0x5c,0x53,0x2a,0x63,0x20,0x20,0x20,0x25,0xc6,0xa4,0x1a,0x2d,0x5b,0x43,0x1a,0x2d,0xc6,0xa4,0x5d,0x50,0x2a,0x5d,0x50,0x2a,0x5e,0x58,0x42,0x5c,0x5c,0x3f,0x77,0xc,0x5c,0x77,0x0,0x36,0x5c,0x20,0xa0,0xc0,0xec,0x2d,0x3f,0x5c,0x77,0x3f,0x5c,0x2d,0xac,0x3f,0x5c,
[*]?[??\\S?[*]?[??\\S*c %\xc6\xa4\x1a-[C\x1a-\xc6\xa4]P*]P*^XB\\\\?w\x0c\\w\x006\\ \xa0\xc0\xec-?\\w?\\-\xac?\\
artifact_prefix='./'; Test unit written to ./crash-849705875bb2098817f3299ee582e2207a568e63
Base64: WypdP1s/P1xTP1sqXT9bPz9cUypjICAgJcakGi1bQxotxqRdUCpdUCpeWEJcXD93DFx3ADZcIKDA7C0/XHc/XC2sP1w=
stat::number_of_executed_units: 606206
stat::average_exec_per_sec: 255
stat::new_units_added: 8960
stat::slowest_unit_time_sec: 0
stat::peak_rss_mb: 598
得到了一个不一样的crash。但这也在情理之中,通过链接不同或更多的静态库。只要harness程序逻辑所能涉及到,就有机会得到不同静态库里的crash。
通过SUMMARY: AddressSanitizer: heap-buffer-overflow /home/admin/libfuzzer-workshop/lessons/11/pcre2-10.00/src/pcre2_ord2utf.c:92:12 in _pcre2_ord2utf_8
我们了解到在pcre2_ord2utf.c中存在heapoverflow的漏洞。同样对漏洞进行定位:
这次的函数调用有点多,一层一层的找:
首先在pcre2posix.c
中调用pcre2_compile
:
preg->re_pcre2_code = pcre2_compile((PCRE2_SPTR)pattern, -1, options,
&errorcode, &erroffset, NULL);
该函数定义在pcre2_compile.c
中,然后又调用了compile_regex
:
(void)compile_regex(re->overall_options, &code, &ptr, &errorcode, FALSE, FALSE,
0, 0, &firstcu, &firstcuflags, &reqcu, &reqcuflags, NULL, &cb, NULL);
之后在函数compile_regex
中又调用了compile_branch
:
if (!compile_branch(&options, &code, &ptr, errorcodeptr, &branchfirstcu,
&branchfirstcuflags, &branchreqcu, &branchreqcuflags, &bc,
cond_depth, cb, (lengthptr == NULL)? NULL : &length))
{
*ptrptr = ptr;
return FALSE;
}
compile_branch
中又调用了add_to_class
:
class_has_8bitchar +=
add_to_class(classbits, &class_uchardata, options, cb, c, d);
接着add_to_class
调用PRIV
:
else if (start == end)
{
*uchardata++ = XCL_SINGLE;
uchardata += PRIV(ord2utf)(start, uchardata);
}
}
PRIV
定义在pcre2_ord2utf.c
中:
unsigned int
PRIV(ord2utf)(uint32_t cvalue, PCRE2_UCHAR *buffer)
{
/* Convert to UTF-8 */
#if PCRE2_CODE_UNIT_WIDTH == 8
register int i, j;
for (i = 0; i < PRIV(utf8_table1_size); i++)
if ((int)cvalue <= PRIV(utf8_table1)[i]) break;
buffer += i;
for (j = i; j > 0; j--)
{
*buffer-- = 0x80 | (cvalue & 0x3f); //此处对于内存指针循环操作由于限制条件不当导致出现了heap_overflow
cvalue >>= 6;
}
*buffer = PRIV(utf8_table2)[i] | cvalue;
return i + 1;
/* Convert to UTF-16 */
#elif PCRE2_CODE_UNIT_WIDTH == 16
if (cvalue <= 0xffff)
{
*buffer = (PCRE2_UCHAR)cvalue;
return 1;
}
cvalue -= 0x10000;
*buffer++ = 0xd800 | (cvalue >> 10);
*buffer = 0xdc00 | (cvalue & 0x3ff);
return 2;
/* Convert to UTF-32 */
#else
*buffer = (PCRE2_UCHAR)cvalue;
return 1;
#endif
}
总结下这两个crash:
第一个crash由harness中的regexech
函数的匹配逻辑触发stack_overflow
,位于pcre2_match.c:5968:11
;第二个crash由regcomp
函数的编译逻辑触发heap_overflow
,位于pcre2_ord2utf.c:92:12
。
一层层的函数调用关系分析得让人头大,但这也正体现了漏洞挖掘中的“挖掘”二字的含义。
fuzzing re2
这一个例子将让我们意识到max_len
的选择对于fuzz效率的影响。
re2是一个高效的、原则性的正则表达式库。是由两位来在Google的大神用C++实现的。Go中的regexp正则表达式包也是由re2实现的。workshop提供的是re2-2014-12-09的版本。
先源码编译:
tar xzf re2.tgz
cd re2
export FUZZ_CXXFLAGS="-O2 -fno-omit-frame-pointer -gline-tables-only -fsanitize=address,fuzzer-no-link -fsanitize-address-use-after-scope"
make clean
CXX=clang++ CXXFLAGS="$FUZZ_CXXFLAGS" make -j
接着研究harness:
// Copyright (c) 2016 The Chromium Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
#include <stddef.h>
#include <stdint.h>
#include <string>
#include "re2/re2.h"
#include "util/logging.h"
using std::string;
void Test(const string& buffer, const string& pattern,
const RE2::Options& options) {
RE2 re(pattern, options);
if (!re.ok())
return;
string m1, m2;
int i1, i2;
double d1;
if (re.NumberOfCapturingGroups() == 0) {
RE2::FullMatch(buffer, re);
RE2::PartialMatch(buffer, re);
} else if (re.NumberOfCapturingGroups() == 1) {
RE2::FullMatch(buffer, re, &m1);
RE2::PartialMatch(buffer, re, &i1);
} else if (re.NumberOfCapturingGroups() == 2) {
RE2::FullMatch(buffer, re, &i1, &i2);
RE2::PartialMatch(buffer, re, &m1, &m2);
}
re2::StringPiece input(buffer);
RE2::Consume(&input, re, &m1);
RE2::FindAndConsume(&input, re, &d1);
string tmp1(buffer);
RE2::Replace(&tmp1, re, "zz");
string tmp2(buffer);
RE2::GlobalReplace(&tmp2, re, "xx");
RE2::QuoteMeta(re2::StringPiece(pattern));
}
// Entry point for LibFuzzer.
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
if (size < 1)
return 0;
RE2::Options options;
size_t options_randomizer = 0;
for (size_t i = 0; i < size; i++)
options_randomizer += data[i];
if (options_randomizer & 1)
options.set_encoding(RE2::Options::EncodingLatin1);
options.set_posix_syntax(options_randomizer & 2);
options.set_longest_match(options_randomizer & 4);
options.set_literal(options_randomizer & 8);
options.set_never_nl(options_randomizer & 16);
options.set_dot_nl(options_randomizer & 32);
options.set_never_capture(options_randomizer & 64);
options.set_case_sensitive(options_randomizer & 128);
options.set_perl_classes(options_randomizer & 256);
options.set_word_boundary(options_randomizer & 512);
options.set_one_line(options_randomizer & 1024);
options.set_log_errors(false);
const char* data_input = reinterpret_cast<const char*>(data);
{
string pattern(data_input, size);
string buffer(data_input, size);
Test(buffer, pattern, options);
}
if (size >= 3) {
string pattern(data_input, size / 3);
string buffer(data_input + size / 3, size - size / 3);
Test(buffer, pattern, options);
}
return 0;
}
可以看到harness用到了很多re2里的方法,最后使用FullMatch和PartialMatch接口进行匹配buffer和re。其中buffer是由data_input
和size
初始化得到(data_input由输入的data经无关类型转换得到),re是由pattern和options建立的RE2对象。
注意到harness里有几个条件分支语句,首先是size<1是直接返回,还有就是当size>=3时,初始化pattn和buffer用的是size/3和size-size/3说明它对我们的输入的size进行了切割,初始化pattern用到的是data_input + size / 3
,而初始化buffer是用的之后的data_input。这样使得我们样例的size会对fuzz的过程产生影响。如果size很短,可能无法触发crash,而如果size很大,对harness的执行匹配过程就会更加耗时,影响fuzz寻找覆盖点的效率。下面做几个测试,比较一下max_len对fuzz过程的影响:
编译链接harness:
clang++ -O2 -fno-omit-frame-pointer -gline-tables-only -fsanitize=address,fuzzer-no-link -fsanitize-address-use-after-scope -std=gnu++98 target.cc -I re2/ re2/obj/libre2.a -fsanitize=fuzzer -o re2_fuzzer
由于使用的re2版本较老了,编译的时候使用了c++98标准。
首先我们设置max_len为10,执行时间为100秒,-print_final_stats=1打印最后的结果,corpus1作为语料库的存放处:
➜ 10 git:(master) ✗ ./re2_fuzzer ./corpus1 -print_final_stats=1 -max_len=10 -max_total_time=100
Done 643760 runs in 101 second(s)
stat::number_of_executed_units: 643760
stat::average_exec_per_sec: 6373
stat::new_units_added: 36
stat::slowest_unit_time_sec: 0
stat::peak_rss_mb: 456
只探测到了36个代码单元。
接着设置max_len为100,执行时间为100秒,-print_final_stats=1打印最后的结果,corpus2作为语料库的存放处:
./re2_fuzzer ./corpus2 -print_final_stats=1 -max_len=100 -max_total_time=100
Done 233437 runs in 101 second(s)
stat::number_of_executed_units: 233437
stat::average_exec_per_sec: 2311
stat::new_units_added: 50
stat::slowest_unit_time_sec: 0
stat::peak_rss_mb: 675
探测到了50个代码单元,感觉差别不大。
然年设置max_len为1000,执行时间为100秒,-print_final_stats=1打印最后的结果,corpus3作为语料库的存放处:
./re2_fuzzer ./corpus3 -print_final_stats=1 -max_len=1000 -max_total_time=100
Done 105935 runs in 101 second(s)
stat::number_of_executed_units: 105935
stat::average_exec_per_sec: 1048
stat::new_units_added: 97
stat::slowest_unit_time_sec: 0
stat::peak_rss_mb: 830
这次探测到了97个代码单元,是第二个的2倍,第一个的3倍左右。
最后再设置max_len为500,执行时间为100秒,-print_final_stats=1打印最后的结果,corpus4作为语料库的存放处
./re2_fuzzer ./corpus4 -print_final_stats=1 -max_len=500 -max_total_time=100
Done 119361 runs in 101 second(s)
stat::number_of_executed_units: 119361
stat::average_exec_per_sec: 1181
stat::new_units_added: 117
stat::slowest_unit_time_sec: 0
stat::peak_rss_mb: 827
结果也比较明显,不同的max_len对fuzz的效率有着不同的影响,当然这也和你写的harness有关。因此在执行fuzzer的时候选择合适的max_len(如本例中的max_len在100~1000比较合适)会使得我们fuzzer探测到更多的代码块,得到crash的概率也就越大。
总结
libfuzzer workshop到此就全部学习完了。libfuzzer作为最常用的fuzz工具,它所涉及到的一些使用方法在workshop里都有相应的lesson。就我个人而言,在逐步学习libfuzzer的过程中感觉到libfuzzer对于开源库提供的接口函数的fuzz是十分强力的,而这也是我们在学习libfuzzer中的难点:如何能够设计出合理的harness,这需要我们对要fuzz的开源库提供的方法有一定的了解,经过攻击面分析等去逐步改善我们的harness,使得我们与获得crash更近一步。
初学libfuzzer,有错误疏忽之处烦请各位师傅指正。
发表评论
您还未登录,请先登录。
登录