stack-overflow at "parseFile"

Question

stack-overflow at "parseFile"

NotmebutWind opened this issue 3 years ago · comments

Hi,

I have found a bug when I fuzzing . When I enter an input file to a program use toml.h with parseFile, it cause a stack-overflow at parseFile function. I think there maybe too much loop or other bug cause this . my stack size is 8192kbs.

I know that we can avoid it by ulimit -s , but I think parse a toml file about 32k that cause 8M stack overflow maybe not a good way. If I make a file follow some pattern , the file may be minimized and smaller than 10k.

Here is the backtrace：

#0 __asan_memset ()
    at /src/llvm-project/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:26
#1  std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__zero() () at /usr/local/bin/../include/c++/v1/string:1563
#2  std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::basic_string() () at /usr/local/bin/../include/c++/v1/string:1812
#3  toml::internal::Token::Token(toml::internal::TokenType) () at ./include/toml/toml.h:272
#4  toml::internal::Lexer::nextToken(bool) () at ./include/toml/toml.h:1022
#5  toml::internal::Lexer::nextValueToken() () at ./include/toml/toml.h:967
#6  toml::internal::Parser::nextValue() () at ./include/toml/toml.h:347
#7  toml::internal::Parser::consumeForValue(toml::internal::TokenType) ()
    at ./include/toml/toml.h:1740

(and maybe 20000 times repeat below two lines)

toml::internal::Parser::parseArray(toml::Value*) () at ./include/toml/toml.h:1978
toml::internal::Parser::parseValue(toml::Value*) () at ./include/toml/toml.h:1919

and this below:

toml::internal::Parser::parseKeyValue(toml::Value*) () at ./include/toml/toml.h:1883
toml::internal::Parser::parse() () at ./include/toml/toml.h:1790
toml::parse(std::__1::basic_istream<char, std::__1::char_traits<char> >&) ()
    at ./include/toml/toml.h:385
toml::parseFile(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) () at ./include/toml/toml.h:401
LLVMFuzzerTestOneInput () at /src/parse_file.cc:16
fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) ()
RunOneTest () at /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:323
fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) ()
main () at /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20

the testcase I use is like parse_file.cc in your project. you can just compile use your code and input my file to the parse_file. stack size is 8192kbs. And you will get a segment fault caused by stack overflow. I have upload the file causes this.
crashcase.zip

mayah · Answer 1 · Sat Oct 16 2021 12:31:48 GMT+0800 (China Standard Time)

Thank you for the report, and sorry for the late response.
By design, this library doesn't make much account of too complex input.
To solve this problem, we won't be able to use stack based algorithm for parsing.
It is possible, however, I believe it will cause code complex, and it's beyond this library's purpose.