Hugal31 / yara-rust

Rust bindings for VirusTotal/Yara

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Memory leak

ikrivosheev opened this issue · comments

I tested my application and see a huge growth of memory!
Files to reproduce error: 16025.tar.gz.zip (this is simple txt files)

Simple code to reproduce:

use std::os::unix::fs::MetadataExt;
use std::sync::Arc;
use threadpool::ThreadPool;
use yara::Compiler;

const WORKERS: usize = 4;
const RULES: &str = "..."; // path to rules
const FILE: &str = "...";  // path to files to scan

fn main() {
    let pool = ThreadPool::new(WORKERS);
    let compiler = Compiler::new().unwrap();
    let compiler = compiler.add_rules_file(RULES).unwrap();
    let rules = Arc::new(compiler.compile_rules().unwrap());
    loop {
        let mut counter = 0;
        println!("Start iter");

        for file in globwalk::GlobWalkerBuilder::new(FILE, "**")
            .build()
            .unwrap()
        {
            if let Ok(file) = file {
                counter += 1;
                if file.metadata().unwrap().size() == 0 {
                    continue;
                }
                let rules = rules.clone();
                pool.execute(move || {
                    let mut scanner = rules.scanner().unwrap();
                    scanner.set_timeout(60);
                    let _ = scanner.scan_file(file.path());
                });
            }
        };
        pool.join()
        println!("Finish iter, files={}", counter);
        std::thread::sleep(std::time::Duration::from_secs(7));

    }
}

What am I doing wrong?

Hi,

Can you provide a sample of your rules? When I test your code, I get a very slow (but still worrying) increase in memory.

@Hugal31 , hi. Can you test with the changes: #57? I read about mem::transmute and I think this is first problem...

Rules: all.zip. I get rules from: https://github.com/Yara-Rules/rules

Just to be clear, you are running your test with #57

@Hugal31, did you reproduce the problem?

@Hugal31, I make test with: #57. It does not help...

Sorry, I could not reproduce

Have you double-checked the Yara version you are using? Which flags did you enabled? Note that I had to disable on rule depending on cuckoo.

My application using:

  1. python 3.8
  2. rust bindings using pyo3 (using stable ABI3)
  3. rust-yara with features: vendored, bundled-4_1_2

And I see memory growth... I try write simple example for reproduce the problem. Valgrind and heapcheck show nothing.

I don't understand what Python and pyo3 has to do here. Your sample code does not contains nor run python, right? And you see memory grow with your sample code?

Some more results: I run process with strace: strace -k -f -e trace=%memory -o /tmp/log bin

Then pmap -p <pid>:

00007fdda4000000  65324K rw---   [ anon ]
00007fdda7fcb000    212K -----   [ anon ]
00007fddac000000  65324K rw---   [ anon ]
00007fddaffcb000    212K -----   [ anon ]
00007fddb4000000  65324K rw---   [ anon ]
00007fddb7fcb000    212K -----   [ anon ]
00007fddb8000000  65324K rw---   [ anon ]
00007fddbbfcb000    212K -----   [ anon ]
00007fddbc000000  65324K rw---   [ anon ]
00007fddbffcb000    212K -----   [ anon ]
00007fddc0000000  65324K rw---   [ anon ]
00007fddc3fcb000    212K -----   [ anon ]
00007fddc4000000  65032K rw---   [ anon ]
00007fddc7f82000    504K -----   [ anon ]
00007fddcc000000  65324K rw---   [ anon ]
00007fddcffcb000    212K -----   [ anon ]
00007fddd4000000  65028K rw---   [ anon ]
00007fddd7f81000    508K -----   [ anon ]
00007fddd8000000  65024K rw---   [ anon ]
00007fdddbf80000    512K -----   [ anon ]
00007fdddc000000  65020K rw---   [ anon ]
....

Then find some address in strace log and see:

11265 mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7fddcc000000
 > /lib/x86_64-linux-gnu/libc-2.31.so(mmap64+0x26) [0x11ba46]
 > /lib/x86_64-linux-gnu/libc-2.31.so(pthread_attr_setschedparam+0x5f7) [0x98527]
 > /lib/x86_64-linux-gnu/libc-2.31.so(pthread_attr_setschedparam+0x262d) [0x9a55d]
 > /lib/x86_64-linux-gnu/libc-2.31.so(pthread_attr_setschedparam+0x39e3) [0x9b913]
 > /lib/x86_64-linux-gnu/libc-2.31.so(__libc_malloc+0x1b9) [0x9d419]
 > (yr_notebook_alloc+0x49) [0xae709]
 > (_yr_scan_match_callback+0x2a4) [0xac364]
 > (yr_scan_verify_match+0x368) [0xad008]
 > (_yr_scanner_scan_mem_block.isra.0+0x1e2) [0xa9002]
 > (yr_scanner_scan_mem_blocks+0x3ac) [0xa96dc]
 > (yr_scanner_scan_mem+0x7d) [0xa9add]
 ...
 111265 mprotect(0x7fddcc000000, 716800, PROT_READ|PROT_WRITE) = 0
 > /lib/x86_64-linux-gnu/libc-2.31.so(mprotect+0xb) [0x11bb0b]
 > /lib/x86_64-linux-gnu/libc-2.31.so(pthread_attr_setschedparam+0x64a) [0x9857a]
 > /lib/x86_64-linux-gnu/libc-2.31.so(pthread_attr_setschedparam+0x262d) [0x9a55d]
 > /lib/x86_64-linux-gnu/libc-2.31.so(pthread_attr_setschedparam+0x39e3) [0x9b913]
 > /lib/x86_64-linux-gnu/libc-2.31.so(__libc_malloc+0x1b9) [0x9d419]
 > (yr_notebook_alloc+0x49) [0xae709]
 > (_yr_scan_match_callback+0x2a4) [0xac364]
 > (yr_scan_verify_match+0x368) [0xad008]
 > (_yr_scanner_scan_mem_block.isra.0+0x1e2) [0xa9002]
 > (yr_scanner_scan_mem_blocks+0x3ac) [0xa96dc]
 > (yr_scanner_scan_mem+0x7d) [0xa9add]

Why memory is not free... This is very strange

Other example which is ok. I remove threadpool and work with thread:

use std::os::unix::fs::MetadataExt;
use std::sync::Arc;
use yara::Compiler;
use std::thread;

const RULES: &str = "/home/ikrivosheev/projects/test/src/ms_binary.yar"; // path to rules
const FILE: &str = "/home/ikrivosheev/data/16025/";  // path to files to scan

fn main() {
    let compiler = Compiler::new().unwrap();
    let compiler = compiler.add_rules_file(RULES).unwrap();
    let rules = Arc::new(compiler.compile_rules().unwrap());
    loop {
        let mut counter = 0;
        println!("Start iter");

        for file in globwalk::GlobWalkerBuilder::new(FILE, "**")
            .build()
            .unwrap()
        {
            if let Ok(file) = file {
                counter += 1;
                if file.metadata().unwrap().size() == 0 {
                    continue;
                }
                let rules = rules.clone();
                thread::spawn(move || {
                    let mut scanner = rules.scanner().unwrap();
                    scanner.set_timeout(60);
                    let _ = scanner.scan_file(file.path());
                });
            }
        };
        println!("Finish iter, files={}", counter);
        std::thread::sleep(std::time::Duration::from_secs(100));
    }
}

Is this still happening?

@Hugal31 I think I can close issue. If something change - I will reopen)