nabla-c0d3 / sslyze

Fast and powerful SSL/TLS scanning library.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Memory Leak still seems to be present when running in Python

lcurtis-datto opened this issue · comments

Looking at 560, it appeared that the memory leak was resolved. However, I find after running a large number of scans, memory usage continually increases until the Python script is killed by oom-killer. I have tried using multiprocess and running the sslyze scan in a child process, but I keep running into deadlocks when scanning larger lists of hosts.

To Reproduce
The condition is reproduce-able by running the great example code in 560:

import os
import gc
import psutil
from sslyze import Scanner, ServerScanRequest, ServerNetworkLocation


def print_memory_used(msg):
    object_count = len(gc.get_objects())
    process = psutil.Process(os.getpid())
    memory_used = process.memory_info().rss
    print(f"{msg}: object count: {object_count}, memory used: {memory_used / 1024**2} MB")


def sslyze_scan(hostname, port):
    results = list()
    request = ServerScanRequest(ServerNetworkLocation(hostname=hostname, port=port))
    scanner = Scanner()
    scanner.queue_scans([request])
    results = list(scanner.get_results())

for i in range(1,5):
    print_memory_used(f"before run {i}")
    sslyze_scan("mozilla.com", 443)
    print_memory_used(f"after run {i}")

Expected behavior
Memory usage should reset on each run, but output shows:

before run 1: object count: 35980, memory used: 37.78515625 MB
after run 1: object count: 45321, memory used: 59.171875 MB
before run 2: object count: 45321, memory used: 59.171875 MB
after run 2: object count: 52064, memory used: 69.640625 MB
before run 3: object count: 52064, memory used: 69.640625 MB
after run 3: object count: 52101, memory used: 79.33984375 MB
before run 4: object count: 52101, memory used: 79.85546875 MB
after run 4: object count: 45080, memory used: 86.80078125 MB

If I do the same thing using multiprocessing, the memory is released.

before run 1: object count: 36584, memory used: 23.13671875 MB
after run 1: object count: 45749, memory used: 51.92578125 MB
before run 2: object count: 36597, memory used: 23.13671875 MB
after run 2: object count: 45759, memory used: 51.9296875 MB
before run 3: object count: 36609, memory used: 23.13671875 MB
after run 3: object count: 45768, memory used: 52.19140625 MB
before run 4: object count: 36621, memory used: 23.13671875 MB
after run 4: object count: 45777, memory used: 52.00390625 MB

Python environment (please complete the following information):

  • OS: Ubuntu 20.04
  • Python version: Python 3.8.10
  • SSLYZE Version: 5.1.3
  • NASSL Version: 5.0.1

Additional context
I have tried using multiprocessing on main internal code, but run into deadlocks when scanning larger groups of hosts:

FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY

I am continuing to investigate other alternatives to avoid increasing memory usage on subsequent scans. Any assistance or advice is greatly appreciated.