hashjeeves

A caching VirusTotal lookup API.

But why?

VirusTotal has API limits even in the "unlimited" mode, furthermore it takes time to make those lookups. Which means that if you are enriching a larger data set, then you are spending a lot of time waiting. This microservice does two things:

Uses concurrency to look up a lot of hashes quickly
Uses a cache to see if a lookup was already done. If it is and cache isn't stale (you tell it how old is acceptable), it will return from cache. This saves time and also API limits when you look up previously analysed data. For example, if you look up all executables from multiple systems.

API spec

Running

uvicorn api:app or use docker-compose

Configuration via environ

Name	Default	Description
REDIS_URL	redis://localhost:6379/0	URL to Redis cache
VT_API_KEY	(blank)	Set default VT key (optional)
MAX_VT_REQUESTS	100	Maximum simultaneous VT requests
MAX_TOTAL_REQUESTS	500	Maximum concurrency limit (set to about 5x of VT )

How to invoke it?

import requests

with open(r"hashes.txt","r") as f:
    hashes=[x.strip() for x in f]

stats_only="true" # Only return detection stats
                  # Set to "false" to get back full VT response
                  
requests.post(f"http://localhost:8000/api/lookup?stats_only={stats_only}",
                    json=hashes, 
                    headers= {'x-apikey': 'my-vt-api-key'})

About

A caching VirusTotal lookup API

BSD 3-Clause "New" or "Revised" License

Languages

Language:Python 89.2%Language:Dockerfile 10.8%