ChatNoir's repositories
chatnoir-resiliparse
A robust web archive analytics toolkit
web-content-extraction-benchmark
Web Content Extraction Benchmark
chatnoir2-indexer
ChatNoir Indexer
chatnoir-copycat
CopyCat is a resource for deduplication in TREC-style experimental setups.
chatnoir2-webclient
ChatNoir Web Frontend
chatnoir-warc-dl
This pipeline allows extracting data from WARC files on a CPU cluster and streaming it to a GPU server, where it is processed.
chatnoir-api
🔍 Simple, type-safe access to the ChatNoir search API.
chatnoir2-mapfile-generator
ChatNoir HDFS Map File Generator
chatnoir-pyterrier
🔍 Use the ChatNoir search engine in PyTerrier.
webis-uuid
Webis UUID Generation Tool
chatnoir-warc-indexer
ChatNoir Indexer