Daniel Tiarks's starred repositories
LLM_convert_receipt_image-to-json_or_xml
Finetune LLM to convert an invoice or receipt image to receipt XML or JSON object.
Brazilian-Identity-Document-Dataset
Brazilian Identity Document Dataset (BID Dataset): The first public dataset of Brazilian identification documents.
fsdp_qlora
Training LLMs with QLoRA + FSDP
streamlit-drawable-canvas
Do you like Quick, Draw? Well what if you could train/predict doodles drawn inside Streamlit? Also draws lines, circles and boxes over background images for annotation.
streamlit-cropper
A simple image cropper for Streamlit
kernel_tuner
Kernel Tuner
polars-fuzzy-match
Polars extension for fzf-style fuzzy matching
cookiecutter-hypermodern-python
Hypermodern Python Cookiecutter
sslcontext-kickstart
🔐 A lightweight high level library for configuring a http client or server based on SSLContext or other properties such as TrustManager, KeyManager or Trusted Certificates to communicate over SSL TLS for one way authentication or two way authentication provided by the SSLFactory. Support for Java, Scala and Kotlin based clients with examples. Available client examples are: Apache HttpClient, OkHttp, Spring RestTemplate, Spring WebFlux WebClient Jetty and Netty, the old and the new JDK HttpClient, the old and the new Jersey Client, Google HttpClient, Unirest, Retrofit, Feign, Methanol, Vertx, Scala client Finagle, Featherbed, Dispatch Reboot, AsyncHttpClient, Sttp, Akka, Requests Scala, Http4s Blaze, Kotlin client Fuel, http4k Kohttp and Ktor. Also gRPC, WebSocket and ElasticSearch examples are included
regex-constrained-decoding
Fast, High-Fidelity LLM Decoding with Regex Constraints
haystack
:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
json_repair
A python module to repair broken JSON, very useful with LLMs
tantivy-py
Python bindings for Tantivy
seaweedfs
SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.