Daniel van Strien (davanstrien)

davanstrien

Geek Repo

Company:Hugging Face

Location:United Kingdom

Home Page:https://danielvanstrien.xyz/

Twitter:@vanstriendaniel

Github PK Tool:Github PK Tool


Organizations
AI4LAM
carpentries-incubator
Hugging-Face-Supporter
huggingface
Living-with-machines

Daniel van Strien's repositories

awesome-synthetic-datasets

awesome synthetic (text) datasets

Language:Jupyter NotebookLicense:CC-BY-SA-4.0Stargazers:72Issues:0Issues:0

haiku-dpo

Using open source LLMs to build synthetic datasets for direct preference optimization

Language:Jupyter NotebookStargazers:23Issues:2Issues:1

flyswot

Command Line Interface for running 🤗 Transformers Image Classification locally

Language:PythonLicense:MITStargazers:18Issues:3Issues:42

huggingface-tldr

Experimental tl;dr summaries for datasets on the Hugging Face Hub!

Language:JavaScriptStargazers:8Issues:0Issues:0
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7Issues:2Issues:32

auto_dataset_card

Wouldn't it be nice to generate parts of our dataset card automagically?

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:5Issues:2Issues:0

Python-introduction-for-digital-collections

Workshop materials on Python as part of a series of Library Carpentry workshops at the British Library

Language:Jupyter NotebookStargazers:5Issues:2Issues:2

hugit-cli

push ImageFolder style image datasets to the 🤗 Hub from the command line

Language:PythonLicense:MITStargazers:2Issues:2Issues:1

LLM-pubmed-query-generation-evaluation

LLM PubMed Query Generation Evaluation

Stargazers:2Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:2Issues:0

argilla

Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency.

License:Apache-2.0Stargazers:0Issues:0Issues:0

arxiv.py

Python wrapper for the arXiv API

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

awesome-list

Awesome AI in Libraries

License:CC0-1.0Stargazers:0Issues:1Issues:0

BERTopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.

Language:PythonLicense:MITStargazers:0Issues:1Issues:0
License:NOASSERTIONStargazers:0Issues:1Issues:0

Computer-Vision-for-the-Humanities-workshop

Computer Vision for the Humanities workshop

Language:Jupyter NotebookLicense:CC-BY-4.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

data-preparation

Code used for sourcing and cleaning the BigScience ROOTS corpus

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

distilabel

⚗️ AI Feedback framework for scalable LLM alignment

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

Fin-Fact

A Benchmark Dataset for Multimodal Scientific Fact Checking

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

gahd

GAHD: A German Adversarial Hate speech Dataset

License:CC-BY-4.0Stargazers:0Issues:0Issues:0

huggingface_hub

All the open source things related to the Hugging Face Hub.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

iiif2annos

OCR a IIIF images in a manifest and generate annotations

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

jekyll

static site version of Programming Historian

Language:HTMLStargazers:0Issues:2Issues:0

kraken

OCR engine for all the languages

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

ml-kge

Multilingual Knowledge Graph Enhancement (EMNLP 2023)

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

monitor-prompts-hf

A Gradio app to monitor annotation effort done by users using the Argilla HF Space over the prompt dataset

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Website-Classification

Trying to classify web archives using metadata...

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0