Beast code in Giters

Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.

Language:PythonNOASSERTION3517 19 974

segment-geospatial

A Python package for segmenting geospatial data with the Segment Anything Model (SAM)

Language:PythonMIT2797 54 127

whylogs

An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collection, ensuring safety & robustness. 📈

Language:Jupyter NotebookApache-2.02606 32 426

stable-diffusion-tensorflow

Stable Diffusion in TensorFlow / Keras

Language:PythonNOASSERTION1574 25 49

refinery

The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.

Language:PythonApache-2.01382 16 204

AI

Microsoft AI

Language:PythonMIT1370 87 33

eo-learn

Earth observation processing framework for machine learning in Python

Language:PythonMIT1103 45 159

graphein

Protein Graph Library

Language:Jupyter NotebookMIT1007 19 150

deepscatter

Zoomable, animated scatterplots in the browser that scales over a billion points

Language:TypeScriptNOASSERTION1005 15 59

poetry-dynamic-versioning

Plugin for Poetry to enable dynamic versioning based on VCS tags

Language:PythonMIT596 5 153

datamol

Molecular Processing Made Easy.

Language:PythonApache-2.0444 17 106

SpanMarkerNER

SpanMarker for Named Entity Recognition

Language:Jupyter NotebookApache-2.0376 9 42

tner

Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An All-Round Python Library for Transformer-based Named Entity Recognition, EACL 2021"

Language:PythonMIT368 9 43

Uni-Fold

An open-source platform for developing protein models beyond AlphaFold.

Language:PythonApache-2.0362 7 70

gnome-shell-extension-alt-tab-scroll-workaround

Quick fix to the bug where scrolling in one application is repeated in another when switching between them using Alt+Tab (e.g., VS Code and Chrome)

Language:JavaScriptGPL-3.0219 4 26

social-media-tutorials

Code dumps of Youtube/Twitter tutorials

Language:Jupyter Notebook160 8 3

flash-genomics-model

My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other hierarchical methods)

Language:PythonMIT52 6 3

sequence-learn

With sequence-learn, you can build models for named entity recognition as quickly as if you were building a sklearn classifier.

Language:PythonApache-2.022 4 1

E3C-Corpus

E3C is a freely available multilingual corpus (Italian, English, French, Spanish, and Basque) of semantically annotated clinical narratives to allow for the linguistic analysis, benchmarking, and training of information extraction systems. It consists of two types of annotations: (i) clinical entities: pathologies, symptoms, procedures, body parts, etc., according to standard clinical taxonomies (i.e. SNOMED-CT, ICD-10); and (ii) temporal information and factuality: events, time expressions, and temporal relations according to the THYME standard. The corpus is organised into three layers, with different purposes. Layer 1: about 25K tokens per language with full manual annotation of clinical entities, temporal information and factuality, for benchmarkingand linguistic analysis. Layer 2: 50-100K tokens per language with semi-automatic annotations of clinical entities, to be used to train baseline systems. Layer 3: about 1M tokens per language of non-annotated medical documents to be exploited by semi-supervised approaches. Researchers can use the benchmark training and test splits of our corpus to develop and test their own models. We trained several deep learning based models and provide baselines using the benchmark. Both the corpus and the built models will be available through the ELG platform.

2000

dswah

daniel servén's starred repositories

the-algorithm

vscodium

qdrant

datasette

dinov2

super-gradients

deepchecks