There are 1,842 repositories under data topic.
LlamaIndex is a data framework for your LLM applications
This is a repo with links to everything you'd ever want to learn about data engineering
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
A curated list of awesome big data frameworks, ressources and other awesomeness.
:orange_book: 中华新华字典数据库。包括歇后语,成语,词语,汉字。
A web interface to create custom vector-based visualizations on top of RAWGraphs core
A roadmap connecting many of the most important concepts in machine learning, how to learn them and what tools to use to perform them.
Interactive Tables and Data Grids for JavaScript
A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.
The open source high performance ELT framework powered by Apache Arrow
Countly is a product analytics platform that helps teams track, analyze and act-on their user actions and behaviour on mobile, web and desktop applications.
A next-generation curated knowledge sharing platform for data scientists and other technical professions.
This repository contains compatibility data for Web technologies as displayed on MDN
Superduper: Build end-to-end AI applications and agent workflows on your existing data infrastructure and preferred tools - without migrating your data.
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.