Sonal (sonalgoyal)

sonalgoyal

Geek Repo

Company:https://github.com/zinggAI/zingg

Location:India

Twitter:@sonalgoyal

Github PK Tool:Github PK Tool


Organizations
zinggAI

Sonal's starred repositories

metabase

The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:

Language:ClojureLicense:NOASSERTIONStargazers:37450Issues:641Issues:19334

redash

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

Language:PythonLicense:BSD-2-ClauseStargazers:25475Issues:574Issues:2481

duckdb

DuckDB is an analytical in-process SQL database management system

dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Language:PythonLicense:Apache-2.0Stargazers:9377Issues:140Issues:5281

cleanlab

The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.

Language:PythonLicense:AGPL-3.0Stargazers:9166Issues:89Issues:361

toaruos

A completely-from-scratch hobby operating system: bootloader, kernel, drivers, C library, and userspace including a composited graphical UI, dynamic linker, syntax-highlighting text editor, network stack, etc.

snorkel

A system for quickly generating training data with weak supervision

Language:PythonLicense:Apache-2.0Stargazers:5764Issues:167Issues:980

OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

Language:TypeScriptLicense:Apache-2.0Stargazers:4895Issues:46Issues:6826

lakeFS

lakeFS - Data version control for your data lake | Git for data

Language:GoLicense:Apache-2.0Stargazers:4242Issues:42Issues:3203

argilla

Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency.

Language:PythonLicense:Apache-2.0Stargazers:3673Issues:30Issues:2060

promote-your-next-startup

🚀 Free resources you may use to promote your next startup

awesome-explainable-graph-reasoning

A collection of research papers and software related to explainability in graph machine learning.

elementary

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

Language:HTMLLicense:Apache-2.0Stargazers:1833Issues:8Issues:538

querybook

Querybook is a Big Data Querying UI, combining collocated table metadata and a simple notebook interface.

Language:TypeScriptLicense:Apache-2.0Stargazers:1829Issues:34Issues:218

DataProfiler

What's in your data? Extract schema, statistics and entities from datasets

Language:PythonLicense:Apache-2.0Stargazers:1393Issues:21Issues:181

zingg

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

Language:JavaLicense:AGPL-3.0Stargazers:925Issues:20Issues:449

dbt-fal

do more with dbt. dbt-fal helps you run Python alongside dbt, so you can send Slack alerts, detect anomalies and build machine learning models.

Language:PythonLicense:Apache-2.0Stargazers:854Issues:22Issues:136

kuwala

Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demographics data b) Point of Interests from Open Street Map c) Google Popular Times

Language:JavaScriptLicense:Apache-2.0Stargazers:779Issues:13Issues:72

flow

🌊 Continuously synchronize the systems where your data lives, to the systems where you _want_ it to live, with Estuary Flow. 🌊

Language:C++License:NOASSERTIONStargazers:544Issues:11Issues:345

schemata

Schema modelling framework for decentralised domain-driven ownership of data.

Language:JavaLicense:Apache-2.0Stargazers:242Issues:8Issues:12

SubTab

The official implementation of the paper, "SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning"

Language:PythonLicense:Apache-2.0Stargazers:140Issues:3Issues:8

spear

SPEAR: Programmatically label and build training data quickly.

Language:Jupyter NotebookLicense:MITStargazers:103Issues:10Issues:2

analytics

Open source data models and analysis.

er-evaluation

An End-to-End Evaluation Framework for Entity Resolution Systems

Language:PythonLicense:AGPL-3.0Stargazers:23Issues:2Issues:4

fal_dbt_examples

Examples showing real-life use cases for fal + dbt

Language:Jupyter NotebookStargazers:22Issues:3Issues:1

customer-er

Translating text attributes (like name, address, phone number) into quantifiable numerical representations Training ML models to determine if these numerical labels form a match Scoring the confidence of each match

Language:PythonLicense:NOASSERTIONStargazers:19Issues:2Issues:3
Stargazers:19Issues:0Issues:0

snowpark-java-scala

Snowflake Snowpark Java & Scala API

Language:ScalaLicense:Apache-2.0Stargazers:16Issues:11Issues:34

spark-connect-example

An example of SparkConnect extension.

Language:JavaStargazers:9Issues:0Issues:0

product-er-with-images

Zingg fuzzy matching for products using metadata and images

Language:PythonLicense:NOASSERTIONStargazers:5Issues:1Issues:2