Sundaram Surampudi's repositories
awesome-dataops
:sunglasses: A curated list of awesome DataOps tools
awesome-healthcare
Curated list of awesome open source healthcare software, libraries, tools and resources.
awesome-local-ai
An awesome repository of local AI tools
best-of-ml-python
🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.
CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
DataEngineering_Projects
Data Engineering Portfolio
deepchecks
Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.
dep-scan
OWASP dep-scan is a next-generation security and risk audit tool based on known vulnerabilities, advisories, and license limitations for project dependencies. Both local repositories and container images are supported as the input, and the tool is ideal for integration.
feathr
Feathr – A scalable, unified data and AI engineering platform for enterprise
gpt-engineer
Specify what you want it to build, the AI asks for clarification, and then builds it.
griptape
Modular Python framework for AI agents and workflows with chain-of-thought reasoning, tools, and memory.
grok-1
Grok open release
labelme
Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
lakehouse-tacklebox
This repo is a collection of tools to deploy, manage and operate a Databricks based Lakehouse.
Large-Language-Model-Notebooks-Course
Practical course about Large Language Models.
mango
Parallel Hyperparameter Tuning in Python
more-itertools
More routines for operating on iterables, beyond itertools
odd-platform
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
operator-lifecycle-manager
A management framework for extending Kubernetes with Operators
pathway
Pathway is a high-throughput, low-latency data processing framework that handles live data & streaming for you. Made with ❤️ for Python & ML/AI developers.
PolarsVsPySpark
can Polars crunch 27GBs of data faster than Pyspark?
pravega
Pravega - Streaming as a new software defined storage primitive
pyCirclize
Circular visualization in Python (Circos Plot, Chord Diagram)
Stirling-PDF
locally hosted web application that allows you to perform various operations on PDF files
swiple
Swiple enables you to easily observe, understand, validate and improve the quality of your data
ubicloud
Open, free, and portable cloud. Elastic compute, block storage (non replicated), virtual networking, managed Postgres, and IAM services in public beta.
vectordb-recipes
High quality resources & applications for LLMs, multi-modal models and VectorDBs