There are 15 repositories under data-transformation topic.
☄️ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! ☄️
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Logical Replication extension for PostgreSQL 15, 14, 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
A block-based API for NSValueTransformer, with a growing collection of useful examples.
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
Microsoft Program Synthesis using Examples SDK is a framework of technologies for the automatic generation of programs from input-output examples. This repo includes samples and sample data for the Microsoft Program Synthesis using Example SDK.
:lipstick: Durable and asynchronous data imports for consuming data at scale and publishing testable SDKs.
Low-code Python library to safely use notebooks in production: schedule workflows, generate assets, trigger webhooks, send notifications, build pipelines, manage secrets (Cloud-only)
An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation in R
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
A simple Spark-powered ETL framework that just works 🍺
A curated list of Clojure resources for dealing with domain-specific languages.
Data transformation and utility functions for R
Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University
🤖 An automated machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers). Python 3.6 required.
A visual data pipeline builder with various backends
Wrangler Transform: A DMD system for transforming Big Data
A schema-aware Scala library for data transformation
Reference Architectures for Datalakes on AWS
Data transformation toolkit
breadroll 🥟 is a simple lightweight library for data processing operations written in Typescript and powered by Bun.
Examples for working with DataWeave scripts from Apex.
Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.
Daany - .NET DAta ANalYtics .NET library with the implementation of DataFrame, Time series decompositions and Linear Algebra routines BLASS and LAPACK.
object flow treatment, data transformation
⚡️ Next-generation data transformation framework for TypeScript that puts developer experience first
Serialize PHP variables, including objects, in any format. Support to unserialize it too.
A tool to read CSV files with CSVW metadata and transform them into other formats.
Functional utilities for Common Lisp
A PHP serialization component focused on performance