avro

There are 3 repositories under avro topic.

apache / avro
Apache Avro is a data serialization system.
csharp ruby cplusplus python php java c avro bigdata dotnet perl rust
Language:Java 2833
dflemstr / rq
Record Query - A tool for doing record analysis and transformation
rust protobuf command-line-tool json avro messagepack yaml toml javascript lodash
Language:Rust 2265
confluentinc / schema-registry
Confluent Schema Registry for Kafka
schema-registry kafka schema schemas avro avro-schema rest-api confluent protobuf protobuf-schema json json-schema
Language:Java 2172
confluentinc / examples
Apache Kafka and Confluent Platform examples and demos
avro cdc cloud confluent connect connector debezium demo docker examples jdbc kafka ksql kubernetes microservices monitoring quickstart replicator schema-registry sql
Language:Shell 1880
capitalone / DataProfiler
What's in your data? Extract schema, statistics and entities from datasets
python privacy pii npi nlp data-science gdpr data-analysis data-labels avro dataprofiling sensitive-data security pandas csv tabular-data dataset network-data graph-data machine-learning
Language:Python 1392
mtth / avsc
Avro for JavaScript :zap:
avro big-data binary-format encoding javascript schema-evolution serialization typescript
Language:JavaScript 1257
pmacct / pmacct
pmacct is a small set of multi-purpose passive network monitoring tools [NetFlow IPFIX sFlow libpcap BGP BMP RPKI IGP Streaming Telemetry].
netflow ipfix sflow bgp bmp kafka rabbitmq libpcap nflog geoip2 ndpi mysql postgresql sqlite3 sql json avro rpki pmacct
Language:C 1029
bigdatagenomics / adam
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
avro big-data bioinformatics genomics java parquet python r scala spark
Language:Scala 968
OBenner / data-engineering-interview-questions
More than 2000+ Data engineer interview questions.
data-engineering interview-questions interview hadoop hadoop-hdfs spark flink sql kafka hive impala airflow aws azure cassandra flume hbase avro nifi data-structures
965
deviceinsight / kafkactl
Command Line Tool for managing Apache Kafka
apache-kafka avro cli kafka golang zsh fish
Language:Go 786
Cinchoo / ChoETL
ETL framework for .NET (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
csv parser writer reader flat xml json keyvalue etl cinchoo-etl etl-framework csharp dotnet parquet parquet-files yaml avro
Language:C# 759
HariSekhon / DevOps-Python-tools
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
cloudformation python hbase json avro parquet spark pyspark travis-ci elasticsearch solr hadoop hdfs dockerhub docker linux aws devops gcp gcf
Language:Python 745
miguno / kafka-storm-starter
[PROJECT IS NO LONGER MAINTAINED] Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
apache-kafka kafka apache-storm storm spark apache-spark scala integration avro apache-avro
Language:Scala 726
thekvs / cpp-serializers
Benchmark comparing various data serialization libraries (thrift, protobuf etc.) for C++
cpp serialization protobuf capn-proto thrift flatbuffers cereal performance-testing boost msgpack avro apache-avro c-plus-plus yas
Language:C++ 717
sksamuel / avro4s
Avro schema generation and serialization / deserialization for Scala
avro avro-schema coproduct scala scala-macros schema-generation serialization
Language:Scala 714
vscode-data-preview
RandomFractals / vscode-data-preview
Data Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files
vscode extension csv arrow data perspective viewer json array config yaml excel avro parquet
Language:TypeScript 540
Netflix / iceberg
Iceberg is a table format for large, slow-moving tabular data
spark hadoop parquet avro
Language:Java 469
zarusz / SlimMessageBus
Lightweight message bus interface for .NET (pub/sub and request-response) with transport plugins for popular message brokers.
message-bus messaging azure-event-hubs pub-sub request-response c-sharp azure-service-bus redis apache-kafka kafka bus azure avro rabbitmq ddd mqtt dotnet
Language:C# 455
lensesio / schema-registry-ui
Web tool for Avro Schema Registry |
schema-registry kafka avro
Language:JavaScript 417
only-cliches / NoProto
Flexible, Fast & Compact Serialization with RPC
schemas protocol-buffers flatbuffers deserialization zero-copy serialization rpc json bson messagepack avro apache-avro flexbuffers data-buffers databases
Language:Rust 372
hamba / avro
A fast Go Avro codec
golang avro encoder-decoder avro-codec
Language:Go 355
mongodb / mongo-kafka
MongoDB Kafka Connector
mongodb kafka kafka-connect sink sink-connector source source-connector cdc bson avro confluent-hub
Language:Java 338
spotify / ratatool
A tool for data sampling, data generation, and data diffing
scala scalacheck avro parquet bigquery protobuf
Language:Scala 338
uber / storagetapper
StorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service
mysql kafka avro cdc etl json msgpack hdfs s3 postgresql clickhouse
Language:Go 336
higherkindness / mu-haskell
Mu (μ) is a purely functional framework for building micro services.
haskell avro grpc protocol-buffers defines-schemas mu monads rpc type-level-programming type-level mu-graphql mu-haskell graphql hacktoberfest
Language:Haskell 329
davidmc24 / gradle-avro-plugin
A Gradle plugin to allow easily performing Java code generation for Apache Avro. It supports JSON schema declaration files, JSON protocol declaration files, and Avro IDL files.
gradle-plugin java groovy avro
Language:Java 319
kafka-connect-file-pulse
streamthoughts / kafka-connect-file-pulse
🔗 A multipurpose Kafka Connect connector that makes it easy to parse, transform and stream any file, in any format, into Apache Kafka
amazon-s3 avro azure-storage csv etl file-streaming google-cloud grok-filters kafka kafka-connect kafka-connector kafka-producer xml
Language:Java 318
FasterXML / jackson-dataformats-binary
Uber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
avro cbor protobuf smile jackson-backends hacktoberfest
Language:Java 306
divolte / divolte-collector
Divolte Collector
divolte-collector hdfs kafka gcs java avro pubsub clickstream analytics analytics-tracking
Language:Java 283
Eugene-Mark / bigdata-file-viewer
A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
bigdata parquet orc avro hdfs
Language:Java 282
riferrei / srclient
Golang Client for Schema Registry
schema registry apache kafka avro go codec confluent cloud
Language:Go 226
AbsaOSS / ABRiS
Avro SerDe for Apache Spark structured APIs.
avro-schema schema-registry spark kafka avro
Language:Scala 224
RumbleDB / rumble
⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
spark json jsoniq data-science scale avro parquet dataframes query query-engine schemaless nested machine-learning svm csv text hdfs s3 azure yaml
Language:Java 209
marcosschroh / dataclasses-avroschema
Generate avro schemas from python classes. Code generation from avro schemas. Serialize/Deserialize python instances with avro schemas
avro apache-avro avro-schemas json-schema pydantic python3 serialization schema json code-generation model
Language:Python 208
Chabane / bigdata-playground
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
docker spark-sql scala kafka hbase parquet avro nodejs angular graphql mongodb machine-learning big-data hadoop apache-spark apache-flink spark-streaming twitter-api python kops
Language:TypeScript 205
Altinity / clickhouse-sink-connector
Replicate data from MySQL, Postgres and MongoDB to ClickHouse
avro kafka kafka-connect clickhouse postgresql replication cdc debezium mysql mongo sqlserver
Language:Python 202

avro

apache / avro

dflemstr / rq

confluentinc / schema-registry

confluentinc / examples

capitalone / DataProfiler

mtth / avsc

pmacct / pmacct

bigdatagenomics / adam

OBenner / data-engineering-interview-questions

deviceinsight / kafkactl

Cinchoo / ChoETL

HariSekhon / DevOps-Python-tools

miguno / kafka-storm-starter

thekvs / cpp-serializers

sksamuel / avro4s

RandomFractals / vscode-data-preview

Netflix / iceberg

zarusz / SlimMessageBus

lensesio / schema-registry-ui

only-cliches / NoProto

hamba / avro

mongodb / mongo-kafka

spotify / ratatool

uber / storagetapper

higherkindness / mu-haskell

davidmc24 / gradle-avro-plugin

streamthoughts / kafka-connect-file-pulse

FasterXML / jackson-dataformats-binary

divolte / divolte-collector

Eugene-Mark / bigdata-file-viewer

riferrei / srclient

AbsaOSS / ABRiS

RumbleDB / rumble

marcosschroh / dataclasses-avroschema

Chabane / bigdata-playground

Altinity / clickhouse-sink-connector