There are 1 repository under parquet-files topic.
High-performance Go package to read and write Parquet files
Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
Price Crawler - Tracking Price Inflation
Query and transform data with PRQL
A library for Spark DataFrame using MinIO Select API
A converter for the OSM PBFs to Parquet files
OSM planet dump high performance data loader. Transform OpenStreetMap World/Region PBF dump into partitioned by H3 regions PostGIS pgsnapshot (lossless) OSM schema representation and/or into ArrowIPC/Parquet dumps
MongoDB integrations for Apache Arrow. Export MongoDB documents to numpy array, parquet files, and pandas dataframes in one line of code.
Library to read a subset of Parquet files
A lightweight Java library that facilitates reading and writing Apache Parquet files without Hadoop dependencies
Threat Detection and Visualization
Scala ZIO-powered Apache Parquet library
Converts between file formats such as CSV and Parquet
A web application for viewing Apache Parquet files . This is a Python + Flask application
Streaming data of Tiki with Kafka and processing with Spark, visualize with Elasticsearch & Kibana.
:bangbang: Handle Big Data for Machine Learning using Python and PySpark, Building ETL Pipelines with PySpark, MongoDB, and Bokeh
:guardsman: ☕️ Tools to Transform and Query Data with 'Apache' 'Drill'
vscode extension for SQL querying and visualizing parquet files
Streaming kafka events using Spark in avro format and saving the events in parquet format
A fast and simple command-line (CLI) tool to convert a Parquet file to an Apache Arrow file
ETL job with AWS Glue
Apache Spark application to get the top ten frequent routes and profitable areas
A command line tool for inspecting parquet files with PyArrow.
Scala code to read Parquet files as streams in Spark Streaming using Avro.
DDIA Course Project
Node-Red contrib that converts between a PARQUET string and its JavaScript object representation, in either direction.
Merge Parquet Files on S3 with this AWS Lambda Function
UniParc dataset describing ~300 million protein sequences converted into relational tables accessible through Google BigQuery (and as Parquet files).
Simple and small CLI to work with parquet files
Load data from the Million Song Dataset into a final dimensional model stored in S3.
Explore factors associated with Malware Infection using Spark SQL