There are 2 repositories under large-data topic.
C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types and contiguous memory storage
A Kafka Serde that reads and writes records from and to Blob storage (S3, Azure, Google) transparently.
Tabular Data Viewer π VSCode extension for viewing very large local and remote CSV and TSV data files with Tabulator Table, Perspective Datagrid and D3FC Chart Views ππ
π― A Ruby on Rails app to generate Fizzbuzz numbers up to 100,000,000,000
Wrapping single instance learning algorithms for fitting them to data for multiple instance learning
This repository contains introduction to pandas which is a software library written for the Python programming language for data manipulation and analysis.
Using Python to analyze large datasets