nealrichardson / feather

Feather: fast, interoperable binary data frame storage for Python, R, and more powered by Apache Arrow

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feather Development is in Apache Arrow now

Feather development lives on in Apache Arrow. The arrow R package includes a much faster implementation of Feather, i.e. arrow::read_feather. The Python package feather is now a wrapper around pyarrow.feather.

Feather: fast, interoperable data frame storage

Feather provides binary columnar serialization for data frames. It is designed to make reading and writing data frames efficient, and to make sharing data across data analysis languages easy. This initial version comes with bindings for python (written by Wes McKinney) and R (written by Hadley Wickham).

Feather uses the Apache Arrow columnar memory specification to represent binary data on disk. This makes read and write operations very fast. This is particularly important for encoding null/NA values and variable-length types like UTF8 strings.

Feather is a part of the broader Apache Arrow project. Feather defines its own simplified schemas and metadata for on-disk representation.

Feather currently supports the following column types:

  • A wide range of numeric types (int8, int16, int32, int64, uint8, uint16, uint32, uint64, float, double).
  • Logical/boolean values.
  • Dates, times, and timestamps.
  • Factors/categorical variables that have fixed set of possible values.
  • UTF-8 encoded strings.
  • Arbitrary binary data.

All column types support NA/null values.

Installation

Python

pip install feather-format

R

install.packages("feather")

Julia

julia> using Pkg
julia> Pkg.add("Feather")

License and Copyrights

This library is released under the Apache License, Version 2.0.

See NOTICE for details about the library's copyright holders.

Getting started

Python

To get started with the python bindings, see the python feather documentation

R

To get started with the R bindings, see the R feather documentation

Julia

To get started with the Julia bindings see Feather.jl

About

Feather: fast, interoperable binary data frame storage for Python, R, and more powered by Apache Arrow

License:Apache License 2.0


Languages

Language:JavaScript 66.9%Language:Python 19.5%Language:R 9.7%Language:Jupyter Notebook 3.6%Language:HTML 0.4%Language:Batchfile 0.0%