dafrenchyman / pycorr

Correlation between categorical variables

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PyCorr

A simple library to calculate correlation between variables. Currently provides correlation between nominal variables.

Based on statistical methodology like Cramer'V and Tschuprow'T allows to gauge the correlation between categorical variables. Ability to plot the correlation in form of heatmap is also provided.

Usage example

import pandas as pd
from pycorrcat.pycorrcat import plot_corr, corr_matrix

df = pd.DataFrame([('a', 'b'), ('a', 'd'), ('c', 'b'), ('e', 'd')],
                  columns=['dogs', 'cats'])

correlation_matrix = corr_matrix(data, ['dogs', 'cats'])
plot_corr(df, ['dogs','cats'] )

Development setup

Create a virtualenv and install dependencies from requirements.txt and continue with code change.

Release History

  • 0.1.4
    • CHANGE: Changed the documentation (no code change)
  • 0.1.3
    • ADD: Ability to pass dataframe to get correlation matrix
    • ADD: Ability to plot the correlation in form of heatmap
  • 0.1.2
    • Added as first release
  • 0.1.1
    • Test release

Author and Contributor

Anurag Kumar Mishra – Connect on github or drop a mail

Distributed under the GNU license. See LICENSE for more information.

Github repo link https://github.com/MavericksDS/pycorr

Contributing

  1. Fork it (https://github.com/MavericksDS/pycorr)
  2. Create your feature branch (git checkout -b feature/fooBar)
  3. Commit your changes (git commit -am 'Add some fooBar')
  4. Push to the branch (git push origin feature/fooBar)
  5. Create a new Pull Request

About

Correlation between categorical variables

License:GNU General Public License v3.0


Languages

Language:Python 100.0%