correspondence-analysis contingency-table factor-analysis chi-square-statistics p-value python pandas seaborn reciprocal-averaging markov-chain-data-analysis multivariate-analysis multivariate-statistics data-analysis data-analysis-python data-visualisation statistical-technique text-analysis egyptology grammar-analysis free-software

Free Correspondence Analysis Python Software

Suitable for Users from any Disciplines

Description

Perform standard correspondence analysis of two categorical variables (code module ca.py in the folder Methods/).

Code can be used to perform correspondence analysis on any dataset that can be transformed into a pandas DataFrame (see the code ca.py in the folder Methods/).

The method mcmca.py can be used for correspondence analysis of dataset that could be assumed to be generated from a Markov Chain Model.

Specific Project

Project Ef5-4: "The evolution of Ancient Egyptian - Quantitative and Non- Quantitative Mathematical Linguistics".

Institutions: ZIB (Zuse Institute Berlin) & MATH+ (Berlin Mathematics Research Center).

Software requirements

python version: 3.7 or +

packages: numpy, pandas, matplotlib, matplotlib.pyplot, matplotlib.backends.backend_pdf, scipy, scipy.stats, seaborn.

You can also get all these using conda by creating a new environment with the spec file myPy3_spec.txt (for a guidance, click here)

Usage requirement

See official publication link here

DOI: https://doi.org/10.12752/8257

Licence: Open Source Apache 2.0

Code Execution

Users with little to no background in python

Helper.py: performs one CA analysis (in this specific project: text vs. grammatical form)

Please enter all the inputs by following the corresponding questions/decriptions.

implementation.py is required to obtain the CA figures.

Users with a moderate background in python

implementation.py can be used to modify the default figure parameter settings. For further modifications, see all the codes in folder Methods/

Notes for all Users

If the dataset is already a contingency table, then the parameter isCont must be given as True and the table should be transformed into a panda dataframe (see example cHelper.py)

Supported Data type (if not a contingency table)

Excel file. In our specific project, datafile contains numerical coding of texts in Égyptien de Tradition, each single data consisting of a ten digits number encoding for the grammatical structure of a sentence (files can be downloaded here).

You can also use your own python function to clean your dataset instead of the function Cleaned_Data in implementation.py line 9.