ledelaney / analyzing-ur-categorical-data

:open_file_folder: A collection of resources that may be helpful for analyzing your categorical data (including trait data).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Analysis of Categorical Data

This repo includes a collection of resources that may be helpful for learning about methods of categorical data analysis, including the specialized case of species data. Many materials are linked below, and others are included in the folders above.

For getting a general sense of what to do with your data and how to work with it in R, I found the vcdExtra tutorial and the Penn State course very helpful. The Agresti book is also extremely useful, and also includes an associated R manual (both in general-materials folder). Humble word of advice: start with categorical data analysis in general, and then move on to caper or phylolm for phylogenetic methods. The main difference is that you will use a phylogeny along with your data, but that can complicate things.

Another humble word of advice: it matters if your response variable is binary. If your response variable is binary, you will need to use a specialized case of glm (family = "binomial") and a special function from the rr2 package (BinaryPGLMM).

For visualizing the output, I cannot speak highly enough about vcdExtra (for non-phylogentic data).

For analysis of general data

For analysis of species data

A note on data

These materials are specifically designed with nominal categorical variables in mind. Data like this has no order and is non-numeric (e.g., "married" or "divorced"). Discrete data is countable and numeric (like the number of times a coin landed on heads, or the number of customer complaints), but can be treated as categorical in cases some cases. Save yourself from my personal pitfalls and make sure you investigate what kind of data you have (discrete, continuous, ordinal, etc.) to ensure you are performing the proper tests.

About

:open_file_folder: A collection of resources that may be helpful for analyzing your categorical data (including trait data).