A holistic analysis on two datasets that includes the following:
- Visualizations through Excel
- Making the data 'Outlier' free
- Data Integrity/Sanity and forming Meta-Data
- EDA (Including Dimensions of Data)
- Applying Statistical Tests upon data while either rejecting or accepting Null hypothesis
- Creating a histogram for all numerical variables
- Correlation Checks between variables
- Covariance checks between variables
- Application of ANOVA on dataset
Sources of Data:
1-4: https://data.world/finance/dc-purchase-orders-2014
5-9: https://data.world/finance/china-largest-companies
Refer the 'R Project' docx document to gain further understanding of the sequence in which these resources are to be read.