Sairaj Indroju's repositories
Rain-Prediction-DataAnalytics
Analyzed a 23-feature dataset, targeting 'RainTomorrow' for weather insights. Conducted thorough data gathering, preprocessing, and feature selection. Evaluated diverse models (Logistic Regression, Random Forest, Decision Trees, K-means, K-nearest neighbors, Hierarchical clustering) and employed technical metrics for in-depth performance analysis.
TokyoOlympicDataAnalytics-Azure
Ingested Tokyo Olympic data into Azure Data Lake using Azure Data Factory. Enhanced data quality with Apache Spark on Azure Databricks. Optimized SQL queries on Synapse Analytics, reducing execution time. Developed engaging Power BI dashboards, boosting user engagement creating KPI's with DAX.
Ego-splitting-Framework-from-Non-Overlapping-to-Overlapping-Clusters
Created Ego-Network Analysis Framework in C++, R, and Python, emphasizing overlapping clusters. Features CSV parsing, node adjacency, and connected component analysis. Systematic approach for persona graph construction elevates network analysis in domains like social network analysis.
HRAnalytics
"Within the realm of HR analytics, adeptly employed Power Query for thorough data processing. Utilized DAX to meticulously formulate efficient KPIs, expeditiously generating attendance insights. Streamlined data loading, presenting dynamic graphs for nuanced analysis of Work from Home %, Sick Leave %, and Presence %.
Membership-Inference-Attack
Developed a privacy attack assessing individual data usage in model training. Implemented diverse shadow models (XGBoost, Random Forest, Linear Regression, Perceptron, AdaBoost, SVM, Hist Gradient Boosting) to evaluate data privacy in machine learning. Proficient in data manipulation and comprehensive data privacy assessment methodologies.
PwCPowerBI
Completed PwC certification projects showcase my expertise in data visualization and strategic insights. Leveraging Power Query, I optimized data loading. DAX calculations were enhanced, reducing report processing time.