Anish Shah (anishshah23)

anishshah23

Geek Repo

Company:Software Engineer at Altice

Location:New York, NY.

Github PK Tool:Github PK Tool

Anish Shah's repositories

Social-Network-analysis-on-Twitter-Data

Using Twitter package in R with the search API, collected tweets and grouped them by geo location as in google maps API and plotted them on the geo map of USA according to number of tweets per state. Technology: R| Tools: Jupyter, Rstudio | libraries: ggplot, ggmap, geom_map.

Language:Jupyter NotebookStargazers:3Issues:0Issues:0

IMDb-5000-Data-analysis

Analyzed the IMDb 5000 Movie Dataset from Kaggle to predict movie ratings and gained some meaningful insights by using different methodologies such as Multiple Linear Regression, Decision Tree and Random Forest. Technology: R— Tools: Rstudio —Libraries: Dplyr, ggplot2.

Language:RStargazers:1Issues:0Issues:0

-DATA-AGGREGATION-BIG-DATA-ANALYSIS-AND-VISUALIZATION

Data aggregation from Twitter and NYTimes using the APIs exposed by data sources, and applying classical big data analytic method of MapReduce to the unstructured data collected, and building a visualization data product.

Language:JavaScriptStargazers:0Issues:0Issues:0

anishshah23.github.io

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

Language:JavaScriptLicense:MITStargazers:0Issues:0Issues:0

Buffalo-Sewer-Authority

Data gathering using GIS (Geographic Information Systems) tools such as ArcGIS and cleaning the gathered data.Built a multiple regression model with an 80% accuracy in Python in one of the project to determine the relationship between trees and crimes in the city of Buffalo which would help the City of Buffalo in future Landscape and Urban Planning projects.

Stargazers:0Issues:0Issues:0

Data-Analytics-pipeline-using-Spark

Processing graph data using Spark

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

Data-Science-Industry-Overview

Detailed learnings in form of multiple reports after attending in-person weekly modules on application-oriented and other related topics to the field of Data Science spanning various industries from the people who work as Data Scientists' in Fortune 500 companies.

Stargazers:0Issues:0Issues:0

Handwritten-digits-classification

Implemented multilayer perceptron neural networks in the classification of handwritten digits on MNIST dataset. Technology: Python— Tools: Jupyter Notebook —Libraries: NumPy, Scipy.

Language:PythonStargazers:0Issues:0Issues:0

NYC-Uber-Data-Analysis

Using quantitative data analysis methods visualized Uber’s ridership growth, characterized the demand based on identified patterns in the time series, estimated the value of the NYC market for Uber and its revenue growth, analyzed the trip duration to determine the probability distribution model and also insights about the usage of the service. Technology: Python —Tools: Jupyter Notebook —Libraries: Pandas, Matplotlib, Seaborn, SQL, NumPy.

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

Regression

Implementing Regression techniques such as Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), Linear regression, Ridge regression, Ridge Regression using Gradient Descent and Non-linear regression to understand how machine learning functions work.

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

aws-codepipeline-s3-codedeploy-linux

Use this sample when creating a simple pipeline in AWS CodePipeline while following the Simple Pipeline Walkthrough tutorial. http://docs.aws.amazon.com/codepipeline/latest/userguide/getting-started-w.html

Language:HTMLLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Boston-Housing-Data-Analysis-

Analyzed the areas with high crime rates and drawing conclusions for the increased crime rates in these areas. Performed data analysis to obtain the relationship between the predictors.

Language:RStargazers:0Issues:0Issues:0

Classification-and-Regression

Implemented Logistic Regression to give the prediction results.

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

Clustering-demographic-data-using-a-classification-tree

Cluster the demographic data of Table 14.1 using a classification tree. Specifically, generate a reference sample the same size as the training set, by randomly permuting the values within each feature. Build a classification tree to the training sample (class 1) and the reference sample (class 0) and describe the terminal nodes having highest estimated class 1 probability.

Language:RStargazers:0Issues:0Issues:0

coding-interview-university

A complete computer science study plan to become a software engineer.

License:CC-BY-SA-4.0Stargazers:0Issues:0Issues:0

EAS503-Programming-Fundamentals-for-Data-Scientists

All my coursework assignements related to the course.

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

Hierarchical-clustering-on-the-states-data

Applying unsupervised clustering technique (Hierarchical clustering) on the states.

Language:RStargazers:0Issues:0Issues:0

Hierarchical-clustering-to-gene-expression-data-set

Apply hierarchical clustering to the samples using correlation based distance.

Language:RStargazers:0Issues:0Issues:0

leetcode

Python & JAVA Solutions for Leetcode

License:MITStargazers:0Issues:0Issues:0

Online-Resume

Portfolio

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

PCA-and-K-Means-Clustering-Of-High-Dimensional-Aircraft-Data-

Executed PCA and K-Means Clustering on Delta airlines high dimensional dataset to obtain some interesting findings. Technology: R— Tools: Rstudio —Libraries: Stats, rgl.

Language:RStargazers:0Issues:0Issues:0

PCA-and-K-means-clustering-on-the-data

Performing PCA and K-means clsutering on simulated dataset

Language:RStargazers:0Issues:0Issues:0

Repeating-Topical-Data-Analysis

Tried to recreate the charts from the CDC site of flu data and analysis, flu.gov and fluview using R for the data till the week of Jan 27th 2018

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

STA545-

Statistical Data MIning 1

Language:RStargazers:0Issues:0Issues:0