ashishyadav24092000

Ashish Kumar Yadav's repositories

ALL-Hypothesis-testing

Hypothesis testing using T-test,ANOVA,chi-square test.

Language:Jupyter Notebook010

DBSCAN-Clustering

Performing DBSCAN(Density based spatial clustering of application with noise) Clustering. As the name suggest it is used specially for diligently handling the noise data or outliers in a dataset.

Language:Jupyter Notebook010

Detect_Parkinson_XGBOOSTCLASSIFIER

Detecting Parkinson Using extreme gradient boosting(XGBOOSTING) Algorithm.

Language:Jupyter Notebook010

EDA_on_HousePrice

In this repository I have performed Exploratory data analysis on the dataset famously known as House Price Prediction.

Language:Jupyter Notebook010

EDA_on_onlineretails

This is an another project in which i have Performed Exploratory data analysis on a dataset about online retailers.

Language:Jupyter Notebook010

In this repository we have performed Exploratory Data analysis to visualise and clean the data. After that we have build two models that is Logistic Regression model and XGBClassifier model to predict the survivors values. And at last we have computed the accuracy for both of our model and also the classifiaction report of the logistic Regression Algorithm.

Language:Jupyter Notebook010

Encoding_categorical-variables

Mostl oftenly used Encoding techniques for categorical Varibales are performed here.

Language:Jupyter Notebook010

Exploratory_data_analysis3

In this repository I have performed Exploratory Data Analysis on the dataset student_performance.csv. In which i have tried to detect outliers,missing values,relationship among features and across features,Categorical data and continuous/numerical data.

Language:Jupyter Notebook010

FE_categorical_missing_values

In this code handling of the missing values for the categorical features from any dataset is shown.

Language:Jupyter Notebook010

FULL-Feature-Transformations

In this project we have performed all types of feature transfromation on the titanic dataset and we have seen the usage of qqplot to check whether a feature is normal/gaussian distributed or not.

Language:Jupyter Notebook010

GenChatAssitantBotOAI

This is a plain chatbot devloped using the OPENAI api. It leverages the following libraries - langchain, openai, huggingface_hub, python-dotenv, streamlit, pandas.

Language:Jupyter Notebook010

Handle-missing-numerical-values

In this code the missisng numerical values inside any feature is handled using various techniques which are mentioned in the coding part itself.

Language:Jupyter Notebook010

Hierarchical-Clustering

Performing Hierarchical clustering.

Language:Jupyter Notebook010

Kmeans_Implementation

KMeans algorithm using a random K-value as 2.

Language:Jupyter Notebook010

KNN-Algorithm

Performing the K-Nearest-Neighbor Algorithm.

010

LInear-Ridge-Lasso-Regression

Performing all the three regression i.e. Linear, Ridge, Lasso for a dataset.

Language:Jupyter Notebook010

MAchineLearning_FeatureEngineering1

In this i have performed complete feature engineering that is from handling null values, Categorical features upto performing feature scaling on our test_data and train_data.

Language:Jupyter Notebook010

MachineLearningE2EProject

Language:Jupyter Notebook010

ML-FeatureSelection1

Ih this i have tried to perform feature selection from a dataset having 81 features. After feature Selection 81 features reduced to 21 for modelling purpose.

Language:Jupyter Notebook010

Multicollinearity-in-Regression

Showing how to identify multicollinearity in a regression problem using the OLS(Ordiniary Least Square Method) and correlation chart adn finaly eradicating it.

Language:Jupyter Notebook010

Multiple-Linear-Regression

Performing multiple linear regression on a simple dataset.

Language:Jupyter Notebook010

One-hot-Encoding_AllTypes

In this i have tried to perform Simple One hot encoding for categorical features and One hot encoding for Top ten/twenty most frequent categories of a feature.

Language:Jupyter Notebook010

Optimal-threshold-for-classification

Choosing the most optimal threshold value for classificaation algorithmms in Machine Learning Use cases.

Language:Jupyter Notebook010

OptimalK-in-KMeans_Clustering

Finding the most optimal k in a KMeans Clustering Algorithm. Here we have discussed two methods used for finding the optimal K-values - Elbow Curve MEthod and Silhouette Analysis method.

Language:Jupyter Notebook010

ParkinsonDetection_LogisticRegression

This is same problem which is solved in https://github.com/ashishyadav24092000/Detect_Parkinson_XGBOOSTCLASSIFIER project. But here we have used Logistic Regression instead of XGBClassifier to classify the Statuses as 0 or 1 i.e. Parkinson positive or negative. And clearly we can see that how our Accuracy suddenly dropped from 95% to 84% as we moved from XGBClassifier to Logistic Regression.

Language:Jupyter Notebook010

ashishyadav24092000

Ashish Kumar Yadav's repositories

ALL-Hypothesis-testing

DBSCAN-Clustering

Detect_Parkinson_XGBOOSTCLASSIFIER

EDA_on_HousePrice

EDA_on_onlineretails

EDA_TitanicSurvivors

Encoding_categorical-variables

Exploratory_data_analysis3

FE_categorical_missing_values

FULL-Feature-Transformations

GenChatAssitantBotOAI

Handle-missing-numerical-values

Hierarchical-Clustering

Kmeans_Implementation

KNN-Algorithm

LInear-Ridge-Lasso-Regression

MAchineLearning_FeatureEngineering1

MachineLearningE2EProject

ML-FeatureSelection1

Multicollinearity-in-Regression

Multiple-Linear-Regression

One-hot-Encoding_AllTypes

Optimal-threshold-for-classification

OptimalK-in-KMeans_Clustering

ParkinsonDetection_LogisticRegression

PCA_dimension_reduction_Technique

RandomForest-Algorithm

Seaborn_visualisations

Silhouette-Score-In-Clustering

UnivariateAndBivariateAndMultivariate-Analysis