Divya171997

Divya Arora's repositories

Big-Data-for-Computational-Finance-Forex_Exchange_Rate

Analysis of forex exchange rate dataset, covering the historical aspects over the period of time, in short doing Timeseries Analysis ,Data Cleansing and Transformation of Forex Exchange Dataset in order to transform it in format or structure required during Timeseries Analysis and Machine Learning ,Visualization of Forex Exchange Dataset based on Timeseries Analysis done on it,Applying ARIMA MODEL, a Machine Learning algorithm, used to perform forecasting or predicting future value.

Language:Jupyter Notebook1 10

Residential_Properties-Applied_Statistics

The dataset contains information regarding residential properties which were collected by the US Census Service, the period 2006 to 2010.

1 10

Agriculture-Industry-and-Service-Affecting-GDP-Of-Country

Exploratory Data Analysis and Visualization-We have used descriptive statistics to describe our data by using quartile, Range, Summary, Percentage, cross Tabulation .I have used two interactive Stacked bar chart for showing the contribution of agriculture, industry and Service effect on the GDP per capita of the country as well as on Continent to comprehend,interactive density plot to show the distribution and also principal component graph for showing the relationship of sectors with the GDP.

Language:HTML010

Analysing-Factor-Affecting-GDP

Exploratory Data Analysis and Visualization-GDP is one of the most important indicator in determining the performance of country economy.The significant factors affecting gdp are population,agriculture,service,industry,health,migration,urban and obesity which are recorded in the dataset. Null hypothesis is that population,agriculture,service,industry,health,migration,urban and obesity are the factors not affecting the gdp per capita of the countries. By analyzing the data we have to find out whether or not these factors affecting gdp per capita of the country. For undergoing the analysis,we have used CIA_Factsheet dataset that records different factors affecting gdp of the countries for the year 2016.Briefly factors has been described and stated hypothesis has been validated.

Language:HTML010

Factors-Affecting-Officers-Injury

Exploratory Data Analysis and Visualization-While serving the country at the type of criminal activities lots of officers are injured every year.So in order to keep an eye on the injuries, we are analyzing the factors affecting injuries within the year so that it can be considered and reduces with time.The dataset contains information related to incident happened, officer details,criminal details of offence,geographical area of crime,details of extra force if required. Null hypothesis is that None of the factors affecting the rate of injury.

010

Information-Retrieval_Elasticsearch-Evaluation

This task is majorly focusing on Elasticsearch.It is open source search engine known for text search and analytics.Task is to convert documents into structured index by using information retrieval models and evaluate them.

Language:Python010

Information-Retrieval_Indexing-for-Web-Search

Task is to transform the data present on the websites in somewhat structured form in order to get relevant information from the HTML data given in each URL.

Language:Jupyter Notebook010

Life_Expectancy-Modelling_Experimental_Data

Investigate the response variable (dependent variable) life expectancy in the year 2016 and use other indicators (predictor variables) of the dataset to develop a linear model which explains the life expectancies 2016.

010

Machine-Learning-and-Data-Mining

Investigate the feasibility of machine learning procedures in predicting the fake news and misleading information aﬀecting the image of social media company.Tasks are to identify machine learning techniques appropriate for a particular practical problem,undertake a comparative evaluation of several machine learning procedures when applied to the specific problem, produce class predictions of the records in the test set using one approach of your choice among those tested in the comparative study.

Language:Jupyter Notebook010

Minor-1-Bank-Asset-Auctioning-System-

000

Minor-2-User-Activity-Tracer-

000

Text-Analytics-eXtreme-Multi-Label-Classification-XMLC

Multi-label classification is one of the standard tasks in text analytics. The objective is to perform an eXtreme multi-label classification (XMLC) on two datasets( https://www.kaggle.com/hsrobo/titlebased-semantic-subject-indexing) -EconBiz( ZBW - Leibniz Information Centre for Economics from July 2017) and PubMed(5th BioASQ challenge on large-scale semantic subject indexing of biomedical articles).In an XMLC setting, there are k many labels from a large pool of n labels to be assigned to the data objects. The classification task is extreme in two senses: First, the number of n labels is very large with hundreds or thousands of labels. Second, there are only very few k labels to assign, i. e. it holds k <<n. Thus, it is likely to have false positives.

Language:Jupyter Notebook010