Jakob_Skelmose's starred repositories

statsmodels

Statsmodels: statistical modeling and econometrics in Python

Language:PythonLicense:BSD-3-ClauseStargazers:10046Issues:282Issues:5446

gs-quant

Python toolkit for quantitative finance

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7608Issues:152Issues:31

SDV

Synthetic data generation for tabular data

Language:PythonLicense:NOASSERTIONStargazers:2326Issues:43Issues:1305

synthea

Synthetic Patient Population Simulator

Language:JavaLicense:Apache-2.0Stargazers:2146Issues:75Issues:578

StockPricePrediction

Stock Price Prediction using Machine Learning Techniques

Language:Jupyter NotebookLicense:MITStargazers:1310Issues:84Issues:14

pytorch-ts

PyTorch based Probabilistic Time Series forecasting framework based on GluonTS backend

Language:PythonLicense:MITStargazers:1239Issues:26Issues:140

differential-privacy-library

Diffprivlib: The IBM Differential Privacy Library

Language:PythonLicense:MITStargazers:820Issues:33Issues:42

crypto-arbitrage

Automatic Cryptocurrency Trading Bot using Triangular or Exchange Arbitrages

Language:PythonLicense:MITStargazers:741Issues:88Issues:24

PyDP

The Python Differential Privacy Library. Built on top of: https://github.com/google/differential-privacy

Language:PythonLicense:Apache-2.0Stargazers:505Issues:20Issues:158

notebooks

Analysis on systematic trading strategies (e.g., trend-following, carry and mean-reversion). The result is regularly updated.

Language:Jupyter NotebookLicense:MITStargazers:470Issues:16Issues:5

Mastering-Python-for-Finance-Second-Edition

Mastering Python for Finance – Second Edition, published by Packt

Language:Jupyter NotebookLicense:MITStargazers:443Issues:25Issues:7

mslive_public

Track live sentiment for stocks from Reddit and Twitter and identify growing stocks

Language:PythonLicense:GPL-3.0Stargazers:353Issues:34Issues:3

pystan

PyStan, a Python interface to Stan, a platform for statistical modeling. Documentation: https://pystan.readthedocs.io

Language:PythonLicense:ISCStargazers:338Issues:13Issues:199

xgboost-survival-embeddings

Improving XGBoost survival analysis with embeddings and debiased estimators

Language:PythonLicense:Apache-2.0Stargazers:320Issues:84Issues:46

open-data-anonymizer

Python Data Anonymization & Masking Library For Data Science Tasks

Language:PythonLicense:BSD-3-ClauseStargazers:241Issues:7Issues:10

DeepOnto

A package for ontology engineering with deep learning and language models.

Language:PythonLicense:Apache-2.0Stargazers:188Issues:5Issues:15

fintwit-bot

FinTwit-Bot is a Discord bot designed to track and analyze financial markets by pulling data from platforms like Twitter, Reddit, and Binance. It features customizable tools for sentiment analysis, market trends, and portfolio tracking to help traders stay informed and make data-driven decisions.

Language:PythonLicense:MITStargazers:64Issues:4Issues:357

pycanon

pyCANON is a Python library and CLI to assess the values of the parameters associated with the most common privacy-preserving techniques.

Language:PythonLicense:Apache-2.0Stargazers:28Issues:5Issues:1

anonypy

Anonymization library for python. Protect the privacy of individuals.

Language:PythonLicense:MITStargazers:25Issues:3Issues:2

pyMultiOmics

Python toolbox for multi-omics data mapping and analysis

Language:Jupyter NotebookLicense:MITStargazers:19Issues:2Issues:24

bnstruct

R package for Bayesian Network Structure Learning

Language:RLicense:GPL-3.0Stargazers:17Issues:4Issues:32

EHR_Incentive_Program_Analysis_Python

The Medicare Electronic Health Record (EHR) Incentive Program provides incentives to eligible clinicians and hospitals to adopt electronic health records. This dataset combines meaningful use attestations from the Medicare EHR Incentive Program and certified health IT product data from the ONC Certified Health IT Product List (CHPL) to identify the unique vendors, products, and product types of each certified health IT product used to attest to meaningful use. (data, 2017) Data set merges information about the Centers for Medicare and Medicaid Services, Medicare and Medicaid EHR Incentive Programs attestations with the Office of the National Coordinator for Health IT Certified Health IT Products List. This new dataset enables systematic analysis of the distribution of certified EHR vendors and products among those providers that have attested to meaningful use within the CMS EHR Incentive Programs. The data set can be analyzed by state, provider type, provider specialty, and practice setting. (Technology, 2017) The dataset also includes important provider-specific data, related to the provider's participation and status in the program, unique provider identifiers, and other characteristics unique to each provider, like geography and provider type. Because providers may declare more than one EHR product when attesting, this list also provides a unique ID (i.e. NPI) for each provider. The Medicare EHR Incentive Program provides incentive payments to eligible providers as they adopt, implement, upgrade, or demonstrate meaningful use of certified EHR technology. The CHPL provides the authoritative, comprehensive listing of certified health IT products that have been tested under the ONC Certification Program. (data, 2017) The complete dataset exceeds 1 million rows of data. This data is intended to provide names of EHR products and their vendors, the certification classification of each product (Complete or Modular), the healthcare setting for which the product was certified (Ambulatory or Inpatient), the type of provider attesting to “meaningful use” of an HER, the Incentive Program the provider attested in (Medicare or Medicare/Medicaid), Unique ID for each attestation, Version of EHR product and the Stage of Meaningful Use that the provider attested to (Stage 1/Stage 2). The size of the dataset is 370 MB with 23 columns giving all the necessary information about it. The information in this dataset is from April 2011 till present which is very useful for finding interesting trends from this dataset.

Stargazers:5Issues:0Issues:0

ml-impute

A package for synthetic data generation for imputation using single and multiple imputation methods.

Language:PythonLicense:MITStargazers:4Issues:2Issues:0

phd-course

PhD course on Knowledge Graphs

Language:PythonLicense:Apache-2.0Stargazers:3Issues:1Issues:0

copula-tabular

Generate tabular synthetic data using Gaussian copulas

Language:PythonLicense:MITStargazers:2Issues:1Issues:0

TabuGAN

A Tabular GAN with Attention Mechanisms, Reinforcement Learning, Knowledge Graphs and Clustering

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2Issues:0Issues:0

Markov-Chain-Monte-Carlo

Some cool Markov Chain Monte Carlo implementations

Language:Jupyter NotebookStargazers:2Issues:2Issues:0