rcorrero / clustering-capital-markets

We study capital as it flows through the American equity markets. Using daily returns and volume data we develop a method to identify regimes in the markets over time, regimes which characterize much of the markets' behavior. By segmenting the data we analyze the causes of market behavior associated with a given regime.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

clustering-capital-markets — Richard Correro

This is a machine learning project created using Sci-kit Learn and thermidor.

In this project we study capital as it flows through the American equity markets. Using daily returns and volume data we develop a method to identify regimes in the markets over time, regimes which characterize much of the markets' behavior. By segmenting the data we analyze the causes of market behavior associated with a given regime.

The main intuition underpinning this project is that the financial markets, as the conduits of capital, act as "encoders" recording the behavior of capital in the face of new information. This information is encoded in the price histories of the instruments traded on the markets.

Data

In this project we use returns, price, and volume data from a universe of equities traded on exchanges in the United States. This data is provided by the Center for Research in Security Prices and obtained through Wharton Research Data Services. We access this data through our institution, and we do not have the rights to publish it. Because of this the data folder in our local directory is excluded from this repository.

The data required for this project is available through other sources, and if you need help obtaining data then feel free to contact Richard Correro.

thermidor

This project depends on thermidor. thermidor is a Python module containing several functions and classes which simplify the creation of machine learning projects by streamlining Sci-kit Learn pipeline construction.


Organization

.
├── LICENSE
├── README.md
├── clustering_capital_markets
│   ├── __init__.py
│   ├── classes
│   │   ├── __init__.py
│   │   ├── gmm_socket_cv.py
│   │   └── k_means_socket.py
│   └── functions
│       ├── __init__.py
│       ├── gmm_dist.py
│       ├── labeled_data_joiner.py
│       ├── pca_dist.py
│       └── transform_returns.py
├── notebooks
│   └── 0.7.3-returns-pipeline-analysis.ipynb
└── setup.py
    

Dependencies


Created by Richard Correro in 2019. Contact me at rcorrero at stanford dot edu

About

We study capital as it flows through the American equity markets. Using daily returns and volume data we develop a method to identify regimes in the markets over time, regimes which characterize much of the markets' behavior. By segmenting the data we analyze the causes of market behavior associated with a given regime.

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Jupyter Notebook 92.4%Language:Python 7.6%