jainr3 / SECret-Insights

Using unsupervised machine learning algorithms to analyze SEC 10-K filings

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cool datasets

firmai opened this issue · comments

Hi, thanks for the datasets, do you perhaps have the code available where you break the text into these different items, currently I am using regex but it is messy, I wonder if you don't have a primary source or cleaner solution?

https://github.com/jainr3/SECret-Insights/blob/main/sec_edgar_annual_financial_filings_2021/extracted/1001601_10K_2020_0001493152-21-008913.json

Oh no that doesn't parse it, parsing is happening somewhere else, it seems to be reading in already parse files. I am looking for a resource that I can update for a research paper. Thanks for the help though.

image