arunprsh / fdiworkshop

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Financial Data Innovation Workshop

As part of Financial Data Innovation workshop, participants will use familiar datasets, and AWS tools to speed up data ingestion, data analytics, and gain insights from the data.

Time Commitment Expectations: This workshop was created to be completed in approximately 2 hours.

Considerations for Each Role

As the team lead on this lean team of one, you'll need to wear multiple hats. Below are some things we'll cover from the perspective of each role:

  • Developer - You'll modify python script to create data catalog in AWS Glue.
  • Data Scientist - You'll need to load the data into your machine learning development environment. Once loaded, you'll understand the data, use a machine learning algorithm to train the model and do predictions.
  • Trader - You will use different trading strategies based on data to maximize Profit & Loss while attributing to Risk.
  • Analyst - You will use data visualization and AI tools to analyze Reports and gain insights from data.

Goals

At minimum, at the end of this workshop, you should be successfully ingest data, use AWS Glue catalog, run adhoc queries against raw files using Amazon Athena, visualize data through AWS QuickSight. Load data onto Amazon Redshift for complex analytics and periodic reporting, and use Sagemaker to gain insights from the data.

Solution Architecture

Customers can search for and subscribe to data using AWS Data Exchange. Data can be delivered directly to their S3 bucket. AWS Glue can then maintain the data catalog. The raw data then becomes queryable through Amazon Athena. Amazon QuickSight can be used for Visualization. In order to perform complex analytics and periodic reporting, customers can use Amazon Redshift. The Redshift spectrum feature lets join data between the data warehouse (Redshift) and the data lake (S3). Amazon QuickSight also has Machine Learning insights built into it. More Machine learning can be done using Amazon Sagemaker. Customers across the organization can realize value from their data assets. Even without ML skills, personnel can use advanced AI tools such as Amazon Translate and Amazon Comprehend to translate and do sentiment analysis on data. Developers can also easily integrate these higher order services.

Architecture diagram

Supported regions:

  • us-east-1 (N. Virginia)
  • us-east-2 (Ohio)
  • us-west-2 (Oregon)
  • ap-southeast-1 (Singapore)
  • ap-northeast-1 (Tokyo)
  • eu-central-1 (Frankfurt)
  • eu-west-1 (Ireland)

Modules

  1. Module 0: Setting up the environment
  2. Module 1: Get market data and catalog data using AWS Glue
  3. Module 2: Run SQL Queries against raw data using Amazon Athhena
  4. Module 3: Visualize data using Amazon QuickSight
  5. Module 4: Complex Analytics and Reporting using Amazon Redshift
  6. Module 5: Machine Learning using Amazon Sagemaker
  7. Module 6: Cleanup

About


Languages

Language:Jupyter Notebook 54.1%Language:Python 45.9%