marcoacf / starjeans

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Star Jeans

Scheduled Web Scrapping

image

NPM

Star Jeans Case

Eduardo and Marcelo are two Brazilians, friends and business partners. After several successful business, they are planning to enter the fashion market in US as an E-commerce business model. The initial idea is to enter the market with just one product and for a specific audience, in this case the product would be Jeans for the male audience. The objective is to maintain the operating cost low and scale as they get customers. However, even with the input product and audience defined, the two partners do not have experience in this fashion market and therefore they don't know how to define basic things like price, the type of pants and the material for the manufacture of each piece. The main competitors of Star Jeans company are the American companies H&M and Macys.

The two partners hired a Data Science consultancy to answer the following questions:

A. What is the best selling price for pants?

B. How many types of pants and yours colors for the initial product?

C. What are the raw materials needed to make the pants?

Solution Strategy

  1. Step by step to build the median or mean calculation

    • Perform the calculation of the median on the product, type and color
  2. Define the delivery format (View, Table, Phrase)

    • Bar graph with the median of product prices, by type and color in the last 30 days.

    • Table with the following columns: id I product_name I product_type I product_color I product_price

    • Schema definition: Columns and their type

    • Definition of storage infrastructure (SQLITE3)

    • ETL Design (Extract, Transform and Load Scripts)

    • Script Scheduling Planning (dependencies between scripts)

  3. Delivery of the final product

    • App with Streamlit
  4. Tools

    • Python 3.8.
    • Rebscrapping Libraries (BS4, Selenium)
    • PyCharm
    • Jupyter Notebook (Analysis and prototyping)
    • Task Scheduler
    • Streamlit

Conclusion

Under development.

Author

Djalma Jr

LinkedIn

About

License:MIT License


Languages

Language:Jupyter Notebook 73.5%Language:Python 26.5%