Tsangares / 2023fall

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CGU School of Social Science, Policy & Evaluation CGU
Department of Economic Sciences
Causal Modeling, Big Data and Machine Learning
Fall 2023

Contact Information

Course Instructor: Greg DeAngelo

Office:
E-mail: gregory.deangelo@cgu.edu
Office Hours:

Course Instructor: Scott Cunningham

E-mail: scunning@gmail.com

Course Instructor: Minjae Yun

E-mail: minjae.yun@cgu.edu

Teaching Assistant: Anuar Assamidanov

E-mail: anuar.assamidanov@cgu.edu

Course Schedule

Semester start/end dates: 8/28/2023 – 12/16/2023
Meeting day, time: Tuesday, 10:00 AM - 11:50 AM PST
Course Location: Online

Course Description

This course will cover statistical methods based on the machine learning literature that can be used for causal inference. In economics and the social sciences more broadly, empirical analyses typically estimate the effects of counterfactual policies, such as the effect of implementing a government policy, changing a price, showing advertisements, or introducing new products. Recent advances in supervised and unsupervised machine learning provide systematic approaches to model selection and prediction, methods that are particularly well suited to datasets with many observations and/or many covariates.

Background Preparations (Prerequisites)

Econometrics, probability and statistics, basic programming

Student Learning Outcomes

By the end of this course, students will be able to:

  1. Secure the system and reproducibility of data analysis through programming
  2. Implement machine learning algorithms
  3. Develop a causal identification strategy
  4. Identify the basic assumptions of causal inference as applied to machine learning

Texts and Journal References

  • Required: James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. "An Introduction to Statistical Learning with Applications in Python." New York: Springer, 2023. (Free PDF: https://www.statlearning.com)
  • Optional: Matheus Facure. "Causal Inference in Python: Applying Causal Inference in the Tech Industry." 1st Edition. O'Reilly Media, 2023.
  • Optional: Sutton, Richard S., and Andrew G. Barto. "Reinforcement Learning: An Introduction." Second Edition. MIT Press, Cambridge, MA, 2018. (Free PDF: http://incompleteideas.net/book/the-book-2nd.html)
  • Modules

    For each week, a set of required problem sets are assigned. Supplementary readings are also provided for those who wish to delve deeper.

    1. Introduction to Causal Inference and Machine Learning
    2. Data Collection 1: Working with APIs
    3. Machine Learning Fundamentals for Estimating Treatment Effects
    4. Python Programming for Estimating Treatment Effect
    5. Estimating Heterogenous Treatment Effect
    6. Double/Debiased Machine Learning (DML)*
    7. Introduction to Causal Forests*
    8. Multi-armed Bandits and Causal Decision Making*
    9. Instrumental Variable Lasso (IV Lasso)*
    10. Synthetic Difference-in-Differences (Diff-in-Diffs)
    11. Data Collection 2. Web Scraping
    12. Automating Process and Data Visualization
    13. Introduction to Unsupervised Learning
    14. Matrix Completion Techniques for "Missing" Data
    *Weeks marked with an asterisk (*) are subject to potential changes based on the course's evolving curriculum.

    Week 1. Introduction to Causal Inference and Machine Learning

    Econometrics recap and the gist of statistical learning and supervised/unsupervised machine learning

  • Reading: Athey, Susan and Guido Imbens (2019) Machine Learning Methods That Economists Should Know About
  • Chapter 6 from An introduction to statistical learning with applications in Python
  • News article Data labeling in supervised learning
  • Lecture Note
  • Week 2. Data Collection 1: Working with APIs

    Manage covariates from US Census, UCR, Twitter, Reddit, and else

  • Chapter 7 from An introduction to statistical learning with applications in Python
  • Basic Programming Lecture Note
  • US Census
  • FBI Crime data
  • Reddit
  • Python package for Reddit
  • Twitter
  • Python package for Twitter
  • Jacob Kaplan's Reservoir
  • Week 3. Machine Learning Fundamentals for Estimating Treatment Effects

    The promise of machine learning in estimating treatment effects

  • Lecture Notes from Dr. Brigham Frandsen's workshop
  • Chapter 8 from An introduction to statistical learning with applications in Python
  • Week 4. Python Programming for Estimating Treatment Effect

  • Lecture Notes from Dr. Brigham Frandsen's workshop
  • Chapter 10 from An introduction to statistical learning with applications in Python
  • Week 5. Estimating Heterogenous Treatment Effect

  • Lecture Notes from Dr. Brigham Frandsen's workshop
  • Reading: Athey, Susan, and Guido Imbens (2016) Reading Recursive Partitioning for Heterogeneous Causal Effects
  • Reading: Chernozhukov, Victor, Mert Demirer, Esther Duflo, and Iván Fernández-Val (2020) Generic Machine Learning Inference on Heterogenous Treatment Effects in Randomized Experiments, with Application to Immunization in India

    Week 6. Double/Debiased Machine Learning (DML)

    Lecture by Dr. Scott Cunningham

  • Chapter 22 from Causal Inference for The Brave and True
  • Optional Reading: Chapter 4 from Causal Inference in Python
  • Week 7. Introduction to Causal Forests

    Lecture by Dr. Scott Cunningham

  • Reading: Athey, Susan, and Guido Imbens (2016) The Econometrics of Randomized Experiments
  • Week 8. Multi-armed Bandits and Causal Decision Making

    Lecture by Dr. Scott Cunningham

  • Optional Reading: Chapter 2 from Reinforcement Learning: An Introduction.
  • Week 9. Instrumental Variable Lasso (IV Lasso)

    Lecture by Dr. Scott Cunningham

  • Reading: Belloni, Alexandre, Victor Chernozhukov, Christian Hansen (2011) LASSO Methods for Gaussian Instrumental Variables Models
  • Week 10. Synthetic Difference-in-Differences (Diff-in-Diffs)

  • Reading: Arkhangelsky, Dmitry, Susan Athey, David A. Hirshberg, Guido Imbens, and Stefan Wager (2021) Synthetic Difference in Differences
  • Python package: pysynthdid
  • Data: Castle doctrine
  • Week 11. Data Collection 2. Web Scraping

    Collecting various information from cyberspace including news articles and create a flat data file

  • Lecture Note
  • Week 12. Automating Process and Data Visualization

    For reproducibility and systematic management of data analysis

    Week 13. Introduction to Unsupervised Learning

  • Chapter 12 from An introduction to statistical learning with applications in Python
  • Ludwig, Jens and Mullainathan, Sendhil, Algorithmic Behavioral Science: Machine Learning as a Tool for Scientific Discovery (July 15, 2022). Chicago Booth Research Paper No. 22-15, Available at SSRN: https://ssrn.com/abstract=4164272 or http://dx.doi.org/10.2139/ssrn.4164272
  • Week 14. Matrix Completion Techniques for "Missing" Data

  • Reading: Athey, Susan, Mohsen Bayati, Nikolay Doudchenko, Guido Imbens, and Khashayar Khosravi (2021) Matrix Completion Methods for Causal Panel Data Models
  • Paper example
  • Python Coding
  • R Package: gsynth and MCPanel
  • Data: California smoking dat
  • Grading

    Your grade will be calculated using the following scale. Grades with plus or minus designations are at the professor’s discretion.

    Letter Grade Grade Point Description Learning Outcome
    A 4.0 Complete mastery of course material and additional insight beyond course material (Overall grade percent ≥ 90) Insightful
    B 3.0 Complete mastery of course material (90 > Overall grade ≥ 80) Proficient
    C 2.0 Caps in mastery of course material; not at level expected by the program (80 > Overall grade ≥ 65) Developing
    U 0.0 Unsatisfactory (65 > Overall grade Ineffective

    Continual matriculation at CGU requires a minimum grade point average (GPA) of 3.0 in all coursework taken at CGU. Students may not have more than two incompletes. Details of the policy are found on the Student Services webpage. https://mycampus.cgu.edu/web/registrar/for-current-students/student-policies#Satisfactory_Academic_Progress

    Course Policies:

    The CGU institutional policies apply to each course offered at CGU. A few are detailed in the space below. Students are encouraged to review the student handbook for the program as well as the policy documentation within the bulletin and on the Registrar’s pages. http://bulletin.cgu.edu/

    Attendance

    Students are expected to attend all classes. Students who are unable to attend class must seek permission for an excused absence from the course director or teaching assistant. Unapproved absences or late attendance for three or more classes may result in a lower grade or an “incomplete” for the course. If a student has to miss a class, he or she should arrange to get notes from a fellow student and is strongly encouraged to meet with the teaching assistant to obtain the missed material. Missed extra-credit quizzes and papers will not be available for retaking.

    Scientific and Professional Ethics

    The work you do in this course must be your own. Feel free to build on, react to, criticize, and analyze the ideas of others but, when you do, make it known whose ideas you are working with. You must explicitly acknowledge when your work builds on someone else's ideas, including ideas of classmates, professors, and authors you read. If you ever have questions about drawing the line between others' work and your own, ask the course professor who will give you guidance. Exams must be completed independently. Any collaboration on answers to exams, unless expressly permitted, may result in an automatic failing grade and possible expulsion from the Program. Additional information on CGU academic honesty is available on the Student Services webpage. https://cgu.policystat.com/policy/2194316/latest/

    Instructor Feedback and Communication

    The best way to get in touch with me is by email. I will respond to email/voice messages within two business days.

    Expectations and Logistics

    Accommodations for Students with Disabilities:

    If you would like to request academic accommodations due to temporary or permanent disability, contact Dean of Students and Coordinator for Student Disability Services at DisabilityServices@cgu.edu or 909-607- 9448. Appropriate accommodations are considered after you have conferred with the Office of Disability Services (ODS) and presented the required documentation of your disability to the ODS.

    Mental Health Resources

    Graduate school is a context where mental health struggles can be exacerbated. If you ever find yourself struggling, please do not hesitate to ask for help. If you wish to seek out campus resources, here is some basic information about Monsour. https://www.cuc.claremont.edu/mcaps/
    “Monsour Counseling and Psychological Services (MCAPS) is committed to promoting psychological wellness for all students served by the Claremont University Consortium. Our well-trained team of psychologists, psychiatrists, and post-doctoral and intern therapists offer support for a range of psychological issues in a confidential and safe environment.”
    Phone 909-621-8202
    Fax 909-621-8482
    After hours emergency 909-607-2000
    Tranquada Student Services Center, 1st floor
    757 College Way
    Claremont, CA 91711

    Title IX:

    If I learn of any potential violation of our gender-based misconduct policy (rape, sexual assault, dating violence, domestic violence, or stalking) by any means, I am required to notify the CGU Title IX Coordinator at Deanof.Students@cgu.edu or (909) 607-9448. Students can request confidentiality from the institution, which I will communicate to the Title IX Coordinator. If students want to speak with someone confidentially, the following resources are available on and off campus: EmPOWER Center (909) 607-2689, Monsour Counseling and Psychological Services (909) 621-8202, and The Chaplains of the Claremont Colleges (909)621-8685. Speaking with a confidential resource does not preclude students from making a formal report to the Title IX Coordinator if and when they are ready. Confidential resources can walk students through all of their reporting options. They can also provide students with information and assistance in accessing academic, medical, and other support services they may need.

    About


    Languages

    Language:Jupyter Notebook 100.0%