Instructor: Michael L. Nelson mln@cs.odu.edu
Office Hours: Thursdays 3-4 and by appointment
Time: Thursdays 4:20pm - 7:00pm
Place: ECS Building, r. 3102 and/or online
Class Email list: https://groups.google.com/group/cs895-f20
After a review of the protocols and mechanics of web archiving and social media, this class will focus on using web archives to establish the veracity of information that we experience online. The class will stress student participation and presentation. Students will summarize the research work of others (chosen from a reading list), as well as propose their own forensics studies in areas of their own interest. Examples of some of the resources resulting from the prior offering of this course include:
-
Michael L. Nelson, Russell Westbrook, Shane Keisel, Fake Twitter Accounts, and Web Archives, 2019.
-
Nauman Siddique, TweetedAt: Finding Tweet Timestamps for Pre and Post Snowflake Tweet IDs, 2019.
-
Sawood Alam, Cookie Violations Cause Archived Twitter Pages to Simultaneously Replay in Multiple Languages, 2019.
Other forensics examples:
-
The Conservative Party Speeches and Why We Need Multiple Web Archives
-
Links to abovethelaw.com broken on the live web and blocked from the archive.
-
Dominic Cummings claiming he warned about a coronavirus in 2019
-
The interaction between search engine caches and web archives.
-
GOP candidate Marjorie Taylor Greene spread conspiracies about Charlottesville and 'Pizzagate'.
-
Right-Wing Media Outlets Duped by a Middle East Propaganda Campaign.
-
The Internet Archive Is Being Used As A Disinformation Mule.
-
Week 1 - September 3 - The W3C Web Architecture, Memento Protocol, and Research Issues With Web Archiving
- Background: Memento 101, UTC, ISO 8601, robots.txt, The Missing Semester of Your CS Education
-
Week 2 - September 10 - Continued: The W3C Web Architecture, Memento Protocol, and Research Issues With Web Archiving
-
Week 3 - September 17 - Web Archives at the Nexus of Good Fakes and Flawed Originals, Russell Westbrook, Shane Keisel, Fake Twitter Accounts, and Web Archives
-
Week 4 - September 24 - Student Presentation 1
- Shawn -- Joan Donovan, Brian Friedberg, Source Hacking: Media Manipulation in Practice, 2019.
- Travis -- Jennifer Golbeck et al., Fake News vs Satire: A Dataset and Analysis, Proceedings of the 10th ACM Conference on Web Science, 2018.
- Kritika -- Scott G. Ainsworth, Michael L. Nelson, Herbert Van de Sompel, Only One Out of Five Archived Web Pages Existed as Presented, Proceedings of Hypertext 2015, 2015.
- Grant -- Savvas Zannettou, Jeremy Blackburn, Emiliano De Cristofaro, Michael Sirivianos, Gianluca Stringhini, Understanding Web Archiving Services and Their (Mis)Use on Social Media, Proceedings of ICWSM 2018.
-
Week 5 - October 1 - Student Presentation 2
- Harveen -- Andy Greenberg, Hackers Broke Into Real News Sites to Plant Fake Stories, 2020.
- Valentina -- Louise Lief, What the news media can learn from librarians, Columbia Journalism Review, 2016.
- Evan -- Clifford Lynch, Stewardship in the 'Age of Algorithms' First Monday 22(12), 2017.
- Peter -- Mohamed Aturban, Michele C. Weigle, Michael L. Nelson, Difficulties of Timestamping Archived Web Pages, Technical Report arXiv:1712.03140, 2017.
-
Week 6 - October 8 - A Framework for Verifying the Fixity of Archived Web Resources
-
Week 7 - October 15 - Student Forensics Studies 1
-
Week 8 - October 22 - Student Forensics Studies 2
-
Week 9 - October 29 - Student Presentation 3
- Travis -- Amelia Acker, Data Craft: The Manipulation of Social Media Metadata, 2018.
- Udith -- Max Read, How Much of the Internet Is Fake? Turns Out, a Lot of It, Actually, New York Magazine, 2018.
- Himarsha -- Kate Starbird, Carly Miller, Examining Twitter’s policy against election-related misinformation in action, 2020.
- Jim -- L Chai, D Bau, SN Lim, P Isola. What makes fake images detectable? Understanding properties that generalize, European Conference on Computer Vision, 2020.
-
Week 10 November 5 - Student Presentation 4
- Kritika -- Sawood Alam, Plinio Vargas, Michele C. Weigle, Michael L. Nelson, Impact of HTTP Cookie Violations in Web Archives, Technical Report arXiv:1906.07141, 2019.
- Grant -- Takuya Watanabe, Eitaro Shioji, Mitsuaki Akiyama, and Tatsuya Mori, Melting Pot of Origins: Compromising the Intermediary Web Services that Rehost Websites, Proceedings of NDSS 2020.
- Evan -- Jack Cushman, Ilya Kreymer, Thinking like a hacker: Security Considerations for High-Fidelity Web Archives, 2017; Ada Lerner, Tadayoshi Kohno, Franziska Roesner, Rewriting history: Changing the archived web from the present, Proceedings of the 2017 ACM SIGSAC, 2017.
- Shawn -- Luca Luceri, Ashok Deb, Adam Badawy, Emilio Ferrara, Red Bots Do It Better: Comparative Analysis of Social Bot Partisan Behavior, Technical Report arXiv:1902.02765, 2019.
-
Week 11 - November 12 - Student Forensics Studies 3
-
Week 12 - November 19 - Student Forensics Studies 4
- Travis -- Finding The Nintendo Gigaleaks
- Evan -- Is the World Health Organization (WHO) Influenced by Political Power?
- Valentina -- Black Lives in the Twitterverse
- Udith -- Co-opting of #SaveTheChildren
-
Week 13 - November 26 - Thanksgiving Holiday -- no class
-
Week 14 - December 3 - Student Presentation 5
-
Week 15 - December 10 - Student Presentation 6
- Harveen -- Grampa, what’s a deleted tweet? and Delete Forensics
- Valentina -- Blacktivists in the Archive
- Udith -- Follower Factory
-
Week 16 - December 17 - Student Forensics Studies 5
- Jim -- Analyzing the War over Diversity and Ethics in the AI Community
- Travis -- Finding JennaMarbles’s Deleted/Private YouTube Videos
- Kritika -- Analyzing hashtag squatting by K-pop stans
- Harveen -- Neera Tanden deletes tweets that criticized U.S. senators
- Grant -- Tracking Narratives on Election Fraud
- Peter -- BidenCheated Hashtag spread on Twitter
-
Election Integrity Partnership Team, Repeat Offenders: Voting Misinformation on Twitter in the 2020 United States Election, 2020.
-
Melanie Smith, Interpreting Social Qs: Implications of the Evolution of QAnon, 2020.
-
Mary Huber, Chinese Citizens Find Ways to Circumvent COVID-19 Censorship, 2020; Amelia Acker, Platforms, Community Archives and Remembering the Pandemic, 2020.
-
Donie O'Sullivan, How We Proved That the Biggest Black Lives Matter Page on Facebook Was Fake, 2020.
-
Kate Starbird, Carly Miller, Examining Twitter’s policy against election-related misinformation in action, 2020.
-
L Chai, D Bau, SN Lim, P Isola. What makes fake images detectable? Understanding properties that generalize, European Conference on Computer Vision, 2020.
-
Andy Greenberg, Hackers Broke Into Real News Sites to Plant Fake Stories, 2020.
-
Russell Brandom, Researchers uncover six-year Russian misinformation campaign across Facebook and Reddit, 2020.
-
Joan Donovan, Covid hoaxes are using a loophole to stay alive—even after content is deleted, 2020.
-
Joan Donovan, Protest misinformation is riding on the success of pandemic hoaxes, 2020.
-
Joan Donovan, Brian Friedberg, Source Hacking: Media Manipulation in Practice, 2019.
-
Renee DiResta, Isabella García-Camargo, Virality Project (US): Marketing meets Misinformation, 2020.
-
Caroline Orr, Pro-Trump & Russian-Linked Twitter Accounts Are Posing As Ex-Democrats In New Astroturfed Movement, 2018.
-
Takuya Watanabe, Eitaro Shioji, Mitsuaki Akiyama, and Tatsuya Mori, Melting Pot of Origins: Compromising the Intermediary Web Services that Rehost Websites, Proceedings of NDSS 2020.
-
Savvas Zannettou, Jeremy Blackburn, Emiliano De Cristofaro, Michael Sirivianos, Gianluca Stringhini, Understanding Web Archiving Services and Their (Mis)Use on Social Media, Proceedings of ICWSM 2018.
-
Jack Cushman, Ilya Kreymer, Thinking like a hacker: Security Considerations for High-Fidelity Web Archives, 2017; Ada Lerner, Tadayoshi Kohno, Franziska Roesner, Rewriting history: Changing the archived web from the present, Proceedings of the 2017 ACM SIGSAC, 2017.
-
Ahmer Arif, Leo Graiden Stewart, Kate Starbird, Acting the Part: Examining Information Operations Within# BlackLivesMatter Discourse, Proceedings of the ACM on Human-Computer Interaction - CSCW, 2018.
-
Louise Lief, What the news media can learn from librarians, Columbia Journalism Review, 2016.
-
Clifford Lynch, Stewardship in the 'Age of Algorithms' First Monday 22(12), 2017.
-
Luca Luceri, Ashok Deb, Adam Badawy, Emilio Ferrara, Red Bots Do It Better: Comparative Analysis of Social Bot Partisan Behavior, Technical Report arXiv:1902.02765, 2019.
-
Xinyi Zhou, Reza Zafarani, Fake News: A Survey of Research, Detection Methods, and Opportunities, Technical Report arXiv:1812.00315, 2018.
-
Jacob Eisenstein , Brendan O'Connor, Noah A. Smith, Eric P. Xing, Diffusion of Lexical Change in Social Media, PLoS ONE 9(11), 2014.
-
Tom Wilson, Kaitlyn Zhou, Kate Starbird, Assembling Strategic Narratives: Information Operations as Collaborative Work within an Online Community, Proceedings of the ACM on Human-Computer Interaction - CSCW, 2018.
-
Renee DiResta, The Digital Maginot Line, Ribbonfarm, 2018.
-
Melanie Smith, Archives: Facebook Finds “Coordinated and Inauthentic Behavior” In the Philippines; Suspends a Set of Pro-Government Pages Ahead of May Elections, 2019.
-
Justin Littman, Vulnerabilities in the U.S. Digital Registry, Twitter, and the Internet Archive, 2017; Justin Littman, Suspended U.S. government Twitter accounts, 2017
-
Mohamed Aturban, Michele C. Weigle, Michael L. Nelson, Difficulties of Timestamping Archived Web Pages, Technical Report arXiv:1712.03140, 2017.
-
Scott G. Ainsworth, Michael L. Nelson, Herbert Van de Sompel, Only One Out of Five Archived Web Pages Existed as Presented, Proceedings of Hypertext 2015, 2015.
-
Jennifer Golbeck et al., Fake News vs Satire: A Dataset and Analysis, Proceedings of the 10th ACM Conference on Web Science, 2018.
-
Amelia Acker, Data Craft: The Manipulation of Social Media Metadata, 2018.
-
Max Read, How Much of the Internet Is Fake? Turns Out, a Lot of It, Actually, New York Magazine, 2018.
-
Michael L. Nelson, Why we need multiple web archives: the case of blog.reidreport.com, 2018.
-
Mohammed Nauman Siddique, "Grampa, what's a deleted tweet?", 2018; Ed Summers, Delete Forensics, 2017.
-
Melanie Ehrenkranz, How Archivists Could Stop Deepfakes From Rewriting History, 2018.
-
Kate Starbird, The Surprising Nuance Behind the Russian Troll Strategy, 2018.
-
Ed Summers, Blacktivists in the Archive, 2017.
-
Clifford Lynch, Managing the Cultural Record in the Information Warfare Era, EDUCAUSE Review 53(6), 2018.
-
Nicholas Confessore, Gabriel J.X. Dance, Rich Harris and Mark Hansen, The Follower Factory, The New York Times, January 27, 2018.
-
Sawood Alam, Plinio Vargas, Michele C. Weigle, Michael L. Nelson, Impact of HTTP Cookie Violations in Web Archives, Technical Report arXiv:1906.07141, 2019.
-
Amelia Acker, Mitch Chaiet, The weaponization of web archives: Data craft and COVID-19 publics, The Harvard Kennedy School (HKS) Misinformation Review, 2020.
-
Nattiya Kanhabua, Philipp Kemkes, Wolfgang Nejdl, Tu Ngoc Nguyen, Felipe Reis, Nam Khanh Tran, How to Search the Internet Archive Without Indexing It, Proceedings of TPDL 2016.
-
Hugo C. Huurdeman, Anat Ben-David, Jaap Kamps, Thaer Samar, and Arjen P. de Vries, Finding Pages on the Unarchived Web, Proceedings of JCDL 2016.
-
Anat Ben-David, Counter-archiving Facebook, European Journal of Communication, 35(3), 2020.
-
Anat Ben-David, Adam Amram, The Internet Archive and the socio-technical construction of historical facts, Internet Histories 2(1-2), 2018.
-
Anat Ben-David, 2014 not found: a cross-platform approach to retrospective web archiving, Internet Histories 3(3-4), 2019.
-
Anat Ben-David, What does the Web remember of its deleted past? An archival reconstruction of the former Yugoslav top-level domain, New Media & Society, 18(7), 2016.
-
Farhan Asif Chowdhury, Lawrence Allen, Mohammad Yousuf, Abdullah Mueen, On Twitter Purge: A Retrospective Analysis of Suspended Users, 4th International Workshop on Mining Actionable Insights from Social Networks, 2020.