Wh014M / Whatsapp-Webscrapper

Webscrapper to extract Whatsapp messages from a user/group conversation.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Python Whatsapp Webscrapper

Webscrapper to extract Whatsapp messages from a user/group conversation.


Requirements


Usage

  1. Download the webdrivers and place them in the folder drivers.

    • Chrome driver must be named to: chromedriver.exe
    • Firefox driver must be named to: geckodriver.exe
  2. Configure the browser paths in settings.conf (only Firefox, Chrome or Brave compatible).
    (path examples inside the file, modify with your own configuration)

    browser : Browser name to use. Supported values: chrome/firefox. If you are using Brave, value is 'chrome'
    binary : Path where the browser executable is located.
    profile_path : Path where the browser user profile is located.

  3. Add the user(s) or group(s) in the contacts.json file from which you want to extract messages.
    last_message means from which message, all subsequent messages will be collected.
    One of the limitations of the program is that the first time we have to introduce a very old message, for this we have to manually navigate through WhatsApp to extract it.

  4. Running the program.

    python main.py

    The first time the browser window will open. We log-in to WhatsApp (QR code) and once the session is opened, we close the browser.
    Run the program again, a new window will be opened automatically, and it will start to search the contacts and extract the messages.


Data Export

The program will automatically create a folder in the root of the project called data, and within it all messages will be exported in csv format with the following columns:

  • Date
  • Hour
  • User
  • Message
  • Emojis
  • Quoted_Message

The file naming format is as follows: [contact_name/group] [date and hour].csv, for example:

  • John Smith 2021-04-03 21.25.38.csv
  • Family Group 2021-05-03 23.50.31.csv

Settings

Firefox profile_path

In Firefox, we can find our profile in:

  • Linux: /home/user/.mozilla/firefox/xxxxxxxx.default
  • Windows: C:/Users/[user_name]/AppData/Roaming/Mozilla/Firefox/Profiles/xxxxxxxx.default

Once the path is located, we add it to our settings.conf file in the section profile_path.

Chrome/Brave profile_path

For Chrome or Brave, open the browser and type chrome://version/ in the search bar. There you will find the path of the user profile.
Copy the path and add it to settings.conf in the section profile_path.


Program limitations

  • If the program finds a message equal to last_message, the program will take it as the last extraction point, regardless of whether that message was not our last starting point.




About

Webscrapper to extract Whatsapp messages from a user/group conversation.

License:MIT License


Languages

Language:Python 100.0%