sammacbeth / tracker-radar-detector

Code used to build a Tracker Radar data set from raw crawl data.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DuckDuckGo Tracker Radar Detector

This is the code used to build a Tracker Radar data set using crawl data from the Tracker Radar Collector.

Getting Started

To generate a Tracker Radar data set follow these steps:

  1. Clone the Tracker Radar data repo

  2. Generate 3rd party request data using the Tracker Radar Collector

  3. Update the paths in config.json to point to your newly created crawler data files and the location of your Tracker Radar data repository

trackerDataLoc path to your Tracker Radar data repository
crawlerDataLoc path to your crawler data directory
performanceDataLoc path to your performance crawler data

Generating Tracker Radar data

  • Install dependencies

npm install

  • Build site performance summary (optional)

npm run build-performance

  • Update entity data (optional) note: requires some manual validation of the output data, see here for more info
npm run update-entities
npm run apply-entity-changes
  • Build Tracker Radar data files

npm run build

Note that if you wish to resolve CNAME's, node version 12+ is required. You can disable CNAME resolution by setting the option treatCnameAsFirstParty=true and keepFirstParty=false in the config file.

Contributing

Reporting bugs

  1. Check to see if the bug has not already been reported
  2. Create a bug report issue

New features

Right now all new feature development is handled internally.

Bug fixes

Most bug fixes are handled internally, but we will accept pull requests for bug fixes if you first:

  1. Create an issue describing the bug.
  2. Get approval from DDG staff before working on it. Since most bug fixes and feature development are handled internally, we want to make sure that your work doesn't conflict with any current projects

Questions or help with anything else DuckDuckGo related?

See DuckDuckGo Help Pages.

This software is licensed under the terms of the Apache License, Version 2.0 (see LICENSE).

About

Code used to build a Tracker Radar data set from raw crawl data.

License:Other


Languages

Language:JavaScript 100.0%