WikiPathFinder

WikiPathFinder is a tool that tells you the quickest path from START_WKIPEDIA_URL to END_WIKIPEDIA_URL.

For Users

Background

WikiRacing is a commonly known game where 2 people start at the same START_WIKIPEDIA_URL and try to get to the same END_WIKIPEDIA_URL. Whoever gets to END_WIKIPEDIA_URL the fastest is the winner.

How to Use

Install all necessary node modules

cheerio (for web scraping): npm i cheerio
request-promise npm i request-promise
dcp: npm i dcp-client
prompt-sync npm install prompt-sync
require() function: npm i --save-dev @types/node

Start terminal program via node out/index.js
Enter start URL (eg: https://en.wikipedia.org/wiki/Gamer)
Enter end URL (eg: https://en.wikipedia.org/wiki/Google)
Use DCP (y/n) ? (eg: y)
Obtain results. ([] is path not found)

Example Usage (with DCP)

[Using DCP]

Start terminal program via node out/index.js
Enter start URL (eg: https://en.wikipedia.org/wiki/Gamer)
Enter end URL (eg: https://en.wikipedia.org/wiki/Google)
Use DCP (y/n) ? (eg: y)
Obtain results.

Result:
START https://en.wikipedia.org/wiki/Gamer
TO https://en.wikipedia.org/wiki/Gamer
TO END https://en.wikipedia.org/wiki/Google

Example Usage (without DCP)

[Without DCP]

Start terminal program via node out/index.js
Enter start URL (eg: https://en.wikipedia.org/wiki/Among_Us)
Enter end URL (eg: https://en.wikipedia.org/wiki/University_of_Guelph)
Use DCP (y/n) ? (eg: n)
Obtain results.

Result
START https://en.wikipedia.org/wiki/Among_Us
TO https://en.wikipedia.org/wiki/Cover_art
TO https://en.wikipedia.org/wiki/Illustration
TO https://en.wikipedia.org/wiki/Ben_Shneiderman
TO END https://en.wikipedia.org/wiki/University_of_Guelph

For Developers

Installing Dependencies

Update all dependencies listed in package.json: npm install

Node Dependencies

If npm install doesn't work:

type script npm i -g type-script
cheerio (for web scraping): npm i cheerio
request-promise npm i request-promise
request-promise types npm i --save-dev @types/request-promise
prompt-sync npm i prompt-sync
prompt-sync types npm i --save-dev @types/prompt-sync

How we use DCP

The original plan was to do all of the web scraping and run the findPath algorithm on DCP. However, we were told by a mentor that DCP could not use the internet. So we were severely limited with integrating DCP into our idea. So for now, we give the user 2 options (to use DCP and to not use DCP) where each have their own pros and cons:

Using DCP Pros:

findPath algorithm is super fast

Using DCP Cons:

Have to do all web-scraping on system beforehand because no internet access.
The way our algorithm works: it was either using 1 worker or 300+ workers (300+ is how many URLs are in the first wikipedia start page).

Without DCP Pros:

Web Scraping is dynamic.
The user can enter any possible wikipedia links.

Without DCP Cons:

Takes very long to path find
Web scraping speed is limited to user's system

About

HackTheJob 2022 Submission - WikiPathFinder is a program that finds the fastest possible path from one wiki page to another by only clicking links.

nodejs ts webscraping dcp

Languages

Language:TypeScript 91.6%Language:JavaScript 8.4%

JeremyTubongbanua / WikiPathFinder