elifish4 / webcrawler

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

webcrawler

I am going to create a class called LinkParser that inherits some methods from HTMLParser which is why it is passed into the definition.

It takes in an URL, a word to find,and the number of pages to search through before giving up.

The main loop, create a LinkParser and get all the links on the page. Also search the page for the word or string in our getLinks function we return the web page (this is useful for searching for the word) and we return a set of links from that web page (this is useful for where to go next)

About