HelloChatterbox / PyWikiHow

scrap wiki how for info

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PyWikiHow

An unofficial WikiWow python API. Uses BeautifulSoup to scrape WikiHow and return the data you want.

Installation

pip install pywikihow

Usage

Random How To

Learn random stuff! Retuns a random WikiHow article. Sometimes they're weird.

from pywikihow import RandomHowTo

how_to = RandomHowTo()
how_to.print()

Searching

from pywikihow import WikiHow, search_wikihow


max_results = 1  # default for optional argument is 10
how_tos = search_wikihow("how to learn programming", max_results)
assert len(how_tos) == 1
how_tos[0].print()


# for efficiency and to get unlimited entries, the best is to use the generator
for how_to in WikiHow.search("how to learn python"):
    how_to.print()

Parsing

Manipulate HowTo objects

from pywikihow import HowTo

how_to = HowTo("https://www.wikihow.com/Train-a-Dog")

data = how_to.as_dict()

print(how_to.url)
print(how_to.title)
print(how_to.intro)
print(how_to.n_steps)
print(how_to.summary)

first_step = how_to.steps[0]
first_step.print()
data = first_step.as_dict()

how_to.print(extended=True)

ToDo

  • Many WikiHow articles also contain "Parts" which break down further into sub-steps. Write a function to parse these additional divisions.
  • Add parser for tips
  • Add parser for warnings

About

scrap wiki how for info

License:MIT License


Languages

Language:Python 100.0%