JTFouquier / tokenizer

Tokenizer homework for Bioinformatics class

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

7711 Homework 2 Type/Token Ratios

Use Python3 to run hw2.py:

If Python3 is your default interpreter, run python hw2.py or run python3 hw2.py

If you have any questions about running this program, please contact jennifer dot harper @ucdenver.edu or ping me on the CPBS Slack channel.

Original Directions

Homework: type/token ratios

  1. Write a program to count the number of word tokens in a text file. Compare your output to a classmate’s. What decisions did you make differently regarding what counts as “a" token?
  2. Write a program to count the number of word types in a text file. Compare your output to a classmate’s. What decisions did you make differently regarding what counts as a type?
  3. As you have seen, type/token ratios are widely used in research in neurology, psychiatry, the social sciences, and the humanities. How could a researcher’s definitions of “type" and “token" affect whether or not they have a positive result to report?

About

Tokenizer homework for Bioinformatics class


Languages

Language:Python 100.0%