hse-aml / natural-language-processing

Resources for "Natural Language Processing" Coursera course.

Home Page:https://www.coursera.org/learn/language-processing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problem opening 'data/text_prepare_tests.tsv' file

mpizosdim opened this issue · comments

Using the docker container environment I am getting a UnicodeDecodeError. More speciffically:

prepared_questions = []
for line in open('data/text_prepare_tests.tsv'):
     line = text_prepare(line.strip())
     prepared_questions.append(line)
text_prepare_results = '\n'.join(prepared_questions)
grader.submit_tag('TextPrepare', text_prepare_results)

Is giving the following error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 79: ordinal not in range(128)

In order to run it I had to change it to:

prepared_questions = []
for line in open('data/text_prepare_tests.tsv', encoding='utf-8'):
     line = text_prepare(line.strip())
     prepared_questions.append(line)
text_prepare_results = '\n'.join(prepared_questions)
grader.submit_tag('TextPrepare', text_prepare_results)

Can also be solved by using pd.read_csv.

Is this error reproducible to anyone else?

Hi! Which Python version do you use?

@voron13e02 I am running through the docker image provided in the repository.(from the dockerfile i can see its python3).

Fixed.