google / civics_cdf_validator

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fix OCD-ID Validation with election_results_xml_validator

jdmgoogle opened this issue · comments

User feedback:

Your validator reads the local csv file correctly but then does not find the ocd id in the array/object generated from the local csv file.

the problem is that he's comparing the following things here https://github.com/google/election_results_xml_validator/blob/master/rules.py#L454:
ocd-division/country:ee (from the candidates-xml) with

{b'ocd-division/country:ee', b'id'} (from the local .csv file)

And he doesn’t comes in the if-statement. Without the b from {b'ocd-division/country:ee', b'id'} it works well.
Example:

text = "ocd-division/country:ee";

ocds = {'id', 'ocd-division/country:ee'};

if text in ocds:
    print("in")
else:
    print("not in")

There is also a TODO here in your code https://github.com/google/election_results_xml_validator/blob/master/rules.py#L428

Your Code:
with io.open(countries_file, mode="rb") as fd:
    for line in fd:
        if line is not "":
            # TODO use a CSV Reader
            ocd_id_codes.add(line.split(b",")[0])

Replaced with:

with open(countries_file) as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=',')
    line_count = 0
    for row in csv_reader:
        if line_count > 0:
            ocd_id_codes.add(row[0])
        line_count += 1

Note: We could probably use a csv.DictReader:

https://docs.python.org/2/library/csv.html#csv.DictReader

and then wouldn't have to do the line count. We'd replace row[0] with a row['id'] after ensuring that 'id' is in row.