justJay-dev / python-safer

A web scraping API written in Python to fetch data from the Department of Transportation's https://safer.fmcsa.dot.gov

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

non normalized safer data can break scrape

justJay-dev opened this issue · comments

commented

Because we hard type in html.py a non normalized input can cause janky crashes. Example search_by_us_dot(2026085)

commented
    # Formatting Mileage Year to a dictionary
    data['mcs_150_mileage_year'] = {
        'mileage': str(data['mcs_150_mileage_year'].split(' ')[0].replace(',', '')) if data[
            'mcs_150_mileage_year'] else None,
        'year': str(data['mcs_150_mileage_year'].split(' ')[1].replace('(', '').replace(')', '')) if data[
            'mcs_150_mileage_year'] else None
    }
commented

Changing from int to str will work around problem, it may be worthy moving all to strings and then type checking post scrape.