pydata / pandas-datareader

Extract data from a wide range of Internet sources into a pandas DataFrame.

Home Page:https://pydata.github.io/pandas-datareader/stable/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Key Error (trying to fetch out of range data)

jlaurentino opened this issue · comments

File ~/jupyter_dir/lib/python3.9/site-packages/pandas/core/indexes/base.py:3623, in Index.get_loc(self, key, method, tolerance)
3621 return self._engine.get_loc(casted_key)
3622 except KeyError as err:
-> 3623 raise KeyError(key) from err
3624 except TypeError:
3625 # If we have a listlike key, _check_indexing_error will raise
3626 # InvalidIndexError. Otherwise we fall through and re-raise
3627 # the TypeError.
3628 self._check_indexing_error(key)

KeyError: 'Date'

INTC 1900-01-01 through 1980-03-16, for example.

There is no data for that period. We should get an empty string and catch the error on a try-except routine. Pandas try to read nonexistent data and crashes.

Not a big issue but when trying to release the internet from unnecessary traffic it should count.

In my case, I keep some data for research and only try to fetch whatever data is not in the range already on file.

Best,

Laurentino

This is an interesting example because if you call it with multiple tickers it will work as you want:

pdr.get_data_yahoo(["INTC", "IBM"], 1970, 1975)

The issue is that _dl_mult_symbols looks for the KeyError exception to deal with missing data, but a single call to _read_one_data doesn't generate nans.

@bashtage is there a reason I shouldn't add to _read_one_data to make it return missing values between start and stop?