sverrirs / ruvsarpur

Python script to download shows off the Icelandic RÚV website.

Home Page:https://sverrirs.github.io/ruvsarpur/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Script can't find a video that is on ruv.is

Paladenimona opened this issue · comments

I can't download 32209, 9j5d0j. It says "Nothing found to download". The script also can't "find" it.

https://www.ruv.is/sjonvarp/spila/songvakeppnin-2022/32209/9j5d0j

It seems that the switch from https://api.ruv.is/api/programs/tv/all to https://api.ruv.is/api/programs/featured/tv earlier this year is the root cause of this. The list from featured is quite comprehensive but not complete like the one we got from the former URL. I already found 2 other shows missing from the featured list.

The only complete list endpoint I could find now was https://api.ruv.is/api/programs/all but this also includes radio entries. This list would have to be pre-filtered to include only tv entries before being passed along to the code section that queries https://api.ruv.is/api/programs/program/<SID>/all for episode data. Also the format seems very different to the one provided by featured so some data massaging is probably also needed.

Thank you for the analysis Andri, it is correct that the API endpoint I migrated the service to doesn't have all of the entries available. I will have a look at leveraging the /programs/all endpoint

So did a bit more research. Changing the api endpoints is trivial you simply have to modify

all_panel_data= api_data['panels'] if 'panels' in api_data else None
data = []
# Combine all 
for panel_data in all_panel_data:
  if 'programs' in panel_data:
  data.extend(panel_data['programs'])

to this

data = []
for entry in api_data:
  if entry['format'] == 'tv':
  data.append(entry)

No other changes really needed.

Which brings me to the bigger issue. Data inconsistency.
It seems that the whole of api.ruv.is is in some form of a deprecation procedure. Some endpoints don't work anymore, like /api/programs/tv/all and /programs/search/tv/<query>, while others, like the one I mentioned in my last post api/programs/all, has outdated and/or inconsistent data. An example of inconsistent data would be Lottó, that show has a different sid depending on if you query api/programs/all or api/programs/featured/tv. The sid you get from api/programs/featured/tv works when querying show details the on from api/programs/all

Looking around at other RÚV related utilities here on github and cursory analysis on ruv.is, it looks like the new way of doing things is to query RÚV's GraphQL via https://spilari.nyr.ruv.is/gql/. Example here