thegamedb API has changed?

Question

thegamedb API has changed?

pgiblock opened this issue 6 years ago · comments

Not sure, but I am a first time user of this scraper. I ran into the issue "It appears that thegamesdb.net isn't up". Looking at the code, it appears that the scraper attempts to GET http://thegamesdb.net/api/GetGame.php?id=1. After following the 302, a 404 is returned. From https://api.gamesdb.net , it appears the API has changed? Looks like one now needs to hit https://api.thegamesdb.net/Games/ByGameID?id=1&apikey=<API_KEY>.

Is this a recent change on gdb's side? Are there any plans to support the new API? I'm going to modify the code locally and hardcode an API Key temporarily and report back. Hopefully the endpoint paths (and addition of an API Key) is all that changed, and the scraper's parser can remain as-is.

Steven Selph · Answer 1 · Sun Jul 15 2018 02:53:05 GMT+0800 (China Standard Time)

Thanks for the heads-up. It looks like it has changed. I'll convert the code over this weekend.

…

On Sat, Jul 14, 2018, 2:46 PM Paul Giblock ***@***.***> wrote: Not sure, but I am a first time user of this scraper. I ran into the issue "It appears that thegamesdb.net isn't up". Looking at the code, it appears that the scraper attempts to GET http://thegamesdb.net/api/GetGame.php?id=1. After following the 302, a 404 is returned. From https://api.gamesdb.net , it appears the API has changed? Looks like one now needs to hit https://api.thegamesdb.net/Games/ByGameID?id=1&apikey=<API_KEY>. Is this a recent change on gdb's side? Are there any plans to support the new API? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#230>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHwNVvmHz8t4SV0puzBSuZJOSf-MecQIks5uGjyhgaJpZM4VP9cK> .

Steven Selph · Answer 2 · Sun Jul 15 2018 03:22:17 GMT+0800 (China Standard Time)

Actually I may have to remove support for this service. The apiKey they mention is for the dev and is limited to something like 1000 queries per month. I think they designed the API quota for people running a web server or something that mirrors the data and not a scraper like mine.

Ideally they would reconsider and allow users to generate an API key and they'd use their own individual quota. The shared quota for an app like mine makes no sense.

Paul Giblock · Answer 3 · Sun Jul 15 2018 03:24:51 GMT+0800 (China Standard Time)

I'm working on it now... If it is minor, then expect a pull request later today.

Edit: Blarg... just read your recent comment. This stinks as I feel their metadata is superior. Guess I'll try the 'ss' source and see if that gives me the data I want. Either that, or leverage one of the mirrors they are trying to protect against ;-)

Steven Selph · Answer 4 · Sun Jul 15 2018 03:26:53 GMT+0800 (China Standard Time)

Ah nm looks like I misread. The new documentation is not very good. So the limit seems like it might be per IP so that would be roughly a single user. I would just need to batch the API calls some.

At the moment there is supposedly a legacy subdomain you can add to the url to get it working again until the code has been migrated.

Paul Giblock · Answer 5 · Sun Jul 15 2018 03:29:02 GMT+0800 (China Standard Time)

Yeah. Batching sounds ideal to get the query count down. I haven't dug into the guts of the scraper enough to know how painful of a refactor that would be.

Paul Giblock · Answer 6 · Sun Jul 15 2018 03:38:14 GMT+0800 (China Standard Time)

Good news: It seems that simply replacing 'thegamedb.net' with 'legacy.thegamedb.net' is a usable stop-gap solution.

Steven Selph · Answer 7 · Sun Jul 15 2018 03:40:16 GMT+0800 (China Standard Time)

Nice.

Yeah the code today is my first Go code so not great to start and over the years has grown to become even less elegant. It does something roughly like the following so not laid out for batch processing in a single database.

for each rom found
  for each DB:
     if result:
       break
     else:
       continue

It was more designed to try multiple databases to fill in gaps from that were missing. A refactor would probably need to do something like

for each DB:
  for each batch of unscraped roms:
    get results(batch)

Paul Giblock · Answer 8 · Sun Jul 15 2018 05:52:00 GMT+0800 (China Standard Time)

Yeah, that makes sense, where unscraped roms is initially the full set. Then for each iteration of DB, it is only the set of unresolved roms from the previous iteration. gdb might have some limit on the number of ids allowed in a single query, so some chunking might be in order as well.

Zer0xFF · Answer 9 · Sun Jul 15 2018 23:27:02 GMT+0800 (China Standard Time)

Hi there,

I'm currently maintaining TheGamesDB new site and API and would like to give you a quick update in that regard, the new API (and site) is a complete overhaul with nothing but the database from the old site, as such it won't be a simple url change, the new api return is now json with changed field names and data layout.
if you've any questions feel free to tag me here or on the forum.

Regards
Zer0xFF

Steven Selph · Answer 10 · Mon Jul 16 2018 02:47:54 GMT+0800 (China Standard Time)

Thanks. Once I get an API key, I'll start working on it more seriously but if you have documentation of the response formats I can go ahead and have most of it ready. I'll start looking at refactoring the code to make batching a little easier since the new API seems to encourage that.

Zer0xFF · Answer 11 · Tue Jul 17 2018 06:22:41 GMT+0800 (China Standard Time)

im afraid thats not available yet, as there are still few more things to implement, and they take priority over documentation.

and we hope that keys will be reissued by next weekend.

symbios24 · Answer 12 · Sun Jul 22 2018 22:52:04 GMT+0800 (China Standard Time)

Hi, upon the change of the api it finds very little game images per system eg for nes in 400 roms it finds 200 for gameboy in 250 roms it finds 100 is this going to be fixed?

symbios24 · Answer 13 · Mon Jul 23 2018 14:37:01 GMT+0800 (China Standard Time)

after updating the scraper the xml files has the same address : thegamesdb.net instead of the legacy.thegamesdb.net is this normal?

Zer0xFF · Answer 14 · Mon Jul 23 2018 15:36:30 GMT+0800 (China Standard Time)

@symbios24 the legacy subdomain is the old site with only the domain change, so results returned shouldn't be any difference.

I changed any references I was able to find, but there is the possibility I missed some, which endpoint is still returning thegamesdb.net?

symbios24 · Answer 15 · Mon Jul 23 2018 15:54:55 GMT+0800 (China Standard Time)

so far i tried the gameboy/nes/atari 2600 games and they have the thegamesdb.net to the xml

symbios24 · Answer 16 · Mon Jul 23 2018 20:45:11 GMT+0800 (China Standard Time)

also atari 5200 is still returning thegamesdb.net i assume all the atari systems do the same

Steven Selph · Answer 17 · Mon Jul 23 2018 22:03:00 GMT+0800 (China Standard Time)

Thanks for the report. I may have forgotten to fix a url somewhere. I'll also see if there was some change affecting images. I would expect them all to work or to not work so it seems weird that it is hit or miss.

…

On Mon, Jul 23, 2018 at 8:45 AM symbios24 ***@***.***> wrote: also atari 5200 is still returning thegamesdb.net i assume all the atari systems do the same — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#230 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHwNVgxenne2LqkhqTnIpAhL6pCoYrj3ks5uJcVXgaJpZM4VP9cK> .

symbios24 · Answer 18 · Tue Aug 07 2018 02:11:39 GMT+0800 (China Standard Time)

if you can change the scraper for pbp - psx files to download images/pictures based on the name of the game and not on the extension of the filename will be great.

Melroy van den Berg · Answer 19 · Tue Sep 18 2018 00:44:30 GMT+0800 (China Standard Time)

It will require that this project (scraper) will request a API key, see this post.

So you can use the new API, eg: https://api.thegamesdb.net/#/Games/GamesByGameName