Add support to specify the search term for a tv show

Question

Add support to specify the search term for a tv show

JorTurFer opened this issue 2 years ago · comments

Jorge Turrado Ferrero commented 2 years ago

Nefarious monitor the availability of new episodes for the TV shows and this is awesome feature ❤️
The problem is sometimes that in non-English tackers, sometimes the uploaders names the torrents with the Spanish translation and other with the English name. In these cases, it'd be nice if I can override the search term for this specific tv show.

An example, The Legend of Vox Machina is translated into La leyenda de Vox Machina. Nefarious try to look for La leyenda de Vox Machina but the uploader uploaded all the show using the original name instead of the translated name. For movies this is not a problem because I go manually and select the torrent, but in tv shows, this means that I have to add all the torrents one by one, and I lose also the capability of automatic tracking for new episodes.

Thanks for this awesome tool! ❤️ ❤️ ❤️ ❤️

lardbit · Answer 1 · Sat Feb 05 2022 00:49:48 GMT+0800 (China Standard Time)

Hey @JorTurFer,

Yeah that certainly makes sense. I'm trying to think of the best way to solve this. I'd like to avoid having users manually editing records.

TMDB returns the language-specific results, but it also include the original_name. So, when searching for The Legend of Vox Machina, it returns "name": "La leyenda de Vox Machina" and "original_name": "The Legend of Vox Machina".

So, there could just be a checkbox for shows/movies titled "use original title when searching" and then nefarious would use the original_name vs name result when searching torrents.

Would that suffice your use case?

Example requesting Spanish results:

https://api.themoviedb.org/3/search/tv?api_key=21c8985a267ac3f11ea75baf2c05c3ba&query=The%20Legend%20of%20Vox%20Machina&language=en

{
    "page": 1,
    "results": [{
        "backdrop_path": "/lX33BV2g6O2B6PwMtTUSyzGrfq9.jpg",
        "first_air_date": "2022-01-27",
        "genre_ids": [
            16,
            10765
        ],
        "id": 135934,
        "name": "La leyenda de Vox Machina",
        "origin_country": [
            "US"
        ],
        "original_language": "en",
        "original_name": "The Legend of Vox Machina",
        "overview": "Son un hatajo de pendencieros inadaptados reconvertidos en mercenarios. A Vox Machina le interesa más el dinero fácil y la cerveza barata que proteger el reino. Pero cuando este se ve amenazado por algo maligno, esta bulliciosa panda se da cuenta de que nadie más puede restablecer la justicia. Lo que empezó como un día de pago más es ahora la historia del origen de los nuevos héroes de Exandria.",
        "popularity": 178.913,
        "poster_path": "/4fqfhmVNOHe2nLcligiVMtMnfeM.jpg",
        "vote_average": 8.8,
        "vote_count": 28
    }],
    "total_pages": 1,
    "total_results": 1
}

Jorge Turrado Ferrero · Answer 2 · Sat Feb 05 2022 01:35:54 GMT+0800 (China Standard Time)

That's perfect for my use case :)

lardbit · Answer 3 · Sat Feb 05 2022 03:13:06 GMT+0800 (China Standard Time)

Great. I don't think this will be a difficult implementation. I'll leave this ticket open until I have time to work on it.

Jorge Turrado Ferrero · Answer 4 · Sat Feb 05 2022 17:15:12 GMT+0800 (China Standard Time)

Thanks!! No rush at all 😄

lardbit · Answer 5 · Tue Feb 15 2022 05:18:23 GMT+0800 (China Standard Time)

I think I spoke too soon on the difficulty of this task. All the torrent name parsing logic (borrowed from sonarr/radarr) is largely expecting Latin-based languages which is obviously very limiting. In addition, I'm noticing foreign language films (relative to usa) include the original title and the english title which is a separate challenge in parsing out the titles since it includes both.

For instance, searching for the movie Parasite (which the original korean title is 기생충 returns results like:

기생충 Parasite (2019) (2160p BluRay x265 HEVC 10bit HDR AAC 7.1 Bandi)

Jorge Turrado Ferrero · Answer 6 · Tue Feb 15 2022 05:28:38 GMT+0800 (China Standard Time)

in that case no worries, thanks for trying it ❤️
WDYT about the option of specifying the name manually? Maybe it's easier
If not, don't worry, as I said, you tried it :) You are doing an awesome system

lardbit · Answer 7 · Tue Feb 15 2022 05:35:10 GMT+0800 (China Standard Time)

I think specifying the title manually would have the same challenges. For instance, if you're searching for the original title for Parasite (e.g 기생충), then nefarious would have to parse results like 기생충 Parasite (2019) (2160p BluRay x265 HEVC 10bit HDR AAC 7.1 Bandi) and ignore the title Parasite to match the original title. It gets a little tricky. Maybe I'm missing something, though. Do you have an idea to solve this?

Jorge Turrado Ferrero · Answer 8 · Tue Feb 15 2022 05:44:56 GMT+0800 (China Standard Time)

But you could ignore the current value if another value has been specified. I mean, if I specify 기생충 because I know that my tracker uses original name, nefarious could ignore Parasite from the search and searches only with the provided name {givenName} (2160p BluRay x265 HEVC 10bit HDR AAC 7.1 Bandi).
Provided name can replace the original and it's not necessary of any extra parsing

lardbit · Answer 9 · Tue Feb 15 2022 07:39:48 GMT+0800 (China Standard Time)

That's true. Maybe it could be that simple. So, if the search original name option is chosen, we'd just need to strip out any matching translated part. If we're looking for the japanese film Spirited Away (original title 千と千尋の神隠し) and a search result was:

劇場版 千と千尋の神隠し Spirited.Away (Sen to Chihiro no Kamikakushi) (BD 1280x720p AVC AACx9 Subx7).mp4 [encoded by SEED] (Jap,Eng,Fre,Ger,Fin,Kor,Chi)

We'd have to remove Spirited.Away (any other word separator variation).

I'll give this a test and see how well it does.

lardbit · Answer 10 · Tue Feb 15 2022 07:46:10 GMT+0800 (China Standard Time)

Well, shoot. I just tested the current parsing logic in nefarious and it doesn't match anything for the above result, with Spirited Away removed. nefarious has a command line parsing utility to tell you what it parses and it unfortunately didn't work:

python manage.py  re-test-movie "劇場版 千と千尋の神隠し (Sen to Chihiro no Kamikakushi) (BD 1280x720p AVC AACx9 Subx7).mp4 [encoded by SEED] (Jap,Eng,Fre,Ger,Fin,Kor,Chi)"

Returns None.

I'm assuming we'd have to update the parsing logic and I haven't dug into that yet. Maybe it's a non-latinlanguage issue?

Parsing logic for movies:
https://github.com/lardbit/nefarious/blob/master/src/nefarious/parsers/movie.py

Jorge Turrado Ferrero · Answer 11 · Tue Feb 15 2022 15:46:33 GMT+0800 (China Standard Time)

Let me try during the day using that name with the serie that I say (the legend of vox machina)
Should I append the quality to the name? I mean, how should be the command? (sorry, I'm totally a noob with python)

Jorge Turrado Ferrero · Answer 12 · Tue Feb 15 2022 15:48:40 GMT+0800 (China Standard Time)

Something like python manage.py re-test-movie "La Legenda de Vox Machina (BD 1280x720p AVC AACx9 Subx7).mp4 [encoded by SEED]" ?

lardbit · Answer 13 · Tue Feb 15 2022 22:34:02 GMT+0800 (China Standard Time)

Yeah you have the command right. I usually just manually run a search against jackett to find real responses to test with, and then use the title against the command line utility to see how nefarious parses the title. I found out a couple things, most of the parsers expects a year in the title which is why the previous examples weren't working. Secondly, a while ago I added a unicode "transliteration" to ASCII which aimed to successfully match non-original ascii titles produced by indexers. (e.g Ö becomes O). However, that is messing up our intention here. We could conditionally disable that transliteration when the search original title option is enabled.

Here's a quick way to setup your local python environment to be able to run the command line utility:

Change to nefarious source directory

cd nefarious/src

Create new python virtual environment (in /tmp)

python3 -mvenv /tmp/nefarious

Install python dependencies:

/tmp/nefarious/bin/pip install -r requirements.txt

Parse 1:

/tmp/nefarious/bin/python manage.py re-test-movie "La Legenda de Vox Machina 2022"

{'title': 'la legenda de vox machina', 'year': ['2013'], 'match_name': 'Normal movie format, e.g: Mission.Impossible.3.2011', 'quality': 'Bluray-720p', 'resolution': 'unknown', 'hc': False}

Parse 2 without transliteration (just return title in function):

/tmp/nefarious/bin/python manage.py re-test-movie "劇場版 千と千尋の神隠し 2013 (BD 1280x720p AVC AACx9 Subx7).mp4 [encoded by SEED]"

{'title': '劇場版千と千尋の神隠し', 'year': ['2013'], 'match_name': 'Normal movie format, e.g: Mission.Impossible.3.2011', 'quality': 'Bluray-720p', 'resolution': 'unknown', 'hc': False}

Movie Parsers
https://github.com/lardbit/nefarious/blob/master/src/nefarious/parsers/movie.py

lardbit · Answer 14 · Tue Feb 15 2022 22:50:08 GMT+0800 (China Standard Time)

So long story short, maybe by disabling transliteration when searching original titles may do the trick. I'll investigate.