jdepoix / youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

YouTubeTranscriptApi.list_transcripts and YouTubeTranscriptApi.get_transcript raises `KeyError: 'translationLanguages'`

achikapa opened this issue · comments

First of all thanks for this awesome library! I came across this unexpected error:

To Reproduce

Call YouTubeTranscriptApi.list_transcripts or YouTubeTranscriptApi.get_transcript with video_id = "KSgWiokWS6g".

What code / cli command are you executing?

YouTubeTranscriptApi.get_transcript("KSgWiokWS6g")

or

YouTubeTranscriptApi.list_transcripts("KSgWiokWS6g")

Which Python version are you using?

Python 3.11.6

Which version of youtube-transcript-api are you using?

youtube-transcript-api 0.6.1

Expected behavior

I expected a transcript to be returned or raise on of the following exceptions: TranscriptsDisabled, NoTranscriptFound, NoTranscriptAvailable if no transcript can be retreived.

Actual behaviour

A KeyError: 'translationLanguages' is raised:

    119 @staticmethod
    120 def build(http_client, video_id, captions_json):
    121     """
    122     Factory method for TranscriptList.
    123 
   (...)
    131     :rtype TranscriptList:
    132     """
    133     translation_languages = [
    134         {
    135             'language': translation_language['languageName']['simpleText'],
    136             'language_code': translation_language['languageCode'],
--> 137         } for translation_language in captions_json['translationLanguages']
    138     ]
    140     manually_created_transcripts = {}
    141     generated_transcripts = {}

KeyError: 'translationLanguages'

i getting same result too. :(

Hi, thanks for reporting!
It seems that YouTube has made some changes which caused the translationLanguages key to sometimes be missing from the captions json. I just published v0.6.2, which initializes translation_languages with an empty list in case that happens. So upgrading to v0.6.2 should fix this error!

awesome, thanks! 🚀