boto / boto

For the latest version of boto, see https://github.com/boto/boto3 -- Python interface to Amazon Web Services

Home Page:http://docs.pythonboto.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unable to do SpeakerDiarization

Adarsh1999 opened this issue · comments

I have been trying the speaker diarisation available by AWS by reading only available material on the net which is: https://docs.aws.amazon.com/transcribe/latest/dg/how-diarization.html.
The docs are very unclear and not leading to anywhere or where to start from still I tried a code by experimenting but always gives an internal error. So is there any way to get speaker diarisation or any guide to follow?


from __future__ import print_function
import time
import boto3
import uuid
transcribe = boto3.client('transcribe')
job_name = str(uuid.uuid4())
job_uri = "https://atris-bucket.s3.us-east-2.amazonaws.com/16000.wav"
transcribe.start_transcription_job(
    MediaSampleRateHertz=16000,


    TranscriptionJobName=job_name,
    LanguageCode='en-US',
    MediaFormat='wav',
    Media={
        'MediaFileUri': job_uri
    },
  

    Settings={

        'ShowSpeakerLabels': True,
        'MaxSpeakerLabels': 3,
        'ChannelIdentification': False,
        'ShowAlternatives': False,
        'VocabularyFilterName': 'string',

    })

while True:
    status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
    if status['TranscriptionJob']['TranscriptionJobStatus'] in ['COMPLETED', 'FAILED']:
        break
    print("Not ready yet...")
    time.sleep(5)
print(status)