Raspberry Pi Live Closed Captioning with Azure Cognitive Services

This repo contains two projects, both supporting the Raspberry Pi and desktop platforms via .NET 6:

AzureSpeechCC: This project uses Microsoft Azure Cognitive Services speech recognition to generate real-time captions.
AzureTranslateCC: This project uses Microsoft Azure Cognitive Services speech recognition and translation to generate real-time translations.

demo.mp4

Speech is captured via a USB microphone and, using .NET 6, calls Azure Cognitive Services speech-to-text service, which then displays recognized (or translated) text in real-time captions to an LCD screen. You can also generate captions on a remote screen via SSH.

Privacy Note: This project does NOT store captions. If you use this to generate in-person captions, please be sure to inform all speakers that they are being transcribed but not recorded.

*You can sign up for a free 30-day trial of Azure w/ $200 in credits to test out this project.

Read Time: 10 min

Build Time: 20 min (excluding installation times)

Speech-to-text Cost:

Free Tier (1 concurrent request): 5 free audio hours per month
Standard Tier (100 concurrent requests): $1 per audio hour
Check the Azure pricing page for details on paid tiers.

Translation Cost:

Free Tier 2M chars of any combination of standard translation and custom training free per month
Check the Azure pricing page for details on paid tiers.

More info on cost here.

Many thanks to the original developer of this open source project: Mohsin Ali! You can see Mohsin's other GitHub projects here: m-mohsin-ali (M Mohsin Ali)

Hardware Requirements

Raspberry Pi : We will be using the 4GB version.
Microphone Usb plug n play : Any USB plug and play device works.
Lcd Screen : We are using this for rich text quality. You can use any other compatible displays. : You may also skip the screen and access the Pi via SSH.
Pair of Keyboard and Mouse : A basic keyboard and mouse for using as input devices with the Raspberry Pi.

Raspberry Pi Setup

Download Operating System

Note: We recommend using the Ubuntu 22 64 bit OS because it has better support for the architecture we're using. However, Raspberry Pi OS will work for this project.

On your desktop computer, download and install Raspberry Pi Imager.
Run Raspberry Pi Imager. The home screen will appear.
Select 'CHOOSE STORAGE'
Insert the microSD card into your computer (or via a card reader).
Select the connected microSD card as your storage device.
On the home streen, select 'CHOOSE OS'.
Select in this order: 'Other general-purpose OS' > 'Ubuntu' > 'Ubuntu Desktop 22.04 LTS (RPi 4/400)'

Note: Although Raspbian does come in a 64bit version, Ubuntu has better support for the architecture and available software.
On the home screen, select 'WRITE'.
A loading bar will appear.

Note: Flashing the SD Card may take a few minutes to an hour to complete.
Safely eject the SD card and insert it into the Raspberry Pi.
If you're connecting directly to the Pi, connect the display, keyboard, and mouse.
Finally, connect the power supply!
Once the Pi boots up, configure your WiFi settings, keyboard layout and timezone.
CHANGE YOUR PASSWORD. This is important because otherwise someone could get access to your Pi and make your closed captions come out all silly.
Safely eject the SD card and insert into the Raspberry Pi.

Raspberry Pi Physical Setup

This section covers how to connect peripheral devices, like a screen, keyboard, and mouse, to our Raspberry Pi.

If you're using a screen, connect the display via USB and HDMI ports.
Connect Micro-USB (Display) to USB (RaspberryPi)
Connect the USB Mic to a Pi USB port.
Connect the keyboard and mouse. We're using a wireless dongle for both keyboard and mouse.
Finally, connect the power supply!

Software Updates and Installs

This section shows you how to install dependencies for the project onto your Raspberry Pi.

Follow these steps on your Raspberry Pi computer.

Open the terminal.
Make a directory to store our project by running the following commands:
```
mkdir live-captioning
cd live-captioning
```

Setup the .NET Framework by running the following commands:

curl -sSL https://dot.net/v1/dotnet-install.sh | bash /dev/stdin --channel Current

Once the files are installed, set the environment variables by running the following commands:

echo 'export DOTNET_ROOT=$HOME/.dotnet' >> ~/.bashrc
echo 'export PATH=$PATH:$HOME/.dotnet' >> ~/.bashrc
source ~/.bashrc

Verify the installation
```
dotnet --version
```
Finally, install the Azure Cognitive Services speech-to-text dependencies with the following commands:
```
sudo apt-get update
sudo apt-get install build-essential libssl-dev libasound2 wget 
```

We need to manually install the libssl1.0.0 as its not available for ubuntu 22. Run the following command:
```
wget http://ftp.us.debian.org/debian/pool/main/o/openssl/libssl1.1_1.1.1n-0+deb11u1_arm64.deb
```

Next, install from file:

sudo apt install -f ./libssl1.1_1.1.1n-0+deb11u1_arm64.deb

Set up Azure Cognitive Services

Now it's time to sign up for Azure Cognitive Services and get our API keys!

Follow these steps on your desktop or laptop computer.

Sign up for a free Azure account here. Your free trial lasts 30 days and includes $200 Azure credits.
Once you're logged in to your Azure dashboard, select 'Create a Resource'.
Select (or search for) Cognitive Services.
Create a new speech service.
From here, you will need the keys and the region to set up speech-to-text on the Raspberry Pi.
Copy one of the keys (any of them will work) and the location region.

Run the Project!

The following section shows you how to run the project on your Raspberry Pi. Follow these steps on your Raspberry Pi computer.

If you don't already have it, install git with the following command:
```
sudo apt install git
```
Navigate to the project folder that we created earlier:
```
cd live-captioning
```

Clone this repository:

git clone https://github.com/jenfoxbot/closed-captioning-azure-speech-ai

Navigate to the repository folder that contains project code:

cd closed-captioning-azure-speech-ai/code/AzureSpeechCC

cd closed-captioning-azure-speech-ai/code/AzureTranslateCC

Add your Cognitive Services keys to the code:

nano Program.cs

class Program
   {

      static string YourSubscriptionKey = "Enter your Key Here";
      static string YourServiceRegion = "Enter your Region here";
...

Press CTRL+X and save/overwrite the file.
Add the Azure Speech SDK lib/package to the code directory by running the following:
```
dotnet add package Microsoft.CognitiveServices.Speech
```
We did it!! Let's run the code and see our wizardry in action:
```
dotnet build
dotnet run
```
- *Note: Once you've built, you can run the program with only this command:
```
dotnet run
```

Test out different audio sources, try different sounds and voices, and explore the capabilities and limits of the live speech-to-text translation!

Going Further

Make the project portable by getting an enclosure for the Pi, a small touch screen, and a USB-C battery.

Show us your creations by tagging us on Twitter, @MakersAtMicrosoft, or using the hashtag #AzureLiveCaptions!

adrianwyatt / closed-captioning-azure-speech-ai