oTranscribe is a free web app designed to take the pain out of transcribing recorded interviews.
- Pause (ESC), rewind (F1) and fast-forward (F2) without taking your hands off the keyboard
- Adjust playback speed with a slider or using F3/F4
- Your transcript is automatically saved to the browser's
- Rich text support using
- YouTube and video file support
... and more!
Despite the advances in machine learning applications, specifically Automatic Speech Recognition (ASR), the language work based within the audiovisual sector such as transcription, translation and subtitling still relies on manual labor done by experts. During the last decade, the inclusion of the new technologies only contributed to the Computer Assisted Transcription technologies, through the appearance of new startups which combine ASR and related technologies (speaker diarization, punctuation and capitalization recovery etc.) with an online editor.
However, there is a barrier of entry to the adoption of these tools, mostly due to their cost reflected on the client, which are priced based on the length of audio to be processed/transcribed. We aim to present a low-cost alternative to these platforms, both for the final user and the provider; taking advantage of the latest developments in the speech technologies, namely the use of edge (client-side) computing and open ASR models which are small and precise.
With this platform/tool, language workers can upload a file to the web application and receive a basic transcription. Since ASR decoding is done on client-side, the provider can serve multiple users without the concern on costs per user computationally; since the server side processes will be limited.
This project requires Node version 12. We recommend to use the Node Version Manager tool, as well as the Yarn package manager. With them you can locally use that Node version 12 and install requirements:
nvm use 12 yarn install
Download a copy
Although a web version is available, you can install oTranscribe anywhere by following these steps:
- Download the current ZIP archive.
- Compile the CSS and JS with Webpack (see below for more detailed instructions).
- Upload the files in the newly-generated
distfolder to a server of your choice.
Please note that, in Chrome, local copies oTranscribe may not run correctly due to the browser's privacy settings.
- Install Node.js and NPM.
npm installto install dependencies
make build_prodto compile the
Usage and compilation (Extended version)
Code lives in
dist folder will be filled with the end result of oTranscribe+ files and folders. You can emulate the access by a remote browser launching on that location the next Python command:
python3 -m http.server. Having run this, you will be able to access with your browser to your local port 8000, where oTranscribe+ should be served.
OTR file format
oTranscribe has its own file format (.otr), which is just a JSON file with the following parameters:
- text: The raw HTML of the transcript
- media: If available, the name of the last media used
- media-source: If available, a link to the last media used
- media-time: If available, the playtime of the last media used
oTranscribe is not fully tested. There are only a small number of tests, for data migration.
To setup, install CasperJS.
Then run a server at the root directory of this repository at
http://localhost:8000, and on the command line run:
casperjs test tests/
Translations have been provided by the following talented and generous volunteers:
- Catalan: Joan Montané and Jon Sindreu.
- Chinese: baiqj, Cindy Ng, Andy Pan, Cp0204 and Robin Kwong
- Danish: Christian Bruun.
- Dutch: Patrick Mackaaij and Marjolein Quist.
- Filipino: Patricia Albano.
- French: Olivier Aubert, @goofy-bz and Dr J Rogel-Salazar.
- German: Dr J Rogel-Salazar and Lisa Bernhardt.
- Indonesian: Joy Tikoalu.
- Italian: Dr J Rogel-Salazar, Edoardo Putti and Federico Lasta.
- Japanese: harupong.
- Norwegian: Hallvar Hauge Johnsen
- Polish: Emil Maruszczak and Piotr Tarasewicz.
- Portuguese: enVide neFelibata.
- Brazilian Portuguese: Leonardo Barichello and Carlos Eduardo Pinheiro Rocha.
- Romanian: Iain Apreotesei and Catalina Albeanu
- Russian: Pavel Osminin
- Spanish: Cristian Duque, Dr J Rogel-Salazar and Adrián Blanco.
- Swedish: c3ons.
- Turkish: Mehmet S. DERİNDERE.
- Ukrainian: Myroslav Opyr
- Vietnamese: Trần Ngọc Quân
- Greek: Konstantinos Alexiou
More about translating oTranscribe here.