Alexanda's starred repositories
cs-self-learning
计算机自学指南
leedl-tutorial
《李宏毅深度学习教程》(李宏毅老师推荐👍),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
pyvideotrans
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并添加配音
PyQt-Fluent-Widgets
A fluent design widgets library based on C++ Qt/PyQt/PySide. Make Qt Great Again.
X-AnyLabeling
Effortless data labeling with AI support from Segment Anything and other awesome models.
parler-tts
Inference and training library for high-quality TTS models.
Parselmouth
Praat in Python, the Pythonic way
Whisper-WebUI
A Web UI for easy subtitle using whisper model.
Chenyme-AAVT
这是一个全自动(音频)视频翻译项目。利用Whisper识别声音,AI大模型翻译字幕,最后合并字幕视频,生成翻译后的视频。
Meta-voicebox
Implementation of Meta-Voicebox : The first generative AI model for speech to generalize across tasks with state-of-the-art performance.
praatIO
A python library for working with praat, textgrids, time aligned audio transcripts, and audio files. It is primarily used for extracting features from and making manipulations on audio files given hierarchical time-aligned transcriptions (utterance > word > syllable > phone, etc).
create_pictures
A Praat script for creation of pictures (waveform, spectrogram, pitch contour, aligned with a textgrid). It creates figures in PNG PDF wmf eps, PraatPic, of all the Sound and TextGrid files it finds in a folder. The pictures contain a waveform (optional), a spectrogram(optional), the F0 track optional and a the content of the tiers of the TextGrid associated with the sound file optional
Speech-and-Language-Processing-3rd-Edition-Solutions
Solutions for the book "Speech and Language Processing" (3rd ed. draft) by Dan Jurafsky and James H. Martin
forcealign
ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level text alignments of audio, with each word or phoneme's start and end time within the audio. ForceAlign was designed to be easy to install and use, without requiring any third-party, non-Python dependencies.
dsp_tutorials
I wanted guided tutorials on digital signal processing so I decided to create them. The result is this ebook: "Digital Signal Processing for Speech, Language, and Hearing Scientists"
vlabeler-textgrid
A set of plugins of vLabeler for Praat TextGrid
HermeSpeechRecorder
Web application for speech recording
Anchor-annotator
Anchor annotator is a program for inspecting corpora for the Montreal Forced Aligner and correcting transcriptions and pronunciations
Phoneme-Forced-Alignment
Comparison of methods to perform forced-alignment of phonemes in English
jason2textgrid
Python script to convert WhisperX JSON time-stamps to Praat TextGrid files
forced_alignment
Slovene speech alignment with Montreal Forced Aligner
whisper-webmaus
Una serie de scripts para generar transcripciones usando Whisper y TextGrids usando WebMAUS a partir de grabaciones de audio