dennislwm / playscribe

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

playscribe


1. Introduction

1.1. Purpose

This document describes the playscribe automation and manual services that is provided for DevSecOps Engineer, Mac and Mobile Users.

1.2. Audience

The audience for this document includes:

  • Mobile User who will add URLs, via Bookmark or Email, and consume RSS feeds on their iPhone device.

  • Mac User who will add URLs, via Bookmark or Email, consume RSS feeds, and query text from an archive on their workstation.

  • DevSecOps Engineer who will design system workflows, configure any SaaS or selfhosted services, and plan for disaster recovery.


2. System Overview

2.1. Benefits and Values

  1. This system leverages on and applies a similar design from the playboard automation to its workflow.

  2. The Mobile or Mac User will add a YouTube URL with tag youtube that is analogous to sending a message to a queue.

  3. The messages are sent to an SaaS application (Pinboard.in), which manages the URLs and provides separate RSS feeds for different tags that can be consumed.

  4. Currently there isn't an automated process to consume the RSS feed, process a URL and publish the result.

2.2. Workflow

This project uses several methods and products to optimize your workflow.

  • Uses a SaaS application (Pinboard.in) to produce Bookmarks using a Bookmarklet or by sending an Email.
  • Uses a SaaS application (Pinboard.in) to manage Bookmarks and share them as separate RSS feeds using Tags.
  • Uses a MacOS/iOS application (NetNewsWire) to consume individual RSS feeds and save them to a cloud storage.
  • Uses a Cloud Storage (OneDrive.com) to read, write and synchronise the RSS feeds.
  • Uses a Continuous Integration pipeline (GitHub Actions) to consume an RSS feed, process a Bookmark and publish the result.

3. User Personas

3.1 RACI Matrix

Category Activity Mobile User Mac User DevSecOps
Execution Download an autogenerated subtitle file from YouTube R,A
Execution Convert a subtitle file to text R,A

4. Requirements

4.1. Local workstation


5. Installation and Configuration


6. Execution

6.1. Download an autogenerated subtitle file from YouTube

  1. Open a terminal and run the command yt-dlp --version to check if it is installed.
2023.10.1
  1. Type the following command to download the autogenerated subtitle of a YouTube video. Replace the YOUTUBE_URL with any YouTube link, e.g. https://www.youtube.com/watch?v=x3vnCKivCjs.
yt-dlp --write-auto-sub --skip-download [YOUTUBE_URL]

You should see an output similar to below.

[youtube] Extracting URL: https://www.youtube.com/watch?v=x3vnCKivCjs
[youtube] x3vnCKivCjs: Downloading webpage
[youtube] x3vnCKivCjs: Downloading ios player API JSON
[youtube] x3vnCKivCjs: Downloading android player API JSON
[youtube] x3vnCKivCjs: Downloading m3u8 information
[info] x3vnCKivCjs: Downloading subtitles: en
[info] x3vnCKivCjs: Downloading 1 format(s): 22
[info] Writing video subtitles to: The Fastest Way to Lose Belly Fat [x3vnCKivCjs].en.vtt
[download] Destination: The Fastest Way to Lose Belly Fat [x3vnCKivCjs].en.vtt
[download] 100% of   85.84KiB in 00:00:00 at 879.88KiB/s

The subtitle file name is created using the video title and code, e.g. The Fastest Way to Lose Belly Fat [x3vnCKivCjs].en.vtt.

If you open the vtt file, you'll notice that it contains metatags that makes it hard to read. You'll have to convert this vtt file to a text file using a Python script.

6.2. Convert a subtitle file to text

Note: The Python script below has been deprecated in favour of a JSON method using jq.

Fortunately, there is a Python script that converts youtube subtitle file (vtt) to plain text. Credit to glasslion for making it open-source.

  1. Open a terminal and run the command jq --version && yt-dlp --version to check if both apps are installed.
jq-1.6
2023.10.1
  1. In your terminal, type the following command. Replace the YOUTUBE_URL with any YouTube link, e.g. https://www.youtube.com/watch?v=x3vnCKivCjs.
yt-dlp --skip-download --write-auto-sub --quiet --sub-format json3 [YOUTUBE_URL]
  1. Type the following command to convert the subtitle file to text. Replace the JSON_FILE with your subtitle file, e.g. The\ Fastest\ Way\ to\ Lose\ Belly\ Fat\ \[x3vnCKivCjs\].en.json3.
jq -r '.events[]|select(.segs and.segs[0].utf8!="\n")|[.segs[].utf8]|join("")' [JSON_FILE] \
|paste -sd\  -|fold -sw60
  1. If successful, you should see an output similar to below.
today I'm going to share with you the absolute fastest way
to lose your belly now you could have the best willpower
the best discipline really want it really bad and never
really see any results because you're missing the technique
you're missing the strategy I'm the perfect example I took
guitar lessons for six years right as a teenager and I
never really progressed or never really went anywhere
because the techniques that were taught to me were just not
that great great the same thing happened with tennis in
college I was never taught the right technique and so I
...

7. Troubleshooting and FAQs


8. References

The following resources were used as a single-use reference.

Title Author
How to extract closed caption transcript from YouTube video? StackOverflow

About


Languages

Language:Shell 80.4%Language:Makefile 19.6%