thehanimo / ocr-bot

An action to automatically extract keywords from images in issue bodies, making them searchable ๐Ÿ”

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OCR Bot ๐Ÿค–

javscript-action status

This action uses naptha/tesseract.js to extract text from images attached to issue comments.

The extracted text is appended to the issue body.

This allows extracted text to be searchable via Github's searchbox.

Inspired by imjasonh/ideas/issues/76

Usage

Create a workflow (eg: .github/workflows/ocr-bot.yml see Creating a Workflow file) with the following content:

name: "OCR Bot"
on:
  issues:
    types: [opened, edited]

jobs:
  run:
    runs-on: ubuntu-latest
    steps:
      - uses: thehanimo/ocr-bot@v1.0.0
        with:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Done! You should see OCR keywords being added to issues that contain images. Something like this:

OCR Keywords Mild Splendour of the various-vested Night! Mother of wildly-working visions! haill I watch thy gliding, while with watery light Thy weak eye glimmers through a fleecy veil; And when thou lovest thy pale orb to shroud Behind the gatherโ€™d blackness lost on high; And when thou dartest from the wind-rent cloud Thy placid lightning oโ€™er the awakenโ€™d sky.

Development

Install the dependencies

npm install

Run the tests โœ”๏ธ

$ npm test

 PASS  ./index.test.js
  โœ“ empty comment (3 ms)
  โœ“ links outside img tag (1 ms)
  โœ“ extract text (1 ms)
...

About

An action to automatically extract keywords from images in issue bodies, making them searchable ๐Ÿ”

License:MIT License


Languages

Language:JavaScript 100.0%