Skip to content
This repository has been archived by the owner on Jan 7, 2023. It is now read-only.
/ ocr-bot Public archive
generated from actions/javascript-action

An action to automatically extract keywords from images in issue bodies, making them searchable 🔍

License

Notifications You must be signed in to change notification settings

thehanimo/ocr-bot

Use this GitHub action with your project
Add this Action to an existing workflow or create a new one
View on Marketplace

Repository files navigation

OCR Bot 🤖

javscript-action status

This action uses naptha/tesseract.js to extract text from images attached to issue comments.

The extracted text is appended to the issue body.

This allows extracted text to be searchable via Github's searchbox.

Inspired by imjasonh/ideas/issues/76

Usage

Create a workflow (eg: .github/workflows/ocr-bot.yml see Creating a Workflow file) with the following content:

name: "OCR Bot"
on:
  issues:
    types: [opened, edited]

jobs:
  run:
    runs-on: ubuntu-latest
    steps:
      - uses: thehanimo/[email protected]
        with:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Done! You should see OCR keywords being added to issues that contain images. Something like this:

OCR Keywords Mild Splendour of the various-vested Night! Mother of wildly-working visions! haill I watch thy gliding, while with watery light Thy weak eye glimmers through a fleecy veil; And when thou lovest thy pale orb to shroud Behind the gather’d blackness lost on high; And when thou dartest from the wind-rent cloud Thy placid lightning o’er the awaken’d sky.

Development

Install the dependencies

npm install

Run the tests ✔️

$ npm test

 PASS  ./index.test.js
  ✓ empty comment (3 ms)
  ✓ links outside img tag (1 ms)
  ✓ extract text (1 ms)
...

About

An action to automatically extract keywords from images in issue bodies, making them searchable 🔍

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published