Skip to content

Commit

Permalink
ychwanegu manylion atalnodi a priflythrennu / add details on restorin…
Browse files Browse the repository at this point in the history
…g punctuation and capitalization
  • Loading branch information
DewiBrynJones committed Oct 26, 2022
1 parent 7c74994 commit 0724787
Show file tree
Hide file tree
Showing 2 changed files with 56 additions and 1 deletion.
30 changes: 29 additions & 1 deletion inference/server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,4 +71,32 @@ ID Start End Transcript
1 0.619581589958159 5.170041841004185 mae ganddynt ddau o blant mab a merch
```

Mae'r gweinydd yn darparu GUI HTML syml iawn hefyd er mwyn defnyddio/cefnogi'r API uchod. Ewch i http:https://localhost:5511/static_html/index.hml
Mae'r gweinydd yn darparu GUI HTML syml iawn hefyd er mwyn defnyddio/cefnogi'r API uchod. Ewch i http:https://localhost:5511/static_html/index.hml


## Atgyweirio priflythrennau ac atalnodi

Mae canlyniadau o'r model adnabod lleferydd cyson mewn llythrennau bach sydd heb unrhyw fath o atalnodi fel marc cwestiwn, atalnod llawn, cyplysnodau ayb fel yn yr enghraifft uchod - "mae ganddynt ddau o blant mab a merch". Mae modd cysylltu'r gweinydd trawsgrifio gyda weinydd atalnodi (o'n project arall at GitHub - https://github.com/techiaith/docker-atalnodi-server).

Gosodwch y weinydd atalnodi a nodwch ei cyfeiriad we (fel http:https://localhost:5555) a nodwch yr URL o fewn ffeil newydd o'r enw `external_api_urls.py` yn y ffolder `worker`. E.e.

```python
$ cat worker/external_api_urls.py
PUNCTUATION_API_URL = "http:https://localhost:5555/restore"
````

Ac ail-gychwynwch y gweinydd drwy

```shell
$ make down
$ make up
```

Bydd y canlyniad i'r profi'r API gyda'r ffeil `speech.wav` yn rhoi testun sydd wedi ei briflythrennu a'i hatalnodi:

```
$ curl localhost:5511/get_srt/?stt_id=.....
1
00:00:00,619 --> 00:00:05,170
Mae ganddynt ddau o blant, mab a merch.
```
27 changes: 27 additions & 0 deletions inference/server/README_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,3 +74,30 @@ ID Start End Transcript


The server provides a very simple HTML GUI in order to use/support the above API. Go to http:https://localhost:5511/static_html/index.hml

## Restore capitalization and punctuation

Results from the speech recognition model are always in lowercase letters and do not contain any type of punctuation marks such question marks, full stops, colons etc. such as in the example transcription result above - "mae ganddynt dau o blant mab a merch". Therefore, it's now possible to connect the transcription server with a punctuation server that you may have installed from our other project on GitHub - see https://github.com/techiaith/docker-atalnodi-server.

Simply install the punctuation server and enter its web address (such as http:https://localhost:5555/restore) into a new file named `external_api_urls.py` in the `worker` folder. E.g.

```python
$ cat worker/external_api_urls.py
PUNCTUATION_API_URL = "http:https://localhost:5555/restore"
````

Restart your speech recognition server..

```shell
$ make down
$ make up
```

The result of testing the API with the `speech.wav` file will this time give a transcription that is capitalized and punctuated:

```
$ curl localhost:5511/get_srt/?stt_id=.....
1
00:00:00,619 --> 00:00:05,170
Mae ganddynt ddau o blant, mab a merch.
```

0 comments on commit 0724787

Please sign in to comment.