EpubToTxt

Gets all the text content from an epub and saves it as txt.

To run it :

go run main.go -epub=<epub file> -regex=<regex file> -output=<output directory>

A regex file can be added to to replace certain parts of the epub content.

The i-th(first) line in the regex file defines the regex to match and i+1th(second) line defines what to replace the matched regex with.

I wanted to remove everything between <rt> tags because they mess up my text parsing software so my regex file looks like this:

<rt>(.*?)</rt>

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.gitignore		.gitignore
README.md		README.md
default.regex		default.regex
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback