Skip to content

HakuPiku/EpubToTxt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EpubToTxt

Gets all the text content from an epub and saves it as txt.

To run it :

go run main.go -epub=<epub file> -regex=<regex file> -output=<output directory>

A regex file can be added to to replace certain parts of the epub content.

The i-th(first) line in the regex file defines the regex to match and i+1th(second) line defines what to replace the matched regex with.

I wanted to remove everything between <rt> tags because they mess up my text parsing software so my regex file looks like this:

<rt>(.*?)</rt>
 

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages