Skip to content

Download and generate EPUB of your favorite books from O'Reilly Learning (aka Safari Books Online) library. Includes changes to download a topic or collection.

License

Notifications You must be signed in to change notification settings

RedMuggle/safaribooks

 
 

Repository files navigation

SafariBooks

Forked and based on https://github.com/lorenzodifuccia/safaribooks but with options to download a full topic or a playlist

Download and generate EPUB of your favorite books from Safari Books Online library. This could be usefull if you have a subscription and want to read your ebooks on e-readers like kindle I'm not responsible for the use of this program, this is only for personal and educational purpose.
Before any usage please read the O'Reilly's Terms of Service.

Overview:

Requirements & Setup:

First of all, it requires python3 and pip3 or pipenv to be installed.

$ git clone https://github.com/0x6f677548/safaribooks.git
Cloning into 'safaribooks'...

$ cd safaribooks/
$ pip3 install -r requirements.txt

OR

$ pipenv install && pipenv shell

The program depends of only two Python 3 modules:

lxml>=4.1.1
requests>=2.20.0

Usage:

It's really simple to use, just choose a book from the library and replace in the following command:

  • X-es with its ID,
  • email:password with your own.
$ python3 safaribooks.py --cred "[email protected]:password01" --bookid XXXXXXXXXXXXX

The ID is the digits that you find in the URL of the book description page:
https://www.safaribooksonline.com/library/view/book-name/XXXXXXXXXXXXX/
Like: https://www.safaribooksonline.com/library/view/test-driven-development-with/9781491958698/

In the current fork, you can also download a topic:

$ python3 safaribooks.py --cred "[email protected]:password01" --topic linux

or eventually a playlist:

$ python3 safaribooks.py --cred "[email protected]:password01" --collection GUID

The collectionid is the digits that you find in the URL of the playlist description page:
https://learning.oreilly.com/playlists/<GUID>/
Like: https://learning.oreilly.com/playlists/c1c43f61-9e45-4bd1-84d5-84799b1e6f44/

Program options:

$ python3 safaribooks.py --help
usage: safaribooks.py [--cred <EMAIL:PASS> | --login] [--no-cookies]
                      [--kindle] [--preserve-log] [--help] [--title]
                      [--bookid] <BOOK ID>
                      [--topic] <TOPIC>
                      [--collection] <COLLECTION ID>
                      

Download and generate an EPUB of your favorite books from Safari Books Online.

optional arguments:
  --cred <EMAIL:PASS>          Credentials used to perform the auth login on Safari
                               Books Online. Es. ` --cred
                               "[email protected]:password01" `.
  --login                      Prompt for credentials used to perform the auth login
                               on Safari Books Online.
  --no-cookies                 Prevent your session data to be saved into
                               `cookies.json` file.
  --bookid <BOOK ID>           Book digits ID that you want to download. You can find
                               it in the URL (X-es):
                               `https://learning.oreilly.com/library/view/book-
                               name/XXXXXXXXXXXXX/`
  --topic <TOPIC>              Downloads all the books in a topic. You can find 
                               it in the URL:
                               `https://learning.oreilly.com/library/topics/<topic>/`
  --collection <COLLECTION ID> Downloads all the books in a collection. You can find 
                               it in the URL:
                               `https://learning.oreilly.com/playlists/<collection>/`

  --kindle                     Add some CSS rules that block overflow on `table` and
                               `pre` elements. Use this option if you are going to
                               export the EPUB to E-Readers like Amazon Kindle.
  --title                      Output file will be based on title instead of ISBN.
                                Use this option if you want output files with `Title (ISBN).epub`.

  --preserve-log               Leave the `info_XXXXXXXXXXXXX.log` file even if there
                               isn't any error.
  --help                       Show this help message.

The first time you use the program, you'll have to specify your Safari Books Online account credentials (look here for special character).
The next times you'll download a book, before session expires, you can omit the credential, because the program save your session cookies in a file called cookies.json.
For SSO, please use the sso_cookies.py program in order to create the cookies.json file from the SSO cookies retrieved by your browser session (please follow these steps).

Pay attention if you use a shared PC, because everyone that has access to your files can steal your session. If you don't want to cache the cookies, just use the --no-cookies option and provide all time your credential through the --cred option or the more safe --login one: this will prompt you for credential during the script execution.

You can configure proxies by setting on your system the environment variable HTTPS_PROXY or using the USE_PROXY directive into the script.

Calibre EPUB conversion

Important: since the script only download HTML pages and create a raw EPUB, many of the CSS and XML/HTML directives are wrong for an E-Reader. To ensure best quality of the output, I suggest you to always convert the EPUB obtained by the script to standard-EPUB with Calibre. You can also use the command-line version of Calibre with ebook-convert, e.g.:

$ ebook-convert "XXXX/safaribooks/Books/Test-Driven Development with Python 2nd Edition (9781491958698)/9781491958698.epub" "XXXX/safaribooks/Books/Test-Driven Development with Python 2nd Edition (9781491958698)/9781491958698_CLEAR.epub"

After the execution, you can read the 9781491958698_CLEAR.epub in every E-Reader and delete all other files.

The program offers also an option to ensure best compatibilities for who wants to export the EPUB to E-Readers like Amazon Kindle: --kindle, it blocks overflow on table and pre elements (see example).
In this case, I suggest you to convert the EPUB to AZW3 with Calibre or to MOBI, remember in this case to select Ignore margins in the conversion options:

Calibre IgnoreMargins

Examples:

  • $ python3 safaribooks.py --cred "[email protected]:MyPassword1!" --bookid 9781491958698
    
           ____     ___         _ 
          / __/__ _/ _/__ _____(_)
         _\ \/ _ `/ _/ _ `/ __/ / 
        /___/\_,_/_/ \_,_/_/ /_/  
          / _ )___  ___  / /__ ___
         / _  / _ \/ _ \/  '_/(_-<
        /____/\___/\___/_/\_\/___/
    
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    [-] Logging into Safari Books Online...
    [*] Retrieving book info... 
    [-] Title: Test-Driven Development with Python, 2nd Edition                     
    [-] Authors: Harry J.W. Percival                                                
    [-] Identifier: 9781491958698                                                   
    [-] ISBN: 9781491958704                                                         
    [-] Publishers: O'Reilly Media, Inc.                                            
    [-] Rights: Copyright © O'Reilly Media, Inc.                                    
    [-] Description: By taking you through the development of a real web application 
    from beginning to end, the second edition of this hands-on guide demonstrates the 
    practical advantages of test-driven development (TDD) with Python. You’ll learn 
    how to write and run tests before building each part of your app, and then develop
    the minimum amount of code required to pass those tests. The result? Clean code
    that works.In the process, you’ll learn the basics of Django, Selenium, Git, 
    jQuery, and Mock, along with curre...
    [-] Release Date: 2017-08-18
    [-] URL: https://learning.oreilly.com/library/view/test-driven-development-with/9781491958698/
    [*] Retrieving book chapters...                                                 
    [*] Output directory:                                                           
        /XXXX/safaribooks/Books/Test-Driven Development with Python 2nd Edition (9781491958698)
    [-] Downloading book contents... (53 chapters)                                  
        [#####################################################################] 100%
    [-] Downloading book CSSs... (2 files)                                          
        [#####################################################################] 100%
    [-] Downloading book images... (142 files)                                      
        [#####################################################################] 100%
    [-] Creating EPUB file...                                                       
    [*] Done: /XXXX/safaribooks/Books/Test-Driven Development with Python 2nd Edition 
    (9781491958698)/9781491958698.epub
    
        If you like it, please * this project on GitHub to make it known:
            https://github.com/lorenzodifuccia/safaribooks
        e don't forget to renew your Safari Books Online subscription:
            https://learning.oreilly.com
    
    [!] Bye!!

    The result will be (opening the EPUB file with Calibre):

    Book Appearance

  • Use or not the --kindle option:

    $ python3 safaribooks.py --kindle --bookid 9781491958698

    On the right, the book created with --kindle option, on the left without (default):

    NoKindle Option


About

Download and generate EPUB of your favorite books from O'Reilly Learning (aka Safari Books Online) library. Includes changes to download a topic or collection.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.5%
  • Dockerfile 0.5%