Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while extracting - can not pinpoint the exact source of error #6

Closed
flintstones-fred opened this issue Mar 6, 2023 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@flintstones-fred
Copy link

When I try to download a particular book (most books work amazingly well), I get the following error. And unlike previous cases, I can't tell you what is the source of the error. The download seems to happen without issues and maybe the chapterizer fails? I have replicated this on macOS and linux that are using sonus 0.6.3.

Here is the output:

Downloading part 10 of 10... 100%
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.9/bin/sonus", line 8, in <module>
    sys.exit(main.main())
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sonus/main.py", line 119, in main
    chapterizer.main(tmpdir.name, output_path, generic, offset, ffmpeg_debug)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sonus/chapterizer.py", line 219, in main
    all_markers = scan_overdrive_metadata(file_list)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sonus/chapterizer.py", line 22, in scan_overdrive_metadata
    audio_data = xmltodict.parse(xmlstring)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/xmltodict.py", line 327, in parse
    parser.Parse(xml_input, True)
xml.parsers.expat.ExpatError: not well-formed (invalid token): line 1, column 741

@flintstones-fred
Copy link
Author

Book- "Year of the Tiger" and author Alice Wong

@digitalec
Copy link
Owner

digitalec commented Mar 7, 2023

Yep, it looks like the chapterizer is failing when it tries to read the XML data embedded in the file.

The XML data contains the data necessary to cut the files and tag them so I'm very curious to see what's up with that book!

I've put a hold on the book (~8 weeks) so I can investigate it as soon as it's available.

edit It looks like this particular error is in regards to the encoding type of the XML data. That should be an easy fix if you're willing to do the testing for me.

@digitalec digitalec self-assigned this Mar 7, 2023
@digitalec digitalec added the bug Something isn't working label Mar 7, 2023
@cerinawithasea
Copy link

I knew to hold on to the files because of this thread, i wondered did you ever get the book? i can grab it anytime

@digitalec
Copy link
Owner

This is all set. There was an & in one of the text strings within the XML data and this is an invalid character in XML (it's an escape character).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants