As pointed out in this thread, right now it only works with text-based PDFs. But there's a PR[1] which will add OCR support (using EasyOCR) for image-based PDFs in some time.
Yes I need to work on that PR, haven't been getting a lot of free time these days. It adds OCR support using EasyOCR, which I found on HN some time ago!
A common use of COM was scripting with Visual Basic in the 1990s, for instance, ask Excel what is in cell B7, or dynamically load a GUI component out of a DLL and script it into a Visual Basic application.
This blends the boundaries between applications in that you might have a Word document that has an Excel spreadsheet embedded in it, and it really does boot up Excel and has Excel render itself in a rectangle inside the Word document.
COM is still used a lot in audio (at least the COM ABI) because it allows to share objects between programs/shared libraries and manage their destruction. It also has a nice way to add functionality.
It's a language agnostic binary interface. It's kind of hard to explain without getting into the technical details of how it works. For many years it was the only stable ABI on windows.
As pointed out in this thread, right now it only works with text-based PDFs. But there's a PR[1] which will add OCR support (using EasyOCR) for image-based PDFs in some time.
[1] https://github.com/camelot-dev/camelot/pull/209