-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Impossible to read ODT file previously saved by PHPWord as ODText #2493
Labels
Comments
FWIW, the document prepared by save appears to be valid; the problem does not lie with Writer/Odt. On the other hand, Reader/Odt has some gaps. At a minimum, Reader/ODText/Content has no support for |
oleibman
added a commit
to oleibman/PHPWord
that referenced
this issue
Nov 12, 2023
Fix PHPOffice#2493. There is much that the ODT Reader ignores. This change adds support for the `text:section`, `text:span`, `text:s`, and `text:tab` tags, thereby handling multiple sections, text runs, tab characters, and multiple spaces. There will still be many omissions (e.g. styles and tables), but you will now often be able to access the text content of valid ODT documents. The issue suggests variations in a simple file created on its own by LibreOffice, and a similar file created by PhpWord. Both are unit-tested. A `getText` method is added to TextRun to facilitate testing (and can be useful on its own). It will return the concatenated texts of all elements of the text run.
oleibman
added a commit
to oleibman/PHPWord
that referenced
this issue
Nov 22, 2023
Fix PHPOffice#2493. There is much that the ODT Reader ignores. This change adds support for the `text:section`, `text:span`, `text:s`, and `text:tab` tags, thereby handling multiple sections, text runs, tab characters, and multiple spaces. There will still be many omissions (e.g. styles and tables), but you will now often be able to access the text content of valid ODT documents. The issue suggests variations in a simple file created on its own by LibreOffice, and a similar file created by PhpWord. Both are unit-tested. A `getText` method is added to TextRun to facilitate testing (and can be useful on its own). It will return the concatenated texts of all elements of the text run.
3 tasks
Progi1984
pushed a commit
to oleibman/PHPWord
that referenced
this issue
Nov 30, 2023
Fix PHPOffice#2493. There is much that the ODT Reader ignores. This change adds support for the `text:section`, `text:span`, `text:s`, and `text:tab` tags, thereby handling multiple sections, text runs, tab characters, and multiple spaces. There will still be many omissions (e.g. styles and tables), but you will now often be able to access the text content of valid ODT documents. The issue suggests variations in a simple file created on its own by LibreOffice, and a similar file created by PhpWord. Both are unit-tested. A `getText` method is added to TextRun to facilitate testing (and can be useful on its own). It will return the concatenated texts of all elements of the text run.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the Bug
Returning an empty array of element when trying to read an ODT file that has been saved by PHPWord previously.
Steps to Reproduce
Expected Behavior
Should return an array of PhpOffice\PhpWord\Element
Current Behavior
returns [] in $document->getSection(0)->elements
Leads
The library can properly read this (that is coming from libre office when creating a document saved as .odt file):
but cannot read this (that is coming from the library when saving as .odt):
Context
Please fill in your environment information:
The text was updated successfully, but these errors were encountered: