Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

addHtml() inconsistency across various export formats #2232

Open
sirfragalot opened this issue May 23, 2022 · 5 comments
Open

addHtml() inconsistency across various export formats #2232

sirfragalot opened this issue May 23, 2022 · 5 comments
Milestone

Comments

@sirfragalot
Copy link

sirfragalot commented May 23, 2022

Describe the Bug

Inconsistency in exporting HTML to various output formats.

I develop the WIKINDX bibliographic management software that has a word processor in it which uses TInyMCE. Currently we export the word processor output to RTF using code we developed ourselves (so can appreciate the difficulties). We'd like to export to more formats such as DOCX and ODT so I decided to give PHPWord a go. Unfortunately, the results are not yet usable as exporting HTML to RTF, DOCX and ODT gives wildly different results.

A related issue is the documentation. For example, a search on 'addhtml' produces no results and I only came across addHtml quite by chance looking through the bug reports here. It seems there is a lot of (important) PHPWord functionality simply not documented. (Yes, I know from experience, free and open source software relies on the kindness of strangers.) I mention this because the inconsistency I note might be solved by some method/setting that I cannot find in the documentation.

Steps to Reproduce

The following code uses HTML output by TinyMCE to export to DOCX, RTF, and ODT.

require "core/libs/vendor/PHPWord/bootstrap.php";
$phpWord = new \PhpOffice\PhpWord\PhpWord();
\PhpOffice\PhpWord\Settings::setOutputEscapingEnabled(true);

echo "here";

$section = $phpWord->addSection();

$text = "
<p>Is it possible to <strong>im<em>agi</em>ne</strong> an <span style=\"text-decoration: underline;\">unimaginable</span> sound<sup>2</sup>?</p>
<p><span style=\"font-family: 'courier new', courier, monospace; font-size: 36pt;\">large text different font</span></p>
<p>A number of ways to approach this question beginning with semantics (the meanings of the words in the questions):</p>
<ul>
<li>What is the definition of 'imagination' or 'to imagine' that is being used?</li>
<li><em>unimaginable/unimagined</em> by one person or within a culture?</li>
<li>There is a difference between <em>unimaginable</em> & <em>unimagined</em>.</li>
</ul>
<p><img src=\"data/images/winnmp_project_1_a8b8fcce8f468f33fff821212dcf9ebc1ea7274d.png\" width=\"400\" height=\"296\" /></p>
<p> </p>
<table style=\"border-collapse: collapse; width: 100%; height: 40.3334px; border-style: solid;\" border=\"1\">
<tbody>
<tr>
<td style=\"width: 48.1085%;\">cell 1</td>
<td style=\"width: 48.1085%;\"><span style=\"color: #e03e2d;\">cell 2</span></td>
</tr>
<tr>
<td style=\"width: 48.1085%;\">cell 3: ōöéÊÉï</td>
<td style=\"width: 48.1085%;\">cell 4: 𞤔𞤕</td>
</tr>
</tbody>
</table>
<p><a title=\"wikindx\" href=\"https://wikindx.sourceforge.io\">wikindx</a></p>
<p> </p>
<p style=\"text-align: left;\">Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified </p>
<p style=\"text-align: right;\">Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified </p>
<p>Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified </p>
<p>Centered</p>
";

\PhpOffice\PhpWord\Shared\Html::addHtml($section, $text);

$file = 'helloWorld.docx';
$objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($phpWord, 'Word2007');
$objWriter->save($file);

$file = 'helloWorld.rtf';
$objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($phpWord, 'RTF');
$objWriter->save($file);

$file = 'helloWorld.odt';
$objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($phpWord, 'ODText');
$objWriter->save($file);

Expected Behavior

Consistent and accurate export across the three outputs.

Current Behavior

Inconsistent and innacurate export accross the three outputs.

  1. Image. Not output in ODT
  2. List. Not output in RTF and ODT
  3. Table. No border in RTF (Fixed in RTF Writer : Support for Table Border Style #2656), DOCX, ODT
  4. Table. Not 100% width in RTF, DOCX, ODT
  5. Table. Cells completely messed up in RTF
  6. Coloured font. Only in DOCX
  7. Hyperlink. Generally fine but only ODT has link in blue and underlined
  8. Paragraph. No justified or centered text in any of the three outputs
  9. UTF-8 etc. Only in ODT

Context

Please fill in your environment information:

  • PHP Version: 7.4.9
  • PHPWord Version: 0.18.3

Regards,

Mark

@wube1
Copy link

wube1 commented May 26, 2022

Hi @sirfragalot, sorry to fish you out like that out of nowhere. If I'm not mistaken you also deliver hyperion via docker and recently there has been a newer version but eventhough tags in the docer of your container say latest or 2.0.13, once you install container it still is 2.0.12 from November. Is there any chance you could fix that somehow. I really really was looking forward for the latest version from your container repo. And sorry for off topic!

@sirfragalot
Copy link
Author

Hi,

I'm afraid you are mistaken. I have nothing t do with hyperion.

@wube1
Copy link

wube1 commented May 30, 2022

Hi @sirfragalot , I'm so sorry then. I took in account that this might be a case but risked it and asked anyhow. Wish you a pleasent week! Cheers!

@ledahu
Copy link

ledahu commented Jun 10, 2022

just discovering your issue after i posted mine : if you remove the enclosing <p> tag on your <img> , the img will show up

@github-actions github-actions bot added the Stale label Sep 22, 2022
@Progi1984 Progi1984 removed the Stale label Nov 18, 2022
@Legendary4226
Copy link

Facing the same issue, I wanted to export HTML generated by TinyMCE to ODT format, lists (<ol> and <ul>) aren't output in the generated file.

Actual and only working solution: export to DOCX format.

@Progi1984 Progi1984 added this to the 1.3.0 milestone Aug 22, 2024
@PHPOffice PHPOffice deleted a comment from github-actions bot Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants