Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for Various Missing Features in HTML Writer #2475

Merged
merged 7 commits into from
Nov 22, 2023

Conversation

oleibman
Copy link
Contributor

@oleibman oleibman commented Sep 18, 2023

This PR supersedes #1814 and #2343, which had become badly out of sync due to recent changes to the repository.

Implement a number of features implemented in PhpWord, but not yet supported in PhpWord HTML Writer.

  1. Use css @page and page declarations for sections.
  2. Wrap sections in div, with page break before each (except first).
  3. Add ability to specify generic fallback font for html (documentation change).
  4. Add ability to specify handling of whitespace in html (documentation change). Currently, Word writer preserves space but HTML writer does not.
  5. Support for Language, both for document overall and individual text elements.
  6. Support for PageBreak for HTML (currently only PDF is supported).
  7. Support for Table Border style, color, and size.
  8. Support for empty paragraphs (Word writer permits, browsers generally suppress).
  9. Default paragraph style should apply to all paragraphs, as well as class Normal.
  10. Paragraph style should support line-height.
  11. Paragraph style should support indentation.
  12. Paragraph style should support page-break-before.
  13. Paragraph style should not specify margin-top/bottom when spacing is null.
  14. Fix Bug while transforming html to word document with addHtml method. #2467 - text-align:right in html is not handled correctly.

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context.

Fixes # (issue)

Checklist:

  • I have run composer run-script check --timeout=0 and no errors were reported
  • The new code is covered by unit tests (check build/coverage for coverage report)
  • I have updated the documentation to describe the changes

@coveralls
Copy link

coveralls commented Sep 18, 2023

Coverage Status

coverage: 95.489% (+13.8%) from 81.642%
when pulling 4231051 on oleibman:html_changes_5
into d5ca5b4 on PHPOffice:master.

@Progi1984
Copy link
Member

@oleibman It's a wonderful work. Could you allow to commit on your repository ? I would like to complete your PR.

@Progi1984 Progi1984 added this to the 1.2.0 milestone Sep 20, 2023
@oleibman
Copy link
Contributor Author

@Progi1984 That would be great. However, I don't understand your question. When you ask "Could you allow to commit on your repository?", are you just asking me if it's okay to commit (in which case the answer is yes), or is there something I need to do in my repository to allow you to do what you need to do?

@oleibman
Copy link
Contributor Author

Done.

@Progi1984
Copy link
Member

@oleibman I work on your PR the next week.

@oleibman
Copy link
Contributor Author

Great! But ... I'm on vacation for a couple of weeks starting Wednesday, and am seriously considering not taking my computer with me. So, if you want to wait till I get back, I will let you know when I am available again.

@Progi1984
Copy link
Member

Progi1984 commented Sep 30, 2023

@oleibman I will push some changes and wait for your return.

Have fun and rest.

@oleibman
Copy link
Contributor Author

@Progi1984 I am back now, so please proceed. I noticed that this change also fixes a recent issue, so updated the description to say so (item 14 in my bullet list); no new code change is involved.

@Progi1984 Progi1984 self-requested a review November 2, 2023 12:56
@Progi1984
Copy link
Member

@oleibman My goal of the week : rebase the PR, check all modifications, add some unit tests and merge it. 💪

Please dont add or modify this branch.

@oleibman
Copy link
Contributor Author

oleibman commented Nov 5, 2023

@Progi1984 Okay. Good luck.

This PR supersedes PHPOffice#1814 and PHPOffice#2343, which had become badly out of sync due to recent changes to the repository.

Implement a number of features implemented in PhpWord,
but not yet supported in PhpWord HTML Writer.

1. Use css @page and page declarations for sections.
2. Wrap sections in div, with page break before each (except first).
3. Add ability to specify generic fallback font for html (documentation change).
4. Add ability to specify handling of whitespace in html (documentation change).
   Currently, Word writer preserves space but HTML writer does not.
5. Support for Language, both for document overall and individual text elements.
6. Support for PageBreak for HTML (currently only PDF is supported).
7. Support for Table Border style, color, and size.
8. Support for empty paragraphs (Word writer permits, browsers generally suppress).
9. Default paragraph style should apply to all paragraphs, as well as class Normal.
10. Paragraph style should support line-height.
11. Paragraph style should support indentation.
12. Paragraph style should support page-break-before.
13. Paragraph style should not specify margin-top/bottom when spacing is null.
Makes no sense to multiply by 720 when supplied in array to Paragraph, but left unchanged when supplied directly. Skip multiplication.
@Progi1984 Progi1984 force-pushed the html_changes_5 branch 4 times, most recently from 7884fb3 to 27deefb Compare November 8, 2023 19:41
@Progi1984
Copy link
Member

(@oleibman Next time : Make one PR for each feature, it's more easy to review)

@oleibman
Copy link
Contributor Author

oleibman commented Nov 8, 2023

@Progi1984 Apologies for the size of the PR. When I developed this one, it was modeled after similar work I had done for ODT and RTF writers, which were likewise comprised of many changes and installed that way. Also, I was pretty new to this so didn't realize this was problematic.

@Progi1984 Progi1984 force-pushed the html_changes_5 branch 4 times, most recently from 932d84c to 092ecc9 Compare November 9, 2023 10:20
@oleibman
Copy link
Contributor Author

@Progi1984 I have downloaded your latest set of changes and run some tests on my own. Everything appears to be working properly except for one minor problem. You changed Writer/HTML/Style/Paragraph line 99 to:

$css[$this->getParentWriter() instanceof TCPDF ? 'text-indent' : 'margin-left'] = ((string) $inches) . 'in';

ParentWriter will be an instance of something like PhpOffice\PhpWord\Writer\PDF\TCPDF. However the TCPDF class which is being tested for instanceof is something like PhpOffice\PhpWord\Writer\HTML\Style\TCPDF. Adding the following to the beginning of Writer/HTML/Style/Paragraph will correct the problem (TCPDF ignores margin-left but honors text-indent):

use PhpOffice\PhpWord\Writer\PDF\TCPDF;

@oleibman
Copy link
Contributor Author

@Progi1984 Before I do something dumb - can I sync my master with PhpOffice without affecting your work on this PR?

@oleibman
Copy link
Contributor Author

It appears that changing Writer/HTML/Element/AbstractElement so that parentWriter is of type PhpOffice\PhpWord\Writer\HTML rather than AbstractWriter will eliminate a lot of Phpstan errors without adversely affecting anything.

@Progi1984 Progi1984 force-pushed the html_changes_5 branch 4 times, most recently from 51078b7 to f2cf2d0 Compare November 21, 2023 18:36
@oleibman
Copy link
Contributor Author

@Progi1984 Downloaded the latest. After adapting some of my test scripts to new method names and/or locations, things are looking good. There are some things that are a little more difficult to code than they were, but I can still do them, so no big deal.

@Progi1984
Copy link
Member

@oleibman I was going to contact you later today. I'll do some checking and merge later today.

@Progi1984 Progi1984 merged commit 2c0488c into PHPOffice:master Nov 22, 2023
13 checks passed
@Progi1984 Progi1984 deleted the html_changes_5 branch November 22, 2023 11:00
@Progi1984
Copy link
Member

🎉 @oleibman . Thanks for the contribution.

oleibman added a commit to oleibman/PHPWord that referenced this pull request Dec 24, 2023
Fix PHPOffice#1692. Builds on work started some time ago by @0b10011, to whom primary credit is due.

Html Reader does not process the `head` section of the document, and, in particular, does not process its `style` section. It will, however, process inline styles, so 0b10011's model of adding the title as a text run (with styles) will work well once this change is applied. However, that model would not deal with the alternative method of assigning a Title Style, and just adding the title as text. In order to accommodate that, I have removed the declaration of heading font styles in the head section, and now generate them all inline in the body. This has the added benefit of being able to read the doc as html, then saving it as docx, preserving, at least in part, any user-defined font styles. Note that html does have pre-defined title styles, but docx does not.

@constip suggests in the original issue that margin top and bottom are being applied too frequently. I believe that was addressed by recently merged PR PHPOffice#2475. It is also suggested that the `*` css selector be dropped in favor of `body`. 2475 added the body selector. I agree that this renders the `*` selector unnecessary, and, as stated in the issue, it can cause problems. This PR drops that selector. It is also suggested that `loadHTML` be used instead of `loadXML`. This is not as easy a change as it seems, because loadHTML uses ISO-8859-1 charset rather than UTF-8, so I will not attempt that change.
oleibman added a commit to oleibman/PHPWord that referenced this pull request Jan 6, 2024
Fix PHPOffice#2539. Inadvertent break in TemplateProcessor behavior with PHPOffice#2475. Deleted temp file on destruct. It will now persist, restoring prior behavior, unless user specifies otherwise in constructor.
oleibman added a commit to oleibman/PHPWord that referenced this pull request Jan 8, 2024
Replace PR PHPOffice#2542.

Fix PHPOffice#2539. Inadvertent break in TemplateProcessor behavior after PHPOffice#2475. Deleted temp file on destruct. It will now persist after destructor.
Progi1984 pushed a commit that referenced this pull request Jan 8, 2024
* Template Processor Persist File After Destruct

Replace PR #2542.

Fix #2539. Inadvertent break in TemplateProcessor behavior after #2475. Deleted temp file on destruct. It will now persist after destructor.

* Update Change Log
oleibman added a commit to oleibman/PHPWord that referenced this pull request Jan 18, 2024
Fix PHPOffice#2548. A particularly perplexing problem accidentally introduced by PR PHPOffice#2475. Problem does not arise for Php8, and does not arise for Php7 unit tests. But, running *not* under Phpunit auspices with Php7 can cause a warning message at destructor time if the `save` function has been used. A very artificial test is introduced to test this situation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Bug while transforming html to word document with addHtml method.
3 participants