-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CloneBlock, replaceBlock, and deleteBlock not working in template docx #2410
Comments
#341 solution from @liborm85 is working. For my case need to change the preg_match code. If have problem will update soon. Hope someone can fix this issue without need to override existing code Create new controller as TemplateProcessorMod extends TemplateProcessor
In controller change from
to
|
I encountered this back in 0.16.0 and am just today updating due to migration to PHP 8.2. Apparently, I fixed this in my own code. The issue in in the preg_match and it's super simple to fix in TemplateProcessor.php. I'm not a contributor, but will tell you where the issue is. Instead of "<w:p.*" it should be changed to "<w:p\b.*". There's two places on this line where it occurs, so both need to be replaced, to make it look like this:
The reason this is an issue is because it will also capture <w:pPr> nodes if the \b isn't present, which ends up orphaning any <w:p> nodes with <w:pPr> nodes nested inside, so something like: "<w:p><w:pPr>...</w:pPr></w:p>" ends up with just "<w:p>" with no closing "</w:p>" tag because it replaces the "<w:pPr>" through "</w:p>", leaving only "<w:p>". The "\b" addition will cause it to account for the word boundary, so will only include "<w:p>" nodes. |
Customer: ${customer_name} ? Do You mean ${block_name} |
@thomasb88 Yes like you mention ${block_name} |
Actual version of PHPWord CloneBlock function
Which mean the regexp is
This is the hard part, because how the regexp is written seems to create assumptions on the way you create your template file. Let's first replace PHP variables, that are there for generic purpose:
Using https://regex101.com/, one can have kind of explanation
Not so clear ? -) Well, some synthetic points about this: 2 - What one can see looking at the 1st or the last part, is that added to the block_name pattern, there is a lot of stuff with <w:p> and </w:p>, that is to say about how paragraph interact with the block pattern (see http:https://officeopenxml.com/WPparagraph.php) 3 - First try to use cloneBlock was to write directly the block patent as text on word, and expect to be able to read it directly in the corresponding xml file (it would all have same style as it was the case from the user point of view). It fails. 4 - It seems many things can make it fail. For example Track Changes (that introduce w:rsidR tags), Doing enter or Ctrl + Enter at the end of the block patent (Enter should produce a new paragrah, whereas Ctrl+Enter should not - but got examples when i used only Enter and there is only 1 paragraph, but multiple run), adding custom Tabs on the rule (w:tab),... 5 - Word split the document in paragraph, then paragraph style, then run(s), then run(s) style(s)... To see all the tags combination that are allowed, or even used, by your OOXML generator (MS Word, Open Office...), the whole ECMA-376 should be parsed, which is a really long work (https://www.ecma-international.org/publications-and-standards/standards/ecma-376/) 6 - On the last version of PHPWord, the choice have been made to remove lot of tags around block pattern:
7 - It seems the way the cloneBlock function is implemented put the complexity on the preg_match part (as one can see the replacement after that is easy). It also means when the document begin to be bigger, then it happens than it reach the regexp limit more easily. 8 - So instead of try to "check" the OOXML consistency, a solution can be to find only the block pattern (opening and closing). It seems that is what you have done on the new controller TemplateProcessorMod 9 - But then, the "block" to replace might not be "inside" the opening and the closing pattern, due to other tags that can surround the block patterns And so there are still assumptions that are to be made to make things work:
I see a lot of tickets on those functions, with regexp modification to fit specific docx files. But i didn't find the corresponding specification. If i miss it, sorry, my bad (but i would be interested to see it). Else, i would:
Note in the actual code, i think one of the problem is that on findContainingXmlBlockForMacro, when searching for the closing block pattern, it start from the opening block pattern (and so it assume that all is included inside 1 paragraph, which is not always true) |
Describe the Bug
I am experiencing issues with the cloneBlock, replaceBlock, and deleteBlock methods in the PHPWord library. These methods are used to manipulate template DOCX files by cloning, replacing, or deleting specific blocks of content. However, they are not functioning as expected.
Steps to Reproduce
Prepare a template DOCX file with the following content:
In the controller or PHP script, use the following code:
Expected Behavior
The cloneBlock method should clone the block_name block three times in the template DOCX file, resulting in multiple repetitions of the block.
Current Behavior
The cloneBlock method does not clone the block as expected. No changes are made to the template DOCX file. Still same like original template
Additional Information
Context
Please fill in your environment information:
The text was updated successfully, but these errors were encountered: