Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tabs not working in titles (Edit: actually not really working at all anywhere) #862

Open
judgej opened this issue Aug 15, 2016 · 13 comments

Comments

@judgej
Copy link

judgej commented Aug 15, 2016

Setting a tab in a paragraph works fine:

$section->addText("First\tSecond\tThird", [], 'paragraphStyleWithTabs');

The First/Second/Third words get correctly positioned at the tab stops defined in the paragraph style.

When creating a title, this does not seem to work:

    $phpWord->addTitleStyle(
        3,
        [],
        ['tabs' => [new \PhpOffice\PhpWord\Style\Tab('left', 1600), ...]]
    );

    $section->addTitle("First\tSecond\tThird", 3);

The resulting DOCX document puts a space into the position where the tab should be.

However, the tab stops are correctly added to the ruler in the title, so manually replacing the space with a tab inside MS Word takes the title text to the correct tab positions. The tabs are getting in there, are supported by MS Word, but the title text is not getting its "\t" tab characters handled as text tabs in the generated document.


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@judgej
Copy link
Author

judgej commented Aug 15, 2016

The same happens for Libra Office, except none of the tabs work (titles or paragraphs). The RTF output passes the raw tab character through. I'm wondering if the instance it is working for DOCX is a pure fluke, and I've just got the whole syntax wrong?

@judgej
Copy link
Author

judgej commented Aug 15, 2016

Looking into the DOCX source, it looks like tab characters are carried through to that too, so they are in there.

@judgej
Copy link
Author

judgej commented Aug 15, 2016

Looking at the source of the DOCX document after saving it with a manual tab added, MS Word seems to add a tab as an XML element rather than a tab character. So this text run will generate a tab between "One" and "Two":

$textrun = $section->createTextRun('entryFirst');
$textrun->addText('One');
$textrun->addText("<w:r><w:tab/></w:r>");
$textrun->addText('Two');
$textrun->addTextBreak();

I'm not sure how to get a text run unto a heading though, without generating an invalid DOCX file.

@judgej
Copy link
Author

judgej commented Aug 15, 2016

Spent hours trying to understand the code, but have not yet been able to wrap my head around it. What I suspect needs to happen to fix this, is text and title elements (and presumably text runs) that contain \t (tab) characters need to be split into multiple text elements - text elements and tab elements. So "One\tTwo" becomes three elements in a paragraph or title, "<w:t>One/w:t", "<w:t><w:tab/>/w:t" and "<w:t>Two/w:t".

@judgej
Copy link
Author

judgej commented Aug 16, 2016

Last one for tonight. This awful, nasty, dirty hack fixes the XML for tab characters for DOCX ONLY:

    protected function tabXml($string)
    {
        return str_replace("\t", '</w:t></w:r><w:r><w:tab/></w:r><w:r><w:t>', $string);
    }

    ...
    $section->addTitle($this->tabXml(htmlspecialchars("One\tTwo"))), $depth);

It works, for all the wrong reasons (one of which is that addTtitle() does not escape its own XML special characters like any good function should do when it is a gatekeeper between two different formats - text on one side and XML on the other). It assumes a certain XML element structure, which may change, and it's messing around with an XML stream rather than setting up structured elements to open and close their own tags (ensuring the element structure is always valid). It's all wrong and nasty, but demonstrates how tabs should work, can work, and currently do not work.

This clip shows a Header3 (the "Class 1..." line) with a tab inserted before "Minor Puppy" using the above method, and the text tabbing out to the tab stop set up in the ruler of the header 3 paragraph style.

clipboard02

@judgej
Copy link
Author

judgej commented Aug 16, 2016

I'm assuming ODT and RTF both have their own way of encoding tabs, which is why using \t (as the samples suggest) do not work for them either.

@judgej
Copy link
Author

judgej commented Aug 16, 2016

ODT is a simpler "hack" as the text-run tags don't need to be closed and reopened:

    protected function tabOdtXml($string)
    {
        return str_replace("\t", '<text:tab/>', $string);
    }

But again, this is a hack, because the XML elements are the domain of the document output and not the construction of the document source.

@judgej judgej changed the title Tabs not working in titles Tabs not working in titles (Edit: actually not really working at all) Aug 16, 2016
@judgej judgej changed the title Tabs not working in titles (Edit: actually not really working at all) Tabs not working in titles (Edit: actually not really working at all anywhere) Aug 16, 2016
@judgej
Copy link
Author

judgej commented Aug 16, 2016

RTF uses \tab to indicate a tab is present. However, the RTF output of this library does not yet seem to support tab stops, so you are limited to the default tab stops that your RTF reader provides.

@markmorris
Copy link

Is this not fixed yet or is there a work around? I still can not use tabs in title. Thanks

@markmorris
Copy link

markmorris commented Jan 31, 2019

Last one for tonight. This awful, nasty, dirty hack fixes the XML for tab characters for DOCX ONLY:

    protected function tabXml($string)
    {
        return str_replace("\t", '</w:t></w:r><w:r><w:tab/></w:r><w:r><w:t>', $string);
    }

    ...
    $section->addTitle($this->tabXml(htmlspecialchars("One\tTwo"))), $depth);

Where are you putting this code to fix the issue? Because no matter what i do this does nothing.

@judgej
Copy link
Author

judgej commented Feb 1, 2019

This hack is in a site that has been running for three years, so I expect the core library has changed since then (or maybe it hasn't).

I use this method in my controller (note I generate docx, odt and rtf):

    protected function tabXml($string, $extension = '.html')
    {
        $string = htmlspecialchars($string);

        if ($extension == '.rtf') {
            return str_replace("\t", '\tab ', $string);
        } elseif ($extension == '.odt') {
            return str_replace("\t", '<text:tab/>', $string);
        } elseif ($extension == '.docx') {
            return str_replace("\t", '</w:t></w:r><w:r><w:tab/></w:r><w:r><w:t>', $string);
        } else {
            return $string;
        }
    }

Then I use it to wrap any text that may contain a tab character (titles, paragraphs etc,.):

$extension = 'docx';
...
$section->addTitle(
    $this->tabXml("\tMy Great Title", $extension),
    4
);

Like I say, this should not work, as the addTitle() methods and friends should be treating the input as text and not a kind of text/XML hybrid, but it's a "temporary" workaround, and I dare not upgrade this package as a result. It's the very definition of technical debt, unfortunately.

Hope that helps.

@markmorris
Copy link

markmorris commented May 2, 2019

Sorry to post in here again, but does anyone have a workaround for this in 0.16. I can't get the hack to work on this version as there is quite a lot of changes.

UPDATE:
So i've been looking over the code and seen that addTitle actually support a textrun, so i decided to try that, but its impossible to use it this way as you cant add text to a textrun without it outputting, I ended up with two versions of the same text HOWEVER tabs worked...

UPDATE 2:
So i got it to semi work, i didnt realise you can create a TextRun without outputting it. See example

                $titleText = new TextRun();
                $titleText->addText("Hello \t");
                $titleText->addText('Some other text');
                $this->sceneSection->addTitle($titleText, 1);

The above works fine, tabs work fine. Now the TOC is competely broken and the docx errors when i open it but everything is there other than the TOC. Even taking the tab out still brakes the TOC. So this is a whole different issue maybe?

@markmorris
Copy link

markmorris commented May 2, 2019

I done a thing and managed to get it working in 0.16. Hacky but it works.

\vendor\phpoffice\phpword\src\PhpWord\Writer\Word2007\Element\Title.php
Changed This

        // Actual text
        $text = $element->getText();
        if (is_string($text)) {
            $xmlWriter->startElement('w:r');
            $xmlWriter->startElement('w:t');
            $this->writeText($text);
            $xmlWriter->endElement(); // w:t
            $xmlWriter->endElement(); // w:r
        } elseif ($text instanceof \PhpOffice\PhpWord\Element\AbstractContainer) {
            $containerWriter = new Container($xmlWriter, $text);
            $containerWriter->write();
        }

to this

        // Actual text
        $text = $element->getText();
        if (is_string($text)) {
            $xmlWriter->startElement('w:r');
            $xmlWriter->startElement('w:t');
            if (strpos($text, "\t") !== false) {
                $text = explode("\t", $text);
                if (is_array($text)) {
                    foreach ($text as $t) {
                        $this->writeText($t);
                        $xmlWriter->endElement(); // w:t
                        $xmlWriter->endElement(); // w:r
                        $xmlWriter->startElement('w:r');
                        $xmlWriter->writeElement('w:tab', null);
                        $xmlWriter->endElement();
                        $xmlWriter->startElement('w:r');
                        $xmlWriter->startElement('w:t');
                        $xmlWriter->writeElement('xml:space', "preserve");
                    }
                }
            } else {
                $this->writeText($text);
            }
            $xmlWriter->endElement(); // w:t
            $xmlWriter->endElement(); // w:r
        } elseif ($text instanceof \PhpOffice\PhpWord\Element\AbstractContainer) {
            $containerWriter = new Container($xmlWriter, $text);
            $containerWriter->write();
        }

Just to point out I copied the exact markup from an actual word document.

@judgej i managed to figure out how to use TextRuns in title.

$titleText = new TextRun();
$titleText->addText("Hello \t");
$titleText->addText('Some other text');
$sceneSection->addTitle($titleText, 1);

Tabs work this way but unfortunatly it breaks to TOC. I've added an issue for it.
#1625

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants