Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to read docx file into the database(Mysql) #1192

Open
yoyomule opened this issue Nov 12, 2017 · 8 comments
Open

How to read docx file into the database(Mysql) #1192

yoyomule opened this issue Nov 12, 2017 · 8 comments

Comments

@yoyomule
Copy link

I want to put with a control element of the word document, in accordance with the control name or type written into the database(Mysql), do not know what method to achieve, I am also concerned about the phpword, don't know if I can do it, hoping to provide methods, thank you. Ps.I offer a control element word document examples.
phpwordtomysql.docx

@FBnil
Copy link

FBnil commented Nov 13, 2017

If I understand correctly, you want to fetch "PlaceholderText" to get "单击或点击此处输入文字。</" and in the timedate pulldown menu, you want the value of the date (2017-11-11T00:00:00Z), get the datatype (dateTime) maybe the fieldname of the table (time) ?

<w:sdtPr>
  <w:alias w:val="time"/>
  <w:date w:fullDate="2017-11-11T00:00:00Z">
    <w:dateFormat w:val="yyyy/M/d"/>
    <w:lid w:val="zh-CN"/>
    <w:storeMappedDataAs w:val="dateTime"/>
    <w:calendar w:val="gregorian"/>
  </w:date>
</w:sdtPr>

This works with my fork of TemplateProcessor. It searches for something, like "CASE 2" (but consider using bookmarks or something else that allows for a better find). Then, once it find the paragraph w:p around "CASE 2", it will look for w:sdtPr to the right of this paragraph, and return that xml segment, which we convert to a structure so we can access its values. (which needs work as xml tags can shift, and not always be in $vals[2], so something like the $hash variable needs to be created for that)

You need to replace the original template processor file with my TemplateProcessor.php
for this to work.

$file_dir = storage_path('phpwordtomysql.docx');
$template = new \PhpOffice\PhpWord\TemplateProcessor($file_dir);
$xmldata = $template->processSegment('CASE 2', 'w:p', \PhpOffice\PhpWord\TemplateProcessor::SEARCH_AROUND, 1, 'MainPart', function(&$xmlSegment, &$segmentStart, &$segmentEnd, &$part){
	$segmentStart = strpos($part, '<w:sdtPr', $segmentEnd + 1);
	if (!$segmentStart) dd("FATAL: nothing found!");
	$segmentEnd = strpos($part, '</w:sdtPr>', $segmentStart + 8);
	$xmlSegment = substr($part, $segmentStart, ($segmentEnd - $segmentStart));
	return false; # only getSegment
});

$p = xml_parser_create();
xml_parse_into_struct($p, $xmldata, $vals, $index);
xml_parser_free($p);
$date = $vals[2]{'attributes'}{'W:FULLDATE'}; #  "date" = "2017-11-11T00:00:00Z"
$type = $vals[5]{'attributes'}{'W:VAL'}; #  "type" = "dateTime"

$hash = [];
array_walk($vals, function(&$item) use(&$hash){
	if(array_key_exists('attributes',$item) && array_key_exists('tag',$item)){
		$hash[ $item['tag'] ] = array_values($item['attributes'])[0] ;
	}
});

dd(compact('vals','index','date', 'type', 'hash'));

*note: dd() is Laravel's die-and-dump. This function may not be available for you, replace it with var_dump() or similar

this will return:

array:5 [▼
  "vals" => array:8 [▶]
  "index" => array:7 [▶]
  "date" => "2017-11-11T00:00:00Z"
  "type" => "dateTime"
  "hash" => array:6 [▼
    "W:ALIAS" => "time"
    "W:DATE" => "2017-11-11T00:00:00Z"
    "W:DATEFORMAT" => "yyyy/M/d"
    "W:LID" => "zh-CN"
    "W:STOREMAPPEDDATAAS" => "dateTime"
    "W:CALENDAR" => "gregorian"
  ]
]

@zw12579
Copy link

zw12579 commented Nov 16, 2017

Why this function processSegment return false and Warning on line 602 in your TemplateProcessor.php? Could you help me solve the problem ? @FBnil

@FBnil
Copy link

FBnil commented Nov 16, 2017

@zw12579 I recently added $direction as a parameter to processSegment(), so you need to add it too (third parameter):
TemplateProcessor::SEARCH_AROUND has a value of 0. I updated my example above.

    /**
     * process a segment.
     *
     * @param string  $needle  If this is a macro, you need to add the ${} around it yourself.
     * @param string  $xmltag  an xml tag without brackets, for example:  w:p
     * @param integer $direction  in which direction should be searched. -1 left, 1 right. Default 0: around
     * @param integer $clones  How many times the segment needs to be cloned
     * @param string  $docPart 'MainPart' (default) 'Footers:1' (first footer) or 'Headers:1' (first header)
     * @param mixed   $replace true (default/cloneSegment) false(getSegment) string(replaceSegment) function(callback)
     * @param boolean $incrementVariables true by default (variables get appended #1, #2 inside the cloned blocks)
     * @param boolean $throwException false by default (it then returns false or null on errors).
     *
     * @return mixed The segment(getSegment), false (no $needle), null (no tags), true (clone/replace)
     */

@zw12579
Copy link

zw12579 commented Nov 16, 2017

@FBnil Thank you very much, I try to correct it!

@FBnil
Copy link

FBnil commented Nov 16, 2017

@zw12579 And I updated my source yet again, so you will need to download TemplateProcessor again (a bug introduced when I added search direction).

@zw12579
Copy link

zw12579 commented Nov 17, 2017

@FBnil What should I do if I want to get a picture from a docx file?Now,The picture in the xml file is a connection.This is my example code. Thanks!

<w:sdtContent>
    <w:tc>
        <w:tcPr>
            <w:tcW w:w="8296" w:type="dxa"/>
        </w:tcPr>
        <w:p w:rsidR="00432BA4" w:rsidRPr="00432BA4" w:rsidRDefault="00432BA4">
            <w:r>
                <w:rPr>
                    <w:noProof/>
                </w:rPr>
                <w:drawing>
                    <wp:inline distT="0" distB="0" distL="0" distR="0">
                        <wp:extent cx="1524000" cy="857250"/>
                        <wp:effectExtent l="0" t="0" r="0" b="0"/>
                        <wp:docPr id="1" name="图片 1"/>
                        <wp:cNvGraphicFramePr>
                            <a:graphicFrameLocks
                                    xmlns:a="https://schemas.openxmlformats.org/drawingml/2006/main"
                                    noChangeAspect="1"/>
                        </wp:cNvGraphicFramePr>
                        <a:graphic xmlns:a="https://schemas.openxmlformats.org/drawingml/2006/main">
                            <a:graphicData
                                    uri="https://schemas.openxmlformats.org/drawingml/2006/picture">
                                <pic:pic
                                        xmlns:pic="https://schemas.openxmlformats.org/drawingml/2006/picture">
                                    <pic:nvPicPr>
                                        <pic:cNvPr id="0" name="Picture 1"/>
                                        <pic:cNvPicPr>
                                            <a:picLocks noChangeAspect="1" noChangeArrowheads="1"/>
                                        </pic:cNvPicPr>
                                    </pic:nvPicPr>
                                    <pic:blipFill>
                                        <a:blip r:embed="rId5" cstate="print">
                                            <a:extLst>
                                                <a:ext uri="{28A0092B-C50C-407E-A947-70E740481C1C}">
                                                    <a14:useLocalDpi
                                                            xmlns:a14="https://schemas.microsoft.com/office/drawing/2010/main"
                                                            val="0"/>
                                                </a:ext>
                                            </a:extLst>
                                        </a:blip>
                                        <a:stretch>
                                            <a:fillRect/>
                                        </a:stretch>
                                    </pic:blipFill>
                                    <pic:spPr bwMode="auto">
                                        <a:xfrm>
                                            <a:off x="0" y="0"/>
                                            <a:ext cx="1524000" cy="857250"/>
                                        </a:xfrm>
                                        <a:prstGeom prst="rect">
                                            <a:avLst/>
                                        </a:prstGeom>
                                        <a:noFill/>
                                        <a:ln>
                                            <a:noFill/>
                                        </a:ln>
                                    </pic:spPr>
                                </pic:pic>
                            </a:graphicData>
                        </a:graphic>
                    </wp:inline>
                </w:drawing>
            </w:r>
        </w:p>
    </w:tc>
</w:sdtContent>

@FBnil
Copy link

FBnil commented Nov 17, 2017

You will have to expose the zipclass and use getFromName, modify TemplateProcessor to also have this function:

    /**
     * Retrieve a file inside the document (possibly binary data)
     *
     * @param string $localname
     *
     * @return string content of the file
     */
    public function getFromName($localname)
    {
        return $this->zipClass->getFromName($localname);
    }

Now all you need to figure out is which $localname it has, for example "word/media/image1.jpeg" or "word/media/image2.png".

In the relations (document.xml.rels) file:

<Relationship Id="rId2" Type="https://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image1.jpeg"/>
<Relationship Id="rId3" Type="https://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image2.png"/>

So you need to use getFromName to get the content of document.xml.rels; then use xml_parser_create() or similar to parse the xml to a struct, and you can access the filenames from the Id.

@zw12579
Copy link

zw12579 commented Nov 17, 2017

You helped me a lot, once again thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants