Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement new feature to build manifest from any composer.lock #16

Closed
wants to merge 1 commit into from
Closed

Implement new feature to build manifest from any composer.lock #16

wants to merge 1 commit into from

Conversation

llaville
Copy link

@llaville llaville commented Feb 8, 2022

Implement feature describe in #15

Copy link
Member

@theseer theseer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally like the idea to use the composer.lock (and potentially composer.json) as an alternative source of information.

That being said, I'm not yet sure I understand the reasoning behind your overall approach. You created a class ManifestJsonDocument that to me is a mapper rather than a document, given it only returns ManifestDocument instances and does a lot of mapping of data into XML.

After that, you use our mapper to get that XML mapped to the Manifest object we actually want.

Why the intermediate format? That's redundant mapping involving the XML representation only to have it mapped to the actual Manifest Object in the next step. Why not directly map the json to our Manifest ?

If everything went well, our Serializer should be very much capable to map that into XML.

I'm also not convinced the ::fromComposer method belongs into the ManifestLoader as the idea here is logically coupled with loading our manifest from an xml format.

So maybe the composer handling deserves its own Loader? Maybe we want to call it ComposerConverter or ComposerImporter? That class could load what is needed (see other review comments) and pass that on to the mapper as JsonDocument(s) and get a Manifest-Object back. If you want to store that as XML then, that's up to the user.

Does that make any sense to you?

Comment on lines +37 to +39
$this->authors = new AuthorCollection();
$this->authors->add(new Author('John Doe', new Email('[email protected]')));
$this->license = 'MIT';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do understand the composer.lock does not contain this but usually the composer.json does.

So maybe we should change the implementation to use both files?
That would probably mean we should require a path as input to the ManifestLoader::fromComposer and not the actual filename.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above comment uses the "old" class names. The name will be different if the introduction comment would be addressed.

{
$this->id = (string) ($decodedData['content-hash'] ?? $defaultId);
$this->name = (string) ($decodedData['name'] ?? 'vendor/package');
$this->version = (string) ($decodedData['version'] ?? '0.0.0-dev');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this correctly, this value will only be set when it (originally?) was set in the composer.json. But it is recommended to omit it as the version should be determined from the VCS.

Are we sure this is the best we can do here? I certainly don't know ;)

Comment on lines +51 to +52
$typeAttr = $doc->createAttribute('type');
$typeAttr->value = $composerDoc->getType();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually map 1:1? I wasn't aware we were that close ;)

@llaville
Copy link
Author

llaville commented Feb 9, 2022

First, I would like to thank you to take time for your review and comments !

I generally like the idea to use the composer.lock (and potentially composer.json) as an alternative source of information.

With latest LTS Composer version 2.2 announcement, I'd like to use the Composer Runtime API with Installed versions component, and of course the composer.json file because copyright info (license and authors) of root package is only available there.

That being said, I'm not yet sure I understand the reasoning behind your overall approach.

As I said in feature report, "to avoid rewrites too much code, I've reused lot of existing architecture as much as possible".

Preambule: Keep the same way, whatever you handle, an XML string, XML file, contained into a PHAR, or from another source.

You created a class ManifestJsonDocument that to me is a mapper rather than a document,
given it only returns ManifestDocument instances and does a lot of mapping of data into XML.

After that, you use our mapper to get that XML mapped to the Manifest object we actually want.

Why the intermediate format?
That's redundant mapping involving the XML representation only to have it mapped to the actual Manifest Object in the next step.
Why not directly map the json to our Manifest ?

As ManifestDocument is a specialized XML version of document (see DOMDocument) and cannot be re-used with other specialized JSON version,
I must had to write the ManifestJsonDocument that translate json source to an xml source, easy handle by ManifestDocument itself.

If everything went well, our Serializer should be very much capable to map that into XML.

I'm also not convinced the ::fromComposer method belongs into the ManifestLoader
as the idea here is logically coupled with loading our manifest from an xml format.

See my goal in preambule.

So maybe the composer handling deserves its own Loader?
Maybe we want to call it ComposerConverter or ComposerImporter?
That class could load what is needed (see other review comments) and pass that on to the mapper as JsonDocument(s) and get a Manifest-Object back. If you want to store that as XML then, that's up to the user.

In this first version I didn't want to write a lot of code, or change current API, that works fine for all users.

Except perharps for those, like me, that don't want to spent time to write an XML document (with possible XML syntax error, as it was the case with churn-php project on recent release 1.6.0, follow by a fix in 1.6.1

To conclude: I'm not convinced that this version is the best one, but it's a compromise !

In a better world, I'd like an API version that will be able to :

  • read a manifest from XML document (string or file)
  • read a manifest from a PHAR
    • from a root manifest.xml file (to keep BC)
    • from the metadata for an easy integration with BOX project
  • write a manifest
    • from Composer installation
    • to XML document from a callable function
    • to raw text format (from a template), that will allow usage like phpunit --manifest or phpcompatinfo-6.0.phar --manifest

Examples:

PHPUnit manifest
phpunit/phpunit: 9.5.10
doctrine/instantiator: 1.4.0
myclabs/deep-copy: 1.10.2
nikic/php-parser: v4.13.0
phar-io/manifest: 2.0.3
phar-io/version: 3.1.0
phpdocumentor/reflection-common: 2.2.0
phpdocumentor/reflection-docblock: 5.2.2
phpdocumentor/type-resolver: 1.5.0
phpspec/prophecy: 1.14.0
phpunit/php-code-coverage: 9.2.7
phpunit/php-file-iterator: 3.0.5
phpunit/php-invoker: 3.1.1
phpunit/php-text-template: 2.0.4
phpunit/php-timer: 5.0.3
sebastian/cli-parser: 1.0.1
sebastian/code-unit: 1.0.8
sebastian/code-unit-reverse-lookup: 2.0.3
sebastian/comparator: 4.0.6
sebastian/complexity: 2.0.2
sebastian/diff: 4.0.4
sebastian/environment: 5.1.3
sebastian/exporter: 4.0.3
sebastian/global-state: 5.0.3
sebastian/lines-of-code: 1.0.3
sebastian/object-enumerator: 4.0.4
sebastian/object-reflector: 2.0.4
sebastian/recursion-context: 4.0.4
sebastian/resource-operations: 3.0.3
sebastian/type: 2.3.4
sebastian/version: 3.0.2
symfony/polyfill-ctype: v1.23.0
theseer/tokenizer: 1.2.1
webmozart/assert: 1.10.0
PHP CompatInfo manifest
bartlett/php-compatinfo dev-master requires php ^7.4|^8.0 : 7.4.26
bartlett/php-compatinfo dev-master requires ext-json * : 7.4.26
bartlett/php-compatinfo dev-master requires ext-libxml * : 7.4.26
bartlett/php-compatinfo dev-master requires ext-pcre * : 7.4.26
bartlett/php-compatinfo dev-master requires ext-spl * : 7.4.26
bartlett/php-compatinfo dev-master requires bartlett/php-compatinfo-db ^3.6 : 3.15.0
bartlett/php-compatinfo dev-master requires composer/package-versions-deprecated ^1.8 : 1.11.99.4
bartlett/php-compatinfo dev-master requires doctrine/collections ^1.4 : 1.6.8
bartlett/php-compatinfo dev-master requires nikic/php-parser ^4.10 : v4.13.2
bartlett/php-compatinfo dev-master requires psr/log ^1.0 : 1.1.4
bartlett/php-compatinfo dev-master requires ramsey/uuid ^3.9|^4.0 : 4.2.3
bartlett/php-compatinfo dev-master requires symfony/config ^4.4|^5.0 : v5.4.0
bartlett/php-compatinfo dev-master requires symfony/console ^4.4|^5.0 : v5.4.1
bartlett/php-compatinfo dev-master requires symfony/event-dispatcher ^4.4|^5.0 : v5.4.0
bartlett/php-compatinfo dev-master requires symfony/finder ^4.4|^5.0 : v5.4.0
bartlett/php-compatinfo dev-master requires symfony/dependency-injection ^4.4|^5.0 : v5.4.1
bartlett/php-compatinfo dev-master requires symfony/serializer ^4.4|^5.0 : v5.4.0
bartlett/php-compatinfo dev-master requires symfony/stopwatch ^4.4|^5.0 : v5.4.0
bartlett/php-compatinfo dev-master requires (for development) composer/composer ^2.0 : 2.1.14
bartlett/php-compatinfo dev-master requires (for development) symfony/phpunit-bridge ^5.1 : v5.4.0

NOTE: When you said "But it is recommended to omit it as the version should be determined from the VCS.",
you probably think to a solution like https://github.com/sebastianbergmann/phpunit/blob/master/build/scripts/phar-manifest.php#L3-L15

But it suffer that PharIo\Version\Version accept only string that is Semantic Versionning compatible, and does not accept version under developement

For example with this project and this PR, if we pick version from VCS, we may have something like

  • from master branch and commit that have a tag
phar-io/manifest: 2.0.3
[email protected] in /shared/backups/github/manifest $ git log -n 1
commit 97803eca37d319dfa7826cc2437fc020857acb53 (HEAD -> master, tag: 2.0.3, origin/master, origin/HEAD)
Author: Arne Blankerts <[email protected]>
Date:   Tue Jul 20 13:28:34 2021 +0200

    update tools
  • but from another branch, or a commit that does not yet have tag
phar-io/manifest: from_composer@045047b37d1dd99bf7e97487e2dc10820439f29b

Waiting to hear you again, and clarify some points if necessary !

@llaville
Copy link
Author

llaville commented Feb 9, 2022

@theseer Working from a new draft (major API rewrites) ... Will propose a new PR tomorrow morning

@theseer
Copy link
Member

theseer commented Feb 9, 2022

Looking forward to it. I won't have much time until the weekend to look at things or even reply to your above comments though...

@llaville
Copy link
Author

Looking forward to it. I won't have much time until the weekend to look at things or even reply to your above comments though...

Understood. And I won't have time to finish it this morning, It will let me more hours to polish the new PR ;-)

@theseer
Copy link
Member

theseer commented Feb 10, 2022

First, I would like to thank you to take time for your review and comments !

Thank YOU for contributing. That's the least a project owner can do...

I generally like the idea to use the composer.lock (and potentially composer.json) as an alternative source of information.

With latest LTS Composer version 2.2 announcement, I'd like to use the Composer Runtime API with Installed versions component, and of course the composer.json file because copyright info (license and authors) of root package is only available there.

Interesting option, but nothing I'd like to see in this repository. This repository is intended to be used as a dependency when ever someone wants to read a manifest.xml file - usually from within a phar archive. Strictly speaking, even the XML-Serializer should be moved out - but that's just one class and I considered that overkill.

But we won't grow this to a gigantic mess of dependency hell and bloat that can do everything and nothing.

That being said: We maybe should create additional repositories where people wanting to build based on composer data - may it be via API or by parsing the files - can simply use it.

That being said, I'm not yet sure I understand the reasoning behind your overall approach.

As I said in feature report, "to avoid rewrites too much code, I've reused lot of existing architecture as much as possible".

Again, I fail to see where the need of a rewrite comes from?

Why the intermediate format?
That's redundant mapping involving the XML representation only to have it mapped to the actual Manifest Object in the next step.
Why not directly map the json to our Manifest ?

As ManifestDocument is a specialized XML version of document (see DOMDocument) and cannot be re-used with other specialized JSON version, I must had to write the ManifestJsonDocument that translate json source to an xml source, easy handle by ManifestDocument itself.

No you don't. That's just superfluous. Having a ComposerMapper, you can directly create the Manifest object graph. LIke the existing mapper is doing as well - just based on the XML structure.

If everything went well, our Serializer should be very much capable to map that into XML.

I'm also not convinced the ::fromComposer method belongs into the ManifestLoader
as the idea here is logically coupled with loading our manifest from an xml format.

See my goal in preambule.

I still disagree with the approach ;)

So maybe the composer handling deserves its own Loader?
Maybe we want to call it ComposerConverter or ComposerImporter?
That class could load what is needed (see other review comments) and pass that on to the mapper as JsonDocument(s) and get a Manifest-Object back. If you want to store that as XML then, that's up to the user.

In this first version I didn't want to write a lot of code, or change current API, that works fine for all users.

I feel we're still not talking about the same thing here.

To conclude: I'm not convinced that this version is the best one, but it's a compromise !

In a better world, I'd like an API version that will be able to :

* [x]  read a manifest from XML document (string or file)

* [ ]  read a manifest from a PHAR

There is nothing to be done here. The phar stream wrapper does everything.

  * [x]  from a root `manifest.xml` file (to keep BC)

I fail to see what this is but given it's checked it's probably not too much of a problem ;)

  * [ ]  from the [metadata](https://www.php.net/manual/en/phar.getmetadata) for an easy integration with [BOX](https://github.com/box-project/box) project
* [ ]  write a manifest
  
  * [ ]  from Composer installation
  * [ ]  to XML document from a callable function
  * [ ]  to raw text format (from a template), that will allow usage like `phpunit --manifest` or `phpcompatinfo-6.0.phar --manifest`

PHPUnit's manifest is a very bad example because it's not based on phar-io/manifest yet.
If there is a use case for a text based output, of course this could be done but again this is out of scope for this project. We can add a new repository that supports that of course.

But it suffer that PharIo\Version\Version accept only string that is Semantic Versionning compatible, and does not accept version under developement
[...]

phar-io/manifest: from_composer@045047b37d1dd99bf7e97487e2dc10820439f29b

Interesting problem. I have to admit I didn't think much about yet.
Maybe we need to extend phar-io/version to support these as well in one way or another.

@llaville
Copy link
Author

In summary, if I'll well understood :

  • all suggestions are out of scope of this project, because you 'll support only manifest in XML format.
  • as a project owner you're the only one to decide the right way of your project (and I agree with that)
  • be aware that if you want this project to be adopted by much more people, you'll need a feature to write quickly a manifest without to worry about XML syntax, that may be source of parsing error.

That will conclude here my contribution, for this project. Thanks again to take time to reply to all my posts !

@llaville llaville closed this Feb 11, 2022
@llaville llaville deleted the from_composer branch February 11, 2022 05:30
@theseer
Copy link
Member

theseer commented Feb 11, 2022

In summary, if I'll well understood :

* all suggestions are out of scope of this project, because you 'll support only manifest in XML format.

Not exactly what I meant even though I see one could have interpreted it like that. It depends on how you define "project" here.
I'm a fan of small repositories and libraries that do not ship bloat - apart from what the code is doing itself.

The focused on usecase for this phar-io/manifest is to provide an object Manifest that provides a useful API to get meta information about the content of a phar archive. The fact that data is stored in XML Format is an unimportant detail nobody really has to deal with - except of course requiring the necessary extensions to be available.

Everything above that is out of scope and - as mentioned - that would include the XMLSerializer as that's the opposite direction.

That - and I tried to point that out various times in my reponses - does not mean, I disagree with your idea of having something to create the same object structure from a different input format. It would merely require a new project (repository), for instance phar-io/composer-manifest-mapper or something.

* be aware that if you want this project to be adopted by much more people, you'll need a feature to write quickly a manifest without to worry about XML syntax, that may be source of parsing error.

I think this is the core of your/our misunderstanding:

  • The "official" manifest file will likely stay XML. But it could be an encrypted binary file for all a user should care.
  • You never have to manually create the XML file. That's the job of the XMLSerializer.
  • You can create the Manifest object graph by simply creating the objects as needed using plain PHP
  • I agree(d) with your assessment that having a means to generate that object graph from composer files or using its API is a good thing. That doesn't mean it has to be within this very project but just - as stated above - needs its own place

That will conclude here my contribution, for this project. Thanks again to take time to reply to all my posts !

Thanks for your time and efforts. I hope I managed to explain myself better and maybe you feel like contributing the idea as the aforementioned new project. If you're interested, I could create phar-io/composer-manifest-mapper and invite you.

@llaville
Copy link
Author

The focused on usecase for this phar-io/manifest is to provide an object Manifest that provides a useful API to get meta information about the content of a phar archive. The fact that data is stored in XML Format is an unimportant detail nobody really has to deal with - except of course requiring the necessary extensions to be available.

In such condition, an important example script is missing in your project, because I don't think all others projects that used your package generate a manifest like the following code

for example :
<?php

use PharIo\Manifest\ManifestSerializer;

require __DIR__ . '/../vendor/autoload.php';

$bundledComponentCollection = new \PharIo\Manifest\BundledComponentCollection();
$bundledComponentCollection->add(
    new \PharIo\Manifest\BundledComponent(
        'vendor/packageA',
        new \PharIo\Version\Version('0.0.0-dev')
    )
);

$manifest = new PharIo\Manifest\Manifest(
    new \PharIo\Manifest\ApplicationName('vendor/package'),
    new \PharIo\Version\Version('1.0.0'),
    new \PharIo\Manifest\Library(),
    new \PharIo\Manifest\CopyrightInformation(
        new \PharIo\Manifest\AuthorCollection(),
        new \PharIo\Manifest\License(
            'BSD-3-Clause',
            new \PharIo\Manifest\Url('https://spdx.org/licenses/BSD-3-Clause.html')
        )
    ),
    new \PharIo\Manifest\RequirementCollection(),
    $bundledComponentCollection
);

echo (new ManifestSerializer)->serializeToString($manifest);
/*
 * Output produced
 *
<?xml version="1.0" encoding="UTF-8"?>
<phar xmlns="https://phar.io/xml/manifest/1.0">
    <contains name="vendor/package" version="1.0.0" type="library"/>
    <copyright>
        <license type="BSD-3-Clause" url="https://spdx.org/licenses/BSD-3-Clause.html"/>
    </copyright>
    <requires>
        <php version="*"/>
    </requires>
    <bundles>
        <component name="vendor/packageA" version="0.0.0-dev"/>
    </bundles>
</phar>
 */

That - and I tried to point that out various times in my reponses - does not mean, I disagree with your idea of having something to create the same object structure from a different input format. It would merely require a new project (repository), for instance phar-io/composer-manifest-mapper or something.

I agree(d) with your assessment that having a means to generate that object graph from composer files or using its API is a good thing.

👍

That doesn't mean it has to be within this very project but just - as stated above - needs its own place

Why not, but at least a code base (to avoid to rewrite the same logic on each new mapper repository) should be available and independant from target generation. That is not currently the case with all *Element (that depends of DOMElement)

Thanks for your time and efforts. I hope I managed to explain myself better

Yes you did !

and maybe you feel like contributing the idea as the aforementioned new project.

Only if you are agreed to rework code base and make it independant from any mapper (see my comment above)

If you're interested, I could create phar-io/composer-manifest-mapper and invite you.

Sorry, but I'm not interrested with current code base.

@theseer
Copy link
Member

theseer commented Feb 11, 2022

The focused on usecase for this phar-io/manifest is to provide an object Manifest that provides a useful API to get meta information about the content of a phar archive. The fact that data is stored in XML Format is an unimportant detail nobody really has to deal with - except of course requiring the necessary extensions to be available.

In such condition, an important example script is missing in your project, because I don't think all others projects that used your package generate a manifest like the following code
for example :

<?php

use PharIo\Manifest\ManifestSerializer;

require __DIR__ . '/../vendor/autoload.php';

$bundledComponentCollection = new \PharIo\Manifest\BundledComponentCollection();
$bundledComponentCollection->add(
    new \PharIo\Manifest\BundledComponent(
        'vendor/packageA',
        new \PharIo\Version\Version('0.0.0-dev')
    )
);

$manifest = new PharIo\Manifest\Manifest(
    new \PharIo\Manifest\ApplicationName('vendor/package'),
    new \PharIo\Version\Version('1.0.0'),
    new \PharIo\Manifest\Library(),
    new \PharIo\Manifest\CopyrightInformation(
        new \PharIo\Manifest\AuthorCollection(),
        new \PharIo\Manifest\License(
            'BSD-3-Clause',
            new \PharIo\Manifest\Url('https://spdx.org/licenses/BSD-3-Clause.html')
        )
    ),
    new \PharIo\Manifest\RequirementCollection(),
    $bundledComponentCollection
);

echo (new ManifestSerializer)->serializeToString($manifest);
/*
 * Output produced
 *
<?xml version="1.0" encoding="UTF-8"?>
<phar xmlns="https://phar.io/xml/manifest/1.0">
    <contains name="vendor/package" version="1.0.0" type="library"/>
    <copyright>
        <license type="BSD-3-Clause" url="https://spdx.org/licenses/BSD-3-Clause.html"/>
    </copyright>
    <requires>
        <php version="*"/>
    </requires>
    <bundles>
        <component name="vendor/packageA" version="0.0.0-dev"/>
    </bundles>
</phar>
 */

You do have a Point here. The documentation as well as examples are spares. Probably too sparse to be useful beyond the basic reading part.

Your example is of course correct and that would be the intended way of use when (manually) generating a manifest.

That doesn't mean it has to be within this very project but just - as stated above - needs its own place

Why not, but at least a code base (to avoid to rewrite the same logic on each new mapper repository) should be available and independant from target generation. That is not currently the case with all *Element (that depends of DOMElement)

You lost me here again, I'm afraid. Or I still fail to make myself clear. Let me try again.

All the *Element classes are in the xml folder for hopefully obvious reasons ;) and are only (!) used when operating on the XML representation. The ManifestDocument makes use of these and is in turn used by the DocumentMapper to, well, map it into the actual Manifest Objects. The only job of the ManifestDocument is to provide an abstracted API to operate on the DOM of the XML which is our official serialization format. Nobody else has to ever deal with the XML itself, the ManifestDocument or the accompanying Element classes. It's just our way of processing the XML when loading it as a Manifest. As that XML is intended to be the official serialization format, the code to read from XML is part of this repository.

We could certainly argue that this repository - as small as it may be - could be split into 3:

  • The actual Manifest classes (basically, everything in /values)
  • Code to read the XML and map it to the above classes (basically, everything in '/xml' plus the DocumentMapper)
  • Code to Serialize the Manifest object graph to XML

With such a setup, you could combine as needed:

In most cases, people would need the XML reading package - which would require the manifest classes - to be able to deal with the manifests in PHARs.

If a project's task contains the need of creating a PHAR and wanting to add a Manifest, the Manifest Classes plus the XML Serializer are required. Now if the creating should be implemented based on composer provided input and we want to share such an implementation, we'd need yet another package that contains this fragment using the Manifest Classs pretty much like in your example and the XML Serializer to get it into our official Format.

Now arguably all these by now 4 projects are closely related but I seriously dislike having dependencies on things I don't need or code coming along I don't want. So if we'd combine all these into one, we suddenly have the composer api package (or the symfony serializer) as a dependency just to be able to read the manifest.xml from a phar without ever wanting to write anything or create our own manifest.

That's bad design (imho) and what I call bloat.

Thanks for your time and efforts. I hope I managed to explain myself better

Yes you did !

and maybe you feel like contributing the idea as the aforementioned new project.

Only if you are agreed to rework code base and make it independant from any mapper (see my comment above)

Not sure what you mean by "any mapper" but the Manifest Structure is independent.

@llaville
Copy link
Author

Forget all I said, its no more usefull.
My original goal is a quick way to add a manifest from any source that may be installed with Composer.
I didn't realized that I can do it in the box project itself. I'm working on a solution that I will publish soon.
And cherry on the cake, all format will be available, inclusive XML ;)

@llaville
Copy link
Author

@theseer Thanks to had listen to me, and your patience.
I'll keep you aware on my works to integrate phar-io/manifest (without changes) and others alternative into https://github.com/box-project/box when it will be available (normally in next 2 days ... I've already a prototype).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants