Replace hardcoded argument parsing, help text with mappings #13

ferdnyc · 2020-06-22T17:36:07Z

Speaking of things falling out of sync (yuzutech/kroki#279)...

I noticed when running kroki help convert that the list of possible arguments to --type was incomplete. (See bolded text.) Furthermore, the equivalent valid --format arguments weren't listed at all:


$ ./kroki help convert
Convert text diagram to image

Usage:
  kroki convert file [flags]

Flags:
  -c, --config string     alternate config file [env KROKI_CONFIG]
  -f, --format string     output format (default: infer from output file extension
 otherwise svg)
  -h, --help              help for convert
  -o, --out-file string   output file (default: based on path of input file); use - to output
 to STDOUT
  -t, --type string       diagram type [actdiag, blockdiag, c4plantuml, ditaa, dot, erd,
 graphviz, nomnoml, nwdiag, plantuml, seqdiag, svgbob, umlet] (default: infer
 from file extension)

Thus began my journey of discovery.

The end result is that both lists are now displayed, generated semi-automatically (from two slices that list SupportedDiagramTypes and SupportedImageFormats), with each supported kroki.DiagramType or kroki.ImageFormat automatically mapping to its own lowercased name as both the --type / --format argument, and the recognized file extension that corresponds.

The additional "alias" mappings ("jpg": kroki.JPEG, ".dot": kroki.GraphViz, etc.) are hardcoded into the initial mappings. The init() then augments those mappings with the "standard" names taken from the slices..

Those mappings are used in place of all the previously-hardcoded switch statements in cmd/convert.go, and the Supported,,, slices are also used to embed the list of valid arguments into the help text. (The additional, redundant mapping values are not displayed, to avoid clutter, though they will still be recognized.)

So, the help output looks like this now:


./kroki help convert
Convert text diagram to image

Usage:
  kroki convert file [flags]

Flags:
  -c, --config string     alternate config file [env KROKI_CONFIG]
  -f, --format string     output format [base64 jpeg pdf png svg] (default: infer
 from output file extension otherwise svg)
  -h, --help              help for convert
  -o, --out-file string   output file (default: based on path of input file); use - to output
 to STDOUT
  -t, --type string       diagram type [actdiag blockdiag c4plantuml ditaa erd
 graphviz mermaid nomnoml nwdiag plantuml rackdiag seqdiag svgbob umlet
 vega vegalite wavedrom] (default: infer from file extension)

Functionality should be unchanged, all of the same information is there, it's just structured to (hopefully) make maintenance easier longer-term.

If support is subsequently added for any new kroki.DiagramTypes or kroki.ImageFormats, those constants just have to be added to the SupportedDiagramTypes or SupportedImageFormats slices exported from cmd/convert.go and both the matching and the help text will update automatically.

Honestly, the implementation is kind of a hack, and I'm not particularly thrilled with the way it still requires a lot of "manual" mapping construction (even if it's executed programmatically from cmd/convert.go's new init() function). It feels like there should be a better way. Ideally kroki-go would publish this sort of information in some way that it could be queried directly from there, but publishing the names as constants does not lend itself to that type of introspection at all, so I really struggled to come up with a more robust API.

I had to give myself a half-day crash-course on Go just to write this code, and I don't doubt it can be improved upon. Any suggestions for less stupid, ham-fisted ways of structuring the required data for use by the code would be very welcome.

ggrossetie · 2020-06-23T11:47:26Z

This is nice, thanks!

If support is subsequently added for any new kroki.DiagramTypes or kroki.ImageFormats, those constants just have to be added to the SupportedDiagramTypes or SupportedImageFormats slices exported from cmd/convert.go and both the matching and the help text will update automatically.

I like where it's going 👍

Honestly, the implementation is kind of a hack, and I'm not particularly thrilled with the way it still requires a lot of "manual" mapping construction (even if it's executed programmatically from cmd/convert.go's new init() function). It feels like there should be a better way. Ideally kroki-go would publish this sort of information in some way that it could be queried directly from there, but publishing the names as constants does not lend itself to that type of introspection at all, so I really struggled to come up with a more robust API.

Indeed.
At some point, it should part of kroki-go, so we don't have to update the mapping in the CLI when we add/remove/update a diagram type or an image format.

I had to give myself a half-day crash-course on Go just to write this code, and I don't doubt it can be improved upon. Any suggestions for less stupid, ham-fisted ways of structuring the required data for use by the code would be very welcome.

I think we should declare two dictionaries, one for the images formats and one for the diagrams types.

`ImageFormat`:

has a unique id
is associated to a list of names
is associated to a list of file extensions

For instance, ImageFormat.JPEG:

id: JPEG
names: ["jpg", "jpeg"]
fileExtensions: [".jpg", ".jpeg"]

`DiagramType`:

has a unique id
is associated to a list of names
is associated to a list of file extensions
is associated to a list of supported (output) image format

For instance, DiagramType.GraphViz:

id: GraphViz
names: ["dot", "graphviz"]
fileExtensions: [".dot", ".gv", ".graphviz"]
supportedImageFormats: [ImageFormat.PNG, ImageFormat.SVG, ImageFormat.JPEG, ImageFormat.PDF]

We can then create maps from the dictionaries depending on what we need. We could even create a tiny "query API" (i.e. get the DiagramType for a given file extension) on top of them.

What do you think?

ferdnyc · 2020-06-25T16:23:15Z

That makes a lot of sense to me, yeah.

DiagramType:

[...]

is associated to a list of supported (output) image format

I hadn't even realized that mapping exists, when I put this together — I thought every output format was available for every input format. But from reading the Java service code I see that indeed, not every processor supports every output format. So, that adds an additional wrinkle, and makes the list of formats I added to the help not entirely correct.

If those mappings are available in the code, though, it could support something like this instead:

$ kroki-cli convert --help
…
-f, --format string     output format (default: infer from output file extension
                        otherwise svg); available formats depend on diagram type
--list-formats          list available output formats for the selected diagram type
                        (requires input filename or --type option)
…
$ kroki-cli convert -t plantuml -f list
Valid output formats for PlantUML: [base64 jpeg png svg txt utxt]
$ kroki-cli convert -t vega --list-formats
Valid output formats for Vega: [pdf png svg]

(And, of course, it could also validate the requested export format based on the input format.)

On the dictionaries...

For instance, ImageFormat.JPEG:

id: JPEG

names: ["jpg", "jpeg"]

fileExtensions: [".jpg", ".jpeg"]

Part of me thinks that there should be some concept of a "primary" vs. "alternate" or "alias" name(s) for things, to avoid redundancy. If you list the output formats supported, it doesn't make sense to display both jpg and jpeg — though, of course, both should be valid input.

And then file extensions, same thing? If you run kroki convert with -f jpeg or -f jpg, either:

The file name corresponds to your -f argument (either something.jpeg or something.jpg). Which feels a bit odd, though I suppose it does offer the user maximum control/flexibility.
The file is always named something.jpg (or always named something.jpeg) regardless. That seems more predictable, but then how does the code know which extension to prefer? "The first one on the list" feels a bit fragile.

For instance, DiagramType.GraphViz:

id: GraphViz

names: ["dot", "graphviz"]

fileExtensions: [".dot", ".gv", ".graphviz"]

supportedImageFormats: [ImageFormat.PNG, ImageFormat.SVG, ImageFormat.JPEG, ImageFormat.PDF]

It gets even trickier with diagram types, because looking at kroki/io/server/Server.java:

Graphviz is a diagramService that's registered as the diagramHandler for type names "graphviz" and "dot", which are interchangeable.
Vega is a diagramService that gets registered as two separate diagramHandler instances. One handles "vega", then a second, separate one with different options will handle "vegalite".
Blockdiag is a diagramService that's registered as a single diagramHandler for all of the following names: "blockdiag", "seqdiag", "actdiag", "nwdiag", "packetdiag", "rackdiag", but it processes each one differently.

So what's the right line to draw?

I firmly believe the user shouldn't have to care about the diagramServices and whether the same class processes Vega and VegaLite. "vega" and "vegalite" should be separate diagramTypes.
But do we treat "graphviz" and "dot" as separate diagram types? Seems silly when you can easily do this:
```
$ ./kroki convert -t graphviz hello.dot
$ ./kroki convert -t dot hello.dot
```
and you'll get the exact same hello.svg output regardless.
Blockdiag, OTOH, is completely different. All of its "names" are just aliases to the same diagramService instance, but it does matter which you use because it will process the input differently. Meaning:
```
$ ./kroki convert -t rackdiag rack.diag
$ ./kroki convert -t blockdiag rack.diag
fail to generate the image {status: 500, body: Error 500: Error: Invalid specification
…
```

So it seems clear that all of Blockdiag's names have to be separate diagramTypes. Which I guess means Graphviz's should be as well. Just so it's not some weird outlier, it may be simpler if we treat "graphviz" and "dot" as two distinct types even though they're not.

ggrossetie · 2020-06-27T13:58:53Z

Part of me thinks that there should be some concept of a "primary" vs. "alternate" or "alias" name(s) for things, to avoid redundancy. If you list the output formats supported, it doesn't make sense to display both jpg and jpeg — though, of course, both should be valid input.

Yes I think we should have a primary name "jpeg" and then alias/alternatives.

And then file extensions, same thing? If you run kroki convert with -f jpeg or -f jpg, either:
The file name corresponds to your -f argument (either something.jpeg or something.jpg). Which feels a bit odd, though I suppose it does offer the user maximum control/flexibility.
The file is always named something.jpg (or always named something.jpeg) regardless. That seems more predictable, but then how does the code know which extension to prefer? "The first one on the list" feels a bit fragile.

Regarding file extensions, we should let the users use any valid file extension associated with an ImageFormat.
For reference, the JPEG ImageFormat has actually 5 valid file extensions according to MDN: .jpg, .jpeg, .jfif, .pjpeg, .pjp.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types

In my opinion, 5 file extensions is excessive and I believe that .jfif, .pjpef and .pjp are very little used. So let's keep .jpeg and .jpg.

It gets even trickier with diagram types, because looking at kroki/io/server/Server.java:
Graphviz is a diagramService that's registered as the diagramHandler for type names "graphviz" and "dot", which are interchangeable.
Vega is a diagramService that gets registered as two separate diagramHandler instances. One handles "vega", then a second, separate one with different options will handle "vegalite".
Blockdiag is a diagramService that's registered as a single diagramHandler for all of the following names: "blockdiag", "seqdiag", "actdiag", "nwdiag", "packetdiag", "rackdiag", but it processes each one differently.
So what's the right line to draw?

vega and vegalite are distinct diagram libraries and the same is true for blockdiag, seqdiag, actdiag, nwdiag, packetdiag and rackdiag.
Conversely, graphviz and dot are alias. GraphViz uses a language called DOT and the dot binary can be used to convert GraphViz diagram. That's the reason why we allow to use both names.
https://graphviz.org/documentation/

I firmly believe the user shouldn't have to care about the diagramServices and whether the same class processes Vega and VegaLite. "vega" and "vegalite" should be separate diagramTypes.

Indeed 👍

But do we treat "graphviz" and "dot" as separate diagram types? Seems silly when you can easily do this:

They are not separated diagram types, just a diagram type with a primary name and an alternative name.

and you'll get the exact same hello.svg output regardless.

Yes that's intended.

So it seems clear that all of Blockdiag's names have to be separate diagramTypes. Which I guess means Graphviz's should be as well. Just so it's not some weird outlier, it may be simpler if we treat "graphviz" and "dot" as two distinct types even though they're not.

I'm not sure I understand why it's bothering you to have more than on name to reference a diagram type? I might be wrong but some users might be more familiar with "dot" than "graphviz" and vice-versa even if they are using the exact same syntax for their diagrams.

ggrossetie · 2020-07-05T09:29:29Z

@ferdnyc If you want I can take a stab at it? I might have better suggestions/ideas if I dive into the code 😄

ggrossetie · 2021-01-06T10:10:12Z

@ferdnyc I rebased on master and made a few changes, let me know what you think.

I'm quite happy with the result. It brings a few enhancements and removes some code duplication so I would like to move forward with this. In my opinion, we should address the other concerns in subsequent pull requests. We can create an issue to track progress.

cmd/convert.go

ggrossetie force-pushed the cmd-arg-maps branch from d968d16 to 58ca9fe Compare January 6, 2021 09:57

ggrossetie reviewed Jan 6, 2021

View reviewed changes

cmd/convert.go Outdated Show resolved Hide resolved

ggrossetie force-pushed the cmd-arg-maps branch from 58ca9fe to 1338204 Compare January 6, 2021 10:12

ferdnyc and others added 4 commits January 7, 2021 17:42

cmd/convert: Create mappings for modes/types

ad1ec35

cmd/root: Use Supported mappings to display help

04a6800

Replace vars by functions

d42c365

Upgrade to kroki-go 0.6.2

68b4fc7

ggrossetie force-pushed the cmd-arg-maps branch from 1338204 to 68b4fc7 Compare January 7, 2021 16:54

ggrossetie merged commit 5105cb7 into yuzutech:master Jan 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace hardcoded argument parsing, help text with mappings #13

Replace hardcoded argument parsing, help text with mappings #13

ferdnyc commented Jun 22, 2020 •

edited

Loading

ggrossetie commented Jun 23, 2020

ferdnyc commented Jun 25, 2020

`DiagramType`:

ggrossetie commented Jun 27, 2020

ggrossetie commented Jul 5, 2020

ggrossetie commented Jan 6, 2021

Replace hardcoded argument parsing, help text with mappings #13

Replace hardcoded argument parsing, help text with mappings #13

Conversation

ferdnyc commented Jun 22, 2020 • edited Loading

ggrossetie commented Jun 23, 2020

ImageFormat:

DiagramType:

ferdnyc commented Jun 25, 2020

DiagramType:

ggrossetie commented Jun 27, 2020

ggrossetie commented Jul 5, 2020

ggrossetie commented Jan 6, 2021

ferdnyc commented Jun 22, 2020 •

edited

Loading

`ImageFormat`:

`DiagramType`:

`DiagramType`: