Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String template literals: Additional Features #2603

Open
jordwalke opened this issue Jul 10, 2020 · 15 comments
Open

String template literals: Additional Features #2603

jordwalke opened this issue Jul 10, 2020 · 15 comments

Comments

@jordwalke
Copy link
Member

jordwalke commented Jul 10, 2020

PR 2599 implements string template literals in a way that is a non-breaking change with Reason Syntax 3.6.

It is good to go as is, but there's a couple of features that should be considered before cutting the release, as well as some features that should be added after cutting the release.

Pre-release

Auto-indenting [DONE]

String template literals will be interpreted according to the indentation of raw white space immediately before the closing backtick:

module MyModule = {
  let x = `
  this is some text
  here that spans multiple lines.
  `
};

However, it will print them with an additional indentation between the ticks so that it matches how other constructs wrap/pretty print:

module MyModule = {
  let x = `
    this is some text
    here that spans multiple lines.
  `
};

This is so that, eventually, pretty printed inline string templates look like all the other constructs indented. This is so that not only are the strings indented like everything else, but so are the closing parens:

Printf.fprintf(fmt, `
  text here
  and here
`);

Prose Templates [UNDECIDED]

In most cases, when you are creating a multiline string, it is handed off to some other formatter that implements text wrapping. For JS this is the DOM, and for command line output it's something like the Format module. A "prose" mode allows you to have newlines in the strings, without actually inserting newlines in the text.

let x = `
  this is some text that you can see
  here that spans multiple lines.
  But it's actually one line.

  But then if you want an actual newline, you
  enter an additional newline as above.
`
  • A newline causes a single white space character instead of a newline.
  • This allows refmt to wrap the text tokens according to the editor width (or if you move these literals to greater indented contexts).
  • This should probably be the default since most tools that print strings do their own formatting that needs to be composable.
  • It would require a non-prose mode to allow multiline literals with explicit newlines.
let nonProseMode = ``
  These will be actual newlines
  not virtual ones.
``

Type Safe Interpolation [UNDECIDED]

Before cutting a release, we just need to implement a syntax parsing for this feature, even if it's not implemented yet (so that people upgrading to the next next release will have identical semantics. The idea here is that interpolation can select between different types.

let x = ```
  This is a ${stringExpression()} and this is an %{integerExpression}
  and this is #{somethingCustom}

Challenges:

  • String templates require different conventions for type safe interpolation depending on the use case.
  • Printf requires something different and has more interpolation features than simple string concatenation but both should be supported.

Post-release

JSX Type Safe Interpolation

JSX can use the same exact string prose convention, and doc comments can use the same convention as well.

let x = <div>
  regular text heere
  ${myString} %{someInt}
  #{customElement}
  regular text
</div>
  • JSX would need a non-prose opt-in just like string template literals.
  • This would probably be implemented as a lossless transform at the parsing stage, and allow other ppxs and allow dedicated backend specific transforms to clarify the meaning of the interpolation and text content.

If string templates are "Prose Templates", and JSX abides by the same exact convention, then things in the editor can format more beautifully, and there's only one convention to learn everywhere.

@jordwalke jordwalke changed the title String template literals: String template literals: Additional Features Jul 10, 2020
@Lupus
Copy link

Lupus commented Jul 10, 2020

Re "Type Safe Interpolation", probably worth looking at https://github.com/janestreet/ppx_custom_printf.

"The time is %{Time} and the timezone is %{Time.Zone}."

Will be processed using functions Time.to_string and Time.Zone.to_string which is pretty neat. It also supports sexps, which is not really relevant for Reason, but still, it's a syntax for custom processing of payload: %{sexp:<type>}.

@jordwalke
Copy link
Member Author

yeah, I just am not sure how to provide both the custom hook and the expression in the interpolation in a way that is intuitive.

let x = `
   %{Formatter, expression}
`;

For example

@Lupus
Copy link

Lupus commented Jul 10, 2020

We could use type annotations, defaulting to string if absent. That's what's being done in another ppx from J.S. - ppx_sexp_message.

let rename ~src ~dst =
  try Unix.rename ~src:tmpfile ~dst
  with Unix.Unix_Error (error, _, _) ->
    raise_s
      [%message
        "Error while renaming file"
          ~source:(tmpfile : string)
          ~dest:  (dst     : string)
                  (error   : Unix.Error.t)
      ]

They parse type annotations and plug in appropriate sexp encoders.

let x = `
   %{expression : A.B.C.typ}
`;

could assume that expression is of type A.B.C.typ and expect A.B.C.typ_to_string to be present (or just A.B.C.to_string if type is A.B.C.t).

@jfrolich
Copy link

jfrolich commented Jul 10, 2020

yeah, I just am not sure how to provide both the custom hook and the expression in the interpolation in a way that is intuitive.

let x = `
   %{Formatter, expression}
`;

For example

In that case

let x = `   
  %{Formatter.toString(expression)}
`;

Is not a big difference. Why have that extra syntax?

@Lupus
Copy link

Lupus commented Jul 10, 2020

Now goes the holy war regarding to_string vs toString. OCaml folks prefer the former, while JS folks prefer the latter. BuckleScript builds vertically integrated stack, BS Syntax, Belt, BS compiler, this gives one an opportunity to bake in assumptions about stdlib into syntax. Reason is only alternate syntax for OCaml, not vertically integrated stack, and thus it actually does not make sense to bake any naming conventions into Reason syntax itself. Personally I would be happy if Reason did the same thing, and assumed Base as stdlib and build tight integration at syntax level, but that's just my opinion, nothing more :)

@jordwalke
Copy link
Member Author

I think that even if toString is the convention for Reason outside of BS, it would still be worth using underscores for this one thing because if no custom formatter is supplied, you may want to automatically use string_of_int as a fallback etc.

@jordwalke
Copy link
Member Author

Is not a big difference. Why have that extra syntax?

Formatters may not just be a simple function application, but may need to accept something like a Format.t as the first argument, etc.

@jfrolich
Copy link

Ah ok. Forgive me for the camelCase, that was just what I am used to type.

@EduardoRFS
Copy link
Contributor

EduardoRFS commented Jul 14, 2020

On the type safe interpolation, I would like to leave the extension one out or leaving it as a simple ppx to be consistent with the language in general, my gut feeling is that overall we already have too many language constructs.

Most of the usages is formatting and you would need a ppx that receives both a type and a value and that is something new in the language, I'm concerned that it will introduce another mode where the user needs to parse in his head making the language more complex and also making the feature hard to discover.

My take is that any syntax that we can get, will make it marginally better but while adding complexity on both our side and the user side.

reasoning:

let input = `a %{magic} b`;
let output = "a " ++ [@reason.string][%magic] ++ " b";

// then you can do that in user space

let input = `a %{Int.t random_int} b`;
let output = "a " ++ [@reason.string][%Int.t random_int] ++ " b";

// also

let input = `a %{int random_int} b`;
let output = "a " ++ [@reason.string][%int random_int] ++ " b";

// which isn't a huge win against
let input = `a %{Int.t random_int} b`;
let input = `a ${Int.to_string(random_int)} b`;

// so tuples, that's the real problem
// compare it with a special syntax

let input = `a ${(int_left, int_right) |> [%string: (int, int)]} b`
let input = `a %{(int, int): (int_left, int_right)} b`;

// both of them feels terrible, but one is familiar, while the other is new

@jordwalke
Copy link
Member Author

I think if we just make the type safe interpolation be a different syntax for printf format strings, then it needn't invent any new concepts. The only problem I have with that is that the Format libraries bring in a bunch of dependencies when compiling to JavaScript or other targets, which is why I was open to alternatives. I wonder if a lot of the complexity / bulk around the format implementations are due to not having a more invasive DSL such as what was proposed above.

@Lupus
Copy link

Lupus commented Jul 14, 2020

Reason can't really emit code that is using even OCaml stdlib, because when used with Base, those symbols which Reason would refer to might be marked as deprecated in case Base provides alternative way to reference them - resulting in endless deprecation warnings breaking the build.

@jordwalke
Copy link
Member Author

I can't imagine that causing a problem when converting:

something(`
  Hi #s{something()} there
`);

into:

something(`
  Hi %s there
`, something());

Because the printf type trick occurs more deeply in the type system, even if you keep the types as string. It could be an issue if you perform the coercion explicitly expr : Format.t(_), but even so I think this is the one type that if aliased would need to be done transparently because the special Format type is special cased in the compiler.

@Lupus
Copy link

Lupus commented Jul 14, 2020

Re: prose templates - looks good to be default. My long log lines would finally be wrapped by refmt, yet there will be no newlines in the resulting logs. What about triple backticks? That seems to be the de-facto standard for verbatim multi-line code blocks in many markups. Lol, just tried to write an example in triple backticks right in this comment, and failed because Github markup interprets triple backticks as end of verbatim code block... Probably it will end up as a nightmare for code sharing.

@jordwalke
Copy link
Member Author

My long log lines would finally be wrapped by refmt, yet there will be no newlines in the resulting logs

Usually you would use a formatter to format logs (to wrap to terminal width, or indent etc). If you include newlines inside of text blocks that go through formatters, it can end up ruining the indentation even.

Also, in prose mode you can force there to be newlines by including two newlines.

@jfrolich
Copy link

Relevant stage 1 JavaScript proposal:

https://github.com/tc39/proposal-string-dedent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants