Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(std/encoding/csv): Add stringify functionality #8408

Merged
merged 15 commits into from
Nov 25, 2020
Merged

feat(std/encoding/csv): Add stringify functionality #8408

merged 15 commits into from
Nov 25, 2020

Conversation

jsejcksn
Copy link
Contributor

@jsejcksn jsejcksn commented Nov 16, 2020

Resolves #8342

Hi; this is my first new module contribution. While I think I checked all the items on the contributing requirements, please let me know if I missed something when you review.

I'd like specific scrutiny towards a few lines that have preceding code comments, which I'll link to in a follow-up comment.

I have not yet updated the encoding readme because I'm waiting for any changes that might need to be made before doing so.

I have created a separate module for stringifying because none of the module code depends on Deno APIs (to allow for browser compatibility), whereas in the existing csv.ts (parser) module, that's not the case.

I have re-exported the exports from the stringify module from the main csv (parser) module to simplify import usage. I think this is the right pattern. Let me know if there's a different established convention.

@jsejcksn
Copy link
Contributor Author

The parts I'd specifically like help/feedback on:

// Is regex.test more performant here? If so, how to dynamically create?
// https://stackoverflow.com/questions/3561493/
if (str.includes(sep) || str.includes(NEWLINE) || str.includes(QUOTE)) {
return `${QUOTE}${str.replaceAll(QUOTE, `${QUOTE}${QUOTE}`)}${QUOTE}`;
}

// "unknown" is more type-safe, but inconvenient for user. How to resolve?
// deno-lint-ignore no-explicit-any
fn?: (value: any) => string | Promise<string>;

} // I think this assertion is safe. Confirm?
else value = (value as ObjectWithStringPropertyKeys)[prop];

@ry
Copy link
Member

ry commented Nov 18, 2020

@jsejcksn I think the patch looks good - thank you! Can't say I have any feedback on your questions - but I'm happy to land this and allow it to be iterated on in master. Please update the readme tho.

@jsejcksn
Copy link
Contributor Author

jsejcksn commented Nov 19, 2020

@ry Readme updated. PTAL at the format check failure. I ran format locally, and it formatted the README document, but the auto-formatting broke the nested code block formatting AND failed the CI check, so I re-formatted again manually. I'm not sure what the issue is.

Also: The generated doc is missing links to some type aliases that aren't currently exported (e.g. PropertyAccessor is just used as an internal alias for string | number to improve readability and reduce typing). Should I also export every type alias used in an export of the module?

Additionally, the JSDoc outputs of some type aliases seem to have some strange behavior, but I think it's a bug in the documentation generator (e.g. the first @param includes the @ but the following ones don't.)

@jsejcksn
Copy link
Contributor Author

@uki00a Feel free to review if you're interested.

@bartlomieju
Copy link
Member

@jsejcksn great PR! Please run format and linting scripts so it can be landed

@bartlomieju bartlomieju added this to the 1.6.0 milestone Nov 24, 2020
@jsejcksn
Copy link
Contributor Author

jsejcksn commented Nov 25, 2020

Please run format and linting scripts so it can be landed

@bartlomieju Please see the comment above where I responded to Ryan.

I ran format locally, and it formatted the README document, but the auto-formatting broke the nested code block formatting AND failed the CI check

@bartlomieju
Copy link
Member

Please run format and linting scripts so it can be landed

@bartlomieju Please see the comment above where I responded to Ryan.

I ran format locally, and it formatted the README document, but the auto-formatting broke the nested code block formatting AND failed the CI check

I've formatted the file @jsejcksn, let's see if that helps. Also make sure to have git submodules up to date (git submodule update --init --recursive)

@bartlomieju
Copy link
Member

This is strange, I reformatted the file on macOS but it still fails CI :/

@jsejcksn
Copy link
Contributor Author

This is strange, I reformatted the file on macOS but it still fails CI :/

Exactly. And the CI doesn't explain why it fails.

@bartlomieju
Copy link
Member

CC @dsherret is it possible that dprint formats differently on linux compared to macos?

@dsherret
Copy link
Member

@bartlomieju that's weird. When I check out 51879ae it formats fine for me on all three operating systems (assuming 05e0760 is the proper result). The plugins are web assembly code so they shouldn't know what operating system they're running on.

You may want to try upgrading the markdown plugin to 0.4.2 (json could be upgraded to 0.7.2) as it uses a newer version of the markdown parser. Also, you may want to try with dprint 0.10.0 as that uses a newer version of the web assembly runtime it depends on.

@bartlomieju
Copy link
Member

@bartlomieju that's weird. When I check out 51879ae it formats fine for me on all three operating systems (assuming 05e0760 is the proper result). The plugins are web assembly code so they shouldn't know what operating system they're running on.

You may want to try upgrading the markdown plugin to 0.4.2 (json could be upgraded to 0.7.2) as it uses a newer version of the markdown parser. Also, you may want to try with dprint 0.10.0 as that uses a newer version of the web assembly runtime it depends on.

@dsherret thanks, I will do dprint upgrade in a separate PR.

Copy link
Member

@ry ry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thank you @jsejcksn !

@ry ry merged commit ed11eb6 into denoland:master Nov 25, 2020
@jsejcksn
Copy link
Contributor Author

assuming 05e0760 is the proper result

@dsherret @bartlomieju It is not the proper result.

It breaks the code blocks (includes literal triple ticks in the code bocks), and also improperly nests some code blocks in unrelated list items.

Visually compare the intended version (703aed6) to the auto-formatted version (05e0760).

@dsherret
Copy link
Member

dsherret commented Nov 25, 2020

@jsejcksn oh, I see. I was only checking to see if the code output was the same on all three operating systems.

Regarding that issue, if I'm reading this properly I believe it's because the code was indented (which makes it a code block) along with using backticks. It's a behaviour in the markdown parser... I guess I could add some extra stuff to the formatter to make it round down to not be an indented code block, but a fenced one. You can fix it by putting the code block at the same indentation as the paragraph above so it becomes a fenced code block and not an indented one.

@jsejcksn
Copy link
Contributor Author

You can fix it by putting the code block at the same indentation as the paragraph above so it becomes a fenced code block and not an indented one

@dsherret I'm not 100% clear on the markdown spec in regard to how to keep content separated by multiple newlines at the same indentation level as a preceding list item. I've encountered parsers which treat source differently in this regard and all claim to be compatible with GFM, so I've just always indented the content by 2 more spaces than the - of the list item and it seems to work in most places. That's what I did here as well.

How does the parser handle the case described above? Example:

source:

## Section

  - List item 1

    Paragraph content at same indentation as list item 1

    ```
    code block at same indentation as list item 1
    ```

    - Sublist item A

      Paragraph content at same indentation as sublist item A

      ```
      code block at same indentation as sublist item A
      ```

    More paragraph content at same indentation as list item 1

    ```
    code block at same indentation as list item 1
    ```

  - List item 2

    Paragraph content at same indentation as list item 2

    ```
    code block at same indentation as list item 2
    ```

output:

Section

  • List item 1

    Paragraph content at same indentation as list item 1

    code block at same indentation as list item 1
    
    • Sublist item A

      Paragraph content at same indentation as sublist item A

      code block at same indentation as sublist item A
      

    More paragraph content at same indentation as list item 1

    code block at same indentation as list item 1
    
  • List item 2

    Paragraph content at same indentation as list item 2

    code block at same indentation as list item 2
    

@dsherret
Copy link
Member

dsherret commented Nov 25, 2020

@dsherret I'm not 100% clear on the markdown spec in regard to how to keep content separated by multiple newlines at the same indentation level as a preceding list item. I've encountered parsers which treat source differently in this regard and all claim to be compatible with GFM, so I've just always indented the content by 2 more spaces than the - of the list item and it seems to work in most places. That's what I did here as well.

@jsejcksn yes, that is correct, but here it looks like the code block was indented by four spaces instead of two?

image

Regarding the code example you posted, I opened dprint/dprint-plugin-markdown#28 in order to handle this scenario more gracefully. Looks like all the code blocks stay indented when the list is initially at the wrong indentation.

@jsejcksn
Copy link
Contributor Author

@dsherret:

yes, that is correct, but here it looks like the code block was indented by four spaces instead of two?

I'm not positive, but it looks like your screenshot was from these lines:

deno/std/encoding/README.md

Lines 146 to 153 in 703aed6

- If the data is not already in the required output format, or a different column header is desired, then a `ColumnDetails` object type can be used for each column:
- **`fn?: (value: any) => string | Promise<string>`** is an optional function to transform the targeted data into the desired format
- **`header?: string`** is the optional value to use for the column header name
- **`prop: PropertyAccessor | PropertyAccessor[]`** is the property accessor (`string` or `number`) or array of property accessors used to access the data on each object
```ts
const columns = [

That code block on line 152 belongs at the same indentation level as the list item on line 146, but dprint seems to be indenting it a level further to be a part of the list item that begins on line 150.

when the list is initially at the wrong indentation

I'm not sure what this means. Are you saying that the spec forbids whitespace preceding a list marker? Will you clarify? Here's a link to list items in the spec: https://spec.commonmark.org/0.29/#list-items

@jsejcksn
Copy link
Contributor Author

@bartlomieju It seems that the PR was merged prematurely.

To be clear: I think all of the std module code is fine as-is, but the README was not formatted correctly when the merge happened. How to address that? Is there a way to push a fix or must a new PR be created?

@dsherret
Copy link
Member

@jsejcksn I was referencing the autoformatted code in 05e0760, but looking at the reverse of 703aed6 (if I understanding it correctly, the reverse shows what the formatter did?) the main issue there is that on line 100 the list was indented two spaces instead of not using an indent and so that threw the code blocks within the list off. In other words, the list on line 100 should be at 0 spaces indent and the the code blocks should be at 2 spaces indent, but instead the list is at 2 spaces indent and code block at 4 spaces indent. So either the formatter or markdown parser is not gracefully handling this scenario. I think it's probably internally storing being "one indentation level deep", but then when it measures the code block it sees it's at 4 spaces so thinks it is an indented code block instead of only a fenced one. Again, I've opened dprint/dprint-plugin-markdown#28 to handle this more gracefully. Let me know if I've missed anything though!

If you notice any other issues, you can play around in the playground with the same configuration as what's used in deno's codebase here in the playground and open an issue at dprint-plugin-markdown.

@jsejcksn
Copy link
Contributor Author

In other words, the list on line 100 should be at 0 spaces indent

@dsherret Thanks for clarifying. That's what I was wondering about when I said:

Are you saying that the spec forbids whitespace preceding a list marker? Will you clarify? Here's a link to list items in the spec: https://spec.commonmark.org/0.29/#list-items

I read more into the spec, and it appears that list markers can be indented 0-3 spaces and should be rendered the same way as not being indented. (Indenting 4 spaces indicates a code block). That detail is at 5.2.4:

https://spec.commonmark.org/0.29/#:~:text=Indentation.%20If%20a%20sequence%20of%20lines%20Ls%20constitutes%20a%20list%20item

So, if I'm interpreting it correctly, I think dprint is not following the spec in this way. If that's the case, it might be worth including that info in the issue you filed.

@bartlomieju
Copy link
Member

@bartlomieju It seems that the PR was merged prematurely.

To be clear: I think all of the std module code is fine as-is, but the README was not formatted correctly when the merge happened. How to address that? Is there a way to push a fix or must a new PR be created?

@jsejcksn new PR is required

caspervonb pushed a commit to caspervonb/deno_std that referenced this pull request Jan 21, 2021
caspervonb pushed a commit to caspervonb/deno_std that referenced this pull request Jan 24, 2021
caspervonb pushed a commit to caspervonb/deno_std that referenced this pull request Jan 24, 2021
caspervonb pushed a commit to caspervonb/deno_std that referenced this pull request Jan 24, 2021
caspervonb pushed a commit to caspervonb/deno_std that referenced this pull request Jan 31, 2021
caspervonb pushed a commit to caspervonb/deno_std that referenced this pull request Jan 31, 2021
caspervonb pushed a commit to caspervonb/deno_std that referenced this pull request Jan 31, 2021
caspervonb pushed a commit to caspervonb/deno_std that referenced this pull request Jan 31, 2021
caspervonb pushed a commit to caspervonb/deno_std that referenced this pull request Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[std/encoding/csv] Request for "stringify" function to complement "parse" function
5 participants