fix: file picker widgets removing files causing abnormal content, resulting in text garbled characters of type restapiplugin MULTIPART-FORM-DATA #31422

gzlinzihong · 2024-03-01T15:35:53Z

Description

fix filepicker widgets selects multiple files and then removes one of them, causing the final content of all file.data to be abnormal

fix The MULTIPART_FORM_DATA type in the body of the restapi plugin, when passing text type parameters, including Chinese, Japanese, etc., is garbled.

PR fixes following issue(s)

Fixes # (issue number)

31411

Type of change

Bug fix

Testing

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Also list any relevant details for your test configuration.

JUnit
Cypress

Test Plan

Add Testsmith test cases links that relate to this PR

Issues raised during DP testing

Link issues raised during DP testing for better visiblity and tracking (copy link from comments dropped on this PR)

Checklist:

Dev activity

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
My changes generate no new warnings
New and existing unit tests pass locally with my changes

Summary by CodeRabbit

New Features
- Enhanced the FilePicker widget to support better file management.
Bug Fixes
- Improved handling of character encoding in form data to support UTF-8.
Tests
- Added tests for new FilePicker functionality and data parsing improvements.

…lost.

welcome · 2024-03-01T15:35:57Z

Welcome to the Appsmith community! Thank you for your first pull request and making this project better. 🤗 Please make sure that you raise a review request so your code change does not go unnoticed.

coderabbitai · 2024-03-01T15:36:00Z

Walkthrough

The updates include enhancing the FilePickerV2 widget in the client side to manage file selections more robustly, specifically allowing for the removal of a file without affecting the integrity of other files' data. On the server side, an improvement in handling multipart form data character encoding ensures text parameters, including those in languages like Chinese and Japanese, are correctly processed without resulting in garbled text. These changes collectively aim to improve the user experience and data handling reliability.

Changes

File Path	Change Summary
`.../cypress/e2e/Regression/ClientSide/Widgets/Filepicker/FilePickerV2_spec.js`	Added a test case for removing a file in FilePickerV2 widget.
`.../src/widgets/FilePickerWidgetV2/widget/index.tsx`	Introduced `fileMap` object for storing selected files.
`.../helpers/restApiUtils/helpers/DataUtils.java`	Changed encoding to `UTF_8` in `parseFormData` method.
`.../helpers/DataUtilsTest.java`	Added a test for parsing multipart text data with different character sets.

Related issues

[Bug]: The filepicker widgets selects multiple files and then removes one of them, causing the final content of all file.data to be abnormal. #31411: This issue discusses problems with file data integrity in the FilePicker widget after file removal and garbled text in multipart form data, both of which are addressed by the changes in this PR. The update to file handling in the widget and the encoding fix for multipart form data directly tackle the reported bugs, making this issue relevant for linking.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit-tests for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit tests for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit tests.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
The JSON schema for the configuration file is available here.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

app/client/src/widgets/FilePickerWidgetV2/widget/index.tsx

Nikhil-Nandagopal · 2024-03-02T07:21:36Z

@gzlinzihong thanks for the PR! We'll get someone to review it

rohan-arthur · 2024-03-04T10:29:12Z

/ok-to-test tags="@tag.Filepicker, @tag.Widget, @tag.Datasource"

github-actions · 2024-03-04T10:29:39Z

Tests running at: https://github.com/appsmithorg/appsmith/actions/runs/8138779483.
Workflow: Appsmith External Integration Test Workflow.
Tags: @tag.Filepicker, @tag.Widget, @tag.Datasource.

rohan-arthur · 2024-03-04T10:29:42Z

/build-deploy-preview skip-test=true

github-actions · 2024-03-04T10:30:06Z

Deploying Your Preview: https://github.com/appsmithorg/appsmith/actions/runs/8138786096.
Workflow: On demand build Docker image and deploy preview.
skip-tests: . env: .
PR: 31422.
recreate: .

github-actions · 2024-03-04T10:36:13Z

Failed server tests

com.external.plugins.RestApiPluginTest#testGetApiWithBody
com.external.plugins.RestApiPluginTest#testGetApiWithoutBody
com.external.plugins.RestApiPluginTest#testHttpGetRequestRawBody

github-actions · 2024-03-04T11:41:16Z

Workflow run: https://github.com/appsmithorg/appsmith/actions/runs/8138779483.
Commit: ``.
Cypress dashboard url: Click here!
All cypress tests have passed 🎉🎉🎉

sneha122 · 2024-03-04T12:25:42Z

@Parthvi12 Can you please review the cypress tests?

sneha122 · 2024-03-04T12:29:26Z

...h-interfaces/src/main/java/com/appsmith/external/helpers/restApiUtils/helpers/DataUtils.java

@@ -190,7 +190,7 @@ public String parseFormData(List<Property> bodyFormData, Boolean encodeParamsTog
 byte[] valueBytesArray = new byte[0];
 if (StringUtils.hasLength(String.valueOf(property.getValue()))) {
 valueBytesArray =
- String.valueOf(property.getValue()).getBytes(StandardCharsets.ISO_8859_1);
+ String.valueOf(property.getValue()).getBytes(StandardCharsets.UTF_8);


@nidhi-nair @sumitsum Do you guys see any problems in extending the character set for multipart form data text type? Looks ok to me. Please let me know if anything else needs to be changed in order to accommodate for UTF-8

Hi @sneha122 the relvant change seems to have been added as part of this PR. From the PR it seems like adding ISO_8859_1 instead of UTF_8 could be an explicit decision. @nidhi-nair can you please check out the PR once and confirm if it is safe to replace with UTF_8 encoding ?
Also, in case we go ahead with the replacement, there are two other places in the same file where ISO_8859_1 encoding is used - we should check if they need replacement as well.
From my perspective, I think it should be safe to replace ISO_8859_1 with UTF_8. However, I would suggest waiting for @nidhi-nair 's confirmation on this.

Hi @nidhi-nair Could you please help confirm this?

Hey folks, apologies for the delay in response. That's a great catch, Sneha! Yes, the change is intentional. The two types of encoding differ from each other in that UTF-8 is a variable length encoding and ISO_8859_1 (Latin) is a fixed length encoding (of 8 bits). As a result UTF-8 has a different representation for special characters outside the first 256 code points. Latin encoding treats each byte as a separate character instead.

The reason our request parsing uses Latin encoding over UTF-8 is because the multipart form request that we use for action execution sends each character across in a single byte encoded format. I have lost a little bit of context on why that is the case, but it is likely because there is no special handling to let the client side library know otherwise. If you notice in L#470 at the request parsing stage, we have interpretted the incoming byte stream as Latin encoded, and created a String using that explicit type of encoding. As a result, when we are trying to convert the interpretted String into a bytw stream again inside DataUtils, we are using the same format, so we end up at the same kind of byte representation.

If we are making this change, we need to make sure that we use the same format of encoding across the board. Let me know if I can provide more details here. Perhaps what we need to do is to introduce the ability to even clarify the encoding for end users' endpoints.

Additionally, could I also request that we add Cypress tests to test for files as well as text bindings with non-ASCII character sets, so we know our solution here will be permanent, and no future change disregards the conclusion we get to at the end of this thread? Please make sure to cover Chinese, Japanese, and Arabic characters at a minimum.

Hey folks, apologies for the delay in response. That's a great catch, Sneha! Yes, the change is intentional. The two types of encoding differ from each other in that UTF-8 is a variable length encoding and ISO_8859_1 (Latin) is a fixed length encoding (of 8 bits). As a result UTF-8 has a different representation for special characters outside the first 256 code points. Latin encoding treats each byte as a separate character instead.

The reason our request parsing uses Latin encoding over UTF-8 is because the multipart form request that we use for action execution sends each character across in a single byte encoded format. I have lost a little bit of context on why that is the case, but it is likely because there is no special handling to let the client side library know otherwise. If you notice in L#470 at the request parsing stage, we have interpretted the incoming byte stream as Latin encoded, and created a String using that explicit type of encoding. As a result, when we are trying to convert the interpretted String into a bytw stream again inside DataUtils, we are using the same format, so we end up at the same kind of byte representation.

If we are making this change, we need to make sure that we use the same format of encoding across the board. Let me know if I can provide more details here. Perhaps what we need to do is to introduce the ability to even clarify the encoding for end users' endpoints.

Hi @nidhi-nair It is a text type rather than a blob type. You can check line L#453 where the bytes array has been converted to utf-8. However, when restapi is used later, the text type is converted to ISO_8859_1 again, resulting in garbled characters.

For specific reproduction steps, please check the chat between me and @sneha122 below this PR.

In addition, I checked ActionApi.tsx. The body is of FormData type. When appending a string value to FormData, the binary converted by the browser should be utf-8 encoded by default.

Whoops, missed that piece of context in the conversation entirely, sorry about that! Sneha or @gzlinzihong can you please help to check if the line you pointed to can be moved to Latin encoding instead? This is to check if we can use a uniform type of encoding throughout. I see that there is actually some interspersing happening between these two encodings already at L#553, and it makes me wonder what kind of behaviour we would see if a user was expecting to embed the value from a file into their binding. Please feel free to invalidate the use case itself if that is what your findings show. This is a core part of our execution flow, which is where my paranoia is coming from. :)

Hi @nidhi-nair Sorry, I don't understand how this is related to L#553 because the blob case is not actually executed.
I don't think changing the UTF-8 encoding to ISO_8859_1 in L#453 is a better way. Because this will introduce more problems, the test parameter in the example below is directly a constant, defined in actionConfig.

If the UTF-8 encoding is modified to ISO_8859_1 in L#453,although the form of the binding variable can be solved smoothly, the garbled problem will not be solved if the variable is a constant string, and because the encoding is modified before the plugin is executed, the impact The scope will design the execution of all plugins

sneha122 · 2024-03-04T14:48:58Z

Hi @gzlinzihong Thanks a lot for your contribution!

There are a couple of linting failures on server side. Could you please execute following command in server directory?
mvn spotless:apply

On client side as well there is a linting failure in file cypress/e2e/Regression/ClientSide/Widgets/Filepicker/FilePickerV2_spec.js, can you please format this file using prettier formatter?

These should fix all of your linting errors and we can merge the PR as quickly as we can

gzlinzihong · 2024-03-05T02:50:47Z

@sneha122 thank you for your feedback.

I have made the necessary amendments to the code on both the server side and the client side as per your suggestion. I've run the mvn spotless:apply command on the server side, and I've also formatted the cypress/e2e/Regression/ClientSide/Widgets/Filepicker/FilePickerV2_spec.js file with Prettier on the client side.

All the linting errors should now be corrected. Feel free to let me know if there is anything else that needs my attention.

Thank you again for your help and patience.

gzlinzihong · 2024-03-05T03:12:25Z

Hi @sneha122 I'm sorry for the confusion, I've realized that I made a mistake in the naming of my branch. If necessary, we can close this pull request and I will rename the branch and create a new pull request. Let me know if you would like me to proceed in this manner

sneha122 · 2024-03-05T05:16:32Z

@sneha122 thank you for your feedback.

I have made the necessary amendments to the code on both the server side and the client side as per your suggestion. I've run the mvn spotless:apply command on the server side, and I've also formatted the cypress/e2e/Regression/ClientSide/Widgets/Filepicker/FilePickerV2_spec.js file with Prettier on the client side.

All the linting errors should now be corrected. Feel free to let me know if there is anything else that needs my attention.

Thank you again for your help and patience.

Thanks a lot for these changes, the respective checks are running on the PR, once those are done we can check if we are good to go!!

sneha122 · 2024-03-05T05:24:33Z

Hi @sneha122 I'm sorry for the confusion, I've realized that I made a mistake in the naming of my branch. If necessary, we can close this pull request and I will rename the branch and create a new pull request. Let me know if you would like me to proceed in this manner

Hi @gzlinzihong, No worries we can keep this PR request and merge this once all checks pass.

sneha122 · 2024-03-05T05:24:50Z

/build-deploy-preview skip-tests=true

sneha122 · 2024-03-05T05:25:08Z

/ok-to-test tags="@tag.Filepicker, @tag.Widget, @tag.Datasource"

github-actions · 2024-03-05T05:25:13Z

Deploying Your Preview: https://github.com/appsmithorg/appsmith/actions/runs/8151480457.
Workflow: On demand build Docker image and deploy preview.
skip-tests: true.
env: ``.
PR: 31422.
recreate: .

github-actions · 2024-03-05T05:25:33Z

Tests running at: https://github.com/appsmithorg/appsmith/actions/runs/8151484166.
Workflow: Appsmith External Integration Test Workflow.
Tags: @tag.Filepicker, @tag.Widget, @tag.Datasource.

github-actions · 2024-03-05T05:39:03Z

Deploy-Preview-URL: https://ce-31422.dp.appsmith.com

github-actions · 2024-03-05T06:38:05Z

Workflow run: https://github.com/appsmithorg/appsmith/actions/runs/8151484166.
Commit: ``.
Cypress dashboard: Click here!
The following are new failures, please fix them before merging the PR:

cypress/e2e/Regression/ClientSide/Widgets/Tab/Tab_spec.js

To know the list of identified flaky tests - Refer here

github-actions · 2024-03-05T08:44:39Z

Workflow run: https://github.com/appsmithorg/appsmith/actions/runs/8151484166.
Commit: ``.
Cypress dashboard url: Click here!
All cypress tests have passed 🎉🎉🎉

sneha122 · 2024-03-05T09:22:27Z

Hi @gzlinzihong,

Could you please help me understand second part of the issue: The MULTIPART_FORM_DATA type in the body of the restapi plugin, when passing text type parameters, including Chinese, Japanese, etc., is garbled? Can you please give me exact steps to reproduce the issue so that it will be easier for me to test the deploy preview?

app/client/cypress/e2e/Regression/ClientSide/Widgets/Filepicker/FilePickerV2_spec.js

gzlinzihong · 2024-03-05T09:47:34Z

Hi @gzlinzihong,

Could you please help me understand second part of the issue: The MULTIPART_FORM_DATA type in the body of the restapi plugin, when passing text type parameters, including Chinese, Japanese, etc., is garbled? Can you please give me exact steps to reproduce the issue so that it will be easier for me to test the deploy preview?

Hi @sneha122 You can reproduce it by following these steps

Create a restapi query, as shown in the figure below. One of the parameters uses text type and the value is Chinese or Japanese, etc.
The test procedure for accepting requests is as follows
Print viewing request
Most applications should use UTF-8 encoding by default. If utf-8 encoding is used, the print result is as follows

sneha122 · 2024-03-05T12:36:40Z

@gzlinzihong Thanks a lot for the detailed steps, Tested the PR looks fixes look good to me!

gzlinzihong · 2024-03-07T06:28:02Z

Hi @sneha122 I would like to ask if anyone else will review it in the future?

sneha122 · 2024-03-07T06:53:44Z

Hi @gzlinzihong Let me get confirmation from Nidhi on the change of ISO to UTF-8, post that we are good to go ahead!

sneha122 · 2024-03-11T04:50:22Z

Hi @gzlinzihong, As I can see in your PR, you have resolved following two issues:

When using filepicker widgets to select multiple files and then remove one of them, all file.data contents are abnormal in the end.
The MULTIPART_FORM_DATA type in the body of the restapi plugin, when passing text type parameters, including Chinese, Japanese, etc., is garbled.

First one is completely resolved with relevant cypress tests and I have tested the changes as well. As for second one, we are still discussing the change from ISO to UTF, While we are still discussing and coming to conclusion on that, it will be great to get at least first fix merged, and we cannot do that with this PR as it contains changes for second one as well. Is it possible for you to create a separate PRs for both issues? That way we can merge changes for first one very quickly and second one we can discuss and take it forward separately. Please let me know what you think about this

gzlinzihong · 2024-03-11T04:59:09Z

Hi @gzlinzihong, As I can see in your PR, you have resolved following two issues:

When using filepicker widgets to select multiple files and then remove one of them, all file.data contents are abnormal in the end.

The MULTIPART_FORM_DATA type in the body of the restapi plugin, when passing text type parameters, including Chinese, Japanese, etc., is garbled.

First one is completely resolved with relevant cypress tests and I have tested the changes as well. As for second one, we are still discussing the change from ISO to UTF, While we are still discussing and coming to conclusion on that, it will be great to get at least first fix merged, and we cannot do that with this PR as it contains changes for second one as well. Is it possible for you to create a separate PRs for both issues? That way we can merge changes for first one very quickly and second one we can discuss and take it forward separately. Please let me know what you think about this

OK, I'll do it

gzlinzihong · 2024-03-11T06:05:57Z

Hi @sneha122
I have split it into 2 pr.
thank you for your patience!
filePickerPR
garbled characters

sneha122 · 2024-03-11T10:09:46Z

Thanks a lot @gzlinzihong This makes it super easy for us to merge the fix which is good to go, now we can focus on utf-8 separately, I will check the file picker PR and run necessary checks on it.

sneha122 · 2024-03-12T07:02:53Z

Closing this PR as we have created two separate ones, out of which one is merged, other one is pending and can be tracked here

fix: filepicker remove file lead to previously uploaded file data is …

b6daff5

…lost.

Nikhil-Nandagopal added the Task A simple Todo label Mar 1, 2024

coderabbitai bot reviewed Mar 1, 2024

View reviewed changes

app/client/src/widgets/FilePickerWidgetV2/widget/index.tsx Show resolved Hide resolved

gzlinzihong force-pushed the feature/fix-file-upload branch 5 times, most recently from 29af53e to 22517f7 Compare March 1, 2024 16:11

fix: garbled content

eb102fd

gzlinzihong force-pushed the feature/fix-file-upload branch from 22517f7 to eb102fd Compare March 1, 2024 16:11

sneha122 requested review from sneha122 and Parthvi12 March 4, 2024 12:25

sneha122 reviewed Mar 4, 2024

View reviewed changes

fix: linting failures

2c54e97

sneha122 requested a review from sumitsum March 5, 2024 09:03

Parthvi12 reviewed Mar 5, 2024

View reviewed changes

app/client/cypress/e2e/Regression/ClientSide/Widgets/Filepicker/FilePickerV2_spec.js Outdated Show resolved Hide resolved

app/client/cypress/e2e/Regression/ClientSide/Widgets/Filepicker/FilePickerV2_spec.js Outdated Show resolved Hide resolved

fix: Use commonlocators instead of hardcoding

76c6860

sneha122 added the Integrations Pod General Issues related to the Integrations Pod that don't fit into other tags. label Mar 8, 2024

gzlinzihong requested a review from nidhi-nair March 11, 2024 01:52

sneha122 closed this Mar 12, 2024

sneha122 mentioned this pull request Mar 12, 2024

fix: resulting in text garbled characters of type restapiplugin MULTI… #31647

Closed

fix: file picker widgets removing files causing abnormal content, resulting in text garbled characters of type restapiplugin MULTIPART-FORM-DATA #31422

fix: file picker widgets removing files causing abnormal content, resulting in text garbled characters of type restapiplugin MULTIPART-FORM-DATA #31422

Conversation

gzlinzihong commented Mar 1, 2024 • edited by coderabbitai bot Loading

Description

PR fixes following issue(s)

Type of change

Testing

How Has This Been Tested?

Test Plan

Issues raised during DP testing

Checklist:

Dev activity

Summary by CodeRabbit

welcome bot commented Mar 1, 2024

coderabbitai bot commented Mar 1, 2024 • edited Loading

Walkthrough

Changes

Related issues

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

CodeRabbit Discord Community

Nikhil-Nandagopal commented Mar 2, 2024

rohan-arthur commented Mar 4, 2024

github-actions bot commented Mar 4, 2024

rohan-arthur commented Mar 4, 2024

github-actions bot commented Mar 4, 2024

github-actions bot commented Mar 4, 2024

Failed server tests

github-actions bot commented Mar 4, 2024

sneha122 commented Mar 4, 2024

sneha122 Mar 4, 2024

Choose a reason for hiding this comment

sumitsum Mar 5, 2024 • edited Loading

Choose a reason for hiding this comment

gzlinzihong Mar 6, 2024 • edited Loading

Choose a reason for hiding this comment

nidhi-nair Mar 7, 2024

Choose a reason for hiding this comment

nidhi-nair Mar 7, 2024

Choose a reason for hiding this comment

gzlinzihong Mar 7, 2024

Choose a reason for hiding this comment

gzlinzihong Mar 7, 2024

Choose a reason for hiding this comment

nidhi-nair Mar 7, 2024

Choose a reason for hiding this comment

gzlinzihong Mar 7, 2024

Choose a reason for hiding this comment

sneha122 commented Mar 4, 2024

gzlinzihong commented Mar 5, 2024

gzlinzihong commented Mar 5, 2024

sneha122 commented Mar 5, 2024

sneha122 commented Mar 5, 2024

sneha122 commented Mar 5, 2024

sneha122 commented Mar 5, 2024

github-actions bot commented Mar 5, 2024

github-actions bot commented Mar 5, 2024

github-actions bot commented Mar 5, 2024

github-actions bot commented Mar 5, 2024

github-actions bot commented Mar 5, 2024

sneha122 commented Mar 5, 2024

gzlinzihong commented Mar 5, 2024 • edited Loading

sneha122 commented Mar 5, 2024

gzlinzihong commented Mar 7, 2024

sneha122 commented Mar 7, 2024

sneha122 commented Mar 11, 2024

gzlinzihong commented Mar 11, 2024

gzlinzihong commented Mar 11, 2024

sneha122 commented Mar 11, 2024

sneha122 commented Mar 12, 2024

gzlinzihong commented Mar 1, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 1, 2024 •

edited

Loading

CodeRabbit Configration File (`.coderabbit.yaml`)

sumitsum Mar 5, 2024 •

edited

Loading

gzlinzihong Mar 6, 2024 •

edited

Loading

gzlinzihong commented Mar 5, 2024 •

edited

Loading