Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Delta export path unescape #7473

Merged
merged 2 commits into from
Feb 19, 2024

Conversation

N-o-Z
Copy link
Member

@N-o-Z N-o-Z commented Feb 19, 2024

Closes #7460

Change Description

Background

Fix issue where physical address is not escaped before writing it as path in the delta log file

Bug Fix

Issue root cause:
When importing data, physical address is saved unescaped. When exporting a delta table the log file is rewritten in the export location and the path is modified accordingly using the object physical address. For unescaped address which contain special characters in the path we are writing an invalid path.

Testing Details

Add + modify esti tests

Breaking Change?

No

@N-o-Z N-o-Z added bug Something isn't working contributor labels Feb 19, 2024
@N-o-Z N-o-Z self-assigned this Feb 19, 2024
Copy link

E2E Test Results - DynamoDB Local - Local Block Adapter

10 passed

@N-o-Z N-o-Z added the include-changelog PR description should be included in next release changelog label Feb 19, 2024
Copy link
Contributor

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but please address comments before merging.

Comment on lines +536 to +537
accessKeyID := viper.GetString("access_key_id")
secretAccessKey := viper.GetString("secret_access_key")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use testutil.SetupTestingEnv here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what sense?
testutil.SetupTestingEnv configures the values (it is used in the TestMain and is run on every test)
Here we are just retrieving the values after running testutil.SetupTestingEnv

@@ -111,7 +111,8 @@ local function export_delta_log(action, table_def_names, write_object, delta_cli
local code, obj = lakefs.stat_object(repo, commit_id, unescaped_path)
if code == 200 then
local obj_stat = json.unmarshal(obj)
local physical_path = obj_stat["physical_address"]
local u = url.parse(obj_stat["physical_address"])
local physical_path = url.build_url(u["scheme"], u["host"], u["path"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we parse and then build what looks like the exact same URL. Is this somehow the way to unescape? If so, please add a comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@N-o-Z N-o-Z merged commit 70f932c into master Feb 19, 2024
35 checks passed
@N-o-Z N-o-Z deleted the fix/delta-export-logs-path-with-space-7460 branch February 19, 2024 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working contributor include-changelog PR description should be included in next release changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Delta table export: can't read exported data if any delta partition name includes spaces
2 participants