Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3: error calling ListObjectsV2 with unusual file name in results #5043

Closed
NathanBaulch opened this issue Oct 28, 2023 · 5 comments
Closed
Assignees
Labels
bug This issue is a bug. p3 This is a minor priority issue response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.

Comments

@NathanBaulch
Copy link

NathanBaulch commented Oct 28, 2023

Describe the bug

I'm unable to list the contents of a bucket due to the presence of a file with "%10" in its name.

Expected Behavior

Successfully return the file object.

Current Behavior

Error:

could not list objects: SerializationError: failed to decode REST XML response
        status code: 200, request id: XM6P29PNE0M2FX9S
caused by: XML syntax error on line 2: illegal character code U+0010

Digging deeper, it looks like the XML unmarshaler is tripping up on the string sequence  in the file name 2018_POSTER_PRE.jpg. According to AWS Console the actual file name is 2018_POSTER%10_PRE.jpg.

Reproduction Steps

Complete example:

x := `
<ListBucketResult>
    <Contents>
        <Key>2018_POSTER&#x10;_PRE.jpg</Key>
    </Contents>
</ListBucketResult>`
r := &request.Request{
	HTTPResponse: &http.Response{Body: io.NopCloser(strings.NewReader(x))},
	Data:         &s3.ListObjectsV2Output{},
}
restxml.Unmarshal(r)
if r.Error != nil {
	panic(r.Error)
	// panic: SerializationError: failed to decode REST XML response
	// caused by: XML syntax error on line 3: illegal character code U+0010
}

Possible Solution

No response

Additional Information/Context

No response

SDK version used

v1.44.320

Environment details (Version of Go (go version)? OS name and version, etc.)

go1.21.3 windows/amd64

@NathanBaulch NathanBaulch added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Oct 28, 2023
@RanVaknin RanVaknin self-assigned this Oct 30, 2023
@RanVaknin
Copy link
Contributor

Hi @NathanBaulch,

I'm not able to reproduce this reported behavior.

package main

import (
	"context"
	"fmt"
	"github.com/aws/aws-sdk-go/aws"
	"github.com/aws/aws-sdk-go/aws/session"
	"github.com/aws/aws-sdk-go/service/s3"
)

func main() {
	ctx := context.Background()
	sess, err := session.NewSession(&aws.Config{
		Region:   aws.String("us-east-1"),
		LogLevel: aws.LogLevel(aws.LogDebugWithHTTPBody),
	})
	if err != nil {
		panic(err)
	}

	svc := s3.New(sess)
	
	out, err := svc.ListObjectsV2WithContext(ctx, &s3.ListObjectsV2Input{
		Bucket: aws.String("foo-bucket-REDACTED"),
	})
	if err != nil {
		panic(err)
	}

	fmt.Println(len(out.Contents))
}

This prints fine:

2023/10/30 09:28:07 
<?xml version="1.0" encoding="UTF-8"?>
<ListBucketResult
	xmlns="http:https://s3.amazonaws.com/doc/2006-03-01/">
	<Name>foo-bucket-REDACTED</Name>
	<Prefix></Prefix>
	<KeyCount>1</KeyCount>
	<MaxKeys>1000</MaxKeys>
	<IsTruncated>false</IsTruncated>
	<Contents>
		<Key>2018_POSTER%10_PRE.jpg</Key>
		<LastModified>2023-10-30T16:26:46.000Z</LastModified>
		<ETag>REDACTED</ETag>
		<Size>59015</Size>
		<StorageClass>STANDARD</StorageClass>
	</Contents>
</ListBucketResult>

Please note. S3's object naming rules specifically lists % as a character that should be avoided because of the need to use special handling. Something you can try is pass in Encoding type in the argument
Another thing you can try, is to use the EncodingType : aws.String("url") parameter to see if this alleviate the issue.

I also have noticed you are using an older SDK version. Can you try this either the newest version, or using Go SDK v2 altogether?

Thanks,
Ran~

@RanVaknin RanVaknin added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. p3 This is a minor priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Oct 30, 2023
@NathanBaulch
Copy link
Author

It looks like the file was created (not by me!) with a 0x10 character in the name. This is pretty easy to reproduce:

_, err := s3c.PutObject(&s3.PutObjectInput{
	Bucket:      aws.String(myBucket),
	Key:         aws.String("_test/foo\x10bar.txt"),
	ContentType: aws.String("text/plain"),
	Body:        strings.NewReader("hello world"),
})
if err != nil {
	panic(err)
}
_, err = s3c.ListObjectsV2(&s3.ListObjectsV2Input{
	Bucket: aws.String(myBucket),
	Prefix: aws.String("_test/"),
})
if err != nil {
	panic(err) // XML syntax error on line 9: illegal character code U+0010
}

I understand this is totally against file naming recommendations (again, not by me!), but what do I do now that I'm in this situation? I need to reliably iterate over this bucket's contents in Golang!

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Oct 31, 2023
@RanVaknin
Copy link
Contributor

RanVaknin commented Oct 31, 2023

Hi @NathanBaulch ,

Ah now I see what is happening. In Go, the \x is an escape sequence meant to specify byte values in hexadecimal notation. So when the Go interpreter tries to read the XML, it runs into \x10 and it assumes that its the 0x10 ASCII DLE Character which cannot be represented in text.

You can get around it by specifying that you want to get url encoded data:

out, err := svc.ListObjectsV2WithContext(ctx, &s3.ListObjectsV2Input{
	Bucket:       aws.String(myBucket),
	EncodingType: aws.String("url"),
	Prefix:       aws.String("_test/"),
})
if err != nil {
	panic(err)
}

Thanks,
Ran~

@RanVaknin RanVaknin added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Oct 31, 2023
@NathanBaulch
Copy link
Author

Perfect, thanks.

Copy link

github-actions bot commented Nov 1, 2023

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. p3 This is a minor priority issue response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.
Projects
None yet
Development

No branches or pull requests

2 participants