Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: encoding/json: avoid massive escape costs #68203

Closed
karalabe opened this issue Jun 26, 2024 · 3 comments
Closed

proposal: encoding/json: avoid massive escape costs #68203

karalabe opened this issue Jun 26, 2024 · 3 comments
Labels
Milestone

Comments

@karalabe
Copy link
Contributor

Proposal Details

It seems the json encoder and decoder has a significant overhead when escaping strings. I've attached a bunch of benchmarks to the end of this report. In short, I have a large string (hex in this case) and I would like to insert it into a json field. My benchmarks just json encode the single hex value.

BenchmarkMarshalString-12             	     162	   7383418 ns/op
BenchmarkMarshalRawJSON-12            	      42	  28148749 ns/op
BenchmarkMarshalTexter-12             	     153	   7785682 ns/op
BenchmarkMarshalJsoner-12             	      40	  28960272 ns/op
BenchmarkMarshalCopyString-12         	    4141	    263625 ns/op

I would expect the performance to be near the speed of copying the data. However, Go seems to do a lot of extra processing. This report is kind of questioning various parts of that:

  • I can imagine Go wanting to double check the content of a string, but in that case, it would be nice to have a means to tell the json encoder/decoder that I know the content is valid, just parse it without wasting a ton of time.
  • I expected the RawMessage to actually not do all kinds of pre-post processing, but alas, Go elegantly ignores that it's "raw", and still does everything.
  • Annoyingly enough, for types that have MarshalJson implemented, it seems the escaping runs 3 (!!!) times. I haven;t found the 3rd one, but I think two of them are https://github.com/golang/go/blob/master/src/encoding/json/encode.go#L587 and the line right after, where both lines do an appendString call, which internally does the escape checks (yeah, the noescape flag only disables HTML escape checking, not ascii escape checking).

I'm not even entirely sure what's the solution to the various issues.

  • I'd expect to be able to use the json package without escaping.
  • I'd expect RawMessage to not be post processed
  • I'd expect the escaping code to be fast, and not take more time than encoding all the fields
  • I'd expect encoding to run once, not 3 times
func BenchmarkMarshalString(b *testing.B) {
	src := bytes.Repeat([]byte{'0'}, 4194304)
	str := hex.EncodeToString(src)

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		json.Marshal(str)
	}
}

func BenchmarkMarshalRawJSON(b *testing.B) {
	src := bytes.Repeat([]byte{'0'}, 4194304)
	msg := json.RawMessage(`"` + hex.EncodeToString(src) + `"`)

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		json.Marshal(msg)
	}
}

func BenchmarkMarshalTexter(b *testing.B) {
	src := bytes.Repeat([]byte{'0'}, 4194304)
	txt := &Texter{str: hex.EncodeToString(src)}

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		json.Marshal(txt)
	}
}

func BenchmarkMarshalJsoner(b *testing.B) {
	src := bytes.Repeat([]byte{'0'}, 4194304)
	jsn := &Jsoner{str: hex.EncodeToString(src)}

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		json.Marshal(jsn)
	}
}

func BenchmarkMarshalCopyString(b *testing.B) {
	src := bytes.Repeat([]byte{'0'}, 4194304)
	str := hex.EncodeToString(src)

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		buf := make([]byte, len(str)+2)
		buf[0] = '"'
		copy(buf[1:], str)
		buf[len(buf)-1] = '"'
	}
}

type Texter struct {
	str string
}

func (t Texter) MarshalText() ([]byte, error) {
	return []byte(t.str), nil
}

type Jsoner struct {
	str string
}

func (j Jsoner) MarshalJSON() ([]byte, error) {
	return []byte(`"` + j.str + `"`), nil
}
@gopherbot gopherbot added this to the Proposal milestone Jun 26, 2024
@ianlancetaylor
Copy link
Contributor

Let's focus further encoding/json optimization discussions on encoding/json/v2. #63397

@karalabe
Copy link
Contributor Author

Ah, works for me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants