Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify and improve performance of rand(Char) #11178

Merged
merged 1 commit into from
May 7, 2015

Conversation

ScottPJones
Copy link
Contributor

This also fixes a minor issue that two valid code points (0x10fffe and 0x10ffff) would not be produced.
[They are valid code points or scalar values, even though they are part of the set of 66 "NonCharacters",
and per the Unicode standard, need to be allowed].
On testing on my laptop, this took about 83% of the time of the original code...
[kudos to @jakebolewski for #11033 in the first place!]

return reinterpret(Char,c)
end
end
c = rand(0x00000000:0x0010f7ff)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need

c = rand(r, 0x00000000:0x0010f7ff)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just followed what the code did before (it just had the values incorrect). Why would the r be needed, it actually is not even used (I don't know why the function even has the argument, it seems rather misleading)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm no expert on Julia's RNG code, but I think this might have been an oversight in the original code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the way rand is meant to be used, here, is the 1-argument form, i.e.:
rand(Char), and that it is simply using the 1-argument form there also:

julia> typeof(0:0x10f7ff)
UnitRange{Int64}

julia> @which rand(0:0x10f7ff)
rand(r::AbstractArray{T,N}) at random.jl:203

I'm pretty sure it is correct both in the original code, and in my update.
Thanks for reviewing the code though, the dropping of r bothers me too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, and hi david ;-) rand(r, 0:0x10f7ff) gives reproducible random numbers and has side effects on r , rand(0:0x10f7ff)is not reproducible in a reasonable way and has side effects on GLOBAL_RNG . rand(r::AbstractRNG, ::Type{Char}) should not affect GLOBAL_RNG.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daviddelaat is right, the rand Range needs to be parameterized by the AbstractRNG. I'll add the fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain... I really know nothing about the way abstract ranges work, or GLOBAL_RNG...
I was just fixing the Unicode issues!
Maybe @jakebolewski should fix the range issues?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I totally understand your question, but the change Jake made in 7efbc01 isn't really specific to ranges or the RNG. It's an idiom for functions that interact with global state (like the GLOBAL_RNG or STDOUT). For example, the print function has one fallback method:

print(x) = print(STDOUT, x)

And then all custom types define methods print(io::IO, x::MyType), and within those methods they should only call the two-arg print methods, explicitly using io instead of defaulting back to STDOUT. This allows you to easily collect the output in, e.g., an IOBuffer without worrying about stuff leaking out to STDOUT.

The same applies to RNGs, since they are also a global that mutates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks, now that makes sense!

jakebolewski added a commit that referenced this pull request May 7, 2015
Simplify and improve performance of rand(Char)
@jakebolewski jakebolewski merged commit 2bd0de3 into JuliaLang:master May 7, 2015
@ScottPJones ScottPJones deleted the spj/randchar branch May 7, 2015 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants