This file contains errata for the "Julia for Data Analysis" book that has been written by Bogumił Kamiński and has been published by Manning Publications Co.
I show the following example of code execution:
julia> function sum_n(n)
s = 0
for i in 1:n
s += i
end
return s
end
sum_n (generic function with 1 method)
julia> @time sum_n(1_000_000_000)
0.000001 seconds
500000000500000000
This timing is very fast (and the reason is explained in the book). The issue is that this is the situation under Julia 1.7.
Under Julia 1.8 and Julia 1.9 running the same code takes longer (tested under Julia 1.9-beta4):
julia> @time sum_n(1_000_000_000)
2.265569 seconds
500000000500000000
The reason for this inconsistency is a bug in the @time
macro introduced in Julia 1.8 release.
The sum_n(1_000_000_000)
call (without @time
) is executed fast.
Here is a simplified benchmark (run under Julia 1.9-beta4):
julia> let
start = time_ns()
v = sum_n(1_000_000_000)
stop=time_ns()
v, Int(stop - start)
end
(500000000500000000, 1000)
Unfortunately there is an issue with the @time
macro used in global scope, that needs to be resolved in Base Julia.
See this issue.
- middle of page 20: the provided link http:https://mng.bz/5mWD explaining k-times winsorized mean definition no longer works. Use https://web.archive.org/web/20210804184830/https://v8doc.sas.com/sashtml/insight/chap38/sect17.htm provided by The Wayback Machine instead.
I compare the following expressions:
x > 0 && println(x)
and
if x > 0
println(x)
end
where x = -7
.
I write there that Julia interprets them both in the same way.
It is true in terms of the fact that in both cases the println
function is not called (and this is the focus point of the example).
However, there is a difference in the value of these expressions.
The first expression evaluates to false
, while the second evaluates to nothing
.
Here is how you can check it:
julia> x = -7
-7
julia> show(x > 0 && println(x))
false
julia> show(if x > 0
println(x)
end)
nothing
- top of page 45: use in this book): should be use in this book:
- middle of page 58:
y[end - the + 1] = y[end -- k]
should bey[end - i + 1] = y[end - k]
- top of page 59:
sort(v::AbstractVector; kwthe.)
should besort(v::AbstractVector; kws...)
- middle of Listing 6.4:
codeunits("?")
should becodeunits("ε")
- middle of page 189:
zsdf format
should bezstd format
- bottom of page 191:
misssingstring
should bemissingstring
- top of page 191:
both ratings
should beratings
- bottom of page 255:
? Error: Error adding value to column :b.
should be┌ Error: Error adding value to column :b.
- bottom of page 302:
julia> df = DataFrame(a=1:3, b=1:3, c=1:3)
3×3 DataFrame
Row │ a b c
│ Int64 Int64 Int64
???????????????????????????
1 │ 1 1 1
2 │ 2 2 2
3 │ 3 3 3
should be
julia> df = DataFrame(a=1:3, b=1:3, c=1:3)
3×3 DataFrame
Row │ a b c
│ Int64 Int64 Int64
─────┼─────────────────────
1 │ 1 1 1
2 │ 2 2 2
3 │ 3 3 3
- top of page 318: in the annotation to Figure 12.6 there is text Applies a log1p which looks like Applies a loglp
(this is a display issue due to the fact that in the font used letter
l
and digit1
look identical)