Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*.ji files need package version in slug to unbreak Revise #42404

Open
timholy opened this issue Sep 28, 2021 · 6 comments
Open

*.ji files need package version in slug to unbreak Revise #42404

timholy opened this issue Sep 28, 2021 · 6 comments

Comments

@timholy
Copy link
Sponsor Member

timholy commented Sep 28, 2021

Currently our precompile cache files (stored in ~/.julia/compiled/vX.Y/) have a slug for the pkg UUID and the Julia version:

julia/base/loading.jl

Lines 1383 to 1397 in 84cc901

function compilecache_path(pkg::PkgId, prefs_hash::UInt64)::String
entrypath, entryfile = cache_file_entry(pkg)
cachepath = joinpath(DEPOT_PATH[1], entrypath)
isdir(cachepath) || mkpath(cachepath)
if pkg.uuid === nothing
abspath(cachepath, entryfile) * ".ji"
else
crc = _crc32c(something(Base.active_project(), ""))
crc = _crc32c(unsafe_string(JLOptions().image_file), crc)
crc = _crc32c(unsafe_string(JLOptions().julia_bin), crc)
crc = _crc32c(prefs_hash, crc)
project_precompile_slug = slug(crc, 5)
abspath(cachepath, string(entryfile, "_", project_precompile_slug, ".ji"))
end
end

The package source directories (stored in ~/.julia/packages) also have a package version component to their name:

julia/base/loading.jl

Lines 169 to 173 in 84cc901

function version_slug(uuid::UUID, sha1::SHA1, p::Int=5)
crc = _crc32c(uuid)
crc = _crc32c(sha1.bytes, crc)
return slug(crc, p)
end

The thing to note is that the source code path naming is package-version dependent but the precompiled cache file is not. For Revise, this has a pretty awful consequence:

  • user loads a package. Revise notes this and stores the filename of the cachefile, knowing that it can extract the original source code later from the source-text segment of the cache file
  • user starts another julia process somewhere else on their system. Does a package update.
  • Thanks to automatic precompilation, that other julia process overwrites the cache file since the name does not depend on the package version
  • now back in the original julia session, Revise notes the manifest has updated. Tries to update the running code. Can't even extract the source file text because now the cache file has the source text for the new version package, not the old one.

You can reduce the chance of this happening by setting ENV["JULIA_PKG_PRECOMPILE_AUTO"]=0, but that isn't a real fix (if the user does a using SomePkg in the other session it will trigger overwriting of the cache file).

CC @IanButterworth (not a bug you created, but it's now much more exposed). The real fix is to track the version of the loaded package and make sure the version makes it into the slug.

@KristofferC
Copy link
Sponsor Member

KristofferC commented Sep 28, 2021

Ref some previous discussion about the path schema for the precompile files: #32651 (comment)

Tries to update the running code. Can't even extract the source file text because now the cache file has the source text for the new version package, not the old one.

So you then don't have the original source code and you can't diff it vs the new one? What's the performance impact of Revise just caching the source text of the packages it tracks?

@timholy
Copy link
Sponsor Member Author

timholy commented Sep 28, 2021

I've figured out a good workaround (arguably it's better than just a workaround): the caching of the text in the precompile file is still need to support devved packages, but it isn't needed anymore for versioned packages. Thanks to the fact that we now store multiple versions of the package and the version appears in the pkgdir slug, I can just get it instead from the actual .julia/packages/ source text: https://github.com/timholy/Revise.jl/blob/6b3c2390c341139f41c70dceb6f77f835cb5e076/src/pkgs.jl#L531-L540.

@timholy
Copy link
Sponsor Member Author

timholy commented Sep 28, 2021

What's the performance impact of Revise just caching the source text of the packages it tracks?

Pretty bad. On Revise's released version:

julia> using Revise

julia> tstart = time(); using DifferentialEquations; time()-tstart
12.63300609588623

vs one in which you add a maybe_parse_from_cache! call when you start tracking:

julia> using Revise

julia> tstart = time(); using DifferentialEquations; time()-tstart
28.40114998817444

(Nice, though, that it got all the way through parsing DE. That's a lot of packages, and helps me breathe easier knowing Revise is good enough to do that. I've been testing it against Base for a long time, but there are always weird corner cases...)

timholy added a commit to timholy/Revise.jl that referenced this issue Sep 28, 2021
)

The new features:
- stop `put!`ting to `backend.response_channel`
- allow watching multiple Manifests, in conjunction with changes to Pkg: JuliaLang/Pkg.jl#2744
- only update code for changes in the current Manifest
- use the original source-text instead of the cache when the cache has been overwritten by an external process (JuliaLang/julia#42404)

Together (even without the changes to Pkg) these should be a significant enhancement.

Fixes #647
Fixes #645
Fixes #617
@timholy
Copy link
Sponsor Member Author

timholy commented Sep 28, 2021

Should we leave this open anyway? It seems a little weird that the slug incorporates the Julia version but not the package version.

@IanButterworth
Copy link
Sponsor Member

IanButterworth commented Sep 28, 2021

Something I don't understand here is if the active project is folded into the compiled/... slug, in this example below why doesn't CSV etc. re-precompile given the active project change?

(@v1.8) pkg> activate --temp
  Activating new project at `/var/folders/_6/1yf6sj0950vcg4t91m9ltb5w0000gn/T/jl_UA4e4V`

(jl_UA4e4V) pkg> add CSV
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
   Installed SentinelArrays ─ v1.3.7
   Installed CSV ──────────── v0.9.4
    Updating `/private/var/folders/_6/1yf6sj0950vcg4t91m9ltb5w0000gn/T/jl_UA4e4V/Project.toml`
 [336ed68f] + CSV v0.9.4
    Updating `/private/var/folders/_6/1yf6sj0950vcg4t91m9ltb5w0000gn/T/jl_UA4e4V/Manifest.toml`
 [336ed68f] + CSV v0.9.4
 [944b1d66] + CodecZlib v0.7.0
 [9a962f9c] + DataAPI v1.9.0
 [e2d170a0] + DataValueInterfaces v1.0.0
 [48062228] + FilePathsBase v0.9.11
...
Precompiling project...
  10 dependencies successfully precompiled in 13 seconds (7 already precompiled)

(jl_UA4e4V) pkg> activate --temp
  Activating new project at `/var/folders/_6/1yf6sj0950vcg4t91m9ltb5w0000gn/T/jl_S2NRMy`

(jl_S2NRMy) pkg> add CSV
   Resolving package versions...
    Updating `/private/var/folders/_6/1yf6sj0950vcg4t91m9ltb5w0000gn/T/jl_S2NRMy/Project.toml`
 [336ed68f] + CSV v0.9.4
    Updating `/private/var/folders/_6/1yf6sj0950vcg4t91m9ltb5w0000gn/T/jl_S2NRMy/Manifest.toml`
 [336ed68f] + CSV v0.9.4
 [944b1d66] + CodecZlib v0.7.0
 [9a962f9c] + DataAPI v1.9.0
 [e2d170a0] + DataValueInterfaces v1.0.0
 [48062228] + FilePathsBase v0.9.11
...

(jl_S2NRMy) pkg> 

@KristofferC
Copy link
Sponsor Member

KristofferC commented Sep 28, 2021

in this example below why doesn't CSV etc. re-precompile given the active project change?

It will only recompile if it doesn't find a working existing precompile file. The one in the previous project is valid, so it will use that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants