Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check CHOLMOD version at runtime #10362

Merged
merged 1 commit into from
Mar 14, 2015
Merged

Check CHOLMOD version at runtime #10362

merged 1 commit into from
Mar 14, 2015

Conversation

andreasnoack
Copy link
Member

This one should help to fix problem like #10342 where the version of CHOLMOD linked at runtime differs from the version used to create the shim suitesparse_wrapper. If the two versions don't match, an error is thrown.

ccall((:cholmod_version, :libcholmod), Cint, (Ptr{Cint},), version_array)
ccall((:jl_cholmod_version, :libsuitesparse_wrapper), Cint, (Ptr{Cint},), tmp)
if tmp != version_array
throw(CHOLMODException("your version of CHOLMOD differ from the version used when Julia was build. These versions must match."))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

differs from the version used when Julia was built

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@andreasnoack andreasnoack force-pushed the anj/cholmod branch 2 times, most recently from a8ca62d to 297b698 Compare February 28, 2015 19:55
@tkelman
Copy link
Contributor

tkelman commented Feb 28, 2015

Does throwing exceptions in __init__ end up as a warning, or would that be an error immediately on startup making Julia unusable even for people who don't need cholmod?

@andreasnoack
Copy link
Member Author

Just checked and the result right now is

Hej
Warning: error initializing module CHOLMOD:
Base.SparseMatrix.CHOLMOD.CHOLMODException(msg="your version of CHOLMOD differs from the version used when Julia was built. These versions must match.")
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "help()" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.4.0-dev+3625 (2015-02-28 19:55 UTC)
 _/ |\__'_|_|_|\__'_|  |  anj/cholmod/297b698* (fork: 1 commits, 0 days)
|__/                   |  x86_64-linux-gnu

though I plan to remove the "Hej".

So you can still use Julia even if you have an old or wrong version of CHOLMOD.

@ivarne
Copy link
Sponsor Member

ivarne commented Mar 1, 2015

This error will likely be encountered by people using various distributions, so there is a high probability they will not know what CHOLMOD is for, and they will have a hard time figuring out if the message is relevant to them. It is not even mentioned in the README.md, but the Lincence.md seems to indicate that it is related to some operations on sparce matrices.

I will also award a few bonus points for including the compiled version, and the current version. It's usually a struggle to make users find such information.

eg:

Julia was compiled with CHOLMOD version x.y.z, but is currently
linked to version x.y.Z, this might cause errors when working with
[whatever operations CHOLMOD provides] on sparse matrices.

@andreasnoack
Copy link
Member Author

It is a good idea to mention the version numbers in the error message. They are calculated anyway. However, real users shouldn't see this error. It is mainly meant as a help for packagers to make the right dependency requirements. Julia is broken and can segfault when the versions of CHOLMOD differ so we should really try to avoid mismatch.

@ivarne
Copy link
Sponsor Member

ivarne commented Mar 2, 2015

Real users will see this error message when the packagers mess up (which they do).

@andreasnoack
Copy link
Member Author

I'm fine with adding to the error message that the sparse solvers are broken when the versions don't match, but I feel sad that the packaging model of Linux distributions makes it so likely that Julia will end up broken. I'd really prefer that we shipped our own copies of the dependencies. I have space enough left on my harddisk to over 5000 copies of SuiteSparse, so I'd really prefer a couple of extra copies to a potentially broken application.

@tkelman
Copy link
Contributor

tkelman commented Mar 2, 2015

Encourage people to use the generic linux binaries instead of from their package manager then. You lose system-managed in-place updates that way, but you gain self-contained installation without needing root.

For distribution packaging we need to either declare more explicit version requirements on dependencies (#6742 is part of that), or register sysimg rebuild hooks. The static compilation work might make this a little less painful if the Julia sysimg could be broken up into more independent pieces, so you wouldn't have to rebuild the whole thing.

@nalimilan
Copy link
Member

This shouldn't happen in standard distro packages. Normally, packagers are expected to warn maintainers of all packages that depend on their library when they update to a backward-incompatible version so that they can rebuild their package.

So I'd say the problem comes from the fact that Julia moves very fast, and it's not easy for us to handle stable and development packages without sometimes breaking things. Maybe @staticfloat would better create a separate repo for the dependencies of stable releases, were things are more calm and users can expect more working packages.

@staticfloat
Copy link
Sponsor Member

I agree with @ivarne that we should make the error as user-facing as possible. With all the moving parts we now have with the Julia builds, it's difficult to ensure everything works when problems are silent until the moment of critical failure. The first step toward ensuring that problems like this don't arise is preemptively checking for them, so this is a nice improvement.

@nalimilan
Copy link
Member

BTW, @staticfloat, it looks like the policy for Debian packages is to include the SOVERSION in their name, so normally you cannot install Julia with an unsupported dependency version. Sounds like a sane approach to me, that Fedora unfortunately does not follow AFAICT.

@tkelman
Copy link
Contributor

tkelman commented Mar 6, 2015

This message should spell out what the consequences of this problem mean. Otherwise it'll confuse people.

Is the exact version number what we actually want to be checking here, or should we specifically be checking as many features as we can? SuiteSparse can be configured in different ways by different packagers...

@andreasnoack
Copy link
Member Author

@tkelman @ivarne I have extended the warnings in this pull request. Could you please take a look and see if the information is useful and if the language is correct.

@ivarne
Copy link
Sponsor Member

ivarne commented Mar 13, 2015

The warnings is tremendously more useful now! Thanks!

A few esthetic remarks. My terminal doesn't do a great job with breaking such long lines at word boundries, so I'd appreciate a few newlines at appropriate places. The """ quoted string also have a nice deindentation functionality that allows you to have multiline string literals without breaking the indentation pattern. (see how the leading spaces is removed by the parser.)

begin
    begin
        begin
            warn("""
                     CHOLMOD version incompatibility

                     Julia was compiled with a version of CHOLMOD that supported $(32length(IndexTypes))
                     bit integers, but is currently linked with version that supports $(8intsize) integers. This
                     might cause Julia to terminate when working with sparse matrices for operations involving
                     factorization of a matrix, e.g. solving systems of equations with \\.

                     This problem can be fixed by downloading the OS X or generic Linux binary from
                     www.julialang.org, which are shipped with the correct versions of all dependencies.
            """)
        end
    end
end

@nalimilan
Copy link
Member

@andreasnoack Somehow I missed that this commit added a warning even when only SuiteSparse's revision version (z in x.y.z) is changed. Is that absolutely necessary? Couldn't we check only the two first version numbers? It doesn't sound like a good development practice to me to break backward compatibility in an apparently bugfix-only version of a library; ideally, it would not even happen in minor releases, only in major ones. We could discuss this with upstream if needed.

This was raised in #12841, where a bugfix update of SuiteSparse in Fedora triggered the warning in Julia. Printing a scary warning all the time in a distro package isn't reasonable, and neither is updating Julia with every SuiteSparse update. I've never seen such a situation in established programs.

@andreasnoack
Copy link
Member Author

@nalimilan Good point. The test could probably be loosened to only check for match on x and y.

@nalimilan
Copy link
Member

Ah, great to hear. That makes it much easier to follow SuiteSparse updates in distros.

nalimilan added a commit that referenced this pull request Sep 7, 2015
The check introduced in #10362 is actually too strict,
as ABI break only happen with mismatches in the major
version. Warning on minor version differences means
distributions cannot update SuiteSparse without
rebuiling Julia (#12841).

Also fix the incorrect wording introuced in 25eb444:
this message corresponds to the case where the version is
different, not necessarily older.
@nalimilan
Copy link
Member

@andreasnoack See #13000.

nalimilan added a commit that referenced this pull request Sep 9, 2015
The check introduced in #10362 is actually too strict,
as ABI break only happen with mismatches in the major
version. Warning on minor version differences means
distributions cannot update SuiteSparse without
rebuiling Julia (#12841). We also require at least
libcholmod 2.1.1.

Call dlsym_e instead of dlsym, which raises an error instead
of returning C_NULL.

Also fix the incorrect wording introuced in 25eb444:
this message corresponds to the case where the version is
different, not necessarily older.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants