Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Internal compiler error: too many bits types" when running testall #5688

Closed
andreasnoack opened this issue Feb 5, 2014 · 16 comments
Closed
Labels
kind:bug Indicates an unexpected problem or unintended behavior test This change adds or pertains to unit tests

Comments

@andreasnoack
Copy link
Member

Recently, I have had problems with running make testall on my mac. After a while the output goes crazy with

    From worker 3:  !!!An ERROR occurred while printing the last error!!!
    From worker 3:  ", ErrorException("error compiling print: internal compiler error: too many bits types"), Array{Ptr{Void}, 1}[0x0000000001d624a6, 0x0000000001d35629, 0x0000000001d28e8f, 0x0000000001d28c8f, 0x0000000001d2512c, 0x0000000001d1f425, 0x000000000f2eed0e, 0x0000000001d1f425, 0x000000000f2ee9a8, 0x0000000001d1f425, 0x000000000f2edabf, 0x000000000f2ed270, 0x000000000f2ed041, 0x0000000001d1f425, 0x000000000f2ee795, 0x0000000001d23ebb, 0x000000000f2ec6e7, 0x0000000001d1f425, 0x000000000f2ee656, 0x0000000001d1f425, 0x00000000039e35ff, 0x0000000003605fb9, 0x0000000003605b13, 0x0000000001d1f425, 0x00000000039eb457, 0x00000000039e510c, 0x00000000039e5876, 0x0000000001d1f425, 0x0000000001d08973, 0x0000000001d61819, 0x0000000001d08d24])
error compiling print: internal compiler error: too many bits types
 in     From worker 3:  ("
    From worker 3:  !!!An ERROR occurred while printing the last error!!!
    From worker 3:  ", ErrorException("error compiling print: internal compiler error: too many bits types"), Array{Ptr{Void}, 1}[0x0000000001d624a6, 0x0000000001d35629, 0x0000000001d28e8f, 0x0000000001d28c8f, 0x0000000001d2512c, 0x0000000001d1f425, 0x000000000f2eed0e, 0x0000000001d1f425, 0x000000000f2ee9a8, 0x0000000001d1f425, 0x000000000f2edabf, 0x000000000f2ed270, 0x000000000f2ed041, 0x0000000001d1f425, 0x000000000f2ee795, 0x0000000001d23ebb, 0x000000000f2ec6e7, 0x0000000001d1f425, 0x000000000f2ee656, 0x0000000001d1f425, 0x00000000039e35ff, 0x0000000003605fb9, 0x0000000003605b13, 0x0000000001d1f425, 0x00000000039eb457, 0x00000000039e510c, 0x00000000039e5876, 0x0000000001d1f425, 0x0000000001d08973, 0x0000000001d61819, 0x0000000001d08d24])
ERROR: error compiling print: internal compiler error: too many bits types
 in     From worker 3:  ("
    From worker 3:  !!!An ERROR occurred while printing the last error!!!
ERROR: ^Cmake[1]: *** [all] Interrupt: 2
make: *** [testall] Interrupt: 2

andreass-mbp:julia andreasnoackjensen$ error compiling print: internal compiler error: too many bits types
 in ERROR: error compiling print: internal compiler error: too many bits types
 in ERROR: error compiling print: internal compiler error: too many bits types
 in ERROR: error compiling print: internal compiler error: too many bits types
 in ERROR: error compiling print: internal compiler error: too many bits types
 in ERROR: error compiling print: internal compiler error: too many bits types
 in ERROR: error compiling print: internal compiler error: too many bits types
 in ERROR: error compiling print: internal compiler error: too many bits types

The tests work when I run them individually. It is a mid 2009 MacBook Pro and versioninfo() is

Julia Version 0.3.0-prerelease+1396
Commit f2b1168* (2014-02-05 04:25 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin13.0.0)
  CPU: Intel(R) Core(TM)2 Duo CPU     P8800  @ 2.66GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY)
  LAPACK: libopenblas
  LIBM: libopenlibm
@kmsquire
Copy link
Member

kmsquire commented Feb 5, 2014

I can verify this problem on Linux.

julia> versioninfo()
Julia Version 0.3.0-prerelease+1399
Commit 20b9453* (2014-02-05 16:27 UTC)
Platform Info:
  System: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU            5150  @ 2.66GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY)
  LAPACK: libopenblas
  LIBM: libopenlibm

@kmsquire
Copy link
Member

kmsquire commented Feb 5, 2014

(I've updated the issue accordingly.)

@kmsquire kmsquire added the bug label Feb 5, 2014
@kmsquire
Copy link
Member

kmsquire commented Feb 5, 2014

git bisect blames 186287a, which contains Andreas' own linear algebra updates.

It seems unlikely that that commit caused the issue, but rather uncovered it.

Having 2-processors/cores also seems to be a prerequisite to trigger this. Travis doesn't have a problem (it runs with 8 procs), and the tests run fine on an 8-core machine I tried. Artificially restricting to 2 processors there caused the error to appear.

@mauro3
Copy link
Contributor

mauro3 commented Feb 5, 2014

I get almost the same problem on Linux, but it only observed it when I run the tests in serial (i.e. setting n=1 in runtests.jl). The error is triggered after * sparse is printed. I tried some bisecting and the bug seems to go away when I comment out the linalg test. (I'm also on commit Commit 20b9453*). This seems to confirm @kmsquire's findings.

@kmsquire kmsquire added the test label Feb 5, 2014
@rsofaer
Copy link
Contributor

rsofaer commented Feb 6, 2014

I have this problem on OSX 10.9. I'm on a late 2010 macbook air.

@nalimilan
Copy link
Member

This is very strange. Here (Fedora 20 64-bit) it only happens with double-conversion 2.0.1. With the version I was using previously, 2.0.0, the tests run fine. I've tested several times and always got this result. By chance, would you be using a recent release of double-conversion 1.x or 2.x?

@mauro3
Copy link
Contributor

mauro3 commented Feb 8, 2014

I ran with all the libraries as pulled in by the julia install, which is in this case double-conversion 1.1.1.

@mschauer
Copy link
Contributor

mschauer commented Feb 9, 2014

Which test ran before in the same worker?

For example I get after deactivating some tests

    JULIA test/all
        From worker 2:       * core
        From worker 3:       * keywordargs
        From worker 3:       * numbers
        From worker 2:       * strings
        From worker 2:       * spawn
        From worker 2:         [stdio passthrough ok]
        From worker 2:       * parallel
    SUCCESS

but

    JULIA test/all
        From worker 2:       * core
        From worker 3:       * keywordargs
        From worker 3:       * numbers
        From worker 2:       * strings
        From worker 2:       * functional
        From worker 2:       * bigint
        From worker 2:       * sorting
        From worker 2:       * statistics
        From worker 2:       * spawn
        From worker 2:         [stdio passthrough ok]
        From worker 3:       * parallel
exception on 2: ERROR: test error during :((readall((@cmd "\$exename -f -e 'println(STDERR,\"Hello World\")'".>@cmd "cat"))=="Hello World\n"))
assertion failed

@nalimilan
Copy link
Member

Forget about double-conversion, I eventually got tests to pass with 2.0.1.

@JeffBezanson
Copy link
Sponsor Member

@loladiro @vtjnash How about we simply clear the typeIdToType map when not inside the code generator? I believe these mappings are only used within a function and don't need to be globally unique. At least that would avoid this unnecessary error. Whether there are actually more than 65025 bits types is a different question --- I hope not. Could be memory corruption.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Feb 10, 2014

I would like to see why it happens first (perhaps instrument the fn with printf)

@vtjnash
Copy link
Sponsor Member

vtjnash commented Feb 10, 2014

If it's memory corruption, your proposal probably wouldn't fix it

@JeffBezanson
Copy link
Sponsor Member

My proposal isn't intended to fix any possible memory corruption; it's just a related observation that this error is avoidable in case it ever happens for real.

@simonster
Copy link
Member

Looks like jl_eqtable_put may rehash typeToTypeId, but typeToTypeId never gets updated with the rehashed table, so jl_eqtable_get(typeToTypeId, t, NULL) will return NULL later for types that were already added. Unfortunately I don't know the workings of the GC well enough to fix this myself.

@JeffBezanson
Copy link
Sponsor Member

Oh my goodness how did we miss that. Thank you.

@carlobaldassi
Copy link
Member

The fix solves the issue for me, now make testall1 passes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug Indicates an unexpected problem or unintended behavior test This change adds or pertains to unit tests
Projects
None yet
Development

No branches or pull requests

10 participants