Cholesky factorization test failure #69

StefanKarpinski · 2014-02-02T18:39:34Z

This has been failing for at least a day:

    JULIA test/linalg
     * linalg
exception on 1: ERROR: assertion failed: |b - :(*(apd,\(capd,b)))| <= 1.4432899320127035e-12
  b = 5 1
3   5
5   3
2   1
4   1
1   5
3   5
2   3
2   5
1   2

  :(*(apd,\(capd,b))) = 4.999999999999602   1.0000000000022737
2.9999999999998863  4.999999999999545
4.999999999999943   3.0000000000004547
1.9999999999999432  1.0000000000015916
4.000000000000114   1.0000000000020464
1.0000000000002274  4.999999999998636
3   5.000000000001592
2.0000000000000853  3.0000000000002274
2.0000000000001137  4.999999999999318
1.0000000000000568  2.000000000001819

  difference = 2.2737367544323206e-12 > 1.4432899320127035e-12
 in error at error.jl:22
 in test_approx_eq at test.jl:68
 in anonymous at no file:39
 in runtests at /Users/stefan/projects/julia.alt/test/testdefs.jl:5
 in anonymous at multi.jl:613
 in run_work_thunk at multi.jl:575
 in remotecall_fetch at multi.jl:647
 in remotecall_fetch at multi.jl:662
 in anonymous at multi.jl:1382
while loading linalg.jl, in expression starting on line 23
ERROR: assertion failed: |b - :(*(apd,\(capd,b)))| <= 1.4432899320127035e-12
  b = 5 1
3   5
5   3
2   1
4   1
1   5
3   5
2   3
2   5
1   2

  :(*(apd,\(capd,b))) = 4.999999999999602   1.0000000000022737
2.9999999999998863  4.999999999999545
4.999999999999943   3.0000000000004547
1.9999999999999432  1.0000000000015916
4.000000000000114   1.0000000000020464
1.0000000000002274  4.999999999998636
3   5.000000000001592
2.0000000000000853  3.0000000000002274
2.0000000000001137  4.999999999999318
1.0000000000000568  2.000000000001819

  difference = 2.2737367544323206e-12 > 1.4432899320127035e-12
 in error at error.jl:22
 in test_approx_eq at test.jl:68
 in anonymous at no file:39
 in runtests at /Users/stefan/projects/julia.alt/test/testdefs.jl:5
 in anonymous at multi.jl:613
 in run_work_thunk at multi.jl:575
 in remotecall_fetch at multi.jl:647
 in remotecall_fetch at multi.jl:662
 in anonymous at multi.jl:1382
while loading linalg.jl, in expression starting on line 23
while loading /Users/stefan/projects/julia.alt/test/runtests.jl, in expression starting on line 23

make[1]: *** [linalg] Error 1
make: *** [test-linalg] Error 2

cc: @jiahao, @andreasnoackjensen – did you guys monkey around with these tests some time on Friday or yesterday?

Julia Version 0.3.0-prerelease+1362
Commit d0aa799* (2014-02-02 17:49 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin13.0.0)
  CPU: Intel(R) Core(TM) i7-3740QM CPU @ 2.70GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY)
  LAPACK: libopenblas
  LIBM: libopenlibm

The text was updated successfully, but these errors were encountered:

jiahao · 2014-02-02T18:58:27Z

I just pushed c88a989b557b9c27538278c9a65a5abc30eaf717 which was intended to fix this. Can you verify if this works for you? (ref: #67)

jiahao · 2014-02-02T18:59:48Z

Those pesky winged monkeys. You leave them unattended for five minutes and they escape the castle and wreak havoc.

timholy · 2014-02-02T19:26:21Z

You just need one of those magic caps so you can get them to do your bidding ("write another 200 linalg test cases, mwaaa haaa haaaaa").

StefanKarpinski · 2014-02-03T05:08:34Z

I'm not sure if this works or not, but now the linalg tests don't finish at all :-|. I'm trying to get some float range stuff working, so I don't have time to debug this at the moment.

jiahao · 2014-02-03T05:15:53Z

Situation normal for me.

amitmurthy · 2014-02-03T06:03:41Z

Situation normal on Ubuntu 13.04 with latest master. The linalg tests do take a long time though.

andreasnoack · 2014-02-06T12:34:01Z

@StefanKarpinski Do you still see this error?

mschauer · 2014-02-06T22:16:29Z

This fails with

type of a: Int32 type of b: Float16
(Automatic) upper Cholesky factor

with

exception on 1: ERROR: assertion failed: |:(det(capd)) - :(det(apd))| <= 0.002384185791015625
  :(det(capd)) = 1.626428240984628e9
  :(det(apd)) = 1.6264282409980435e9
  difference = 0.01341557502746582 > 0.002384185791015625

on

Julia Version 0.3.0-prerelease+1408
Commit fb58104* (2014-02-06 21:53 UTC)
Platform Info:
  System: Linux (i686-linux-gnu)
  CPU: Intel(R) Core(TM) Duo CPU      T2450  @ 2.00GHz
  WORD_SIZE: 32
  BLAS: libopenblas (DYNAMIC_ARCH NO_AFFINITY)
  LAPACK: libopenblas
  LIBM: libopenlibm

StefanKarpinski · 2014-02-06T22:41:31Z

No, I'm good now, but there do still seem to be a lot of errors on various systems.

amitmurthy · 2014-02-07T11:10:30Z

This is really odd.

On the REPL

using Base.Test
import Base.LinAlg
import Base.LinAlg: BlasComplex, BlasFloat, BlasReal

n     = 10
a = rand(n,n)
for elty in (Float32, Float64, Complex64, Complex128)
    a = convert(Matrix{elty}, a)
    # cond
    @test_approx_eq_eps cond(a, 1) 4.837320054554436e+02 0.01
    @test_approx_eq_eps cond(a, 2) 1.960057871514615e+02 0.01
    @test_approx_eq_eps cond(a, Inf) 3.757017682707787e+02 0.01
    @test_approx_eq_eps cond(a[:,1:5]) 10.233059337453463 0.01
end

fails with

ERROR: assertion failed: |:(cond(a,1)) - 483.7320054554436| <= 0.01
  :(cond(a,1)) = 2468.8115
  483.7320054554436 = 483.7320054554436
  difference = 1985.0795179820564 > 0.01
 in error at error.jl:22
 in test_approx_eq at test.jl:68
 in anonymous at no file:4

while julia runtests.jl linalg goes through.

On Ubuntu 13.10

julia> versioninfo()
Julia Version 0.3.0-prerelease+1419
Commit a673e4c* (2014-02-06 22:55 UTC)
Platform Info:
  System: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY)
  LAPACK: libopenblas
  LIBM: libopenlibm

andreasnoack · 2014-02-07T11:15:41Z

I think you forgot to srand(1234321) in the REPL.

amitmurthy · 2014-02-07T11:22:20Z

Right. I was trying to parallelize the linalg tests and it was barfing. Thanks.

jiahao · 2014-02-07T15:59:09Z

The parallelization of the linalg tests brings up the question of how the random number generator behaves in parallel. I don't think the Mersenne Twister implementation we have currently is guaranteed to behave well for providing multiple parallel streams. Perhaps @ViralBShah knows...?

We can't actually parallelize the linalg tests reliably without risking breaking all the bounds because we implicitly rely on the stream of matrices produced by the RNG (even though it is set to a deterministic seed), since we can no longer guarantee the order in which matrices are generated for various tests, and as I showed in JIN5705, there is a small but significant probability that changing the input matrix will cause tests to fail. If we want to pursue parallelizing the linalg tests, the only sane thing we can do now is to snapshot all the matrices being computed and save them into the test suite or as @stevengj suggested in JuliaLang/julia#5705, use fixed matrices and adjust the existing bounds as necessary, so that the tests are deterministic, while in the long run continue to chip away at #67.

jiahao · 2014-02-07T16:14:55Z

Reopening with @mschauer 's reported failure. Updated title to identify the specific test that is failing.

pao · 2014-02-07T16:14:55Z

Parallel RNG: JuliaLang/julia#94.

amitmurthy · 2014-02-07T16:33:45Z

Wouldn't simply setting srand(1234321) before running the specific subset suffice? It will be always deterministic for that subset. No?

jiahao · 2014-02-07T16:39:55Z

No, this does not preserve the current behavior. Imagine if you broke up the test suite halfway through the file and wrapped them in parallel blocks. The first test in the block from the second half of the file will be getting a matrix constructed directly from the first few numbers in the stream seeded by srand(1234321). However, the current test we have would be sampling (say) the 9000-9024th numbers of the stream from srand(1234321) and would be testing an entirely different matrix.

amitmurthy · 2014-02-07T16:44:47Z

I understand that. We will have to change the test parameters. But if we could intersperse the current single test suite with calls to srand(1234321) (and changed test parameters) - wherever we want tests grouped together logically - they could be run in parallel without any issues, right?

I don't know the amount of work involved in doing this, the snapshot approach may be simpler.

jiahao · 2014-02-07T16:45:56Z

Again, the point I'm trying to make is not that the current behavior is all that desirable, it is merely that we can only guarantee the tests for the current stream of matrices, because the tolerances are all essentially hard-coded for the current stream. If we change the input stream, we would in principle have to readjust more magic numbers until the tests stop failing. But I think we are both in agreement on this point.

StefanKarpinski · 2014-02-07T18:02:11Z

Sorry. Ugh. That button is where the cancel comment button should be.

andreasnoack · 2014-06-02T07:36:38Z

This one has been fixed a while ago.

jiahao · 2014-06-02T15:47:32Z

@andreasnoackjensen what was the fix?

andreasnoack · 2014-06-02T16:37:22Z

My investigation suggests the initial error reported by @StefanKarpinski was fixed by your JuliaLang/julia@c88a989, but that the error reported later by @mschauer for 32 bit systems was fixed by my JuliaLang/julia@b695c7a

jiahao · 2014-06-02T16:38:46Z

Ah, right. For some reason I was thinking about the ARPACK failure when I saw this issue.

staticfloat mentioned this issue Feb 4, 2014

Linear solver on lower triangular matrix linalg test failing #70

Closed

jiahao mentioned this issue Feb 21, 2014

32-bit julia test failures #79

Closed

KristofferC transferred this issue from JuliaLang/julia Nov 26, 2024

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cholesky factorization test failure #69

Cholesky factorization test failure #69

StefanKarpinski commented Feb 2, 2014

jiahao commented Feb 2, 2014

jiahao commented Feb 2, 2014

timholy commented Feb 2, 2014

StefanKarpinski commented Feb 3, 2014

jiahao commented Feb 3, 2014

amitmurthy commented Feb 3, 2014

andreasnoack commented Feb 6, 2014

mschauer commented Feb 6, 2014

StefanKarpinski commented Feb 6, 2014

amitmurthy commented Feb 7, 2014

andreasnoack commented Feb 7, 2014

amitmurthy commented Feb 7, 2014

jiahao commented Feb 7, 2014

jiahao commented Feb 7, 2014

pao commented Feb 7, 2014

amitmurthy commented Feb 7, 2014

jiahao commented Feb 7, 2014

amitmurthy commented Feb 7, 2014

jiahao commented Feb 7, 2014

StefanKarpinski commented Feb 7, 2014

andreasnoack commented Jun 2, 2014

jiahao commented Jun 2, 2014

andreasnoack commented Jun 2, 2014

jiahao commented Jun 2, 2014

Cholesky factorization test failure #69

Cholesky factorization test failure #69

Comments

StefanKarpinski commented Feb 2, 2014

jiahao commented Feb 2, 2014

jiahao commented Feb 2, 2014

timholy commented Feb 2, 2014

StefanKarpinski commented Feb 3, 2014

jiahao commented Feb 3, 2014

amitmurthy commented Feb 3, 2014

andreasnoack commented Feb 6, 2014

mschauer commented Feb 6, 2014

StefanKarpinski commented Feb 6, 2014

amitmurthy commented Feb 7, 2014

andreasnoack commented Feb 7, 2014

amitmurthy commented Feb 7, 2014

jiahao commented Feb 7, 2014

jiahao commented Feb 7, 2014

pao commented Feb 7, 2014

amitmurthy commented Feb 7, 2014

jiahao commented Feb 7, 2014

amitmurthy commented Feb 7, 2014

jiahao commented Feb 7, 2014

StefanKarpinski commented Feb 7, 2014

andreasnoack commented Jun 2, 2014

jiahao commented Jun 2, 2014

andreasnoack commented Jun 2, 2014

jiahao commented Jun 2, 2014