Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support fat binary sysimg and/or native jit with generic sysimg #18179

Closed
simonbyrne opened this issue Aug 22, 2016 · 14 comments
Closed

Support fat binary sysimg and/or native jit with generic sysimg #18179

simonbyrne opened this issue Aug 22, 2016 · 14 comments
Labels
building Build system, or building Julia or its dependencies compiler:codegen Generation of LLVM IR and native code performance Must go faster speculative Whether the change will be implemented is speculative

Comments

@simonbyrne
Copy link
Contributor

On my (Broadwell) machine, using the latest rc:

julia> versioninfo()
Julia Version 0.5.0-rc2+0
Commit 0350e57 (2016-08-12 11:25 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin13.4.0)
  CPU: Intel(R) Core(TM) i5-5287U CPU @ 2.90GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.7.1 (ORCJIT, broadwell)

julia> @code_native muladd(1.0,2.0,3.0)
    .section    __TEXT,__text,regular,pure_instructions
Filename: float.jl
    pushq   %rbp
    movq    %rsp, %rbp
Source line: 247
    mulsd   %xmm1, %xmm0
    addsd   %xmm2, %xmm0
    popq    %rbp
    retq
Source line: 247
    nop

The same version built from scratch on my machine:

julia> versioninfo()
Julia Version 0.5.0-rc2+0
Commit 0350e57 (2016-08-12 11:25 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin15.6.0)
  CPU: Intel(R) Core(TM) i5-5287U CPU @ 2.90GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.7.1 (ORCJIT, broadwell)

julia> @code_native muladd(1.0,2.0,3.0)
    .section    __TEXT,__text,regular,pure_instructions
Filename: float.jl
    pushq   %rbp
    movq    %rsp, %rbp
Source line: 247
    vfmadd213sd %xmm2, %xmm1, %xmm0
    popq    %rbp
    retq
Source line: 247
    nopl    (%rax,%rax)
@simonbyrne simonbyrne added the performance Must go faster label Aug 22, 2016
@tkelman
Copy link
Contributor

tkelman commented Aug 22, 2016

You'd have to rebuild the system image to enable the native instruction set, since vfmadd213sd is an FMA3 instruction and we build the mac binaries so they work on generic systems back to core2.

@yuyichao yuyichao changed the title FMA instructions not working on OS X 0.5-rc2 build Support fat binary sysimg Aug 22, 2016
@yuyichao yuyichao added speculative Whether the change will be implemented is speculative building Build system, or building Julia or its dependencies compiler:codegen Generation of LLVM IR and native code labels Aug 22, 2016
@yuyichao yuyichao changed the title Support fat binary sysimg Support fat binary sysimg and/or native jit with generic sysimg Aug 22, 2016
@simonbyrne
Copy link
Contributor Author

Or alternatively, could we rebuild the sysimg as part of the installation process?

@yuyichao
Copy link
Contributor

That will increase the installation time by a lot and also increase the system requirement (you need to have a compatible linker)

@simonbyrne
Copy link
Contributor Author

I realise the commenters on this thread might be the exception, but how often do you expect users to reinstall Julia? It seems like a price worth paying. (I can't comment about the linker though)

@tkelman
Copy link
Contributor

tkelman commented Aug 22, 2016

Kind of a duplicate of #14995. You won't have a linker present if you don't have development tools installed, which will be a small percentage of OSX or Linux machines and a large percentage of Windows ones.

@simonbyrne
Copy link
Contributor Author

This is particularly important if we want to be able to exploit all the fancy vectorized instructions that intel keeps foisting upon us.

@yuyichao
Copy link
Contributor

yuyichao commented Aug 22, 2016

My (downstream) solution to this problem is to compile multiple version of julia with different instructions sets enabled (in particular I use x86_64 and core-avx2)

@yuyichao
Copy link
Contributor

yuyichao commented Aug 22, 2016

how often do you expect users to reinstall Julia? It seems like a price worth paying.

I imaging that can make a difference between whether the installation process can be done in a class or not.

(edit: in another word #14995 could be okay, but not by default)

@simonbyrne
Copy link
Contributor Author

But we could at least have it as an option, or even a post-install script (make-julia-faster.sh).

@yuyichao
Copy link
Contributor

Yes, and that's #14995

@simonbyrne
Copy link
Contributor Author

Alright, I guess close this as a duplicate.

@tkelman
Copy link
Contributor

tkelman commented Aug 22, 2016

We already have that, it's called build_sysimg.jl. #14995 goes a little further with some non-portable tricks to do switching between multiple images.

With any of the compiled libraries that haven't gone to the length of implementing dynamic arch like openblas has, the choice is between not fully utilizing modern instruction sets, or people getting SIGILL if they try to install on older systems. I think the binaries that we provide should be as widely usable as we can make them, even if that comes with some performance cost. We could eventually move to building and distributing multiple instruction sets worth of binaries, but that's going to be a lot of release overhead for something you can get right now by building from source.

@simonbyrne
Copy link
Contributor Author

But it should be easier for users to do (build_sysimg.jl isn't even mentioned in Performance Tips). Also, people are used to long installation times for complex software (ever installed Matlab, or TeX?).

I agree there's not much we can about binary dependencies, but that's a point in our favour. If a user can download some Julia code and find it instantly faster than some random C library simply because it can use faster instructions, then that's a huge bonus.

@tkelman
Copy link
Contributor

tkelman commented Aug 22, 2016

ever installed Matlab

Far too many times for one person to ever have to sit through. Long install times are a negative. It's also not always safe to assume that Julia was installed into a user-writable location, so this may have to be done by the same person who initially does the install. Hopefully modularization of the standard library will help allow these things to be compiled into multiple files rather than one monolithic sys.so, so things can be recompiled selectively and faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
building Build system, or building Julia or its dependencies compiler:codegen Generation of LLVM IR and native code performance Must go faster speculative Whether the change will be implemented is speculative
Projects
None yet
Development

No branches or pull requests

3 participants