Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment with different mallocs #128

Closed
ViralBShah opened this issue Jul 17, 2011 · 9 comments
Closed

Experiment with different mallocs #128

ViralBShah opened this issue Jul 17, 2011 · 9 comments
Labels
performance Must go faster speculative Whether the change will be implemented is speculative

Comments

@ViralBShah
Copy link
Member

There are some very fast mallocs out there:

http://www.canonware.com/jemalloc/
http://goog-perftools.sourceforge.net/doc/tcmalloc.html

It would be useful to try out these instead of the system malloc, to see if we get a real performance boost. I do feel that using jemalloc, rather than the system one, may also make performance and memory behaviour more uniform across different OSes.

I see use of malloc in src/, src/flisp, and src/support. It would also be nice to refactor the code so that other malloc implementations can be experimented with easily.

I guess the libraries in external will continue using their own malloc.

@ViralBShah
Copy link
Member Author

This doesn't seem necessary unless we have performance measurements that require the use of a faster malloc.

StefanKarpinski pushed a commit that referenced this issue Feb 8, 2018
0.4 precompilation functions
@oschulz
Copy link
Contributor

oschulz commented Feb 13, 2020

Just saw that Google published a revamped version of TCMalloc (https://github.com/google/tcmalloc) - given current core counts, would a multi-threaded malloc like that be beneficial for Julia?

@oschulz
Copy link
Contributor

oschulz commented Feb 13, 2020

@yuyichao
Copy link
Contributor

It'll certainly not benefit most uses since they don't use malloc anyway.

@oschulz
Copy link
Contributor

oschulz commented Feb 13, 2020

Ah, thanks - was just curious.

@oscardssmith
Copy link
Member

Slack today came up with a benchmark where our allocation is really slow. If you make a 100000000 element undef vector repeatedly, Julia is about 10x slower than numpy. We believe the problem is the lack of interaction between the allocator and GC

@Seelengrab
Copy link
Contributor

FYI, I "only" had a 2x drop in performance:

Projects $ python3
Python 3.7.5 (default, Apr 19 2020, 20:18:17)
[GCC 9.2.1 20191008] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>> timeit.timeit(stmt='np.empty(int(1e9), dtype=np.float64)', setup='import numpy as np')
66.47885020000103
>>> import numpy
>>> numpy.version.version
'1.18.3'
>>>
Projects $ julia -q
julia> versioninfo()
Julia Version 1.5.2
Commit 539f3ce943 (2020-09-23 23:17 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)
Environment:
  JULIA_NUM_THREADS = 4

julia> using BenchmarkTools

julia>  dummy = zeros(Int64(1e9));

julia> @benchmark similar($dummy)
BenchmarkTools.Trial:
  memory estimate:  7.45 GiB
  allocs estimate:  2
  --------------
  minimum time:     114.000 μs (65.79% GC)
  median time:      162.300 μs (71.69% GC)
  mean time:        175.176 μs (76.59% GC)
  maximum time:     49.640 ms (99.85% GC)
  --------------
  samples:          10000
  evals/sample:     1

The vast amount of time is spent in GC, even for minimum time, which is why we suspect that the deallocation/reusing of allocated blocks is somehow slow.

@racinmat
Copy link

racinmat commented Feb 3, 2022

I wonder if #42566 would be solved or at least partially mitigated by using jemalloc, guys in LinkedIn were able to solve nasty memory leaks by replacing malloc from glibc with jemalloc https://engineering.linkedin.com/blog/2021/taming-memory-fragmentation-in-venice-with-jemalloc

@oschulz
Copy link
Contributor

oschulz commented Feb 4, 2022

I'm not an expert on allocators, just came across this, in case it's of interest:

http://ithare.com/testing-memory-allocators-ptmalloc2-tcmalloc-hoard-jemalloc-while-trying-to-simulate-real-world-loads/

They seems to like jemalloc. :-) It's not a very recent comparison, though.

vtjnash pushed a commit that referenced this issue Dec 8, 2023
Stdlib: Statistics
URL: https://github.com/JuliaStats/Statistics.jl.git
Stdlib branch: master
Julia branch: jn/loading-stdlib-exts
Old commit: 04e5d89
New commit: 68869af
Julia version: 1.11.0-DEV
Statistics version: 1.11.1(Does not match)
Bump invoked by: @vtjnash
Powered by:
[BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl)

Diff:
JuliaStats/Statistics.jl@04e5d89...68869af

```
$ git log --oneline 04e5d89..68869af
68869af Bump patch for version 1.11.1
89f5fc7 Create tagbot.yml
dc844db CI: restore v1.9.4 to build matrix (#159)
d0523ae relax test for mapreduce_empty (#156)
d1c1c42 Drop support for v1.9 in CI (#157)
bfc6326 Fix `quantile` with `Date` and `DateTime` (#153)
b8ea3d2 Prevent overflow in `mean(::AbstractRange)` and relax type constraint. (#150)
a88ae4f Document MATLAB behavior in `quantile` docstring (#152)
46290a0 Revert "Prepare standalone package, step 2 (#128)" (#148)
81a90af make SparseArrays a weak dependency (#134)
```

Co-authored-by: Dilum Aluthge <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster speculative Whether the change will be implemented is speculative
Projects
None yet
Development

No branches or pull requests

6 participants