Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve GroupBy performance by reducing calls to hash (use one Dict) #280

Merged
merged 2 commits into from
May 15, 2020

Conversation

tkf
Copy link
Member

@tkf tkf commented May 15, 2020

No description provided.

@codecov-io
Copy link

codecov-io commented May 15, 2020

Codecov Report

Merging #280 into master will decrease coverage by 6.76%.
The diff coverage is 71.87%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #280      +/-   ##
==========================================
- Coverage   90.49%   83.73%   -6.77%     
==========================================
  Files          20       20              
  Lines        1378     1377       -1     
==========================================
- Hits         1247     1153      -94     
- Misses        131      224      +93     
Impacted Files Coverage Δ
src/library.jl 90.25% <71.87%> (-3.78%) ⬇️
src/show.jl 68.50% <0.00%> (-21.34%) ⬇️
src/basics.jl 66.66% <0.00%> (-19.05%) ⬇️
src/interop/onlinestats.jl 66.66% <0.00%> (-14.29%) ⬇️
src/progress.jl 74.74% <0.00%> (-12.13%) ⬇️
src/processes.jl 77.01% <0.00%> (-10.35%) ⬇️
src/reduce.jl 70.00% <0.00%> (-10.20%) ⬇️
src/core.jl 88.88% <0.00%> (-1.59%) ⬇️
src/unordered.jl 96.29% <0.00%> (-0.32%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7453396...066ab03. Read the comment docs.

@github-actions
Copy link
Contributor

Benchmark result

Judge result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmarks:
    • Target: 15 May 2020 - 03:01
    • Baseline: 15 May 2020 - 03:05
  • Package commits:
    • Target: 5ef086
    • Baseline: 745339
  • Julia commits:
    • Target: 381693
    • Baseline: 381693
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1
    • Baseline: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["filter_map_map!", "xf"] 0.94 (5%) ✅ 1.00 (1%)
["gemm", "fusedmul", "xf", "16"] 0.77 (5%) ✅ 1.00 (1%)
["gemm", "fusedmul", "xf", "2"] 0.83 (5%) ✅ 1.00 (1%)
["gemm", "fusedmul", "xf", "32"] 0.77 (5%) ✅ 1.00 (1%)
["gemm", "mul", "linalg", "8"] 1.05 (5%) ❌ 1.00 (1%)
["gemm", "mul", "man", "false", "256"] 0.95 (5%) ✅ 1.00 (1%)
["gemm", "mul", "man", "ivdep", "8"] 1.13 (5%) ❌ 1.00 (1%)
["gemm", "mul", "man", "true", "256"] 0.95 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "false", "256"] 0.95 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "false", "8"] 0.79 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "ivdep", "8"] 0.90 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "true", "8"] 0.95 (5%) ✅ 1.00 (1%)
["groupby", "sum", "sac"] 1.09 (5%) ❌ 1.00 (1%)
["groupby", "sum", "xf-with-init"] 0.85 (5%) ✅ 66.29 (1%) ❌
["groupby", "sum", "xf-without-init"] 0.81 (5%) ✅ 66.29 (1%) ❌
["missing_dot", "xf_nota"] 1.07 (5%) ❌ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["findall"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["groupby", "sum"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Target

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1020-azure #21~18.04.1-Ubuntu SMP Wed Apr 15 09:35:56 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz       5836 s          0 s       1295 s      55996 s          0 s
       #2  2095 MHz      52049 s          0 s       1586 s       9656 s          0 s
       
  Memory: 6.764888763427734 GB (3362.03515625 MB free)
  Uptime: 647.0 sec
  Load Avg:  1.08154296875  0.99755859375  0.5966796875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1020-azure #21~18.04.1-Ubuntu SMP Wed Apr 15 09:35:56 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      28490 s          0 s       1437 s      57274 s          0 s
       #2  2095 MHz      53519 s          0 s       1691 s      32121 s          0 s
       
  Memory: 6.764888763427734 GB (3435.390625 MB free)
  Uptime: 889.0 sec
  Load Avg:  1.0  1.0  0.703125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Target result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 15 May 2020 - 3:1
  • Package commit: 5ef086
  • Julia commit: 381693
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["cat", "base"] 2.167 μs (5%)
["cat", "xf"] 2.045 μs (5%)
["collect", "filter-missing"] 117.512 μs (5%) 33.03 KiB (1%) 19
["collect", "identity-float"] 84.409 μs (5%) 256.89 KiB (1%) 19
["collect", "identity-union"] 402.743 μs (5%) 285.69 KiB (1%) 6675
["dot", "blas"] 1.480 μs (5%)
["dot", "man"] 1.470 μs (5%)
["dot", "rf"] 2.667 μs (5%)
["dot", "xf"] 2.678 μs (5%)
["filter_map_map!", "man"] 63.404 μs (5%)
["filter_map_map!", "xf"] 64.704 μs (5%) 144 bytes (1%) 8
["filter_map_reduce", "man"] 287.525 μs (5%)
["filter_map_reduce", "xf"] 287.526 μs (5%)
["findall", "base"] 991.601 μs (5%) 2.00 MiB (1%) 21
["findall", "xf-array"] 781.377 μs (5%) 3.05 MiB (1%) 100014
["findall", "xf-iter"] 999.996 μs (5%) 2.00 MiB (1%) 28
["gemm", "fusedmul", "blas", "16"] 5.127 ms (5%)
["gemm", "fusedmul", "blas", "2"] 3.758 ms (5%)
["gemm", "fusedmul", "blas", "32"] 7.256 ms (5%)
["gemm", "fusedmul", "blas", "8"] 3.985 ms (5%)
["gemm", "fusedmul", "xf", "16"] 4.349 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "2"] 543.832 μs (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "32"] 8.360 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "8"] 2.616 ms (5%) 160 bytes (1%) 6
["gemm", "mul", "linalg", "256"] 1.112 ms (5%)
["gemm", "mul", "linalg", "32"] 3.750 μs (5%)
["gemm", "mul", "linalg", "8"] 315.280 ns (5%)
["gemm", "mul", "man", "false", "256"] 1.925 ms (5%)
["gemm", "mul", "man", "false", "32"] 5.934 μs (5%)
["gemm", "mul", "man", "false", "8"] 475.541 ns (5%)
["gemm", "mul", "man", "ivdep", "256"] 1.966 ms (5%)
["gemm", "mul", "man", "ivdep", "32"] 4.943 μs (5%)
["gemm", "mul", "man", "ivdep", "8"] 563.281 ns (5%)
["gemm", "mul", "man", "true", "256"] 1.938 ms (5%)
["gemm", "mul", "man", "true", "32"] 5.750 μs (5%)
["gemm", "mul", "man", "true", "8"] 498.485 ns (5%)
["gemm", "mul", "xf", "false", "256"] 1.928 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "32"] 5.967 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "8"] 471.459 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "256"] 1.988 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "32"] 4.829 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "8"] 449.525 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "256"] 1.934 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "32"] 5.884 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "8"] 473.505 ns (5%) 48 bytes (1%) 2
["groupby", "sum", "sac"] 340.718 μs (5%) 313.14 KiB (1%) 10007
["groupby", "sum", "xf-with-init"] 257.614 μs (5%) 157.44 KiB (1%) 10008
["groupby", "sum", "xf-without-init"] 246.614 μs (5%) 157.44 KiB (1%) 10008
["missing_argmax", "man"] 3.350 μs (5%) 32 bytes (1%) 1
["missing_argmax", "rf"] 3.100 μs (5%) 32 bytes (1%) 1
["missing_argmax", "xf"] 3.100 μs (5%) 32 bytes (1%) 1
["missing_dot", "equiv"] 1.810 μs (5%) 16 bytes (1%) 1
["missing_dot", "man"] 1.700 μs (5%) 16 bytes (1%) 1
["missing_dot", "naive"] 5.901 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf"] 1.690 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf_nota"] 1.760 μs (5%) 16 bytes (1%) 1
["missing_dot", "xf"] 256.322 μs (5%) 71.98 KiB (1%) 3736
["missing_dot", "xf_nota"] 261.223 μs (5%) 72.14 KiB (1%) 3742
["partition_by", "man"] 2.465 ms (5%) 352 bytes (1%) 4
["partition_by", "xf"] 2.417 ms (5%) 576 bytes (1%) 7

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["findall"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["groupby", "sum"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1020-azure #21~18.04.1-Ubuntu SMP Wed Apr 15 09:35:56 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz       5836 s          0 s       1295 s      55996 s          0 s
       #2  2095 MHz      52049 s          0 s       1586 s       9656 s          0 s
       
  Memory: 6.764888763427734 GB (3362.03515625 MB free)
  Uptime: 647.0 sec
  Load Avg:  1.08154296875  0.99755859375  0.5966796875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 15 May 2020 - 3:5
  • Package commit: 745339
  • Julia commit: 381693
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["cat", "base"] 2.167 μs (5%)
["cat", "xf"] 2.044 μs (5%)
["collect", "filter-missing"] 114.105 μs (5%) 33.03 KiB (1%) 19
["collect", "identity-float"] 83.404 μs (5%) 256.89 KiB (1%) 19
["collect", "identity-union"] 407.018 μs (5%) 285.97 KiB (1%) 6685
["dot", "blas"] 1.490 μs (5%)
["dot", "man"] 1.480 μs (5%)
["dot", "rf"] 2.678 μs (5%)
["dot", "xf"] 2.678 μs (5%)
["filter_map_map!", "man"] 64.002 μs (5%)
["filter_map_map!", "xf"] 69.202 μs (5%) 144 bytes (1%) 8
["filter_map_reduce", "man"] 287.413 μs (5%)
["filter_map_reduce", "xf"] 287.612 μs (5%)
["findall", "base"] 986.143 μs (5%) 2.00 MiB (1%) 21
["findall", "xf-array"] 775.833 μs (5%) 3.05 MiB (1%) 100014
["findall", "xf-iter"] 996.743 μs (5%) 2.00 MiB (1%) 28
["gemm", "fusedmul", "blas", "16"] 5.149 ms (5%)
["gemm", "fusedmul", "blas", "2"] 3.835 ms (5%)
["gemm", "fusedmul", "blas", "32"] 7.263 ms (5%)
["gemm", "fusedmul", "blas", "8"] 4.024 ms (5%)
["gemm", "fusedmul", "xf", "16"] 5.679 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "2"] 657.025 μs (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "32"] 10.841 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "8"] 2.666 ms (5%) 160 bytes (1%) 6
["gemm", "mul", "linalg", "256"] 1.110 ms (5%)
["gemm", "mul", "linalg", "32"] 3.900 μs (5%)
["gemm", "mul", "linalg", "8"] 300.000 ns (5%)
["gemm", "mul", "man", "false", "256"] 2.029 ms (5%)
["gemm", "mul", "man", "false", "32"] 5.900 μs (5%)
["gemm", "mul", "man", "false", "8"] 500.000 ns (5%)
["gemm", "mul", "man", "ivdep", "256"] 2.066 ms (5%)
["gemm", "mul", "man", "ivdep", "32"] 4.800 μs (5%)
["gemm", "mul", "man", "ivdep", "8"] 500.000 ns (5%)
["gemm", "mul", "man", "true", "256"] 2.044 ms (5%)
["gemm", "mul", "man", "true", "32"] 5.800 μs (5%)
["gemm", "mul", "man", "true", "8"] 500.000 ns (5%)
["gemm", "mul", "xf", "false", "256"] 2.035 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "32"] 6.200 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "8"] 600.000 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "256"] 2.051 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "32"] 4.600 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "8"] 500.000 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "256"] 2.026 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "32"] 5.900 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "8"] 500.000 ns (5%) 48 bytes (1%) 2
["groupby", "sum", "sac"] 313.012 μs (5%) 313.14 KiB (1%) 10007
["groupby", "sum", "xf-with-init"] 303.912 μs (5%) 2.38 KiB (1%) 16
["groupby", "sum", "xf-without-init"] 305.511 μs (5%) 2.38 KiB (1%) 16
["missing_argmax", "man"] 3.350 μs (5%) 32 bytes (1%) 1
["missing_argmax", "rf"] 3.188 μs (5%) 32 bytes (1%) 1
["missing_argmax", "xf"] 3.100 μs (5%) 32 bytes (1%) 1
["missing_dot", "equiv"] 1.790 μs (5%) 16 bytes (1%) 1
["missing_dot", "man"] 1.690 μs (5%) 16 bytes (1%) 1
["missing_dot", "naive"] 5.883 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf"] 1.690 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf_nota"] 1.820 μs (5%) 16 bytes (1%) 1
["missing_dot", "xf"] 252.611 μs (5%) 72.19 KiB (1%) 3746
["missing_dot", "xf_nota"] 244.510 μs (5%) 72.17 KiB (1%) 3744
["partition_by", "man"] 2.468 ms (5%) 352 bytes (1%) 4
["partition_by", "xf"] 2.347 ms (5%) 576 bytes (1%) 7

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["findall"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["groupby", "sum"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1020-azure #21~18.04.1-Ubuntu SMP Wed Apr 15 09:35:56 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      28490 s          0 s       1437 s      57274 s          0 s
       #2  2095 MHz      53519 s          0 s       1691 s      32121 s          0 s
       
  Memory: 6.764888763427734 GB (3435.390625 MB free)
  Uptime: 889.0 sec
  Load Avg:  1.0  1.0  0.703125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Stepping:            4
CPU MHz:             2095.198
BogoMIPS:            4190.39
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            36608K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec xsaves
Cpu Property Value
Brand Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Vendor :Intel
Architecture :Skylake
Model Family: 0x06, Model: 0x55, Stepping: 0x04, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 1024, 36608) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 512 bit = 64 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@github-actions
Copy link
Contributor

Multi-thread benchmark result

Judge result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmarks:
    • Target: 15 May 2020 - 03:02
    • Baseline: 15 May 2020 - 03:07
  • Package commits:
    • Target: 5ef086
    • Baseline: 745339
  • Julia commits:
    • Target: 381693
    • Baseline: 381693
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: JULIA_NUM_THREADS => 2
    • Baseline: JULIA_NUM_THREADS => 2

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["collect", "unordered", "basesize=1024"] 0.88 (5%) ✅ 0.94 (1%) ✅
["findfirst", "n=1000", "foldl"] 0.88 (5%) ✅ 1.00 (1%)
["findfirst", "n=1000", "reduce", "basesize=128"] 0.87 (5%) ✅ 1.00 (1%)
["findfirst", "n=1000", "reduce", "basesize=256"] 0.94 (5%) ✅ 1.00 (1%)
["findfirst", "n=1000", "reduce", "basesize=512"] 0.92 (5%) ✅ 1.00 (1%)
["findfirst", "n=400", "reduce", "basesize=256"] 0.89 (5%) ✅ 1.00 (1%)
["findfirst", "n=500", "foldl"] 0.92 (5%) ✅ 1.00 (1%)
["findfirst", "n=500", "reduce", "basesize=256"] 0.94 (5%) ✅ 1.00 (1%)
["findfirst", "n=500", "reduce", "basesize=512"] 0.89 (5%) ✅ 1.00 (1%)
["parallel_histogram", "comm", "basesize=4096"] 1.00 (5%) 1.02 (1%) ❌
["parallel_histogram", "comm", "basesize=8192"] 1.07 (5%) ❌ 1.00 (1%)
["sum", "valley", "reduce", "basesize=256"] 1.07 (5%) ❌ 1.00 (1%)
["sum", "valley", "reduce", "basesize=512"] 1.07 (5%) ❌ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["collect", "assoc"]
  • ["collect"]
  • ["collect", "unordered"]
  • ["findfirst", "n=1000"]
  • ["findfirst", "n=1000", "reduce"]
  • ["findfirst", "n=400"]
  • ["findfirst", "n=400", "reduce"]
  • ["findfirst", "n=500"]
  • ["findfirst", "n=500", "reduce"]
  • ["overhead"]
  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["sum", "random"]
  • ["sum", "random", "reduce"]
  • ["sum", "uniform"]
  • ["sum", "uniform", "reduce"]
  • ["sum", "valley"]
  • ["sum", "valley", "reduce"]
  • ["words"]

Julia versioninfo

Target

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1020-azure #21~18.04.1-Ubuntu SMP Wed Apr 15 09:35:56 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      47622 s          0 s       2327 s      19560 s          0 s
       #2  2095 MHz      47932 s          0 s       2206 s      16724 s          0 s
       
  Memory: 6.764888763427734 GB (3473.58984375 MB free)
  Uptime: 737.0 sec
  Load Avg:  1.68212890625  1.5927734375  1.02294921875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1020-azure #21~18.04.1-Ubuntu SMP Wed Apr 15 09:35:56 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      69138 s          0 s       2784 s      26064 s          0 s
       #2  2095 MHz      70727 s          0 s       2734 s      21850 s          0 s
       
  Memory: 6.764888763427734 GB (3473.38671875 MB free)
  Uptime: 1024.0 sec
  Load Avg:  1.779296875  1.6455078125  1.189453125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Target result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 15 May 2020 - 3:2
  • Package commit: 5ef086
  • Julia commit: 381693
  • Julia command flags: None
  • Environment variables: JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["collect", "assoc", "basesize=1"] 354.113 ms (5%) 13.655 ms 87.05 MiB (1%) 1558050
["collect", "assoc", "basesize=1024"] 221.375 ms (5%) 1.84 MiB (1%) 1780
["collect", "assoc", "basesize=32"] 225.459 ms (5%) 5.63 MiB (1%) 52986
["collect", "seq"] 440.569 ms (5%) 512.98 KiB (1%) 22
["collect", "unordered", "basesize=1"] 442.065 ms (5%) 29.15 MiB (1%) 402372
["collect", "unordered", "basesize=1024"] 282.762 ms (5%) 799.73 KiB (1%) 4230
["collect", "unordered", "basesize=32"] 252.249 ms (5%) 1.47 MiB (1%) 16604
["findfirst", "n=1000", "foldl"] 631.501 ms (5%)
["findfirst", "n=1000", "reduce", "basesize=128"] 321.487 ms (5%) 563.63 KiB (1%) 10203
["findfirst", "n=1000", "reduce", "basesize=256"] 340.747 ms (5%) 287.00 KiB (1%) 5210
["findfirst", "n=1000", "reduce", "basesize=512"] 333.997 ms (5%) 149.11 KiB (1%) 2712
["findfirst", "n=400", "foldl"] 505.868 ms (5%)
["findfirst", "n=400", "reduce", "basesize=128"] 281.274 ms (5%) 1.02 MiB (1%) 18924
["findfirst", "n=400", "reduce", "basesize=256"] 243.155 ms (5%) 525.41 KiB (1%) 9526
["findfirst", "n=400", "reduce", "basesize=512"] 262.177 ms (5%) 266.94 KiB (1%) 4861
["findfirst", "n=500", "foldl"] 79.442 ms (5%)
["findfirst", "n=500", "reduce", "basesize=128"] 46.872 ms (5%) 157.08 KiB (1%) 2834
["findfirst", "n=500", "reduce", "basesize=256"] 41.864 ms (5%) 84.28 KiB (1%) 1520
["findfirst", "n=500", "reduce", "basesize=512"] 43.053 ms (5%) 48.08 KiB (1%) 868
["overhead", "default"] 178.905 μs (5%) 146.16 KiB (1%) 2629
["overhead", "stoppable=false"] 178.504 μs (5%) 146.16 KiB (1%) 2629
["overhead", "stoppable=true"] 322.409 μs (5%) 146.41 KiB (1%) 2645
["parallel_histogram", "assoc", "basesize=16384"] 4.290 ms (5%) 732.06 KiB (1%) 103
["parallel_histogram", "assoc", "basesize=4096"] 5.334 ms (5%) 2.07 MiB (1%) 503
["parallel_histogram", "assoc", "basesize=8192"] 4.664 ms (5%) 1.43 MiB (1%) 243
["parallel_histogram", "comm", "basesize=16384"] 15.041 ms (5%) 1.22 MiB (1%) 156
["parallel_histogram", "comm", "basesize=4096"] 24.215 ms (5%) 1.08 MiB (1%) 3120
["parallel_histogram", "comm", "basesize=8192"] 20.579 ms (5%) 1.25 MiB (1%) 2201
["parallel_histogram", "seq"] 7.847 ms (5%) 364.63 KiB (1%) 25
["sum", "random", "foldl"] 16.970 ms (5%)
["sum", "random", "reduce", "basesize=128"] 8.469 ms (5%) 313.23 KiB (1%) 6061
["sum", "random", "reduce", "basesize=256"] 8.454 ms (5%) 155.08 KiB (1%) 3010
["sum", "random", "reduce", "basesize=512"] 8.256 ms (5%) 76.28 KiB (1%) 1486
["sum", "uniform", "foldl"] 15.598 ms (5%)
["sum", "uniform", "reduce", "basesize=128"] 8.642 ms (5%) 313.34 KiB (1%) 6068
["sum", "uniform", "reduce", "basesize=256"] 8.512 ms (5%) 155.08 KiB (1%) 3010
["sum", "uniform", "reduce", "basesize=512"] 8.593 ms (5%) 76.23 KiB (1%) 1483
["sum", "valley", "foldl"] 16.645 ms (5%)
["sum", "valley", "reduce", "basesize=128"] 8.816 ms (5%) 313.28 KiB (1%) 6064
["sum", "valley", "reduce", "basesize=256"] 9.174 ms (5%) 155.09 KiB (1%) 3011
["sum", "valley", "reduce", "basesize=512"] 9.172 ms (5%) 76.27 KiB (1%) 1485
["words", "nthreads=1"] 36.286 ms (5%) 6.454 ms 64.67 MiB (1%) 2092676
["words", "nthreads=2"] 22.023 ms (5%) 65.39 MiB (1%) 2092830
["words", "nthreads=4"] 23.155 ms (5%) 66.02 MiB (1%) 2093122

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["collect", "assoc"]
  • ["collect"]
  • ["collect", "unordered"]
  • ["findfirst", "n=1000"]
  • ["findfirst", "n=1000", "reduce"]
  • ["findfirst", "n=400"]
  • ["findfirst", "n=400", "reduce"]
  • ["findfirst", "n=500"]
  • ["findfirst", "n=500", "reduce"]
  • ["overhead"]
  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["sum", "random"]
  • ["sum", "random", "reduce"]
  • ["sum", "uniform"]
  • ["sum", "uniform", "reduce"]
  • ["sum", "valley"]
  • ["sum", "valley", "reduce"]
  • ["words"]

Julia versioninfo

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1020-azure #21~18.04.1-Ubuntu SMP Wed Apr 15 09:35:56 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      47622 s          0 s       2327 s      19560 s          0 s
       #2  2095 MHz      47932 s          0 s       2206 s      16724 s          0 s
       
  Memory: 6.764888763427734 GB (3473.58984375 MB free)
  Uptime: 737.0 sec
  Load Avg:  1.68212890625  1.5927734375  1.02294921875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 15 May 2020 - 3:7
  • Package commit: 745339
  • Julia commit: 381693
  • Julia command flags: None
  • Environment variables: JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["collect", "assoc", "basesize=1"] 360.826 ms (5%) 12.951 ms 87.05 MiB (1%) 1558015
["collect", "assoc", "basesize=1024"] 224.143 ms (5%) 1.84 MiB (1%) 1781
["collect", "assoc", "basesize=32"] 224.459 ms (5%) 5.63 MiB (1%) 52990
["collect", "seq"] 446.135 ms (5%) 512.98 KiB (1%) 22
["collect", "unordered", "basesize=1"] 446.104 ms (5%) 29.15 MiB (1%) 402308
["collect", "unordered", "basesize=1024"] 322.189 ms (5%) 853.63 KiB (1%) 7724
["collect", "unordered", "basesize=32"] 254.463 ms (5%) 1.46 MiB (1%) 15944
["findfirst", "n=1000", "foldl"] 717.210 ms (5%)
["findfirst", "n=1000", "reduce", "basesize=128"] 368.585 ms (5%) 563.47 KiB (1%) 10193
["findfirst", "n=1000", "reduce", "basesize=256"] 363.241 ms (5%) 286.98 KiB (1%) 5209
["findfirst", "n=1000", "reduce", "basesize=512"] 363.701 ms (5%) 149.11 KiB (1%) 2712
["findfirst", "n=400", "foldl"] 526.953 ms (5%)
["findfirst", "n=400", "reduce", "basesize=128"] 275.780 ms (5%) 1.02 MiB (1%) 18877
["findfirst", "n=400", "reduce", "basesize=256"] 273.818 ms (5%) 525.44 KiB (1%) 9528
["findfirst", "n=400", "reduce", "basesize=512"] 273.089 ms (5%) 267.03 KiB (1%) 4867
["findfirst", "n=500", "foldl"] 86.730 ms (5%)
["findfirst", "n=500", "reduce", "basesize=128"] 45.164 ms (5%) 157.02 KiB (1%) 2830
["findfirst", "n=500", "reduce", "basesize=256"] 44.455 ms (5%) 84.27 KiB (1%) 1519
["findfirst", "n=500", "reduce", "basesize=512"] 48.401 ms (5%) 48.06 KiB (1%) 867
["overhead", "default"] 175.103 μs (5%) 146.14 KiB (1%) 2628
["overhead", "stoppable=false"] 178.703 μs (5%) 146.14 KiB (1%) 2628
["overhead", "stoppable=true"] 331.006 μs (5%) 146.41 KiB (1%) 2645
["parallel_histogram", "assoc", "basesize=16384"] 4.273 ms (5%) 732.06 KiB (1%) 103
["parallel_histogram", "assoc", "basesize=4096"] 5.304 ms (5%) 2.07 MiB (1%) 503
["parallel_histogram", "assoc", "basesize=8192"] 4.736 ms (5%) 1.43 MiB (1%) 242
["parallel_histogram", "comm", "basesize=16384"] 14.871 ms (5%) 1.22 MiB (1%) 156
["parallel_histogram", "comm", "basesize=4096"] 24.314 ms (5%) 1.06 MiB (1%) 5758
["parallel_histogram", "comm", "basesize=8192"] 19.242 ms (5%) 1.25 MiB (1%) 2211
["parallel_histogram", "seq"] 8.250 ms (5%) 364.63 KiB (1%) 25
["sum", "random", "foldl"] 16.393 ms (5%)
["sum", "random", "reduce", "basesize=128"] 8.723 ms (5%) 313.31 KiB (1%) 6066
["sum", "random", "reduce", "basesize=256"] 8.297 ms (5%) 155.05 KiB (1%) 3008
["sum", "random", "reduce", "basesize=512"] 7.880 ms (5%) 76.25 KiB (1%) 1484
["sum", "uniform", "foldl"] 15.611 ms (5%)
["sum", "uniform", "reduce", "basesize=128"] 8.365 ms (5%) 313.34 KiB (1%) 6068
["sum", "uniform", "reduce", "basesize=256"] 8.517 ms (5%) 155.08 KiB (1%) 3010
["sum", "uniform", "reduce", "basesize=512"] 8.206 ms (5%) 76.25 KiB (1%) 1484
["sum", "valley", "foldl"] 17.034 ms (5%)
["sum", "valley", "reduce", "basesize=128"] 8.891 ms (5%) 313.28 KiB (1%) 6064
["sum", "valley", "reduce", "basesize=256"] 8.590 ms (5%) 155.08 KiB (1%) 3010
["sum", "valley", "reduce", "basesize=512"] 8.543 ms (5%) 76.25 KiB (1%) 1484
["words", "nthreads=1"] 35.754 ms (5%) 6.144 ms 64.96 MiB (1%) 2102056
["words", "nthreads=2"] 20.988 ms (5%) 65.68 MiB (1%) 2102210
["words", "nthreads=4"] 22.155 ms (5%) 66.32 MiB (1%) 2102499

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["collect", "assoc"]
  • ["collect"]
  • ["collect", "unordered"]
  • ["findfirst", "n=1000"]
  • ["findfirst", "n=1000", "reduce"]
  • ["findfirst", "n=400"]
  • ["findfirst", "n=400", "reduce"]
  • ["findfirst", "n=500"]
  • ["findfirst", "n=500", "reduce"]
  • ["overhead"]
  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["sum", "random"]
  • ["sum", "random", "reduce"]
  • ["sum", "uniform"]
  • ["sum", "uniform", "reduce"]
  • ["sum", "valley"]
  • ["sum", "valley", "reduce"]
  • ["words"]

Julia versioninfo

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1020-azure #21~18.04.1-Ubuntu SMP Wed Apr 15 09:35:56 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      69138 s          0 s       2784 s      26064 s          0 s
       #2  2095 MHz      70727 s          0 s       2734 s      21850 s          0 s
       
  Memory: 6.764888763427734 GB (3473.38671875 MB free)
  Uptime: 1024.0 sec
  Load Avg:  1.779296875  1.6455078125  1.189453125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Stepping:            4
CPU MHz:             2095.127
BogoMIPS:            4190.25
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            36608K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec xsaves
Cpu Property Value
Brand Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Vendor :Intel
Architecture :Skylake
Model Family: 0x06, Model: 0x55, Stepping: 0x04, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 1024, 36608) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 512 bit = 64 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@github-actions
Copy link
Contributor

Benchmark result

Judge result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmarks:
    • Target: 15 May 2020 - 03:17
    • Baseline: 15 May 2020 - 03:21
  • Package commits:
    • Target: 8b7319
    • Baseline: 745339
  • Julia commits:
    • Target: 381693
    • Baseline: 381693
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1
    • Baseline: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["cat", "xf"] 1.08 (5%) ❌ 1.00 (1%)
["collect", "filter-missing"] 1.09 (5%) ❌ 1.00 (1%)
["dot", "rf"] 0.95 (5%) ✅ 1.00 (1%)
["filter_map_map!", "man"] 0.91 (5%) ✅ 1.00 (1%)
["filter_map_map!", "xf"] 1.15 (5%) ❌ 1.00 (1%)
["filter_map_reduce", "man"] 1.08 (5%) ❌ 1.00 (1%)
["filter_map_reduce", "xf"] 1.08 (5%) ❌ 1.00 (1%)
["gemm", "mul", "linalg", "32"] 0.90 (5%) ✅ 1.00 (1%)
["gemm", "mul", "linalg", "8"] 0.90 (5%) ✅ 1.00 (1%)
["gemm", "mul", "man", "false", "32"] 1.16 (5%) ❌ 1.00 (1%)
["gemm", "mul", "man", "false", "8"] 0.91 (5%) ✅ 1.00 (1%)
["gemm", "mul", "man", "ivdep", "32"] 1.06 (5%) ❌ 1.00 (1%)
["gemm", "mul", "man", "ivdep", "8"] 1.25 (5%) ❌ 1.00 (1%)
["gemm", "mul", "man", "true", "256"] 0.91 (5%) ✅ 1.00 (1%)
["gemm", "mul", "man", "true", "8"] 0.89 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "false", "256"] 0.91 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "false", "32"] 1.07 (5%) ❌ 1.00 (1%)
["gemm", "mul", "xf", "false", "8"] 0.84 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "ivdep", "32"] 0.93 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "ivdep", "8"] 0.86 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "true", "32"] 0.95 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "true", "8"] 1.05 (5%) ❌ 1.00 (1%)
["groupby", "sum", "xf-with-init"] 0.79 (5%) ✅ 66.29 (1%) ❌
["groupby", "sum", "xf-without-init"] 0.76 (5%) ✅ 66.29 (1%) ❌
["missing_argmax", "man"] 0.92 (5%) ✅ 1.00 (1%)
["missing_argmax", "xf"] 0.95 (5%) ✅ 1.00 (1%)
["missing_dot", "rf_nota"] 1.05 (5%) ❌ 1.00 (1%)
["missing_dot", "xf_nota"] 1.41 (5%) ❌ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["findall"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["groupby", "sum"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Target

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1020-azure #21~18.04.1-Ubuntu SMP Wed Apr 15 09:35:56 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      17802 s          0 s       1288 s      51574 s          0 s
       #2  2095 MHz      39968 s          0 s       1508 s      29693 s          0 s
       
  Memory: 6.764888763427734 GB (3288.72265625 MB free)
  Uptime: 726.0 sec
  Load Avg:  1.04150390625  1.0087890625  0.6435546875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1020-azure #21~18.04.1-Ubuntu SMP Wed Apr 15 09:35:56 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      30192 s          0 s       1407 s      63219 s          0 s
       #2  2095 MHz      51664 s          0 s       1709 s      41989 s          0 s
       
  Memory: 6.764888763427734 GB (3371.8828125 MB free)
  Uptime: 969.0 sec
  Load Avg:  1.05029296875  1.0419921875  0.751953125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Target result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 15 May 2020 - 3:17
  • Package commit: 8b7319
  • Julia commit: 381693
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["cat", "base"] 1.933 μs (5%)
["cat", "xf"] 1.960 μs (5%)
["collect", "filter-missing"] 113.102 μs (5%) 33.03 KiB (1%) 19
["collect", "identity-float"] 80.602 μs (5%) 256.89 KiB (1%) 19
["collect", "identity-union"] 389.408 μs (5%) 285.69 KiB (1%) 6671
["dot", "blas"] 1.440 μs (5%)
["dot", "man"] 1.350 μs (5%)
["dot", "rf"] 2.433 μs (5%)
["dot", "xf"] 2.567 μs (5%)
["filter_map_map!", "man"] 59.801 μs (5%)
["filter_map_map!", "xf"] 75.201 μs (5%) 144 bytes (1%) 8
["filter_map_reduce", "man"] 276.105 μs (5%)
["filter_map_reduce", "xf"] 276.105 μs (5%)
["findall", "base"] 980.718 μs (5%) 2.00 MiB (1%) 21
["findall", "xf-array"] 770.214 μs (5%) 3.05 MiB (1%) 100014
["findall", "xf-iter"] 973.517 μs (5%) 2.00 MiB (1%) 28
["gemm", "fusedmul", "blas", "16"] 5.139 ms (5%)
["gemm", "fusedmul", "blas", "2"] 3.787 ms (5%)
["gemm", "fusedmul", "blas", "32"] 7.161 ms (5%)
["gemm", "fusedmul", "blas", "8"] 4.054 ms (5%)
["gemm", "fusedmul", "xf", "16"] 4.183 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "2"] 502.905 μs (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "32"] 8.326 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "8"] 2.041 ms (5%) 160 bytes (1%) 6
["gemm", "mul", "linalg", "256"] 993.415 μs (5%)
["gemm", "mul", "linalg", "32"] 3.325 μs (5%)
["gemm", "mul", "linalg", "8"] 270.291 ns (5%)
["gemm", "mul", "man", "false", "256"] 1.851 ms (5%)
["gemm", "mul", "man", "false", "32"] 5.550 μs (5%)
["gemm", "mul", "man", "false", "8"] 456.350 ns (5%)
["gemm", "mul", "man", "ivdep", "256"] 1.797 ms (5%)
["gemm", "mul", "man", "ivdep", "32"] 4.543 μs (5%)
["gemm", "mul", "man", "ivdep", "8"] 500.534 ns (5%)
["gemm", "mul", "man", "true", "256"] 1.824 ms (5%)
["gemm", "mul", "man", "true", "32"] 5.500 μs (5%)
["gemm", "mul", "man", "true", "8"] 443.595 ns (5%)
["gemm", "mul", "xf", "false", "256"] 1.719 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "32"] 5.150 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "8"] 419.802 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "256"] 1.786 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "32"] 4.086 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "8"] 431.663 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "256"] 1.801 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "32"] 5.217 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "8"] 421.325 ns (5%) 48 bytes (1%) 2
["groupby", "sum", "sac"] 302.003 μs (5%) 313.14 KiB (1%) 10007
["groupby", "sum", "xf-with-init"] 214.002 μs (5%) 157.44 KiB (1%) 10008
["groupby", "sum", "xf-without-init"] 208.502 μs (5%) 157.44 KiB (1%) 10008
["missing_argmax", "man"] 2.975 μs (5%) 32 bytes (1%) 1
["missing_argmax", "rf"] 2.845 μs (5%) 32 bytes (1%) 1
["missing_argmax", "xf"] 2.822 μs (5%) 32 bytes (1%) 1
["missing_dot", "equiv"] 1.750 μs (5%) 16 bytes (1%) 1
["missing_dot", "man"] 1.530 μs (5%) 16 bytes (1%) 1
["missing_dot", "naive"] 5.283 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf"] 1.540 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf_nota"] 1.670 μs (5%) 16 bytes (1%) 1
["missing_dot", "xf"] 227.704 μs (5%) 72.02 KiB (1%) 3738
["missing_dot", "xf_nota"] 309.205 μs (5%) 71.89 KiB (1%) 3731
["partition_by", "man"] 2.331 ms (5%) 352 bytes (1%) 4
["partition_by", "xf"] 2.245 ms (5%) 576 bytes (1%) 7

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["findall"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["groupby", "sum"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1020-azure #21~18.04.1-Ubuntu SMP Wed Apr 15 09:35:56 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      17802 s          0 s       1288 s      51574 s          0 s
       #2  2095 MHz      39968 s          0 s       1508 s      29693 s          0 s
       
  Memory: 6.764888763427734 GB (3288.72265625 MB free)
  Uptime: 726.0 sec
  Load Avg:  1.04150390625  1.0087890625  0.6435546875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 15 May 2020 - 3:21
  • Package commit: 745339
  • Julia commit: 381693
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["cat", "base"] 1.933 μs (5%)
["cat", "xf"] 1.810 μs (5%)
["collect", "filter-missing"] 104.201 μs (5%) 33.03 KiB (1%) 19
["collect", "identity-float"] 81.300 μs (5%) 256.89 KiB (1%) 19
["collect", "identity-union"] 395.803 μs (5%) 285.73 KiB (1%) 6672
["dot", "blas"] 1.430 μs (5%)
["dot", "man"] 1.400 μs (5%)
["dot", "rf"] 2.567 μs (5%)
["dot", "xf"] 2.567 μs (5%)
["filter_map_map!", "man"] 65.601 μs (5%)
["filter_map_map!", "xf"] 65.201 μs (5%) 144 bytes (1%) 8
["filter_map_reduce", "man"] 255.902 μs (5%)
["filter_map_reduce", "xf"] 256.102 μs (5%)
["findall", "base"] 954.208 μs (5%) 2.00 MiB (1%) 21
["findall", "xf-array"] 774.506 μs (5%) 3.05 MiB (1%) 100014
["findall", "xf-iter"] 997.308 μs (5%) 2.00 MiB (1%) 28
["gemm", "fusedmul", "blas", "16"] 4.963 ms (5%)
["gemm", "fusedmul", "blas", "2"] 3.714 ms (5%)
["gemm", "fusedmul", "blas", "32"] 7.075 ms (5%)
["gemm", "fusedmul", "blas", "8"] 4.008 ms (5%)
["gemm", "fusedmul", "xf", "16"] 4.087 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "2"] 523.004 μs (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "32"] 8.279 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "8"] 2.053 ms (5%) 160 bytes (1%) 6
["gemm", "mul", "linalg", "256"] 987.907 μs (5%)
["gemm", "mul", "linalg", "32"] 3.700 μs (5%)
["gemm", "mul", "linalg", "8"] 300.000 ns (5%)
["gemm", "mul", "man", "false", "256"] 1.883 ms (5%)
["gemm", "mul", "man", "false", "32"] 4.800 μs (5%)
["gemm", "mul", "man", "false", "8"] 500.000 ns (5%)
["gemm", "mul", "man", "ivdep", "256"] 1.839 ms (5%)
["gemm", "mul", "man", "ivdep", "32"] 4.300 μs (5%)
["gemm", "mul", "man", "ivdep", "8"] 400.000 ns (5%)
["gemm", "mul", "man", "true", "256"] 2.008 ms (5%)
["gemm", "mul", "man", "true", "32"] 5.300 μs (5%)
["gemm", "mul", "man", "true", "8"] 500.000 ns (5%)
["gemm", "mul", "xf", "false", "256"] 1.894 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "32"] 4.800 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "8"] 500.000 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "256"] 1.852 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "32"] 4.400 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "8"] 500.000 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "256"] 1.871 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "32"] 5.500 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "8"] 400.000 ns (5%) 48 bytes (1%) 2
["groupby", "sum", "sac"] 300.602 μs (5%) 313.14 KiB (1%) 10007
["groupby", "sum", "xf-with-init"] 271.702 μs (5%) 2.38 KiB (1%) 16
["groupby", "sum", "xf-without-init"] 272.802 μs (5%) 2.38 KiB (1%) 16
["missing_argmax", "man"] 3.225 μs (5%) 32 bytes (1%) 1
["missing_argmax", "rf"] 2.933 μs (5%) 32 bytes (1%) 1
["missing_argmax", "xf"] 2.978 μs (5%) 32 bytes (1%) 1
["missing_dot", "equiv"] 1.720 μs (5%) 16 bytes (1%) 1
["missing_dot", "man"] 1.510 μs (5%) 16 bytes (1%) 1
["missing_dot", "naive"] 5.283 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf"] 1.620 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf_nota"] 1.590 μs (5%) 16 bytes (1%) 1
["missing_dot", "xf"] 233.802 μs (5%) 72.25 KiB (1%) 3748
["missing_dot", "xf_nota"] 219.301 μs (5%) 72.09 KiB (1%) 3740
["partition_by", "man"] 2.297 ms (5%) 352 bytes (1%) 4
["partition_by", "xf"] 2.239 ms (5%) 576 bytes (1%) 7

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["findall"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["groupby", "sum"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1020-azure #21~18.04.1-Ubuntu SMP Wed Apr 15 09:35:56 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      30192 s          0 s       1407 s      63219 s          0 s
       #2  2095 MHz      51664 s          0 s       1709 s      41989 s          0 s
       
  Memory: 6.764888763427734 GB (3371.8828125 MB free)
  Uptime: 969.0 sec
  Load Avg:  1.05029296875  1.0419921875  0.751953125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Stepping:            4
CPU MHz:             2095.079
BogoMIPS:            4190.15
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            36608K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec xsaves
Cpu Property Value
Brand Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Vendor :Intel
Architecture :Skylake
Model Family: 0x06, Model: 0x55, Stepping: 0x04, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 1024, 36608) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 512 bit = 64 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@github-actions
Copy link
Contributor

Multi-thread benchmark result

Judge result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmarks:
    • Target: 15 May 2020 - 03:18
    • Baseline: 15 May 2020 - 03:23
  • Package commits:
    • Target: 8b7319
    • Baseline: 745339
  • Julia commits:
    • Target: 381693
    • Baseline: 381693
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: JULIA_NUM_THREADS => 2
    • Baseline: JULIA_NUM_THREADS => 2

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["collect", "unordered", "basesize=1024"] 1.27 (5%) ❌ 1.14 (1%) ❌
["parallel_histogram", "assoc", "basesize=4096"] 1.04 (5%) 1.15 (1%) ❌
["parallel_histogram", "comm", "basesize=16384"] 1.01 (5%) 1.26 (1%) ❌
["parallel_histogram", "comm", "basesize=8192"] 0.84 (5%) ✅ 1.00 (1%)
["words", "nthreads=2"] 0.95 (5%) ✅ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["collect", "assoc"]
  • ["collect"]
  • ["collect", "unordered"]
  • ["findfirst", "n=1000"]
  • ["findfirst", "n=1000", "reduce"]
  • ["findfirst", "n=400"]
  • ["findfirst", "n=400", "reduce"]
  • ["findfirst", "n=500"]
  • ["findfirst", "n=500", "reduce"]
  • ["overhead"]
  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["sum", "random"]
  • ["sum", "random", "reduce"]
  • ["sum", "uniform"]
  • ["sum", "uniform", "reduce"]
  • ["sum", "valley"]
  • ["sum", "valley", "reduce"]
  • ["words"]

Julia versioninfo

Target

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1020-azure #21~18.04.1-Ubuntu SMP Wed Apr 15 09:35:56 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      44738 s          0 s       2415 s      31008 s          0 s
       #2  2095 MHz      53682 s          0 s       2282 s      21840 s          0 s
       
  Memory: 6.764888763427734 GB (3510.68359375 MB free)
  Uptime: 798.0 sec
  Load Avg:  1.67529296875  1.48779296875  0.89111328125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1020-azure #21~18.04.1-Ubuntu SMP Wed Apr 15 09:35:56 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      66442 s          0 s       2962 s      37623 s          0 s
       #2  2095 MHz      76714 s          0 s       2791 s      27095 s          0 s
       
  Memory: 6.764888763427734 GB (3497.83203125 MB free)
  Uptime: 1089.0 sec
  Load Avg:  1.83740234375  1.6591796875  1.12353515625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Target result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 15 May 2020 - 3:18
  • Package commit: 8b7319
  • Julia commit: 381693
  • Julia command flags: None
  • Environment variables: JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["collect", "assoc", "basesize=1"] 383.095 ms (5%) 11.644 ms 87.05 MiB (1%) 1558010
["collect", "assoc", "basesize=1024"] 238.960 ms (5%) 1.84 MiB (1%) 1775
["collect", "assoc", "basesize=32"] 244.027 ms (5%) 5.63 MiB (1%) 52989
["collect", "seq"] 475.105 ms (5%) 512.98 KiB (1%) 22
["collect", "unordered", "basesize=1"] 481.217 ms (5%) 29.15 MiB (1%) 402466
["collect", "unordered", "basesize=1024"] 388.710 ms (5%) 928.31 KiB (1%) 12504
["collect", "unordered", "basesize=32"] 271.152 ms (5%) 1.47 MiB (1%) 16666
["findfirst", "n=1000", "foldl"] 735.508 ms (5%)
["findfirst", "n=1000", "reduce", "basesize=128"] 381.325 ms (5%) 563.98 KiB (1%) 10226
["findfirst", "n=1000", "reduce", "basesize=256"] 373.960 ms (5%) 287.16 KiB (1%) 5220
["findfirst", "n=1000", "reduce", "basesize=512"] 380.248 ms (5%) 149.13 KiB (1%) 2713
["findfirst", "n=400", "foldl"] 557.467 ms (5%)
["findfirst", "n=400", "reduce", "basesize=128"] 283.865 ms (5%) 1.02 MiB (1%) 18956
["findfirst", "n=400", "reduce", "basesize=256"] 283.251 ms (5%) 525.39 KiB (1%) 9525
["findfirst", "n=400", "reduce", "basesize=512"] 284.965 ms (5%) 267.05 KiB (1%) 4868
["findfirst", "n=500", "foldl"] 95.503 ms (5%)
["findfirst", "n=500", "reduce", "basesize=128"] 49.113 ms (5%) 157.16 KiB (1%) 2839
["findfirst", "n=500", "reduce", "basesize=256"] 48.641 ms (5%) 84.34 KiB (1%) 1524
["findfirst", "n=500", "reduce", "basesize=512"] 52.192 ms (5%) 48.09 KiB (1%) 869
["overhead", "default"] 198.710 μs (5%) 146.16 KiB (1%) 2629
["overhead", "stoppable=false"] 194.809 μs (5%) 146.16 KiB (1%) 2629
["overhead", "stoppable=true"] 329.615 μs (5%) 146.42 KiB (1%) 2646
["parallel_histogram", "assoc", "basesize=16384"] 5.146 ms (5%) 732.06 KiB (1%) 103
["parallel_histogram", "assoc", "basesize=4096"] 5.886 ms (5%) 2.07 MiB (1%) 503
["parallel_histogram", "assoc", "basesize=8192"] 5.437 ms (5%) 1.43 MiB (1%) 242
["parallel_histogram", "comm", "basesize=16384"] 15.908 ms (5%) 1.22 MiB (1%) 155
["parallel_histogram", "comm", "basesize=4096"] 24.767 ms (5%) 1.09 MiB (1%) 3865
["parallel_histogram", "comm", "basesize=8192"] 17.272 ms (5%) 1.23 MiB (1%) 961
["parallel_histogram", "seq"] 9.382 ms (5%) 364.63 KiB (1%) 25
["sum", "random", "foldl"] 18.383 ms (5%)
["sum", "random", "reduce", "basesize=128"] 9.709 ms (5%) 313.34 KiB (1%) 6068
["sum", "random", "reduce", "basesize=256"] 9.449 ms (5%) 155.09 KiB (1%) 3011
["sum", "random", "reduce", "basesize=512"] 9.351 ms (5%) 76.27 KiB (1%) 1485
["sum", "uniform", "foldl"] 18.112 ms (5%)
["sum", "uniform", "reduce", "basesize=128"] 9.803 ms (5%) 313.34 KiB (1%) 6068
["sum", "uniform", "reduce", "basesize=256"] 9.290 ms (5%) 155.08 KiB (1%) 3010
["sum", "uniform", "reduce", "basesize=512"] 9.177 ms (5%) 76.25 KiB (1%) 1484
["sum", "valley", "foldl"] 18.683 ms (5%)
["sum", "valley", "reduce", "basesize=128"] 9.955 ms (5%) 313.31 KiB (1%) 6066
["sum", "valley", "reduce", "basesize=256"] 9.585 ms (5%) 155.09 KiB (1%) 3011
["sum", "valley", "reduce", "basesize=512"] 9.473 ms (5%) 76.25 KiB (1%) 1484
["words", "nthreads=1"] 41.990 ms (5%) 7.415 ms 64.86 MiB (1%) 2098673
["words", "nthreads=2"] 21.880 ms (5%) 65.22 MiB (1%) 2098750
["words", "nthreads=4"] 23.123 ms (5%) 66.12 MiB (1%) 2099047

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["collect", "assoc"]
  • ["collect"]
  • ["collect", "unordered"]
  • ["findfirst", "n=1000"]
  • ["findfirst", "n=1000", "reduce"]
  • ["findfirst", "n=400"]
  • ["findfirst", "n=400", "reduce"]
  • ["findfirst", "n=500"]
  • ["findfirst", "n=500", "reduce"]
  • ["overhead"]
  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["sum", "random"]
  • ["sum", "random", "reduce"]
  • ["sum", "uniform"]
  • ["sum", "uniform", "reduce"]
  • ["sum", "valley"]
  • ["sum", "valley", "reduce"]
  • ["words"]

Julia versioninfo

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1020-azure #21~18.04.1-Ubuntu SMP Wed Apr 15 09:35:56 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      44738 s          0 s       2415 s      31008 s          0 s
       #2  2095 MHz      53682 s          0 s       2282 s      21840 s          0 s
       
  Memory: 6.764888763427734 GB (3510.68359375 MB free)
  Uptime: 798.0 sec
  Load Avg:  1.67529296875  1.48779296875  0.89111328125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 15 May 2020 - 3:23
  • Package commit: 745339
  • Julia commit: 381693
  • Julia command flags: None
  • Environment variables: JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["collect", "assoc", "basesize=1"] 373.100 ms (5%) 12.591 ms 87.05 MiB (1%) 1558023
["collect", "assoc", "basesize=1024"] 239.584 ms (5%) 1.84 MiB (1%) 1780
["collect", "assoc", "basesize=32"] 244.110 ms (5%) 5.63 MiB (1%) 52990
["collect", "seq"] 471.833 ms (5%) 512.98 KiB (1%) 22
["collect", "unordered", "basesize=1"] 476.355 ms (5%) 29.15 MiB (1%) 402846
["collect", "unordered", "basesize=1024"] 306.938 ms (5%) 814.78 KiB (1%) 5238
["collect", "unordered", "basesize=32"] 269.828 ms (5%) 1.47 MiB (1%) 16610
["findfirst", "n=1000", "foldl"] 741.928 ms (5%)
["findfirst", "n=1000", "reduce", "basesize=128"] 377.760 ms (5%) 563.89 KiB (1%) 10220
["findfirst", "n=1000", "reduce", "basesize=256"] 378.382 ms (5%) 287.17 KiB (1%) 5221
["findfirst", "n=1000", "reduce", "basesize=512"] 382.041 ms (5%) 149.17 KiB (1%) 2716
["findfirst", "n=400", "foldl"] 559.227 ms (5%)
["findfirst", "n=400", "reduce", "basesize=128"] 284.042 ms (5%) 1.02 MiB (1%) 18930
["findfirst", "n=400", "reduce", "basesize=256"] 281.920 ms (5%) 525.94 KiB (1%) 9560
["findfirst", "n=400", "reduce", "basesize=512"] 285.216 ms (5%) 267.13 KiB (1%) 4873
["findfirst", "n=500", "foldl"] 95.553 ms (5%)
["findfirst", "n=500", "reduce", "basesize=128"] 48.759 ms (5%) 157.17 KiB (1%) 2840
["findfirst", "n=500", "reduce", "basesize=256"] 48.701 ms (5%) 84.36 KiB (1%) 1525
["findfirst", "n=500", "reduce", "basesize=512"] 51.792 ms (5%) 48.09 KiB (1%) 869
["overhead", "default"] 193.817 μs (5%) 146.16 KiB (1%) 2629
["overhead", "stoppable=false"] 194.216 μs (5%) 146.16 KiB (1%) 2629
["overhead", "stoppable=true"] 338.028 μs (5%) 146.39 KiB (1%) 2644
["parallel_histogram", "assoc", "basesize=16384"] 4.931 ms (5%) 732.06 KiB (1%) 103
["parallel_histogram", "assoc", "basesize=4096"] 5.684 ms (5%) 1.80 MiB (1%) 497
["parallel_histogram", "assoc", "basesize=8192"] 5.289 ms (5%) 1.43 MiB (1%) 242
["parallel_histogram", "comm", "basesize=16384"] 15.754 ms (5%) 990.83 KiB (1%) 156
["parallel_histogram", "comm", "basesize=4096"] 24.694 ms (5%) 1.10 MiB (1%) 5242
["parallel_histogram", "comm", "basesize=8192"] 20.682 ms (5%) 1.23 MiB (1%) 1060
["parallel_histogram", "seq"] 9.045 ms (5%) 364.63 KiB (1%) 25
["sum", "random", "foldl"] 18.351 ms (5%)
["sum", "random", "reduce", "basesize=128"] 9.531 ms (5%) 313.30 KiB (1%) 6065
["sum", "random", "reduce", "basesize=256"] 9.473 ms (5%) 155.08 KiB (1%) 3010
["sum", "random", "reduce", "basesize=512"] 9.406 ms (5%) 76.27 KiB (1%) 1485
["sum", "uniform", "foldl"] 17.900 ms (5%)
["sum", "uniform", "reduce", "basesize=128"] 9.549 ms (5%) 313.36 KiB (1%) 6069
["sum", "uniform", "reduce", "basesize=256"] 9.454 ms (5%) 155.09 KiB (1%) 3011
["sum", "uniform", "reduce", "basesize=512"] 9.088 ms (5%) 76.23 KiB (1%) 1483
["sum", "valley", "foldl"] 18.505 ms (5%)
["sum", "valley", "reduce", "basesize=128"] 9.828 ms (5%) 313.25 KiB (1%) 6062
["sum", "valley", "reduce", "basesize=256"] 9.702 ms (5%) 155.09 KiB (1%) 3011
["sum", "valley", "reduce", "basesize=512"] 9.503 ms (5%) 76.23 KiB (1%) 1483
["words", "nthreads=1"] 41.159 ms (5%) 7.004 ms 64.57 MiB (1%) 2089918
["words", "nthreads=2"] 23.069 ms (5%) 65.29 MiB (1%) 2090073
["words", "nthreads=4"] 23.428 ms (5%) 65.74 MiB (1%) 2090222

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["collect", "assoc"]
  • ["collect"]
  • ["collect", "unordered"]
  • ["findfirst", "n=1000"]
  • ["findfirst", "n=1000", "reduce"]
  • ["findfirst", "n=400"]
  • ["findfirst", "n=400", "reduce"]
  • ["findfirst", "n=500"]
  • ["findfirst", "n=500", "reduce"]
  • ["overhead"]
  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["sum", "random"]
  • ["sum", "random", "reduce"]
  • ["sum", "uniform"]
  • ["sum", "uniform", "reduce"]
  • ["sum", "valley"]
  • ["sum", "valley", "reduce"]
  • ["words"]

Julia versioninfo

Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1020-azure #21~18.04.1-Ubuntu SMP Wed Apr 15 09:35:56 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      66442 s          0 s       2962 s      37623 s          0 s
       #2  2095 MHz      76714 s          0 s       2791 s      27095 s          0 s
       
  Memory: 6.764888763427734 GB (3497.83203125 MB free)
  Uptime: 1089.0 sec
  Load Avg:  1.83740234375  1.6591796875  1.12353515625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Stepping:            4
CPU MHz:             2095.196
BogoMIPS:            4190.39
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            36608K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec xsaves
Cpu Property Value
Brand Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Vendor :Intel
Architecture :Skylake
Model Family: 0x06, Model: 0x55, Stepping: 0x04, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 1024, 36608) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 512 bit = 64 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@mergify mergify bot merged commit 4dc04f2 into master May 15, 2020
@mergify mergify bot deleted the groupby branch May 15, 2020 03:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants