Cut down memory requirements for same-split reshape where possible #873

ClaudiaComito · 2021-09-24T13:31:16Z

Description

When reshaping distributed DNDarrays:

if new_split is the same as the original split, and
if distribution (lshapes) allows
then reshape locally via pytorch, stitch local_reshaped tensors along split axis, and balance.

This allows us to bypass the memory-intensive implementation of the distributed reshape in many cases.

Example:

tracemalloc.start()
t_x = torch.arange(100000).reshape(10,-1,10)
x = ht.array(t_x, split=1)
current, peak = tracemalloc.get_traced_memory()
print(f"BEFORE RESHAPE: Current memory usage is {current / 10**6}MB; Peak was {peak / 10**6}MB")

start = time.perf_counter()
t_x = t_x.reshape(10, -1)
end = time.perf_counter()
current, peak = tracemalloc.get_traced_memory()
print(f"after torch.reshape: Current memory usage is {current / 10**6}MB; Peak was {peak / 10**6}MB")
print("torch.reshape takes ", (end-start), " seconds.")

start = time.perf_counter()
x = x.reshape(10, -1)
end = time.perf_counter()
current, peak = tracemalloc.get_traced_memory()
print(f"after ht.reshape: Current memory usage is {current / 10**6}MB; Peak was {peak / 10**6}MB")
print("ht.reshape takes ", (end-start), " seconds.")

Results on master, 2 processes

[1,0]<stdout>:BEFORE RESHAPE: Current memory usage is 0.002501MB; Peak was 0.003077MB <---
[1,0]<stdout>:after torch.reshape: Current memory usage is 0.003669MB; Peak was 0.004101MB <---
[1,0]<stdout>:torch.reshape takes  2.068399999988202e-05  seconds.
[1,1]<stdout>:BEFORE RESHAPE: Current memory usage is 0.002501MB; Peak was 0.003105MB <---
[1,1]<stdout>:after torch.reshape: Current memory usage is 0.003669MB; Peak was 0.004101MB <---
[1,1]<stdout>:torch.reshape takes  2.2049000000023966e-05  seconds.
[1,1]<stdout>:after ht.reshape: Current memory usage is 0.372806MB; Peak was 0.383006MB <---
[1,1]<stdout>:ht.reshape takes  0.020710520999999815  seconds.
[1,0]<stdout>:after ht.reshape: Current memory usage is 0.372689MB; Peak was 0.382889MB <---
[1,0]<stdout>:ht.reshape takes  0.02076237900000022  seconds.

Results on enhancement/distributed_reshape_same_split, 2 processes:

[1,0]<stdout>:BEFORE RESHAPE: Current memory usage is 0.002501MB; Peak was 0.003077MB  <---
[1,0]<stdout>:after torch.reshape: Current memory usage is 0.003669MB; Peak was 0.004101MB <---
[1,0]<stdout>:torch.reshape takes  1.6194000000080422e-05  seconds.
[1,1]<stdout>:BEFORE RESHAPE: Current memory usage is 0.002501MB; Peak was 0.003105MB <---
[1,1]<stdout>:after torch.reshape: Current memory usage is 0.003669MB; Peak was 0.004101MB <---
[1,1]<stdout>:torch.reshape takes  1.3567999999963831e-05  seconds.
[1,0]<stdout>:after ht.reshape: Current memory usage is 0.010736MB; Peak was 0.012752MB <---
[1,0]<stdout>:ht.reshape takes  0.015495102000000038  seconds.
[1,1]<stdout>:after ht.reshape: Current memory usage is 0.010736MB; Peak was 0.01278MB <---
[1,1]<stdout>:ht.reshape takes  0.01551089800000005  seconds.

Issue/s addressed: #874

Changes proposed:

see above

Type of change

New feature (non-breaking change which adds functionality)

Due Diligence

All split configurations tested
Multiple dtypes tested in relevant functions
Documentation updated (if needed)
Updated changelog.md under the title "Pending Additions"

Does this change modify the behaviour of other functions? If so, which?

no

coquelin77 · 2021-10-08T09:03:25Z

failures may be solved by #857 would need to merge to be certain

codecov · 2022-01-20T15:13:13Z

Codecov Report

Merging #873 (0f4fd60) into master (293d873) will decrease coverage by 7.63%.
The diff coverage is 60.00%.

@@            Coverage Diff             @@
##           master     #873      +/-   ##
==========================================
- Coverage   95.50%   87.87%   -7.64%     
==========================================
  Files          64       64              
  Lines        9579     9588       +9     
==========================================
- Hits         9148     8425     -723     
- Misses        431     1163     +732

Flag	Coverage Δ
gpu	`87.87% <60.00%> (-6.77%)`	⬇️
unit	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
heat/core/manipulations.py	`92.51% <60.00%> (-6.44%)`	⬇️
heat/optim/dp_optimizer.py	`13.59% <0.00%> (-82.49%)`	⬇️
heat/optim/utils.py	`38.15% <0.00%> (-61.85%)`	⬇️
heat/nn/data_parallel.py	`75.17% <0.00%> (-19.32%)`	⬇️
heat/spatial/distance.py	`80.90% <0.00%> (-15.08%)`	⬇️
heat/core/relational.py	`91.04% <0.00%> (-8.96%)`	⬇️
heat/core/linalg/qr.py	`91.25% <0.00%> (-8.75%)`	⬇️
heat/utils/data/partial_dataset.py	`87.17% <0.00%> (-7.18%)`	⬇️
heat/cluster/spectral.py	`88.57% <0.00%> (-5.72%)`	⬇️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 293d873...0f4fd60. Read the comment docs.

ghost · 2022-04-27T07:53:09Z

CodeSee Review Map:

Review in an interactive map

View more CodeSee Maps

Legend

ClaudiaComito · 2023-03-20T11:01:53Z

superseded by #1125

ClaudiaComito added 2 commits September 24, 2021 13:03

Cut down memory requirements for same-split reshape where possible

e5031f8

Update changelog

0956a06

ClaudiaComito added the enhancement New feature or request label Sep 24, 2021

ClaudiaComito added this to the 1.2.x milestone Sep 24, 2021

ClaudiaComito mentioned this pull request Sep 27, 2021

reshape memory requirements #874

Closed

ClaudiaComito linked an issue Sep 27, 2021 that may be closed by this pull request

reshape memory requirements #874

Closed

Local-only reshape for non-distributed DNDarrays

b524a10

ClaudiaComito and others added 4 commits October 5, 2021 14:14

Refine condition for local reshape when new_split = split

83cc5d0

rearrange test_diff

bc78487

Merge branch 'master' into enhancement/distributed_reshape_same_split

b58a95d

Merge branch 'master' into enhancement/distributed_reshape_same_split

ab9b76f

coquelin77 and others added 5 commits October 8, 2021 11:14

Merge branch 'master' into enhancement/distributed_reshape_same_split

d80758b

Merge branch 'master' into enhancement/distributed_reshape_same_split

d8884a5

Merge branch 'master' into enhancement/distributed_reshape_same_split

c8abf9a

Merge branch 'master' into enhancement/distributed_reshape_same_split

dd076b9

Merge branch 'master' into enhancement/distributed_reshape_same_split

0f4fd60

ClaudiaComito and others added 4 commits January 21, 2022 16:37

Typo fix just to trigger tests

929db0c

Merge branch 'main' into enhancement/distributed_reshape_same_split

e778a1d

Merge branch 'main' into enhancement/distributed_reshape_same_split

65476dd

Merge branch 'main' into enhancement/distributed_reshape_same_split

f6042b3

ClaudiaComito changed the base branch from main to release/1.2.x April 27, 2022 13:11

ClaudiaComito self-assigned this Feb 13, 2023

ClaudiaComito closed this Mar 20, 2023

mtar deleted the enhancement/distributed_reshape_same_split branch March 23, 2023 12:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cut down memory requirements for same-split reshape where possible #873

Cut down memory requirements for same-split reshape where possible #873

ClaudiaComito commented Sep 24, 2021 •

edited

Loading

coquelin77 commented Oct 8, 2021

codecov bot commented Jan 20, 2022

ghost commented Apr 27, 2022

ClaudiaComito commented Mar 20, 2023

Cut down memory requirements for same-split reshape where possible #873

Cut down memory requirements for same-split reshape where possible #873

Conversation

ClaudiaComito commented Sep 24, 2021 • edited Loading

Description

Changes proposed:

Type of change

Due Diligence

Does this change modify the behaviour of other functions? If so, which?

coquelin77 commented Oct 8, 2021

codecov bot commented Jan 20, 2022

Codecov Report

ghost commented Apr 27, 2022

CodeSee Review Map:

Legend

ClaudiaComito commented Mar 20, 2023

ClaudiaComito commented Sep 24, 2021 •

edited

Loading