benchmark regressions #9914

dberenbaum · 2023-09-05T12:58:05Z

Investigate regressions in https://github.com/iterative/dvc-bench for:

skshetry · 2023-09-05T13:17:49Z

For exp show and plots, it looks like the repository used for benchmarking is example-get-started which is incompatible with 2.x due to artifacts in dvc.yaml and md5/hash in dvc.lock changes, so they are failing.

Plots were never in sub-second range.

dberenbaum · 2023-09-05T15:52:21Z

Makes sense, so we only need to investigate push

dberenbaum · 2023-11-01T15:06:55Z

Added pull based on issues since 3.20

pmrowla · 2023-11-15T09:08:31Z

It looks like the issue for both fetch and push is with index collection - specifically the problem is that sqltrie traverse is slow.

This is a local fetch of the cats_dogs dataset (25k files) in main, using 3.10.0 (without the index collection) the fetch only takes 10.3s with the profiler enabled:

Caching the sqltrie._traverse() results in memory cuts the fetch time in half, but the real issue is that the underlying script to generate the temporary steps tables for traversal has to be called once per file, and that temp table generation is the main performance blocker. (the caching cuts the runtime exactly in half because we are calling traverse twice per file)

So fixing this will require revisiting how the sqltrie traversal queries work. Or maybe we need to consider only using sqltrie for dump/load and then do all of our index operations in memory. cc @efiop

pmrowla · 2023-11-15T09:18:13Z

Actually it looks like we can probably just get away with building an in-memory pygtrie for sqltrie views, since views are the only place we heavily use traverse right now

dberenbaum · 2023-11-27T13:27:32Z

I think closing was an accident @efiop?

efiop · 2023-11-27T14:38:47Z

Yeah, sorry about that.

We only have push left to check it looks like.

dberenbaum · 2023-11-27T14:44:39Z

Does iterative/sqltrie#24 fix all pull regressions? Also I opened iterative/dvc-bench#478 since I notice the benchmarks are failing.

pmrowla · 2023-11-28T08:28:21Z

Yes, all the pull regressions should be resolved after the sqltrie changes.

pmrowla · 2023-11-28T09:11:50Z

bench.dvc.org is up to date now, pull regressions have been resolved but it looks like push has still regressed since 3.10

skshetry · 2023-11-28T13:53:29Z

pull still seems to have regressed, no? The index migration for pull/fetch/checkout happened in #9424 in 3.0.

efiop · 2023-11-28T18:20:07Z

Switched pull -> fetch in #10117 and there is no huge regression for local:

2023-11-28T18:04:54.4979752Z ---------------------------------------------------------------------------- benchmark 'test_fetch-fetch': 5 tests ----------------------------------------------------------------------------
2023-11-28T18:04:54.4982155Z Name (time in s)                  Min                 Max                Mean            StdDev              Median               IQR            Outliers     OPS            Rounds  Iterations
2023-11-28T18:04:54.4984185Z -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2023-11-28T18:04:54.4986347Z test_fetch-fetch[2.58.2]     112.4179 (1.0)      112.4179 (1.0)      112.4179 (1.0)      0.0000 (1.0)      112.4179 (1.0)      0.0000 (1.0)           0;0  0.0089 (1.0)           1           1
2023-11-28T18:04:54.4988991Z test_fetch-fetch[3.0.0]      118.5058 (1.05)     118.5058 (1.05)     118.5058 (1.05)     0.0000 (1.0)      118.5058 (1.05)     0.0000 (1.0)           0;0  0.0084 (0.95)          1           1
2023-11-28T18:04:54.4991617Z test_fetch-fetch[main]       118.6735 (1.06)     118.6735 (1.06)     118.6735 (1.06)     0.0000 (1.0)      118.6735 (1.06)     0.0000 (1.0)           0;0  0.0084 (0.95)          1           1
2023-11-28T18:04:54.4994162Z test_fetch-fetch[3.10.0]     119.2171 (1.06)     119.2171 (1.06)     119.2171 (1.06)     0.0000 (1.0)      119.2171 (1.06)     0.0000 (1.0)           0;0  0.0084 (0.94)          1           1
2023-11-28T18:04:54.4996773Z test_fetch-fetch[3.20.0]     119.2340 (1.06)     119.2340 (1.06)     119.2340 (1.06)     0.0000 (1.0)      119.2340 (1.06)     0.0000 (1.0)           0;0  0.0084 (0.94)          1           1
2023-11-28T18:04:54.4998835Z -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2023-11-28T18:04:54.4999696Z

but there is for checkout, which we already knew.

efiop · 2023-11-29T23:37:03Z

Ok, so for the record: with small files (esp the ones in mnist) the overhead of callback creation in https://github.com/iterative/dvc-objects/blob/231e5273cf929cea2424c9d6e885f72bb5d90750/src/dvc_objects/fs/utils.py#L167 is just too wasteful and unnecessary. There are a few things that we can do there as well (like using file-sized buffer for small files, using larger buffer for large files, or even just using shutils copysomething that can leverage things like sendfile if necessary), but even that alone speeds up link creation for me by like 50x times. reflink/hardlink/symlink bypass that thus not suffering from the overhead, so our goal here is for copy to take about the same time as other links when dealing with small files. The whole copyfile callback situation is rather messy, somewhat keen on revisiting it a bit more. I'll send some patches in the morning.

EDIT: Merged iterative/dvc-objects#229 and triggered https://github.com/iterative/dvc-bench/actions/runs/7040080174 , let's see...

There is no reason to create a progress bar overhead for anything that takes < 2sec to copy on a modern machine. Related iterative/dvc#9914

skshetry · 2023-11-30T10:26:03Z

What changed between 2.58.2 and 3.0.0? The callback was still there before that, no?

This again wastes a lot of time, as `set_size` might trigger a pbar refresh. Related iterative#229 Related iterative/dvc#9914

This again wastes a lot of time, as `set_size` might trigger a pbar refresh. Related #229 Related iterative/dvc#9914

efiop · 2023-12-02T00:42:35Z

The main thing that changed is #9444 . Before it we were processing files one-by-one and passing them to generic.transfer(), while after it, among other things, we are passing all files that we need to create to generic.transfer() which makes it use a bit more overhead in different places down the line.

This is a bit early, but I've sent a few more fixes and now, at least on my linux machine, it looks like we are not only back to 2.58.2, but actually faster too (~30sec vs ~37sec for mnist checkout). Waiting for proper confirmation in dvc-bench https://github.com/iterative/dvc-bench/actions/runs/7066555049/job/19239201268

efiop · 2023-12-02T01:21:19Z

Ok, so looks promising so far.

3.20 also improved because it also installed new dvc-objects version. So indeed, checkout-copy is now faster than it was in 2.x -

But because we've unified all transfer-like logic in the past year, we also get these improvements in other commands that use this same code. E.g. fetch from local remote is now 2.5 times faster than it was.

and push to local remote is now ~2 times faster than before and over 4 times faster than in 2.x

and dvc get-url

and dvc import-url

and a few more. We'll see on on bench.dvc.org once ready.

PS: Hopefully not a fluke...

skshetry · 2023-12-02T02:39:55Z

Great work, @efiop. The only thing that I want to mention is that, iterative/dvc-objects#231 is going to make it slower for users with large files. So we should benchmark on that kind of datasets to be sure. ThreadPoolExecutor has high overhead for small files, but that is noise for larger files which take a lot of time. Parallellizing also help SSDs as they have a lot of bandwidth, and HDDs in planning in their shortest route.

skshetry · 2023-12-02T02:40:34Z

Maybe it's time to write checkout in rust? 😉

dberenbaum · 2023-12-02T13:59:48Z

Can we test on some dataset like https://www.kaggle.com/datasets/thedevastator/open-orca-augmented-flan-dataset? We don't need to put them into our regular benchmarks but would be good to test it out this time to see how these changes perform on a large file.

efiop · 2023-12-03T23:14:24Z

Sorry for the delay, was AFK.

Indeed, we should benchmark other datasets as well, no doubt about that. And we should perform well in both cases, mnist is a very good edge case for that. We can pick smarter strategies based on what we are dealing with.

OpenOrca seems pretty good for testing a dataset that has 1 file that is big-ish, we should definitely add it to the dvc-bench datasets so that we have it around at our disposal, I'll upload it later today 👍 . I was also looking into some famous/well-known video/sound/other catalogue-looking datasets to test many (but not too many, maybe up to 10-100) of big-ish files (so typical classification datasets won't do), to test a bit beyond that. Those videos have to be non-copyright ones, so youtube-based ones won't do. If you have ideas - let me know.

dberenbaum · 2023-12-04T12:51:32Z

Thanks for adding @efiop! To be clear, I didn't mean we have to add it to dvc-bench, although I'm fine with doing so. Sometimes just doing a one-off test is fine if we have something particular to test.

skshetry · 2023-12-04T12:56:27Z

Reminder: we have a gentree command that can randomly generate a dataset.

dvc-data gentree data 10 1Gb --depth 1
1.0G data
   - ├── dir00
 86M │  └── 0.txt
   - ├── dir01
111M │  └── 0.txt
   - ├── dir02
 79M │  └── 0.txt
   - ├── dir03
129M │  └── 0.txt
   - ├── dir04
100M │  └── 0.txt
   - ├── dir05
 95M │  └── 0.txt
   - ├── dir06
159M │  └── 0.txt
   - ├── dir07
124M │  └── 0.txt
   - ├── dir08
 95M │  └── 0.txt
 77M └── 0.txt

efiop · 2023-12-04T14:13:15Z

@dberenbaum It is very nice to have it in the toolset generally, but it also allows us to trigger dispatch runs in CI. I've also added a smaller one used-cars dataset and it is even faster to run than mnist (i mean total time) https://github.com/iterative/dvc-bench/actions/runs/7087497084 , so we can probably run something chunkier on a regular basis, just need to figure out how/where to post. I'll also add a dataset like the one I was talking about above later.

efiop · 2023-12-06T13:17:07Z

For the record, @skshetry and I did some optimizations on reflink creation as a part of https://github.com/iterative/dql/pull/1007, approx 2-3 times faster now that it was before.

One thing that is obvious from benchmarks is that fetch/push-noop are kinda slow, but we are not caching compacted/deduplicated list of files we get from going through all revisions just yet, so I would get back to that after it is done.

efiop · 2023-12-10T00:31:35Z

For the record, some benchmarks with different datasets:

mnist (70K+ ~300B images)

https://bench.dvc.org/schedule/7147613411_2.html

xray (6K ~30KB images)

https://bench.dvc.org/workflow_dispatch/7151367965_3.html

used-cargs (1.4G single file)

https://bench.dvc.org/workflow_dispatch/7146419624_1.html

dberenbaum · 2023-12-11T13:25:03Z

Worth noting that these are not "real" benchmarks for older versions though since those may be benefitting from updates to dvc-objects and other libs, right?

dberenbaum · 2023-12-11T13:28:43Z

So overall, it looks like it's a great improvement but we are trading off some performance on big files to get improvements on many small files. Is that right?

dberenbaum · 2024-01-02T15:36:41Z

Any updates here? What about s3? Any idea why it doesn't show the same improvements as other clouds in the benchmarks?

efiop · 2024-01-02T17:04:00Z

Sorry, I was convinced I responded before, but probably forgot to post.

Worth noting that these are not "real" benchmarks for older versions though since those may be benefitting from updates to dvc-objects and other libs, right?

Yes, mostly recent versions as very old ones will have more strict requirements.

So overall, it looks like it's a great improvement but we are trading off some performance on big files to get improvements on many small files. Is that right?

Marginally in used-cars, but there is no benchmark with many big files which should be more affected.

Any updates here? What about s3? Any idea why it doesn't show the same improvements as other clouds in the benchmarks?

No updates, haven't looked into this since the last time 🙁 s3 is likely because of chunking configuration (similar to improvements that @pmrowla did in adlfs in the past), but haven't looked into it.

dberenbaum · 2024-01-02T20:40:39Z

@pmrowla Any thoughts on the s3 performance? I'd like to look into that either as part of this issue or a new one.

Is there anything else left before closing this issue?

pmrowla · 2024-01-04T02:43:24Z

@dberenbaum this can be closed, we have a separate issue for the s3 performance #9893

dberenbaum added this to DVC Sep 5, 2023

github-project-automation bot moved this to Backlog in DVC Sep 5, 2023

dberenbaum added the p1-important Important, aka current backlog of things to do label Sep 5, 2023

dberenbaum moved this from Backlog to Todo in DVC Sep 5, 2023

dberenbaum moved this from Todo to In Progress in DVC Sep 12, 2023

dberenbaum moved this from In Progress to Todo in DVC Sep 12, 2023

dberenbaum moved this from Todo to Backlog in DVC Sep 19, 2023

dberenbaum moved this from Backlog to Todo in DVC Nov 14, 2023

dberenbaum assigned pmrowla Nov 14, 2023

pmrowla moved this from Todo to In Progress in DVC Nov 15, 2023

This was referenced Nov 15, 2023

sqltrie: return an in-memory pygtrie as view() iterative/sqltrie#20

Closed

SQLiteTrie: use pygtrie for in memory traversal iterative/sqltrie#21

Closed

pmrowla mentioned this issue Nov 23, 2023

sqltrie: make recursive subtrie traversal lazy iterative/sqltrie#24

Merged

efiop closed this as completed in iterative/sqltrie#24 Nov 27, 2023

github-project-automation bot moved this from In Progress to Done in DVC Nov 27, 2023

dberenbaum reopened this Nov 27, 2023

github-project-automation bot moved this from Done to Todo in DVC Nov 27, 2023

pmrowla moved this from Todo to In Progress in DVC Nov 28, 2023

efiop self-assigned this Nov 29, 2023

efiop added a commit to efiop/dvc-objects that referenced this issue Nov 30, 2023

fs: copyfile: don't show progress for files < 1G

f16d0fa

There is no reason to create a progress bar overhead for anything that takes < 2sec to copy on a modern machine. Related iterative/dvc#9914

efiop mentioned this issue Nov 30, 2023

fs: copyfile: don't show progress for files < 1G iterative/dvc-objects#229

Merged

efiop added a commit to iterative/dvc-objects that referenced this issue Nov 30, 2023

fs: copyfile: don't show progress for files < 1G

4a68565

There is no reason to create a progress bar overhead for anything that takes < 2sec to copy on a modern machine. Related iterative/dvc#9914

efiop added a commit to efiop/dvc-objects that referenced this issue Dec 1, 2023

fs: copyfile: don't touch callback at all for <1G files

002cdfc

This again wastes a lot of time, as `set_size` might trigger a pbar refresh. Related iterative#229 Related iterative/dvc#9914

efiop mentioned this issue Dec 1, 2023

fs: copyfile: don't touch callback at all for files <1G iterative/dvc-objects#230

Merged

efiop added a commit to iterative/dvc-objects that referenced this issue Dec 1, 2023

fs: copyfile: don't touch callback at all for <1G files

060a985

This again wastes a lot of time, as `set_size` might trigger a pbar refresh. Related #229 Related iterative/dvc#9914

efiop mentioned this issue Dec 2, 2023

fs: copy: limit to 1 job for localfs to localfs transfers iterative/dvc-objects#231

Merged

efiop mentioned this issue Dec 4, 2023

data: add openorca iterative/dvc-bench#482

Merged

pmrowla closed this as completed Jan 4, 2024

github-project-automation bot moved this from In Progress to Done in DVC Jan 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark regressions #9914

benchmark regressions #9914

dberenbaum commented Sep 5, 2023 •

edited by efiop

Loading

skshetry commented Sep 5, 2023

dberenbaum commented Sep 5, 2023

dberenbaum commented Nov 1, 2023

pmrowla commented Nov 15, 2023 •

edited

Loading

pmrowla commented Nov 15, 2023

dberenbaum commented Nov 27, 2023

efiop commented Nov 27, 2023

dberenbaum commented Nov 27, 2023

pmrowla commented Nov 28, 2023

pmrowla commented Nov 28, 2023

skshetry commented Nov 28, 2023

efiop commented Nov 28, 2023 •

edited

Loading

efiop commented Nov 29, 2023 •

edited

Loading

skshetry commented Nov 30, 2023

efiop commented Dec 2, 2023 •

edited

Loading

efiop commented Dec 2, 2023 •

edited

Loading

skshetry commented Dec 2, 2023

skshetry commented Dec 2, 2023

dberenbaum commented Dec 2, 2023

efiop commented Dec 3, 2023 •

edited

Loading

dberenbaum commented Dec 4, 2023

skshetry commented Dec 4, 2023

efiop commented Dec 4, 2023

efiop commented Dec 6, 2023

efiop commented Dec 10, 2023

dberenbaum commented Dec 11, 2023

dberenbaum commented Dec 11, 2023

dberenbaum commented Jan 2, 2024

efiop commented Jan 2, 2024

dberenbaum commented Jan 2, 2024

pmrowla commented Jan 4, 2024

benchmark regressions #9914

benchmark regressions #9914

Comments

dberenbaum commented Sep 5, 2023 • edited by efiop Loading

skshetry commented Sep 5, 2023

dberenbaum commented Sep 5, 2023

dberenbaum commented Nov 1, 2023

pmrowla commented Nov 15, 2023 • edited Loading

pmrowla commented Nov 15, 2023

dberenbaum commented Nov 27, 2023

efiop commented Nov 27, 2023

dberenbaum commented Nov 27, 2023

pmrowla commented Nov 28, 2023

pmrowla commented Nov 28, 2023

skshetry commented Nov 28, 2023

efiop commented Nov 28, 2023 • edited Loading

efiop commented Nov 29, 2023 • edited Loading

skshetry commented Nov 30, 2023

efiop commented Dec 2, 2023 • edited Loading

efiop commented Dec 2, 2023 • edited Loading

skshetry commented Dec 2, 2023

skshetry commented Dec 2, 2023

dberenbaum commented Dec 2, 2023

efiop commented Dec 3, 2023 • edited Loading

dberenbaum commented Dec 4, 2023

skshetry commented Dec 4, 2023

efiop commented Dec 4, 2023

efiop commented Dec 6, 2023

efiop commented Dec 10, 2023

dberenbaum commented Dec 11, 2023

dberenbaum commented Dec 11, 2023

dberenbaum commented Jan 2, 2024

efiop commented Jan 2, 2024

dberenbaum commented Jan 2, 2024

pmrowla commented Jan 4, 2024

dberenbaum commented Sep 5, 2023 •

edited by efiop

Loading

pmrowla commented Nov 15, 2023 •

edited

Loading

efiop commented Nov 28, 2023 •

edited

Loading

efiop commented Nov 29, 2023 •

edited

Loading

efiop commented Dec 2, 2023 •

edited

Loading

efiop commented Dec 2, 2023 •

edited

Loading

efiop commented Dec 3, 2023 •

edited

Loading