Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed Up JSON filtering on large objects (AzDO) #3702

Closed
Eldarrin opened this issue Sep 28, 2022 · 8 comments
Closed

Speed Up JSON filtering on large objects (AzDO) #3702

Eldarrin opened this issue Sep 28, 2022 · 8 comments
Labels
feature-request All issues for new features that have not been committed to needs-discussion

Comments

@Eldarrin
Copy link
Contributor

Proposal

Improve the speed of json object searching by using filtering as opposed to loop iteration. Particularly around:

https://github.com/kedacore/keda/blob/main/pkg/scalers/azure_pipelines_scaler.go#L260

for _, value := range jobs { <stuff>

Use-Case

Where queue lengths can consistantly grow, this causes keda scalers to slow down and may cause an event to happen slower than liked. This means that in this use-case agents would be more responsive to spin up times

Anything else?

No response

@Eldarrin Eldarrin added feature-request All issues for new features that have not been committed to needs-discussion labels Sep 28, 2022
@tomkerkhove tomkerkhove moved this to Proposed in Roadmap - KEDA Core Sep 28, 2022
@Eldarrin
Copy link
Contributor Author

Eldarrin commented Oct 2, 2022

Ok, so written some performance tests. durations are: original, using struct json, pre-stripping the jobs+struct. 3 models, parent template, demand style, base style.

TL;DR; pre-stripping and structing combined gives the best performance on large queues with templates. On small queues there is no effect (sub 50 µs),

Data @ 1000 length finished job queues

PARENT MODEL
loadvars: dotnet60-keda-template 1000
elapsed: 18.0186ms 15.5343ms 15.509ms 2.4843ms 2.5096ms
elapsed: 17.573ms 15.5113ms 15.1693ms 2.0617ms 2.4037ms
elapsed: 16.55ms 16.0125ms 16.6713ms 537.5µs -121.3µs
elapsed: 17.615ms 16.1678ms 15.4581ms 1.4472ms 2.1569ms
elapsed: 16.4567ms 15.1234ms 14.9863ms 1.3333ms 1.4704ms
elapsed: 17.0224ms 15.5668ms 15.2566ms 1.4556ms 1.7658ms
elapsed: 17.0199ms 14.5121ms 15.5297ms 2.5078ms 1.4902ms
elapsed: 17.0255ms 14.5244ms 15.5076ms 2.5011ms 1.5179ms
elapsed: 16.5132ms 14.509ms 15.0376ms 2.0042ms 1.4756ms
elapsed: 16.0211ms 16.0176ms 14.5084ms 3.5µs 1.5127ms
avg improvement 1.63362ms 1.61815ms

DEMAND MODEL
loadvars: kubectl 1000
elapsed: 18.1213ms 15.4379ms 15.6236ms 2.6834ms 2.4977ms
elapsed: 15.5076ms 15.6823ms 16.0601ms -174.7µs -552.5µs
elapsed: 16.0193ms 16.3354ms 15.0896ms -316.1µs 929.7µs
elapsed: 16.4844ms 16.0103ms 16.599ms 474.1µs -114.6µs
elapsed: 16.1438ms 14.6167ms 16.5129ms 1.5271ms -369.1µs
elapsed: 17.5418ms 14.5102ms 16.0939ms 3.0316ms 1.4479ms
elapsed: 17.0101ms 15.5695ms 15.0164ms 1.4406ms 1.9937ms
elapsed: 18.0478ms 15.5407ms 16.1588ms 2.5071ms 1.889ms
elapsed: 17.09ms 14.6112ms 14.7172ms 2.4788ms 2.3728ms
elapsed: 17.0757ms 15.0215ms 14.5119ms 2.0542ms 2.5638ms
avg improvement 1.57061ms 1.26584ms

BASE MODEL
loadvars: 1000
elapsed: 18.0258ms 16.0143ms 15.022ms 2.0115ms 3.0038ms
elapsed: 16.137ms 14.5118ms 16.0197ms 1.6252ms 117.3µs
elapsed: 16.0939ms 16.045ms 15.5733ms 48.9µs 520.6µs
elapsed: 15.5383ms 15.5061ms 16.0178ms 32.2µs -479.5µs
elapsed: 15.7654ms 15.1277ms 15.5416ms 637.7µs 223.8µs
elapsed: 17.5378ms 15.1562ms 15.1195ms 2.3816ms 2.4183ms
elapsed: 16.6696ms 15.1509ms 14.5413ms 1.5187ms 2.1283ms
elapsed: 16.4351ms 15.5902ms 15.2911ms 844.9µs 1.144ms
elapsed: 17.151ms 15.5707ms 15.5085ms 1.5803ms 1.6425ms
elapsed: 17.5191ms 15.1393ms 20.628ms 2.3798ms -3.1089ms
avg improvement 1.30608ms 761.02µs

@JorTurFer
Copy link
Member

Sorry for the slow response,
I don't understand totally the thing. We are earning in average 1ms, right? How had you taken the metrics? Using the scaler code?
I agree that we should be the most efficient as possible, but IDK if 1ms is a differential value.

Any way, if the changes required don't make the code unreadable, I think that every performance improvement is worth, my concern here is that if we are changing this for performing better, we should measure the performance with the scaler code directly

@JorTurFer
Copy link
Member

are you willing to contribute with this?

@Eldarrin
Copy link
Contributor Author

Eldarrin commented Oct 7, 2022

Sure, I extracted the scalar code azure_pipelines_scalar to another project and replaced the loader with a json file and ran the scalar against it. I'll commit my code to my fork so you can take a look. I actually think it becomes more readable :)

@Eldarrin
Copy link
Contributor Author

Eldarrin commented Oct 7, 2022

@JorTurFer JorTurFer moved this from Proposed to To Do in Roadmap - KEDA Core Oct 7, 2022
@JorTurFer
Copy link
Member

True, it's quite cleaner :)
So if it's cleaner and performs better.. we should merge it ❤️
Could you open a PR?

@JorTurFer
Copy link
Member

BTW, thanks a ton for the in depth research 🙇

@Eldarrin
Copy link
Contributor Author

Eldarrin commented Oct 7, 2022

no worries, I'll give it a final tidy and raise a PR

JorTurFer pushed a commit that referenced this issue Oct 10, 2022
* Improve AzDO profilng speed of queues (#3702)

Signed-off-by: mortx <[email protected]>
Repository owner moved this from To Do to Ready To Ship in Roadmap - KEDA Core Oct 10, 2022
@JorTurFer JorTurFer moved this from Ready To Ship to Done in Roadmap - KEDA Core Mar 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request All issues for new features that have not been committed to needs-discussion
Projects
Archived in project
Development

No branches or pull requests

2 participants