Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tools: new benchmark for the hashkv performance #9866

Merged
merged 2 commits into from
Aug 14, 2018

Conversation

spzala
Copy link
Member

@spzala spzala commented Jun 19, 2018

Provider a way to etcd users to check how long it takes to get
hashkv in user enviornment. The command output will provide time taken
to get hashkv along with db file size.

Fixed #8910

@spzala
Copy link
Member Author

spzala commented Jun 19, 2018

@gyuho hi, I tried it on a db file size around ~8GB and that's the output the new command provides,

./bin/etcdctl check hashkv
HashKV : 4280873459
Endpoint : 127.0.0.1:2379
Time taken to get hashkv : 2.695825965s
DB size : 8269 MB

Hope it makes sense? The command will provide a way for any user to check the hashkv performance in their own environment.

@spzala spzala requested a review from gyuho June 19, 2018 19:21
@xiang90
Copy link
Contributor

xiang90 commented Jun 19, 2018

this is more a benchmark than a check. for a check, we should provide a criteria, and output failure or success.

@gyuho
Copy link
Contributor

gyuho commented Jun 19, 2018

Agree with @xiang90. And we don't need a new command for hash kv, we already have:

https://github.com/coreos/etcd/blob/897c0559400ed6c7c004b0cf3fc3b56f55bfaa86/etcdctl/ctlv3/command/ep_command.go#L69-L74

For #8910, we just need to find out the cost of fetching hash kv.

@spzala
Copy link
Member Author

spzala commented Jun 19, 2018

@xiang90 I see, thanks much for your comments! @gyuho had a similar thoughts but I was wondering if it makes any sense with check!

@codecov-io
Copy link

Codecov Report

Merging #9866 into master will increase coverage by 0.09%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9866      +/-   ##
==========================================
+ Coverage    69.1%   69.19%   +0.09%     
==========================================
  Files         385      385              
  Lines       35703    35703              
==========================================
+ Hits        24672    24706      +34     
+ Misses       9228     9198      -30     
+ Partials     1803     1799       -4
Impacted Files Coverage Δ
etcdserver/api/v3rpc/lease.go 69.31% <0%> (-4.55%) ⬇️
pkg/adt/interval_tree.go 86.48% <0%> (-3.91%) ⬇️
raft/node.go 90.83% <0%> (-1.6%) ⬇️
lease/leasehttp/http.go 58.08% <0%> (-1.48%) ⬇️
proxy/grpcproxy/watch.go 88.19% <0%> (-1.25%) ⬇️
clientv3/watch.go 93.93% <0%> (-0.47%) ⬇️
etcdserver/server.go 73.31% <0%> (-0.3%) ⬇️
etcdserver/v3_server.go 74.7% <0%> (+0.23%) ⬆️
lease/lessor.go 87.42% <0%> (+0.62%) ⬆️
pkg/transport/listener_tls.go 66.22% <0%> (+0.66%) ⬆️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 55a05d9...10d5ef5. Read the comment docs.

@spzala spzala added the WIP label Jul 8, 2018
@spzala spzala force-pushed the checkhashkv branch 5 times, most recently from 3f112b5 to 5338419 Compare July 16, 2018 21:25
@spzala spzala changed the title Etcdctl: check the performance of hashkv Tools: new benchmark for the hashkv performance Jul 16, 2018
@spzala spzala removed the WIP label Jul 17, 2018
@spzala spzala self-assigned this Jul 17, 2018
@spzala
Copy link
Member Author

spzala commented Jul 17, 2018

@xiang90 @gyuho hello, does this approach of creating a benchmark cmd makes sense? Thanks!

@gyuho
Copy link
Contributor

gyuho commented Jul 17, 2018

@spzala Yes! subcommand in benchmark is better. I will take a look at this PR this week. Thanks!

@spzala
Copy link
Member Author

spzala commented Jul 17, 2018

@gyuho sounds great. Thanks much!!

@gyuho
Copy link
Contributor

gyuho commented Jul 19, 2018

@spzala On second thought, instead of duplicating code for key writes, how about adding a flag --check-hash-kv under benchmark put command. Because we only want to know the duration before and after hash kv operation, this should be simple to add.

@spzala
Copy link
Member Author

spzala commented Jul 19, 2018

@gyuho initially I thought of moving some of write code into benchmark util method but wasn't sure if that could have make sense. Adding a sub-command sounds good, let me try it and will update the code. Thanks!

if errhash != nil {
fmt.Fprintf(os.Stderr, "Failed to get the hashkv of endpoint %s (%v)\n", host, errhash)
panic(err)
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not need else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, makes sense..not needed.

if errstatus != nil {
fmt.Println(fmt.Fprintf(os.Stderr, "Failed to get the status of endpoint %s (%v)\n", host, errstatus))
panic(err)
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not need else?

@gyuho
Copy link
Contributor

gyuho commented Aug 2, 2018

@spzala Overall looks great. Could you fix CI and provide some sample outputs with GBs of data? Thanks a lot!

@spzala
Copy link
Member Author

spzala commented Aug 2, 2018

@gyuho thanks much, will be updating output example shortly.

@spzala
Copy link
Member Author

spzala commented Aug 6, 2018

@gyuho made a minor change in the latest commit in how output is being displayed (db size in GB vs MB I had in previous commit). Here are few related examples output,

HaskKV Summary:
  HashKV: 1407003244
  Endpoint: http://127.0.0.1:2379
  Time taken to get hashkv: 917.035345ms
  DB size: 1.09GB
HaskKV Summary:
  HashKV: 1286658187
  Endpoint: http://127.0.0.1:2379
  Time taken to get hashkv: 1.609064019s
  DB size: 2.06GB
HaskKV Summary:
  HashKV: 4280873459
  Endpoint: http://127.0.0.1:2379
  Time taken to get hashkv: 4.971754313s
  DB size: 8.27GB

@spzala spzala removed the WIP label Aug 6, 2018
ctx, cancel := context.WithCancel(context.Background())
clients[0].HashKV(ctx, eps[0], epHashKVRev)
if !strings.HasPrefix(eps[0], `http://`) {
host = "http://" + eps[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gyuho hi, I thought we check it by convention. It's not needed as you pointed out. I will remove it. Thanks!

}
dbsize := float64(respstatus.DbSize) / (1000 * 1000 * 1000)
results += fmt.Sprintf(" DB size: %vGB \t\n", strconv.FormatFloat(dbsize, 'f', 2, 64))
fmt.Println(results)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spzala Let's use humanize.Bytes to string-ify db size. Then, this should be good to merge. Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gyuho thanks, this's good to know. Will make changes.

@gyuho
Copy link
Contributor

gyuho commented Aug 8, 2018

Can you also comment this result #9866 (comment) on
#8910? Once this gets merged, we can close it.

fmt.Println(fmt.Fprintf(os.Stderr, "Failed to get the status of endpoint %s (%v)\n", host, errstatus))
panic(err)
}
results += fmt.Sprintf(" DB size: %v \t\n", humanize.Bytes(uint64(respstatus.DbSize)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just " DB size: %s"?


// Get the db size
if errstatus != nil {
fmt.Println(fmt.Fprintf(os.Stderr, "Failed to get the status of endpoint %s (%v)\n", host, errstatus))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just do fmt.Printf("...\n")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, thanks @gyuho!

panic(err)
}
results += fmt.Sprintf("\nHaskKV Summary:\n")
results += fmt.Sprintf(" HashKV: %v \t\n", resphash.Hash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ \t\n"/\n" to all? We just need \n.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gyuho my bad, yes we don't need tab. Thanks!!

The benchmark as a sub command of put when provided will fetch hashkv and checks
the time taken to get it.

Fixed # 8910

Provider a way to etcd user to check how long it takes to get
hashkv in user enviornment. The command ouput will provide time taken
to get hashkv along with db file size.
@gyuho gyuho merged commit 3c89938 into etcd-io:master Aug 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants