Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak #230

Closed
deric opened this issue May 24, 2018 · 17 comments
Closed

Memory leak #230

deric opened this issue May 24, 2018 · 17 comments

Comments

@deric
Copy link

deric commented May 24, 2018

After updating aerospike-client-go telegraf started leaking memory in aerospike plugin (influxdata/telegraf#4195). The cause is probably related to changes done in between 1.25.0 and 1.32.0.

@khaf
Copy link
Collaborator

khaf commented May 24, 2018

Thanks for your report. Do you think you can provide a memory profile using pprof to help us track the issue down?

@deric
Copy link
Author

deric commented May 24, 2018

@khaf Sure, after 4min:

Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.telegraf.alloc_objects.alloc_space.inuse_objects.inuse_space.001.pb.gz
File: telegraf
Type: inuse_space
Time: May 24, 2018 at 1:55pm (UTC)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 19717.60kB, 100% of 19717.60kB total
Showing top 10 nodes out of 33
      flat  flat%   sum%        cum   cum%
10563.33kB 53.57% 53.57% 10563.33kB 53.57%  github.com/aerospike/aerospike-client-go.newPartitions
    6536kB 33.15% 86.72%     6536kB 33.15%  github.com/aerospike/aerospike-client-go.NewConnection
 1577.92kB  8.00% 94.72%  1577.92kB  8.00%  github.com/zensqlmonitor/go-mssqldb.init
  528.17kB  2.68% 97.40%   528.17kB  2.68%  github.com/aerospike/aerospike-client-go.clonePartitions
  512.19kB  2.60%   100%   512.19kB  2.60%  runtime.malg
         0     0%   100%   528.17kB  2.68%  github.com/aerospike/aerospike-client-go.(*Cluster).tend
         0     0%   100%  1089.33kB  5.52%  github.com/aerospike/aerospike-client-go.(*Cluster).tend.func1
         0     0%   100% 14425.50kB 73.16%  github.com/aerospike/aerospike-client-go.(*Cluster).tend.func3
         0     0%   100%  1584.50kB  8.04%  github.com/aerospike/aerospike-client-go.(*Cluster).tend.func4
         0     0%   100%   528.17kB  2.68%  github.com/aerospike/aerospike-client-go.(*Cluster).waitTillStabilized.func1

@TheSAS
Copy link

TheSAS commented May 24, 2018

Telegraf v1.6.3 (git: release-1.6 890f1d3a)

$ go tool pprof -top http://localhost:6060/debug/pprof/heap
Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap
Saved profile in /home/ovilchynskyy/pprof/pprof.telegraf.alloc_objects.alloc_space.inuse_objects.inuse_space.002.pb.gz
File: telegraf
Type: inuse_space
Time: May 24, 2018 at 4:55pm (EEST)
Showing nodes accounting for 21774.10kB, 100% of 21774.10kB total
      flat  flat%   sum%        cum   cum%
11619.67kB 53.36% 53.36% 11619.67kB 53.36%  github.com/aerospike/aerospike-client-go.newPartitions /go/src/github.com/aerospike/aerospike-client-go/partitions.go
 5991.33kB 27.52% 80.88%  5991.33kB 27.52%  github.com/aerospike/aerospike-client-go.NewConnection /go/src/github.com/aerospike/aerospike-client-go/connection.go
 2112.67kB  9.70% 90.58%  2112.67kB  9.70%  github.com/aerospike/aerospike-client-go.clonePartitions /go/src/github.com/aerospike/aerospike-client-go/partitions.go
     514kB  2.36% 92.94%      514kB  2.36%  github.com/aerospike/aerospike-client-go/types/rand.init /go/src/github.com/aerospike/aerospike-client-go/types/rand/xor_shift128.go
  512.20kB  2.35% 95.30%   512.20kB  2.35%  github.com/aerospike/aerospike-client-go.(*Cluster).addNodes.func1 /go/src/github.com/aerospike/aerospike-client-go/cluster.go
  512.19kB  2.35% 97.65%   512.19kB  2.35%  runtime.malg /usr/local/go/src/runtime/proc.go
  512.05kB  2.35%   100%   512.05kB  2.35%  github.com/aerospike/aerospike-client-go.(*nodeStats).getAndReset /go/src/github.com/aerospike/aerospike-client-go/node_stats.go
         0     0%   100%   512.20kB  2.35%  github.com/aerospike/aerospike-client-go.(*Cluster).addNodes /go/src/github.com/aerospike/aerospike-client-go/cluster.go
         0     0%   100%   512.05kB  2.35%  github.com/aerospike/aerospike-client-go.(*Cluster).aggregateNodestats /go/src/github.com/aerospike/aerospike-client-go/cluster.go
         0     0%   100%  1601.54kB  7.36%  github.com/aerospike/aerospike-client-go.(*Cluster).seedNodes.func2 /go/src/github.com/aerospike/aerospike-client-go/cluster.go
         0     0%   100%  2624.71kB 12.05%  github.com/aerospike/aerospike-client-go.(*Cluster).tend /go/src/github.com/aerospike/aerospike-client-go/cluster.go
         0     0%   100%  3812.67kB 17.51%  github.com/aerospike/aerospike-client-go.(*Cluster).tend.func1 /go/src/github.com/aerospike/aerospike-client-go/cluster.go
         0     0%   100% 11619.67kB 53.36%  github.com/aerospike/aerospike-client-go.(*Cluster).tend.func4 /go/src/github.com/aerospike/aerospike-client-go/cluster.go
         0     0%   100%  2624.71kB 12.05%  github.com/aerospike/aerospike-client-go.(*Cluster).waitTillStabilized.func1 /go/src/github.com/aerospike/aerospike-client-go/cluster.go
         0     0%   100%     4902kB 22.51%  github.com/aerospike/aerospike-client-go.(*Node).GetConnection /go/src/github.com/aerospike/aerospike-client-go/node.go
         0     0%   100%  3812.67kB 17.51%  github.com/aerospike/aerospike-client-go.(*Node).Refresh /go/src/github.com/aerospike/aerospike-client-go/node.go
         0     0%   100%  3812.67kB 17.51%  github.com/aerospike/aerospike-client-go.(*Node).RequestInfo /go/src/github.com/aerospike/aerospike-client-go/node.go
         0     0%   100%     4902kB 22.51%  github.com/aerospike/aerospike-client-go.(*Node).getConnection /go/src/github.com/aerospike/aerospike-client-go/node.go
         0     0%   100%     4902kB 22.51%  github.com/aerospike/aerospike-client-go.(*Node).getConnectionWithHint /go/src/github.com/aerospike/aerospike-client-go/node.go
         0     0%   100%  3812.67kB 17.51%  github.com/aerospike/aerospike-client-go.(*Node).initTendConn /go/src/github.com/aerospike/aerospike-client-go/node.go
         0     0%   100% 11619.67kB 53.36%  github.com/aerospike/aerospike-client-go.(*Node).refreshPartitions /go/src/github.com/aerospike/aerospike-client-go/node.go
         0     0%   100%  3812.67kB 17.51%  github.com/aerospike/aerospike-client-go.(*Node).requestInfo /go/src/github.com/aerospike/aerospike-client-go/node.go
         0     0%   100%  1089.33kB  5.00%  github.com/aerospike/aerospike-client-go.(*nodeValidator).seedNodes /go/src/github.com/aerospike/aerospike-client-go/node_validator.go
         0     0%   100%  1089.33kB  5.00%  github.com/aerospike/aerospike-client-go.(*nodeValidator).validateAlias /go/src/github.com/aerospike/aerospike-client-go/node_validator.go
         0     0%   100% 11619.67kB 53.36%  github.com/aerospike/aerospike-client-go.(*partitionParser).parseReplicasMaster /go/src/github.com/aerospike/aerospike-client-go/partition_parser.go
         0     0%   100%  5991.33kB 27.52%  github.com/aerospike/aerospike-client-go.NewSecureConnection /go/src/github.com/aerospike/aerospike-client-go/connection.go
         0     0%   100%  1089.33kB  5.00%  github.com/aerospike/aerospike-client-go.RequestNodeInfo /go/src/github.com/aerospike/aerospike-client-go/info.go
         0     0%   100%  1089.33kB  5.00%  github.com/aerospike/aerospike-client-go.RequestNodeStats /go/src/github.com/aerospike/aerospike-client-go/info.go
         0     0%   100%      514kB  2.36%  github.com/aerospike/aerospike-client-go.init <autogenerated>
         0     0%   100% 11619.67kB 53.36%  github.com/aerospike/aerospike-client-go.newPartitionParser /go/src/github.com/aerospike/aerospike-client-go/partition_parser.go
         0     0%   100%  2112.67kB  9.70%  github.com/aerospike/aerospike-client-go.partitionMap.merge /go/src/github.com/aerospike/aerospike-client-go/partitions.go
         0     0%   100%   512.20kB  2.35%  github.com/aerospike/aerospike-client-go/types/atomic.(*SyncVal).Update /go/src/github.com/aerospike/aerospike-client-go/types/atomic/sync_val.go
         0     0%   100%  1089.33kB  5.00%  github.com/influxdata/telegraf/plugins/inputs/aerospike.(*Aerospike).Gather.func1 /go/src/github.com/influxdata/telegraf/plugins/inputs/aerospike/aerospike.go
         0     0%   100%  1089.33kB  5.00%  github.com/influxdata/telegraf/plugins/inputs/aerospike.(*Aerospike).gatherServer /go/src/github.com/influxdata/telegraf/plugins/inputs/aerospike/aerospike.go
         0     0%   100%      514kB  2.36%  github.com/influxdata/telegraf/plugins/inputs/aerospike.init <autogenerated>
         0     0%   100%      514kB  2.36%  github.com/influxdata/telegraf/plugins/inputs/all.init <autogenerated>
         0     0%   100%      514kB  2.36%  main.init <autogenerated>
         0     0%   100%      514kB  2.36%  runtime.main /usr/local/go/src/runtime/proc.go
         0     0%   100%   512.19kB  2.35%  runtime.mstart /usr/local/go/src/runtime/proc.go
         0     0%   100%   512.19kB  2.35%  runtime.newproc.func1 /usr/local/go/src/runtime/proc.go
         0     0%   100%   512.19kB  2.35%  runtime.newproc1 /usr/local/go/src/runtime/proc.go
         0     0%   100%   512.19kB  2.35%  runtime.systemstack /usr/local/go/src/runtime/asm_amd64.s

@deric
Copy link
Author

deric commented May 24, 2018

After 16min:

Showing nodes accounting for 118.80MB, 99.17% of 119.80MB total
Dropped 12 nodes (cum <= 0.60MB)
Showing top 10 nodes out of 32
      flat  flat%   sum%        cum   cum%
   72.73MB 60.71% 60.71%    72.73MB 60.71%  github.com/aerospike/aerospike-client-go.newPartitions
   39.89MB 33.30% 94.00%    39.89MB 33.30%  github.com/aerospike/aerospike-client-go.NewConnection
    4.64MB  3.87% 97.88%     4.64MB  3.87%  github.com/aerospike/aerospike-client-go.clonePartitions
    1.54MB  1.29% 99.17%     1.54MB  1.29%  github.com/zensqlmonitor/go-mssqldb.init
         0     0% 99.17%     4.64MB  3.87%  github.com/aerospike/aerospike-client-go.(*Cluster).tend
         0     0% 99.17%     6.38MB  5.33%  github.com/aerospike/aerospike-client-go.(*Cluster).tend.func1
         0     0% 99.17%    92.65MB 77.33%  github.com/aerospike/aerospike-client-go.(*Cluster).tend.func3
         0     0% 99.17%     7.74MB  6.46%  github.com/aerospike/aerospike-client-go.(*Cluster).tend.func4
         0     0% 99.17%     4.64MB  3.87%  github.com/aerospike/aerospike-client-go.(*Cluster).waitTillStabilized.func1
         0     0% 99.17%    34.04MB 28.41%  github.com/aerospike/aerospike-client-go.(*Node).GetConnection

@khaf
Copy link
Collaborator

khaf commented May 24, 2018

Do you think there's an easy way for me to setup telegraph to test this myself? A link to a documentation will also help.

It would also help if you state your setup: What version of Aerospike Server, Edition and number of nodes.

@deric
Copy link
Author

deric commented May 24, 2018

Just grab a package from release page and configure aerospike input in /etc/telegraf/telegraf.conf:

[[inputs.aerospike]]
  servers = ["localhost:3000"]

then test cofiguration with:

su - telegraf -s /bin/sh -c "telegraf -config /etc/telegraf/telegraf.conf -test -input-filter aerospike"

Probably no outputs are needed, but I haven't tested that.

We're using Aerospike 3.13.0.7 CE with 7 nodes.

@TheSAS
Copy link

TheSAS commented May 24, 2018

Mine setup is:

  • Aerospike Community Edition build 3.14.1.1
  • 1 node in the cluster

Outputs should be added, you can use:

[[outputs.file]]
  files = ["stdout"]  # files = ["/dev/null"]
  data_format = "influx"

@khaf
Copy link
Collaborator

khaf commented Jun 22, 2018

I can't reproduce this issue (anymore). Monitoring telegraph's memory via htop, it is not growing. Does it still happen for you?

@deric
Copy link
Author

deric commented Jun 22, 2018

@khaf which version of Telegraf do you use? The bug is present only in 1.6.3 version, then aerospike-client-go was reverted to old-stable version.

@khaf
Copy link
Collaborator

khaf commented Jul 17, 2018

@deric sorry it seems I missed the github notification in my mail. I just downloaded the latest version from github and compiled it. I noticed the code had the latest version of the client already in, and didn't suffer any memory leaks.

Do you still see any in the latest telegraph / aerospike version?

@deric
Copy link
Author

deric commented Jul 18, 2018

@kahf as you can see in telegraf#4128:

Obviously you won't see any problems in the latest version, as it uses old aerospike-client-go version.

If you wanna reproduce the issue, please download telegraf v1.6.3.

@khaf
Copy link
Collaborator

khaf commented Jul 19, 2018

Downloading the pre-compiled version won't help me much, since I need to compile to be able to both reproduce and thengit bisect.

Are you using the prebuilt versions of the telegraph? Could you try building it from source yourself with the latest go client to see if the memory leak is still there? I tried it with the latest client and I couldn't reproduce.

@deric
Copy link
Author

deric commented Jul 19, 2018

It's enough to checkout master branch and update Gopkg.lock file:

git clone https://github.com/influxdata/telegraf.git
make
./telegraf -config config.toml --pprof-addr localhost:6060 
  • v1.24.0 no memory leak
  • v1.25.1 no memory leak
  • v1.26.0 no memory leak
  • v1.27.0 no memory leak
  • v1.28.0 leaks memory
  • v1.32.0 leaks memory
  • v1.33.0 leaks memory
  • v1.34.0 still leaks memory:
# go tool pprof -top http://localhost:6060/debug/pprof/heap
Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap
Saved profile in /root/pprof/pprof.telegraf.alloc_objects.alloc_space.inuse_objects.inuse_space.019.pb.gz
File: telegraf
Type: inuse_space
Time: Jul 19, 2018 at 4:54pm (UTC)
Showing nodes accounting for 244.18MB, 97.99% of 249.19MB total
Dropped 28 nodes (cum <= 1.25MB)
      flat  flat%   sum%        cum   cum%
  152.16MB 61.06% 61.06%   152.16MB 61.06%  github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go.newPartitions /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go/partitions.go
   79.22MB 31.79% 92.85%    79.22MB 31.79%  github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go.NewConnection /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go/connection.go
    9.80MB  3.93% 96.78%     9.80MB  3.93%  github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go.clonePartitions /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go/partitions.go
    2.51MB  1.01% 97.79%     3.51MB  1.41%  github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go.newConnectionQueue /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go/connection_queue.go
    0.50MB   0.2% 97.99%    10.30MB  4.13%  github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go.partitionMap.merge /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go/partitions.go
         0     0% 97.99%     3.19MB  1.28%  github.com/influxdata/telegraf/plugins/inputs/aerospike.(*Aerospike).Gather.func1 /opt/go/src/github.com/influxdata/telegraf/plugins/inputs/aerospike/aerospike.go
         0     0% 97.99%     3.19MB  1.28%  github.com/influxdata/telegraf/plugins/inputs/aerospike.(*Aerospike).gatherServer /opt/go/src/github.com/influxdata/telegraf/plugins/inputs/aerospike/aerospike.go

@deric
Copy link
Author

deric commented Jul 19, 2018

pprof from v1.24:

# go tool pprof -top http://localhost:6060/debug/pprof/heap
Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap
File: telegraf
Type: inuse_space
Time: Jul 19, 2018 at 5:03pm (UTC)
Showing nodes accounting for 4048.41kB, 100% of 4048.41kB total
      flat  flat%   sum%        cum   cum%
  788.96kB 19.49% 19.49%   788.96kB 19.49%  github.com/influxdata/telegraf/vendor/github.com/zensqlmonitor/go-mssqldb.init /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/zensqlmonitor/go-mssqldb/cp936.go
  788.96kB 19.49% 38.98%   788.96kB 19.49%  github.com/influxdata/telegraf/vendor/github.com/zensqlmonitor/go-mssqldb.init /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/zensqlmonitor/go-mssqldb/cp949.go
  788.96kB 19.49% 58.46%   788.96kB 19.49%  github.com/influxdata/telegraf/vendor/github.com/zensqlmonitor/go-mssqldb.init /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/zensqlmonitor/go-mssqldb/cp950.go
  641.34kB 15.84% 74.31%   641.34kB 15.84%  github.com/influxdata/telegraf/vendor/github.com/zensqlmonitor/go-mssqldb.init /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/zensqlmonitor/go-mssqldb/cp932.go
  528.17kB 13.05% 87.35%   528.17kB 13.05%  regexp.(*bitState).reset /usr/local/go/src/regexp/backtrack.go
  512.02kB 12.65%   100%   512.02kB 12.65%  github.com/influxdata/telegraf/metric.(*metric).AddField /opt/go/src/github.com/influxdata/telegraf/metric/metric.go
         0     0%   100%   528.17kB 13.05%  github.com/influxdata/telegraf/agent.(*Agent).Connect /opt/go/src/github.com/influxdata/telegraf/agent/agent.go

v1.26.0:

Saved profile in /root/pprof/pprof.telegraf.alloc_objects.alloc_space.inuse_objects.inuse_space.046.pb.gz
File: telegraf
Type: inuse_space
Time: Jul 19, 2018 at 5:21pm (UTC)
Showing nodes accounting for 4113.54kB, 100% of 4113.54kB total
      flat  flat%   sum%        cum   cum%
    1025kB 24.92% 24.92%     1025kB 24.92%  runtime.allocm /usr/local/go/src/runtime/proc.go
 1024.14kB 24.90% 49.81%  1024.14kB 24.90%  github.com/influxdata/telegraf/vendor/github.com/aws/aws-sdk-go/aws/endpoints.init /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/aws/aws-sdk-go/aws/endpoints/defaults.go
  528.17kB 12.84% 62.65%   528.17kB 12.84%  regexp.(*bitState).reset /usr/local/go/src/regexp/backtrack.go
  512.16kB 12.45% 75.10%   512.16kB 12.45%  github.com/influxdata/telegraf/vendor/github.com/miekg/dns.reverseInt16 /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/miekg/dns/reverse.go
  512.06kB 12.45% 87.55%   512.06kB 12.45%  github.com/influxdata/telegraf/plugins/inputs/ceph.init.0 /opt/go/src/github.com/influxdata/telegraf/plugins/inputs/ceph/ceph.go
  512.02kB 12.45%   100%   512.02kB 12.45%  regexp.makeOnePass.func1 /usr/local/go/src/regexp/onepass.go
         0     0%   100%   528.17kB 12.84%  github.com/influxdata/telegraf/agent.(*Agent).Connect /opt/go/src/github.com/influxdata/telegraf/agent/agent.go
         0     0%   100%   528.17kB 12.84%  github.com/influxdata/telegraf/logger.(*telegrafLog).Write /opt/go/src/github.com/influxdata/telegraf/logger/logger.go

v1.28.0:

Showing nodes accounting for 13155.96kB, 100% of 13155.96kB total
      flat  flat%   sum%        cum   cum%
11075.50kB 84.19% 84.19% 11075.50kB 84.19%  github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go.(*partitionParser).parseReplicasMaster /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go/partition_parser.go
 1056.33kB  8.03% 92.22%  1056.33kB  8.03%  github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go.partitionMap.merge /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go/cluster.go
  512.10kB  3.89% 96.11%   512.10kB  3.89%  github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go.(*Cluster).addNodes.func1 /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go/cluster.go
  512.02kB  3.89%   100%   512.02kB  3.89%  github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go.newSingleConnectionQueue /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go/connection_queue.go
         0     0%   100%   512.10kB  3.89%  github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go.(*Cluster).addNodes /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go/cluster.go
         0     0%   100%   512.02kB  3.89%  github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go.(*Cluster).createNode /opt/go/src/github.com/influxdata/telegraf/vendor/github.com/aerospike/aerospike-client-go/cluster.go

The issue was most likely introduced between v1.27..v1.28

@ghost
Copy link

ghost commented Sep 22, 2018

I'm in trouble with client-go 1.31.0, which is consuming huge amount of memory around handling "Connection".

@deric
Can you think of anything that might have caused.

refs: https://discuss.aerospike.com/t/huge-amount-of-memory-usage-in-newpartitions-and-newconnection/5533/6

v1.31.0:

~ # go tool pprof http://localhost:6000/debug/pprof/heap
Fetching profile from http://localhost:6000/debug/pprof/heap
Saved profile in /root/pprof/pprof.main.localhost:6000.alloc_objects.alloc_space.inuse_objects.inuse_space.005.pb.gz
Entering interactive mode (type "help" for commands)
(pprof) 
(pprof) top
17.70GB of 18.15GB total (97.51%)
Dropped 798 nodes (cum <= 0.09GB)
Showing top 10 nodes out of 50 (cum >= 0.21GB)
      flat  flat%   sum%        cum   cum%
   12.18GB 67.10% 67.10%    12.18GB 67.10%  github.com/my-app/vendor/github.com/aerospike/aerospike-client-go.newPartitions
    4.34GB 23.92% 91.02%     4.35GB 23.96%  github.com/my-app/vendor/github.com/aerospike/aerospike-client-go.NewConnection
    0.51GB  2.81% 93.83%     0.51GB  2.81%  github.com/my-app/vendor/github.com/aerospike/aerospike-client-go.clonePartitions
    0.19GB  1.05% 94.88%     0.19GB  1.05%  github.com/my-app/vendor/github.com/aerospike/aerospike-client-go.newSingleConnectionQueue
    0.18GB  1.02% 95.90%     0.23GB  1.25%  github.com/my-app/vendor/github.com/aerospike/aerospike-client-go.(*info).parseMultiResponse
    0.10GB  0.53% 96.43%     0.64GB  3.50%  github.com/my-app/vendor/github.com/aerospike/aerospike-client-go.(*Cluster).tend
    0.05GB  0.28% 96.71%     0.12GB  0.66%  github.com/my-app/vendor/github.com/aerospike/aerospike-client-go.(*peerListParser).readPeer
    0.05GB  0.28% 96.99%     0.44GB  2.43%  github.com/my-app/vendor/github.com/aerospike/aerospike-client-go.parsePeers
    0.05GB  0.28% 97.27%     0.10GB  0.56%  github.com/my-app/vendor/github.com/aerospike/aerospike-client-go/types.(*Message).Serialize
    0.04GB  0.24% 97.51%     0.21GB  1.16%  github.com/my-app/vendor/github.com/aerospike/aerospike-client-go.(*info).sendCommand
(pprof) 
(pprof) 
(pprof) list github.com/my-app/vendor/github.com/aerospike/aerospike-client-go.newPartitions
Total: 18.15GB
ROUTINE ======================== github.com/my-app/vendor/github.com/aerospike/aerospike-client-go.newPartitions in /go/src/github.com/my-app/vendor/github.com/aerospike/aerospike-client-go/partitions.go
   12.18GB    12.18GB (flat, cum) 67.10% of Total
 Error: open /go/src/github.com/my-app/vendor/github.com/aerospike/aerospike-client-go/partitions.go: no such file or directory

@khaf
Copy link
Collaborator

khaf commented Dec 3, 2018

This seems to have been caused by the addition and default value of Policy.SocketTimeout which caused a lot of connection churn. Can anyone test this with the latest release v1.37.0?

@khaf
Copy link
Collaborator

khaf commented Dec 5, 2018

I'm closing this issue since it is fixed. Feel free to reopen or file a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants