Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A benchmark of Fedora-4 perfomance #773

Closed
jcoyne opened this issue Dec 10, 2014 · 14 comments
Closed

A benchmark of Fedora-4 perfomance #773

jcoyne opened this issue Dec 10, 2014 · 14 comments
Milestone

Comments

@jcoyne
Copy link
Contributor

jcoyne commented Dec 10, 2014

time = Benchmark.measure do
  (1..1000).each { GenericFile.create! { |f| f.apply_depositor_metadata('jcoyne') } }
end
=> #<Benchmark::Tms:0x007fb240810548 @label="", @real=631.307004, @cstime=0.0, @cutime=0.0, @stime=32.4, @utime=261.5, @total=293.9>
puts time
261.500000  32.400000 293.900000 (631.307004)

That's an ingest rate of 95 records per minute. Interestingly the CPU was not maxed out, most of the time was spent waiting on IO (fedora/solr)

@jcoyne
Copy link
Contributor Author

jcoyne commented Dec 10, 2014

Here's what happens in each create:

  • Post object to Fedora
  • Check for (HEAD) child "content" (3x)
  • Check for (HEAD) child "thumbnail" (3x)
  • Check for (HEAD) child "full_text" (3x)
  • Check for (HEAD) the object
  • Retrieve (GET) the object
  • Check for (HEAD) child "characterization"
  • Check for (HEAD) child "full_text"
  • Query Solr for collection members (4x)
  • Write the object to solr
  • POST ACL to Fedora
  • Check for (HEAD) the ACL
  • Retrieve (GET) the ACL
  • Write the object to solr (with ACL data)

See a full log here: https://gist.github.com/jcoyne/b46309ce0a81fa6e2782

There are some clear opportunities for reducing the duplicate (3x and 4x) requests. Since the object has assertions about it's children. We may be able to remove the HEAD requests when children aren't present too.

@jcoyne jcoyne added this to the fedora-4 milestone Dec 10, 2014
@carolyncole
Copy link
Contributor

@jcoyne Does look like we are hitting fedora too much and could use some caching. Are we to the point where we want to tackle this, or is it something for the future?

@jcoyne
Copy link
Contributor Author

jcoyne commented Dec 10, 2014

@Cam156 I think now is a good time to look at improving performance if we don't have anything critical to work on. We are hitting fedora just a bit too much and we could perhaps shave 100ms off each create. One problem is that we have to Push to Solr twice because of the logic that validates an object has a depositor and that the depositor should have edit access. I'm wondering if there's any way around that.

@hectorcorrea
Copy link
Contributor

@jcoyne what do we get from the HEAD calls? Do we really need them? (They are in LDP, right?)

@jcoyne
Copy link
Contributor Author

jcoyne commented Dec 10, 2014

@hectorcorrea you get an acknowledgement that the resource exists without having to load the resource itself. They are faster particularly for large resources.

@jcoyne
Copy link
Contributor Author

jcoyne commented Dec 10, 2014

@hectorcorrea additionally you get all the headers like Content-Disposition, Content-Type, Content-Length, etc.

@jcoyne
Copy link
Contributor Author

jcoyne commented Dec 10, 2014

samvera/ldp#34 removes the duplicate 404s.

@jcoyne
Copy link
Contributor Author

jcoyne commented Dec 10, 2014

samvera-deprecated/hydra-collections#50 halves the number of Solr calls required to index the collections.

@jcoyne
Copy link
Contributor Author

jcoyne commented Dec 11, 2014

samvera/active_fedora#645 further reduces calls to Solr when objects are not members of any collections.

@jcoyne
Copy link
Contributor Author

jcoyne commented Dec 11, 2014

irb(main):006:0> time = Benchmark.measure do
irb(main):007:1*   (1..1000).each { GenericFile.create! { |f| f.apply_depositor_metadata('jcoyne') } }
irb(main):008:1> end
=> #<Benchmark::Tms:0x007fcf76fd7440 @label="", @real=410.517791, @cstime=0.0, @cutime=0.0, @stime=26.590000000000003, @utime=235.13, @total=261.72>
irb(main):009:0> puts time
235.130000  26.590000 261.720000 (410.517791)
=> nil

Yay, now we're up to 146 per minute..

@jcoyne jcoyne closed this as completed Dec 11, 2014
@jcoyne
Copy link
Contributor Author

jcoyne commented Dec 15, 2014

Compare to Fedora 3:

> puts time
490.660000  16.990000 507.650000 (737.432756)

@hectorcorrea
Copy link
Contributor

👍

@mjgiarlo
Copy link
Member

How many per minute is F3, @jcoyne?

@rotated8
Copy link
Contributor

Roughly 81 per minute, I believe.

jcoyne added a commit that referenced this issue Nov 17, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants