Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: federation requests timeout #24

Open
wants to merge 2 commits into
base: development
Choose a base branch
from

Conversation

Bilelkihal
Copy link
Collaborator

@Bilelkihal Bilelkihal commented Jan 9, 2025

Related issue: ontoportal-lirmm/bioportal_web_ui#889

Problem Description

The current implementation of the get method has a timeout set to 60 seconds. This creates a significant delay in cases where a portal (e.g., BiodivPortal) is down because multiple requests are sent for a single operation. For example:

1 - If BiodivPortal is down, selecting it on the browse page triggers multiple requests to its API endpoints (e.g., /ontologies, /submissions, /categories, etc.).
2 - Each request waits for the full 60-second timeout before failing.
3 - This results in compounded delays, causing the user to wait for several minutes before receiving feedback.

Proposed solution in this PR

To address this issue, this PR introduces the following improvements:

  • When the first request to a portal (e.g., BiodivPortal) times out, the system caches the "portal down" status for 10 minutes.
  • Subsequent requests to the same portal within this cache period will be skipped, avoiding unnecessary delays.
  • The request timeout is reduced from 60 seconds to 20 seconds.

Result

  • If a portal is down, the total wait time for the user is 20 seconds before getting the message: "The portal ... is down".

Demo

Fully explained in browse page (enable sound)
https://drive.google.com/file/d/13JAVRZradprsXNjwmhJj_AMA58WwTTYr/view?usp=sharing

Screenshot from search page
image

@Bilelkihal Bilelkihal self-assigned this Jan 9, 2025
@Bilelkihal Bilelkihal added the bug label Jan 9, 2025
Copy link
Collaborator Author

@Bilelkihal Bilelkihal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -42,6 +54,14 @@ def request_portals(params = {})

portals
end


def portal_name_from_id(id)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be refactured to this

def portal_name_from_id(id)
  LinkedData::Client::HTTP.federated_conn.find { |_, value| value.url_prefix.to_s.eql?(id) }&.first || ''
end 

portal_status = Rails.cache.read("federation_portal_up_#{portal_name}")
end

unless portal_status

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do it inline here like this

next [OpenStruct.new(errors: "Problem retrieving #{portal_name}")] unless portal_status

@syphax-bouazzouni
Copy link

syphax-bouazzouni commented Jan 9, 2025

The demo video was nice

@jonquet
Copy link

jonquet commented Jan 9, 2025

Quick question: can you leave the time out at 60s and do only one call if the first one fails. I mean, you go to the 2 other calls only of the first one doe snot times out.

@Bilelkihal
Copy link
Collaborator Author

we decided today to keep the timeout 30s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants