-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Agent search node timeout #3937
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
f4ae944
to
dd73fdc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two main points, the rest are nits:
- We should have a single point where we are catching the timeouts instead of handling openai and other ones directly up the stack
- We should fail loudly in other exception cases to avoid hidden errors
quality_str: str = merge_message_runs(response, chunk_separator="")[ | ||
0 | ||
].content | ||
answer_quality = "yes" in quality_str.lower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not be using string matches for logic like these, or wrap it in a utility function that is well encapsulated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, created a function called binary_string_test(text, positive_value="yes") for that now.
|
||
quality_str: str = merge_message_runs(response, chunk_separator="")[0].content | ||
answer_quality = "yes" in quality_str.lower() | ||
except openai.APITimeoutError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we support a lot of different LLMs and they may raise different errors, should there be a central place to catch these library specific timeout errors (like around the LLM layer) and raise our own general LLMTimeoutError?
That way the rest of the code can use the centralized exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. LiteLLM errors caught in chatllm.py and then raised as our own errors LLMTimeoutError and LLMRateLimitError.
agent_error: AgentError | None = None | ||
response: list | None = None | ||
try: | ||
response = list( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't need to capture individual tokens, there's an invoke function which gathers all of the tokens. Should be more suitable for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -1,6 +1,7 @@ | |||
from datetime import datetime | |||
from typing import cast | |||
|
|||
import openai |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment below, it would allow us to not import openai here or LLM specifics in other modules when it can be correctly centralized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cited_documents = [ | ||
context_docs[id] for id in answer_citation_ids if id < len(context_docs) | ||
] | ||
log_results = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably some explicit null makes more sense here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for logging. But done.
) | ||
except openai.APITimeoutError: | ||
return ( | ||
history # this is what is done at this point anyway, so wwe default to this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo here and below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
history # this is what is done at this point anyway, so wwe default to this | ||
) | ||
|
||
except Exception: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's better to fail loudly and propagate up the stack if not an exception we are explicitly handling. Here and in other places. It prevents silent failures from staying around for a long time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I agree. Here however it is only a nice to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw except Exception make me very sad :'(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DONE!!
@@ -13,6 +13,20 @@ | |||
AGENT_DEFAULT_MAX_ANSWER_CONTEXT_DOCS = 10 | |||
AGENT_DEFAULT_MAX_STATIC_HISTORY_WORD_LENGTH = 2000 | |||
|
|||
AGENT_DEFAULT_TIMEOUT_OVERWRITE_LLM_GENERAL_GENERATION = 1 # in seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in these cases, it's actually better to just use the number in the code below directly 😰. But whatever, doesn't matter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer as is to see the parameters in one go. In fact, I think we should do that in general IMO.
backend/onyx/llm/chat_llm.py
Outdated
) -> BaseMessage: | ||
if LOG_DANSWER_MODEL_INTERACTIONS: | ||
self.log_model_configs() | ||
|
||
response = cast( | ||
litellm.ModelResponse, | ||
self._completion( | ||
prompt, tools, tool_choice, False, structured_response_format | ||
prompt, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that this matters and I'll stop pointing these out, but slight preference for named args in the case where there are many.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was there beforehand. But done.
backend/onyx/prompts/agent_search.py
Outdated
@@ -5,8 +5,9 @@ | |||
NO_RECOVERED_DOCS = "No relevant information recovered" | |||
YES = "yes" | |||
NO = "no" | |||
|
|||
|
|||
AGENT_LLM_TIMEOUT_MESSAGE = "The agent timed out. Please try again." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These aren't prompts though, pref a constants.py file under agents directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small streaming things need changes
prompt=msg, | ||
timeout_overwrite=AGENT_TIMEOUT_OVERWRITE_LLM_SUBANSWER_CHECK, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer timeout_override
...ts/agent_search/deep_search/initial/generate_individual_sub_answer/nodes/check_sub_answer.py
Outdated
Show resolved
Hide resolved
...ts/agent_search/deep_search/initial/generate_individual_sub_answer/nodes/check_sub_answer.py
Show resolved
Hide resolved
...ts/agent_search/deep_search/initial/generate_individual_sub_answer/nodes/check_sub_answer.py
Outdated
Show resolved
Hide resolved
...ts/agent_search/deep_search/initial/generate_initial_answer/nodes/generate_initial_answer.py
Show resolved
Hide resolved
...end/onyx/agents/agent_search/deep_search/shared/expanded_retrieval/nodes/verify_documents.py
Outdated
Show resolved
Hide resolved
history # this is what is done at this point anyway, so wwe default to this | ||
) | ||
|
||
except Exception: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw except Exception make me very sad :'(
backend/onyx/llm/chat_llm.py
Outdated
tools, | ||
tool_choice, | ||
structured_response_format, | ||
timeout_overwrite, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
override
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
<></> | ||
) : isComplete ? ( | ||
error && ( | ||
<p className="mt-2 mx-4 text-red-700 text-sm my-auto"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe could have a cute lil error component here? Would make me happier by slightly reducing indentation :')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- overwrite -> override - enums for error types - some nits
Rolled into #3994 |
Description
Timeouts for Agent Search. Goal is to i) avoid hanging because of a hanging validation step and ii) avoid the user not knowing whether the process is broken if the LLM hangs.
Strategy:
Behavior:
How Has This Been Tested?
Still in works: behavior if question generation hangs.
[Describe the tests you ran to verify your changes]
Backporting (check the box to trigger backport action)
Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.