-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: update timeout error UX #10274
Conversation
superset-frontend/src/components/ErrorMessage/TimeoutErrorMessage.tsx
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great feature! Added a couple of comments.
{['chart', 'dashboard'].includes(extra.location) && extra.owners && ( | ||
<> | ||
<br /> | ||
<p>{t('Please reach out to the Chart Owner(s) for assistance.')}</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: why is "Chart Owners" in title case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is as defined in @Steejay's specs, which would you prefer here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can go either way if we are considering Chart Owner a proper noun for Superset and will be consistent everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@etr2460 I interpreted Chart Owners as the title of a role so I defined it w title case.
</> | ||
)} | ||
</div> | ||
)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of two big code chunks, how about creating two subcomponents SimpleTimeoutErrorMessage
and DetailedTimeoutErrorMessage
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me look into simplifying this a bit more. I might pull some things out into smaller, more reusable components. It's a bit tricky, because depending on where the error is being rendered, it's supposed to look different (sometimes using a modal, sometimes expanding inline)
export const ERROR_DATASOURCE_TOO_LARGE = { | ||
message: t('Error 1000 - The datasource is too large to query.'), | ||
link: `${ROOT_ERROR_REFERENCE_URL}error-1000`, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be more structural? I'm imagining a errors.json
somewhere that could potentially be shared by both the frontend and doc generator:
{
"DATASOURCE_TOO_LARGE": {
"code": 1001,
"title": "The database is too large to query",
"description": "It's likely your datasource has grown too large to run the current
query, and is timing out. You can resolve this by reducing the size of your datasource or by modifying your query to only process a subset of your data."
},
"DATASOURCE_OVERLOAD": {
"code": 1002,
....
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 this. Error messages should be defined in API level. So that API user (that use json response without js rendered chart) should get the same message when some query went wrong.
Front-end should handle how to render error by its code, not define what is this error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there are a couple things being conflated here I think. First off, an idea of API error codes. These are already consistently defined and described on both the backend and the frontend here:
https://github.com/apache/incubator-superset/blob/master/superset/errors.py#L24
Part of the problem is that each of these errors from a software perspective could have multiple possible root causes and mitigations from the user's perspective. In this case, a timeout error can have multiple different root causes and mitigations, and these root causes/mitigations could apply to many different software errors too. We could add another field onto the error payload coming from the backend to include root causes and mitigations, but I also want to ensure we have freedom on the frontend to render these errors in whatever rich format we see fit. Thus why the software's idea of errors exists on the backend, but the user interpretations live in the frontend.
Finally, as to making the errors more structural: I agree this can be defined better, I'll restructure the TS object. However, I'm a bit anxious about investing a bunch of time into autogenning docs from it since we just voted to change doc systems a couple days ago. It might be better to wait for the doc migration, then to build an autogenned doc page with error details
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. Didn't think it was necessary to build autogen right now either.
1c4437d
to
cc26a70
Compare
Do you have specifications on what are errors, what are warnings etc? |
0c3ebcf
to
8aa7e32
Compare
super(SupersetTimeoutException, self).__init__(message) | ||
self.error = SupersetError( | ||
error_type=error_type, message=message, level=level, extra=extra | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering whether we can just extend the attributes of SupersetError
directly on SupersetException
and use a custom JSON encoder to serialize it. This way you remove a layer of abstraction and the need to map the arguments for any new errors your might add in the future,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, possibly. I'm not super familiar with what the plan is for exception handling on the Python side, but having the SupersetExceptions automatically contain the error properties might make sense.
The reason I shied away from it originally was so that it was very clear what attributes the clients need to render the errors and what parts are server specific. But maybe that's not a big issue. Maybe @john-bodley has thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ktmud could you expand (possibly with an example) on your thoughts? Are you looking for more of a generic container for handling error properties?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking maybe subclasses of SupersetException
itself can handle encoding of error code and messages. Here's an example:
class SupersetException(Exception, SupersetAlert):
def __init__(...) -> None:
SupersetAlert.__init__(
self, code=code, message=message, description=description, extra=extra
)
Exception.__init__(self, self.message)
def json_error_response(error: SupersetException) -> FlaskResponse:
return jsonify({ "error": error.to_json() })
8aa7e32
to
00f8ecd
Compare
Codecov Report
@@ Coverage Diff @@
## master #10274 +/- ##
==========================================
+ Coverage 65.57% 70.46% +4.88%
==========================================
Files 600 602 +2
Lines 32101 32177 +76
Branches 3244 3263 +19
==========================================
+ Hits 21051 22672 +1621
+ Misses 10869 9401 -1468
+ Partials 181 104 -77
Continue to review full report at Codecov.
|
const message = ( | ||
<> | ||
<p> | ||
{t('This may be triggered by:')} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking the logic for this may be triggered by
, should come from a standard, general json format generated from backend.
front-end should not decide what error is caused by what reason. Front-end timeout limit is defined in config, it's not something decided by front-end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've introduced an error_codes
mapping within extra
that gets passed down from the backend. You're still free to alter it on the frontend as needed, but it consolidates potential root causes to the backend with the more developer helpful error_type
24c85f7
to
c44b8aa
Compare
"message": _("Error 1001 - The database is under an unusual load."), | ||
}, | ||
] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, just realized a bigger question: are there cases that the backend will be able to tell which is the actual reason? If not, why can't they be the same error code?
## Error 1000 - Timeout communicating with datasource.
Your datasource may be too large to query or was under an unusual load.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also possible to remove such mapping altogether:
class SupersetError(IntEnum):
BACKEND_TIMEOUT = 1001
class SupersetException(Exception):
code = None
def __init__(self, message=None, description=None):
if not self.code:
raise RuntimeError('Must not raise SupersetException directly')
error_name = SupersetError(self.code).name
self.message = message or _(f'error.{error_name}.message')
self.description = description or _(f'error.{error_name}.message')
class SupersetTimeoutException(SupersetException):
code = SupersetErrorEnum.BACKEND_TIMEOUT
Then add the detailed message and explanation in i18n.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this again points to the differentiation between the actual error and potential root causes. In this case, Error 1001
might be the root cause for many different errors (a timeout error, a generic DB error, a 503/504 from the db engine, etc.). That was the reason for splitting them into two different structures. However, I agree it's getting into a zone of too many different abstraction layers with both the error_type and the error_codes...
@Steejay, this is something where your thoughts would be helpful: e.g. what relationship do you see between error types and the in product user facing error codes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal of the error code is to suggest a path to a solution by highlighting the potential sources. Ideally we'd like to design a message that can pinpoint the source but this can't always be achieved so instead we try to provide as much context as possible to let the technical user troubleshoot. But youre right, the same error source can lead to different errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've expanded my thoughts a little and here's a more detailed example: https://gist.github.com/ktmud/59ccbdf6c96570e1ffe41ee8e5ecd479
Let me know if anything was unclear.
Even if an error may represent different root causes, we can return a generic error code and potential causes in extras
if needed. If the backend is able to identify the root cause, then cool, we return a more specific error code. But there doesn't have to be another layer of abstraction with all the mappings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree error codes and possible root causes shouldn't be mixed together as suggesting root causes and providing remedies are more of a documentation issue rather than something the program cares about.
I've removed the description
field in my example code, but other parts of the code still feel like an improvement to me though. There really is no need to have two separate entities for exception catching and messaging.
As for the mapping between symptoms and root causes, maybe we can extend the errors.json
file I suggested earlier and add a causes
field?
{
"issues": {
"DATASOURCE_TOO_LARGE": {
"code": 1001,
"title": "The database is too large to query",
"description": "It's likely your datasource has grown too large to run the current query, and is timing out. You can resolve this by reducing the size of your datasource or by modifying your query to only process a subset of your data."
},
"DATASOURCE_OVERLOAD": {
"code": 1002,
// ....
},
},
"causes": {
"BACKEND_TIMEOUT_ERROR": ["DATASOURCE_TOO_LARGE", "DATASOURCE_OVERLOAD"],
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might need separate entities for exception catching and error messaging. One example is for async queries in sql lab. We would catch an exception from the query failing, but then only want to write a subset of attributes to the query table for returning to the client later. Then when the client polls for query status, we're returning an error payload within that request, but no exception is occuring. They seem like 2 different entities to me, and I'm not sure how much is improved by having a complex inheritance tree consisting of both native Python Exceptions and UI specific payload classes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are indeed two entities, one is runtime exception, another is the result payload used by the API. But it doesn't mean the payload cannot be generated by exception itself.
The improvement is that you don't have to manually generate the payload every time an exception occurs. You just store relevant information as top-level attributes on the Exception and some general error handlers will return a consistent payload whenever needed. The payload doesn't have to be UI specific. We can certainly return different payloads for storage and API by adding arguments to the to_json
method, if that's what we want.
I just find it weird that you had to initialize two classes in order to raise an API-conforming exception. We already use class inheritance for Exceptions and used the pattern extensively for superset.commands.exceptions
, so I'm not sure an additional mixin to serialize the exception is that much of complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, i haven't seen the superset.commands.exceptions
module yet, i'll look at that and see how other parts of the code handles errors. This has been a wild west of sorts on the backend side, and I've mainly been focused on making sure whatever happens on the backend, the frontend gets something consistent. If there's a more standard approach now (i think it was discussed in SIP-41) then i'm happy to follow it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a caveat of some potential future refactoring:
- Create an extended
SupersetErrorAlert
component that renders conformingSupersetError
in consistent format and makeTimeoutErrorMessage
reusable. - (Optional) Consolidate
SupersetError
andSupersetException
in the backend.
super(SupersetTimeoutException, self).__init__(message) | ||
self.error = SupersetError( | ||
error_type=error_type, message=message, level=level, extra=extra | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ktmud could you expand (possibly with an example) on your thoughts? Are you looking for more of a generic container for handling error properties?
db3d101
to
d041659
Compare
@graceguo-supercat and @ktmud, sorry for the delayed response here. I think this is in a good place for final review now. Some significant changes from before:
Thanks for taking the time to be diligent here and get this PR into a better state than it started. Hopefully it should be good to go! |
Are the constructs of how the errors are defined still a blocking issue? If so SIP-41 describes the somewhat current state of how errors are handled by the backend and there is clearly a number of inconsistencies which need to be addressed. I wonder whether it makes sense to flush out the architecture of error constructs in SIP-41 as the problem is far-reaching and beyond just the scope of the issues/concerns raised here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, never intended to let the discussions regarding error handling in the backend block this PR. Sorry for not stamping earlier!
@@ -409,6 +409,10 @@ div.tablePopover { | |||
padding-right: 8px; | |||
} | |||
|
|||
.ResultSetErrorMessage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mind use "-" not case in css class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my bad, updated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one lint comment for css class name. otherwise LGTM!
d041659
to
e41361f
Compare
SUMMARY
Implements new error designs within Superset, starting with timeout errors. This standardizes all timeout errors, frontend and backend, to the UI component. Future errors will be built following this same pattern. Also adds documentation pointers for potential root causes so users can unblock themselves
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TEST PLAN
CI + unit tests
Set the timeouts to 1 second and see the new error component across Dashboards, Explore, and SQL Lab. Ensure everything renders correctly when there are no errors.
ADDITIONAL INFORMATION
to: @graceguo-supercat @ktmud @nytai @rusackas
cc: @Steejay