Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8s-ci-robot result comment grid has stale results while new jobs are running #4602

Closed
stevekuznetsov opened this issue Sep 18, 2017 · 44 comments
Labels
area/prow Issues or PRs related to prow kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@stevekuznetsov
Copy link
Contributor

stevekuznetsov commented Sep 18, 2017

When someone issues a /retest the result matrix has a link to failed results from the past even when a new build of that job is running.

/area prow
/kind bug
/cc @Kargakis

@k8s-ci-robot k8s-ci-robot added area/prow Issues or PRs related to prow kind/bug Categorizes issue or PR as related to a bug. labels Sep 18, 2017
@xiangpengzhao
Copy link
Contributor

Sometimes the stale results are useful to me. Say I may want to check some more details for the failed jobs when the new jobs are running. It's hard to find the link if there is no result comment.

@stevekuznetsov
Copy link
Contributor Author

You should always be able to see historical results in the Gubernator link. Right now we have a lot of new users being very confused by issuing /retest and seeing immediately failures when the new jobs are still running.

@xiangpengzhao
Copy link
Contributor

Hmm...I'm on the fence... 😄

@stevekuznetsov
Copy link
Contributor Author

I don't know. The comment, as a statement of "this is the current state of testing for your PR," is incorrect today. New users are constantly being confused by this. We already have a mechanism to show historical data. At the very least we need to be clear that the test failed but we still are re-running it. Developers are not getting the right feedback for their /retest action today.

@xiangpengzhao
Copy link
Contributor

Agree that it's confused for some new contributors to see the failure existing when a re-test is running. But it's not easy to find historical data for the tests if there isn't a link shown somewhere. Most people can't remember the link. Fortunately, most people don't care about the historical failures, either:)

@stevekuznetsov
Copy link
Contributor Author

The link to Gubernator should be in the same comment, just a couple lines lower.

@xiangpengzhao
Copy link
Contributor

If all failed tests are re-running, as is expected in this issue, the comment will be removed, right? So I still can't find the link at that very time.

@spxtr
Copy link
Contributor

spxtr commented Sep 18, 2017

I'm on the fence about this. The current state is shown in the status lines at the bottom of the PR.

So I still can't find the link at that very time.

You can if you know how to go from the details link.

@xiangpengzhao
Copy link
Contributor

You can if you know how to go from the details link.

So I can't if I don't know :) and maybe this is one of the reasons why we want that comment 😄

@stevekuznetsov
Copy link
Contributor Author

I think it's exactly the fact that the comment and the status are incongruous that is the problem @spxtr ... if the test is running, it has not failed.

@stevekuznetsov
Copy link
Contributor Author

I guess I should re-frame this. Right now the dual messaging is:

  • confusing new users (and the fact that old users are used to it and not confused is not a good thing either)
  • not giving developers the feedback they need when they push /retest

We should find a way to address that.

@xiangpengzhao
Copy link
Contributor

Maybe we can modify the comment to tell contributors which results are stale and which are fresh.

@stevekuznetsov
Copy link
Contributor Author

I was thinking about that, but we already have a place to put stale results, and that's the historical grid in Gubernator. I don't see a compelling reason to invent a new place for it but we could add a column I guess.

@xiangpengzhao
Copy link
Contributor

SGTM :)

@stevekuznetsov
Copy link
Contributor Author

/cc @bparees

@stevekuznetsov
Copy link
Contributor Author

In case we need it... more devs confused by this behavior: openshift/openshift-ansible#6328 (comment)

@stevekuznetsov
Copy link
Contributor Author

/cc @sosiouxme

@stevekuznetsov
Copy link
Contributor Author

@cjwagner do you have any thoughts on this one?

@cjwagner
Copy link
Member

cjwagner commented Dec 4, 2017

I think it's exactly the fact that the comment and the status are incongruous that is the problem

The only case in which these are not the same is when the status contexts say that tests are running and the comment shows previous test results. This doesn't seem like a confusing state to me, the comment shows test results so it seems apparent that any results displayed while test are running refer to the previous round of tests. Furthermore, the user's /test or /retest comment (or new commit push) would have to appear after the bot notification on the PR so it should be obvious that the results are not new.

I don't feel strongly about this though. Adding a column to the comment to show the job start or end times might be the best way to minimize confusion.

@BenTheElder
Copy link
Member

I don't feel strongly about this though. Adding a column to the comment to show the job start or end times might be the best way to minimize confusion.

I would note that the start times are already somewhat confusing since some of our tooling considers start to be when the job was scheduled (deck at least) and other parts consider whatever started.json claims, which can often have large skew when we're at capacity.

@bparees
Copy link
Contributor

bparees commented Dec 4, 2017

The only case in which these are not the same is when the status contexts say that tests are running and the comment shows previous test results. This doesn't seem like a confusing state to me, the comment shows test results so it seems apparent that any results displayed while test are running refer to the previous round of tests.

it's not so obvious when you get an email(github notification) from the bot telling you what tests failed and you just click the links from the email instead of going to the PR (because you're on your phone).

@stevekuznetsov
Copy link
Contributor Author

This doesn't seem like a confusing state to me, the comment shows test results so it seems apparent that any results displayed while test are running refer to the previous round of tests.

I think it's a lot easier for us, as people pretty intimate with the code and the system in general, to understand this. We've pretty consistently seen developers get confused about this exact distinction ever since we moved to using Prow. I think it's pretty clear that there is a messaging gap we should bridge here for developers that are more concerned about tests getting run and being green than how Prow works or the intricacies of the comment.

It seems like to some extent we're re-creating the status signal in this comment as it's a way to ping everyone involved when a state transition changes (as opposed to the real status API which is silent). If there is not a strong NACK to just not having out-of-date rows in the table, I will open a PR to do that. If someone wants to see historical results, they can always view Gubernator.

@sosiouxme
Copy link
Contributor

sosiouxme commented Dec 5, 2017 via email

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 5, 2018
@sosiouxme
Copy link
Contributor

I for one would still like this to be fixed. Very confusing.
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 5, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 3, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@BenTheElder
Copy link
Member

/remove-lifecycle rotten
/reopen
this is still true and still confuses developers

@k8s-ci-robot k8s-ci-robot reopened this Apr 19, 2019
@k8s-ci-robot
Copy link
Contributor

@BenTheElder: Reopened this issue.

In response to this:

/remove-lifecycle rotten
/reopen
this is still true and still confuses developers

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Apr 19, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 19, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 18, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@BenTheElder
Copy link
Member

/remove-lifecycle rotten
/reopen

@k8s-ci-robot
Copy link
Contributor

@BenTheElder: Reopened this issue.

In response to this:

/remove-lifecycle rotten
/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Sep 17, 2019
@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Sep 17, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 16, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 15, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/prow Issues or PRs related to prow kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

10 participants