Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scala CE3 #111

Merged
merged 21 commits into from
Feb 20, 2024
Merged

Scala CE3 #111

merged 21 commits into from
Feb 20, 2024

Conversation

jamesward
Copy link
Owner

No description provided.

@jamesward
Copy link
Owner Author

sbt 1.9.0 deprecated the IntegrationTest thing so I just made the tests regular tests instead of setting up a subproject. Technically they are integration tests but for this example I don't think it is necessary to be correct on that as I'd rather not have subprojects. @paul-snively let me know if you'd rather go the subproject route.

@paul-snively
Copy link
Contributor

On two runs now, I get a 500 response from the server on scenario 6.

Server invocation:

docker run -it -p8080:8080 ghcr.io/jamesward/easyracer --debug > easyracer.log 2>&1

Run the scala-ce3 subproject with sbt run.

Logs are attached (and not very helpful). :-)
easyracer.log

@jamesward
Copy link
Owner Author

I'm not sure why we don't see it in CI but the server seems to be doing the correct thing so it seems there is something in raceSuccessAll that is not handling the first response (which is correctly a 500) as a loser. F.race does say:
If the winner completes with Outcome.Errored, the race raises the error.

So maybe that is why? But I'm not sure how to fix it.

@paul-snively
Copy link
Contributor

Oh, I misremembered how the client code works. It raises an error at the end if any of the scenarios' results aren't "right".

OK, I've looked at everything locally again, and the code runs fine. It does log that 500 status, but that's apparently http4s logging the decoding failure from expect. The actual value/error returned is correct and raceSuccessAll handles it correctly.

So it looks like we're back to the drawing board on your CI problem. Can you tell me a bit more about what that looks like in your CI environment?

@jamesward
Copy link
Owner Author

Ah. Just a red herring then. 😀

For the CI problem, in Scenario 9 the letters are not in the right order and I'm not sure why as the code looks right. And it's odd that it works fine for me locally. Let me know if you have any ideas on how it might be possible to get those in the wrong order.

@jamesward
Copy link
Owner Author

I enabled logging and see the responses and instants:

List(
  (2023-10-09T19:35:47.515385Z,t),
  (2023-10-09T19:35:47.556897Z,i),
  (2023-10-09T19:35:47.375633Z,h),
  (2023-10-09T19:35:46.856362Z,r),
  (2023-10-09T19:35:47.234961Z,g),
)

The responses should be spaced out by 1 second each as that is how the server is sending them. I'm still not sure how that is going wrong and only in CI. My only guess is that some amount of parallelism is being based on number of cores but I'm not sure how to try and change that.

@paul-snively
Copy link
Contributor

Yep, I think I see the problem. Great detective work! I'll poke at this a bit later.

@paul-snively
Copy link
Contributor

Oh, nevermind: the issue isn't on the client side, right? I added logging to my scenario9, and in running the test I saw the second between each response, as you said.

@jamesward
Copy link
Owner Author

I think this has to be an issue on the client side (not 100% certain though). None of the other clients have an issue with this. It only happens in CI (where there are 2 cores). I guess we could try to reproduce this locally by somehow only having 2 cores... Will have to investigate how to turn my cores off or something.

@paul-snively
Copy link
Contributor

@jamesward, I've updated my branch on my fork, for what that's worth.

But it's not likely to be worth much, as a conversation I had with Daniel Spiewak on Discord explains:

fwiw -XX:ActiveProcessorCount=2 is the flag you need
so it's not entirely clear from that PR discussion thread what kind of order is being expected
are you seeing the list items out of order?
where is the ordering coming from?
oh it's because you're re-sorting by realTimeInstant
you're basically making the assumption that CE will schedule now <- Clock[IO].realTimeInstant at the moment it receives the response from the server, but that won't happen
this gets complicated and hairy, but the tldr is that the response from the server comes in as an async callback which shows up on the external pool and gets pulled off by worker threads in some order and with some timing
it's entirely possible and plausible even with a small number of workers that things can get reordered with respect to clock time
and at an absolute minimum, the responses from the server sockets goes through multiple nondeterministic linearizations, including one within the kernel itself
basically, don't expect sequencing which isn't enforced by the structure of the program 🙂

@jamesward jamesward merged commit bee2b66 into main Feb 20, 2024
18 of 24 checks passed
@jamesward
Copy link
Owner Author

Thanks @paul-snively! I've merged this and we can do some more tweaking as needed.

@jamesward jamesward deleted the scala-ce3 branch November 21, 2024 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants