Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Bug Report: Refreshing OAuth token... forever and stop working #150

Closed
freedit-dev opened this issue Jun 23, 2024 · 32 comments
Closed
Labels
bug Something isn't working

Comments

@freedit-dev
Copy link
Contributor

It occurs several times that my instance stop working and the last log is always

INFO redlib::oauth > [⌛] 86279s Elapsed! Refreshing OAuth token...

And restarting redlib always fixed it.

Maybe there should be timeout for refreshing token.

Thanks for your great work.

@freedit-dev freedit-dev added the bug Something isn't working label Jun 23, 2024
@sigaloid
Copy link
Member

Weird. The refresh code is literally just logging in again, which is what happens at launch.

redlib/src/oauth.rs

Lines 102 to 108 in 6c2579c

async fn refresh(&mut self) -> Option<()> {
// Refresh is actually just a subsequent login with the same headers (without the old token
// or anything). This logic is handled in login, so we just call login again.
let refresh = self.login().await;
info!("Refreshing OAuth token... {}", if refresh.is_some() { "success" } else { "failed" });
refresh
}

I suppose it could have something to do with the header cache not being cleared.

Can you reproduce this consistently? If so, can you try running off of this branch? https://github.com/redlib-org/redlib/tree/try_fix_oauth_refresh

@extremelyonline
Copy link

Not sure if this is the same issue as #81

@freedit-dev
Copy link
Contributor Author

freedit-dev commented Jun 27, 2024

The latest commit(fix_oauth_ratelimit)

The instance stops responding(no error log) and the last log:

Jun 27 01:41:19 base redlib[4038062]:  ERROR redlib::utils  > Error page rendered: Reddit error 400 "null": "Bad Request"
Jun 27 01:41:57 base redlib[4038062]:  ERROR redlib::utils  > Error page rendered: Reddit error 400 "null": "Bad Request"
Jun 27 01:43:33 base redlib[4038062]:  WARN  redlib::client > Rate limit 9 is low. Spawning force_refresh_token()
Jun 27 01:43:33 base redlib[4038062]:  INFO  redlib::oauth  > [✅] Success - Retrieved token "eyJhbGciOiJSUzI1NiIsImtpZCI6IlNI...", expires in 86399
Jun 27 01:43:33 base redlib[4038062]:  INFO  redlib::oauth  > Refreshing OAuth token... success
Jun 27 01:43:33 base redlib[4038062]:  WARN  redlib::client > Rate limit 8 is low. Spawning force_refresh_token()
Jun 27 01:43:33 base redlib[4038062]:  INFO  redlib::oauth  > [✅] Success - Retrieved token "eyJhbGciOiJSUzI1NiIsImtpZCI6IlNI...", expires in 86399
Jun 27 01:43:33 base redlib[4038062]:  INFO  redlib::oauth  > Refreshing OAuth token... success
Jun 27 01:47:35 base redlib[4038062]:  WARN  redlib::client > Rate limit 9 is low. Spawning force_refresh_token()
Jun 27 01:47:35 base redlib[4038062]:  WARN  redlib::client > Rate limit 8 is low. Spawning force_refresh_token()

I think it needs timeout for token retrieving.

And restarting fixed it.

@sigaloid
Copy link
Member

Could be an issue with the atomic/concurrent nature of the token retrieval. Would appreciate if you could test after #158 to see if it has any difference

@freedit-dev
Copy link
Contributor Author

freedit-dev commented Jun 27, 2024

Wow! Quick fix.

Deployed.

https://uptime.freedit.eu/status/up

@freedit-dev
Copy link
Contributor Author

@sigaloid hang again

Jun 27 02:39:12 base redlib[4049503]:  TRACE redlib::client > Ratelimit remaining: 11.0
Jun 27 02:39:12 base redlib[4049503]:  TRACE redlib::client > Ratelimit used: 89
Jun 27 02:39:12 base redlib[4049503]:  TRACE redlib::client > Ratelimit reset: 48
Jun 27 02:39:14 base redlib[4049503]:  TRACE redlib::client > Ratelimit remaining: 10.0
Jun 27 02:39:14 base redlib[4049503]:  TRACE redlib::client > Ratelimit used: 90
Jun 27 02:39:14 base redlib[4049503]:  TRACE redlib::client > Ratelimit reset: 45
Jun 27 02:39:18 base redlib[4049503]:  TRACE redlib::client > Ratelimit remaining: 9.0
Jun 27 02:39:18 base redlib[4049503]:  TRACE redlib::client > Ratelimit used: 91
Jun 27 02:39:18 base redlib[4049503]:  TRACE redlib::client > Ratelimit reset: 41
Jun 27 02:39:18 base redlib[4049503]:  WARN  redlib::client > Rate limit 9 is low. Spawning force_refresh_token()
Jun 27 02:39:18 base redlib[4049503]:  TRACE redlib::oauth  > Rolling over refresh token. Current rate limit: 99

@sigaloid
Copy link
Member

Hm. Okay, I see.

image
Side note, really funny that cloudflare's message here uses Redlib's CSS styling.

I'm going to work on trying to reproduce this locally first, some kind of load tester.

@freedit-dev
Copy link
Contributor Author

Maybe you should not use async rwlock, using std rwlock instead.

@sigaloid
Copy link
Member

sigaloid commented Jun 27, 2024

I don't think it's a synchronization issue. It syncs on mine when I simulate high load.

I don't know what exactly the problem is - you're saying it's hanging precisely after the line "TRACE redlib::oauth > Rolling over refresh token. Current rate limit: 99"? Nothing else?

If so, it sounds like you cannot retrieve new tokens for some reason, by method of an automatic refresh. Which makes no sense - it's generating one every time you start it, and a refresh should be no different than restarting the program.

@sigaloid
Copy link
Member

here's an example of my log output:

 TRACE redlib::client > Ratelimit remaining: 9.0
 TRACE redlib::client > Ratelimit used: 91
 TRACE redlib::client > Ratelimit reset: 103
 WARN  redlib::client > Rate limit 9 is low. Spawning force_refresh_token()
 TRACE redlib::oauth  > Rolling over refresh token. Current rate limit: 99
 INFO  redlib::oauth  > [✅] Success - Retrieved token "eyJhbGciOiJSUzI1NiIsImtpZCI6IlNI...", expires in 86399
 INFO  redlib::oauth  > Refreshing OAuth token... success
 TRACE redlib::client > Ratelimit remaining: 8.0
 TRACE redlib::client > Ratelimit used: 92
 TRACE redlib::client > Ratelimit reset: 102
 TRACE redlib::client > Ratelimit remaining: 5.0
 TRACE redlib::client > Ratelimit used: 95
 TRACE redlib::client > Ratelimit reset: 102
 TRACE redlib::client > Ratelimit remaining: 7.0
 TRACE redlib::client > Ratelimit used: 93
 TRACE redlib::client > Ratelimit reset: 102
 TRACE redlib::client > Ratelimit remaining: 6.0
 TRACE redlib::client > Ratelimit used: 94
 TRACE redlib::client > Ratelimit reset: 102
 WARN  redlib::client > Rate limit 6 is low. Spawning force_refresh_token()
 TRACE redlib::oauth  > Rolling over refresh token. Current rate limit: 99
 INFO  redlib::oauth  > [✅] Success - Retrieved token "eyJhbGciOiJSUzI1NiIsImtpZCI6IlNI...", expires in 86399
 INFO  redlib::oauth  > Refreshing OAuth token... success
 TRACE redlib::client > Ratelimit remaining: 3.0

Besides the occasional fact that a response may take longer to retrieve and the really small rate limits right before a refresh might accidentally overwrite the rate limit, my refreshing works just fine.

@freedit-dev
Copy link
Contributor Author

Yes, it's occasional. And it hang exactly TRACE redlib::oauth > Rolling over refresh token. Current rate limit: 99, nothing else. And restarting always fixed it.

Jun 27 03:36:31 base redlib[4054219]:  WARN  redlib::client > Rate limit 8 is low. Spawning force_refresh_token()
Jun 27 03:36:31 base redlib[4054219]:  TRACE redlib::oauth  > Rolling over refresh token. Current rate limit: 99
Jun 27 03:36:31 base redlib[4054219]:  INFO  redlib::oauth  > [✅] Success - Retrieved token "eyJhbGciOiJSUzI1NiIsImtpZCI6IlNI...", expires in 86399
Jun 27 03:36:31 base redlib[4054219]:  INFO  redlib::oauth  > Refreshing OAuth token... success
Jun 27 03:36:31 base redlib[4054219]:  TRACE redlib::client > Ratelimit remaining: 7.0
Jun 27 03:36:31 base redlib[4054219]:  TRACE redlib::client > Ratelimit used: 93
Jun 27 03:36:31 base redlib[4054219]:  TRACE redlib::client > Ratelimit reset: 208
Jun 27 03:36:34 base redlib[4054219]:  WARN  redlib::client > Rate limit 7 is low. Spawning force_refresh_token()
Jun 27 03:36:34 base redlib[4054219]:  TRACE redlib::oauth  > Rolling over refresh token. Current rate limit: 99
Jun 27 03:36:34 base redlib[4054219]:  INFO  redlib::oauth  > [✅] Success - Retrieved token "eyJhbGciOiJSUzI1NiIsImtpZCI6IlNI...", expires in 86399
Jun 27 03:36:34 base redlib[4054219]:  INFO  redlib::oauth  > Refreshing OAuth token... success
Jun 27 03:36:34 base redlib[4054219]:  TRACE redlib::client > Ratelimit remaining: 99.0
Jun 27 03:36:34 base redlib[4054219]:  TRACE redlib::client > Ratelimit used: 1
Jun 27 03:36:34 base redlib[4054219]:  TRACE redlib::client > Ratelimit reset: 205
Jun 27 03:36:39 base redlib[4054219]:  TRACE redlib::client > Ratelimit remaining: 99.0

I just wanted to say that maybe we should have better logs to diagnose whether it is caused by "async rwlock" or "token retriving timeout".

@freedit-dev
Copy link
Contributor Author

I just add more log and it stuck here

let resp = client.request(request).await.ok()?;

And restarting fixed it.

@freedit-dev
Copy link
Contributor Author

I'll try with a timeout.

trace!("Sending OAuth token request...");
let work = client.request(request);
let resp = match tokio::time::timeout(Duration::from_secs(10), work).await {
	Ok(result) => result.ok()?,
	Err(_) => {
		error!("Timeout: no response in 10 seconds.");
		return None;
	}
};
trace!("Received response from OAuth token request...");

@freedit-dev
Copy link
Contributor Author

freedit-dev commented Jun 27, 2024

It's like some dead lock

Jun 27 15:58:43 base redlib[4141438]:  TRACE redlib::client > Ratelimit remaining: 10.0
Jun 27 15:58:43 base redlib[4141438]:  TRACE redlib::client > Ratelimit used: 90
Jun 27 15:58:43 base redlib[4141438]:  TRACE redlib::client > Ratelimit reset: 77
Jun 27 15:58:43 base redlib[4141438]:  TRACE redlib::client > Ratelimit remaining: 7.0
Jun 27 15:58:43 base redlib[4141438]:  TRACE redlib::client > Ratelimit used: 93
Jun 27 15:58:43 base redlib[4141438]:  TRACE redlib::client > Ratelimit reset: 76
Jun 27 15:58:43 base redlib[4141438]:  WARN  redlib::client > Rate limit 7 is low. Spawning force_refresh_token()
Jun 27 15:58:43 base redlib[4141438]:  TRACE redlib::oauth  > Rolling over refresh token. Current rate limit: 99
Jun 27 15:58:43 base redlib[4141438]:  TRACE redlib::oauth  > Trying to refresh OAuth token...
Jun 27 15:58:43 base redlib[4141438]:  TRACE redlib::oauth  > Sending OAuth token request...
Jun 27 15:58:43 base redlib[4141438]:  TRACE redlib::client > Ratelimit remaining: 9.0
Jun 27 15:58:43 base redlib[4141438]:  TRACE redlib::client > Ratelimit used: 91
Jun 27 15:58:43 base redlib[4141438]:  TRACE redlib::client > Ratelimit reset: 77
Jun 27 15:58:44 base redlib[4141438]:  WARN  redlib::client > Rate limit 9 is low. Spawning force_refresh_token()

two force_refresh_token() happened.

another caught

Jun 27 16:08:25 base redlib[4142262]:  TRACE redlib::client > Ratelimit used: 92
Jun 27 16:08:25 base redlib[4142262]:  TRACE redlib::client > Ratelimit reset: 95
Jun 27 16:08:25 base redlib[4142262]:  TRACE redlib::client > Ratelimit remaining: 9.0
Jun 27 16:08:25 base redlib[4142262]:  TRACE redlib::client > Ratelimit used: 91
Jun 27 16:08:25 base redlib[4142262]:  TRACE redlib::client > Ratelimit reset: 95
Jun 27 16:08:27 base redlib[4142262]:  WARN  redlib::client > Rate limit 9 is low. Spawning force_refresh_token()
Jun 27 16:08:27 base redlib[4142262]:  TRACE redlib::oauth  > Rolling over refresh token. Current rate limit: 99
Jun 27 16:08:27 base redlib[4142262]:  TRACE redlib::oauth  > Trying to refresh OAuth token...
Jun 27 16:08:27 base redlib[4142262]:  TRACE redlib::oauth  > Sending OAuth token request...
Jun 27 16:08:27 base redlib[4142262]:  TRACE redlib::client > Ratelimit remaining: 7.0
Jun 27 16:08:27 base redlib[4142262]:  TRACE redlib::client > Ratelimit used: 93
Jun 27 16:08:27 base redlib[4142262]:  TRACE redlib::client > Ratelimit reset: 92
Jun 27 16:08:27 base redlib[4142262]:  WARN  redlib::client > Rate limit 7 is low. Spawning force_refresh_token()

@freedit-dev
Copy link
Contributor Author

here's an example of my log output:

 TRACE redlib::client > Ratelimit remaining: 9.0
 TRACE redlib::client > Ratelimit used: 91
 TRACE redlib::client > Ratelimit reset: 103
 WARN  redlib::client > Rate limit 9 is low. Spawning force_refresh_token()
 TRACE redlib::oauth  > Rolling over refresh token. Current rate limit: 99
 INFO  redlib::oauth  > [✅] Success - Retrieved token "eyJhbGciOiJSUzI1NiIsImtpZCI6IlNI...", expires in 86399
 INFO  redlib::oauth  > Refreshing OAuth token... success
 TRACE redlib::client > Ratelimit remaining: 8.0
 TRACE redlib::client > Ratelimit used: 92
 TRACE redlib::client > Ratelimit reset: 102
 TRACE redlib::client > Ratelimit remaining: 5.0
 TRACE redlib::client > Ratelimit used: 95
 TRACE redlib::client > Ratelimit reset: 102
 TRACE redlib::client > Ratelimit remaining: 7.0
 TRACE redlib::client > Ratelimit used: 93
 TRACE redlib::client > Ratelimit reset: 102
 TRACE redlib::client > Ratelimit remaining: 6.0
 TRACE redlib::client > Ratelimit used: 94
 TRACE redlib::client > Ratelimit reset: 102
 WARN  redlib::client > Rate limit 6 is low. Spawning force_refresh_token()
 TRACE redlib::oauth  > Rolling over refresh token. Current rate limit: 99
 INFO  redlib::oauth  > [✅] Success - Retrieved token "eyJhbGciOiJSUzI1NiIsImtpZCI6IlNI...", expires in 86399
 INFO  redlib::oauth  > Refreshing OAuth token... success
 TRACE redlib::client > Ratelimit remaining: 3.0

Besides the occasional fact that a response may take longer to retrieve and the really small rate limits right before a refresh might accidentally overwrite the rate limit, my refreshing works just fine.

Even success, there are two refresh at the same time.

@sigaloid
Copy link
Member

That shouldn't be a problem, though. Independent threads can of course reach reddit simultaneously, I don't know why it would block in this case. There's nothing wrong with overwriting the token with a new one twice in succession, and the RwLock will stop any simultaneous modification.

Pushing a PR that should avoid the simultaneous token case, but I'm unsure if this will fix the problem. It rolls over on my machine under heavy load, but then again, so does main right now...

@sigaloid
Copy link
Member

@freedit-dev @ButteredCats I know you both experienced this issue, please let me know if #160 helps. latest at quay.io has the newest version

@freedit-dev
Copy link
Contributor Author

freedit-dev commented Jun 28, 2024

Jun 28 03:49:51 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 62.0, we have 62.
Jun 28 03:49:51 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 38
Jun 28 03:49:51 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 8
Jun 28 03:49:53 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 61.0, we have 61.
Jun 28 03:49:53 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 39
Jun 28 03:49:53 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 6
Jun 28 03:49:53 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 60.0, we have 60.
Jun 28 03:49:53 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 40
Jun 28 03:49:53 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 6
Jun 28 03:49:55 base redlib[25514]:  ERROR redlib::utils  > Error page rendered: Nothing here
Jun 28 03:50:03 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 99.0, we have 59.
Jun 28 03:50:03 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 1
Jun 28 03:50:03 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 597
Jun 28 03:50:05 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 98.0, we have 58.
Jun 28 03:50:05 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 2
Jun 28 03:50:05 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 595
Jun 28 03:50:06 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 97.0, we have 57.
Jun 28 03:50:06 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 3
Jun 28 03:50:06 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 594
Jun 28 03:50:07 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 96.0, we have 56.
Jun 28 03:50:07 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 4
Jun 28 03:50:07 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 593
Jun 28 03:50:08 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 95.0, we have 55.
Jun 28 03:50:08 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 5
Jun 28 03:50:08 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 591
Jun 28 03:50:08 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 94.0, we have 54.
Jun 28 03:50:08 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 6
Jun 28 03:50:08 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 591
Jun 28 03:50:09 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 93.0, we have 53.
Jun 28 03:50:09 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 7
Jun 28 03:50:09 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 590
Jun 28 03:50:10 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 92.0, we have 52.
Jun 28 03:50:10 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 8
Jun 28 03:50:10 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 589

Then it backs to normal

Jun 28 03:51:57 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 50.0, we have 10.
Jun 28 03:51:57 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 50
Jun 28 03:51:57 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 482
Jun 28 03:51:59 base redlib[25514]:  WARN  redlib::client > Rate limit 9 is low. Spawning force_refresh_token()
Jun 28 03:51:59 base redlib[25514]:  TRACE redlib::oauth  > Rolling over refresh token. Current rate limit: 8
Jun 28 03:51:59 base redlib[25514]:  INFO  redlib::oauth  > [✅] Success - Retrieved token "eyJhbGciOiJSUzI1NiIsImtpZCI6IlNI...", expires in 86399
Jun 28 03:51:59 base redlib[25514]:  INFO  redlib::oauth  > Refreshing OAuth token... success
Jun 28 03:51:59 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 49.0, we have 9.
Jun 28 03:51:59 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 51
Jun 28 03:51:59 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 480
Jun 28 03:52:00 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 99.0, we have 99.
Jun 28 03:52:00 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 1
Jun 28 03:52:00 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 480
Jun 28 03:52:00 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 98.0, we have 98.
Jun 28 03:52:00 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 2
Jun 28 03:52:00 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 479
Jun 28 03:52:00 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 96.0, we have 97.
Jun 28 03:52:00 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 4
Jun 28 03:52:00 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 479
Jun 28 03:52:00 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 97.0, we have 96.
Jun 28 03:52:00 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 3
Jun 28 03:52:00 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 479
Jun 28 03:52:02 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 95.0, we have 95.

@freedit-dev
Copy link
Contributor Author

@sigaloid So far so good. But I want to say async rwlock is not a good idea. It is not a frequent request and the current solution(async rwlock) is too complex.

From tokio doc https://docs.rs/tokio/latest/tokio/sync/struct.RwLock.html#method.read

Note that under the priority policy of RwLock, read locks are not granted until prior write locks, to prevent starvation. Therefore deadlock may occur if a read lock is held by the current task, a write lock attempt is made, and then a subsequent read lock attempt is made by the current task.

@sigaloid
Copy link
Member

Hm. There's something I could possibly do to fix that - create a new Oauth struct and only call write to do an atomic swap, so the write call is held for far less time (just the time to do the swap, not the entire refresh call). Is that the kind of thing you mean?

Otherwise I'm not sure of a better solution besides a resource pool and that's a lot of overhead IMO vs just fixing the in place solution until it's performant

@freedit-dev
Copy link
Contributor Author

Async lock is not an asynchronous version of std rwlock and is much more complex to deal with. Maybe you could try Arc_swap instead.

@freedit-dev
Copy link
Contributor Author

#160 not work

Jun 28 05:03:44 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 89
Jun 28 05:03:44 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 375
Jun 28 05:03:45 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 10.0, we have 10.
Jun 28 05:03:45 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 90
Jun 28 05:03:45 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 374
Jun 28 05:03:47 base redlib[25514]:  WARN  redlib::client > Rate limit 9 is low. Spawning force_refresh_token()
Jun 28 05:03:47 base redlib[25514]:  TRACE redlib::oauth  > Rolling over refresh token. Current rate limit: 8
Jun 28 05:03:47 base redlib[25514]:  TRACE redlib::client > Ratelimit remaining: Header says 9.0, we have 9.
Jun 28 05:03:47 base redlib[25514]:  TRACE redlib::client > Ratelimit used: 91
Jun 28 05:03:47 base redlib[25514]:  TRACE redlib::client > Ratelimit reset: 372

@sigaloid
Copy link
Member

It doesn't roll over the tokens?

The logging order doesn't really matter, that's not indicative of an issue necessarily. but you're saying it hangs again?

@freedit-dev
Copy link
Contributor Author

@sigaloid right, it hang again even for the latest commint.

Maybe we should see these again

I just add more log and it stuck here

let resp = client.request(request).await.ok()?;

And restarting fixed it.

I'll try with a timeout.

trace!("Sending OAuth token request...");
let work = client.request(request);
let resp = match tokio::time::timeout(Duration::from_secs(10), work).await {
	Ok(result) => result.ok()?,
	Err(_) => {
		error!("Timeout: no response in 10 seconds.");
		return None;
	}
};
trace!("Received response from OAuth token request...");

Even timeout doesn't work.

@ButteredCats
Copy link
Contributor

I updated and mine hung again within a minute, didn't seem to last any longer like freedit's.

When I can I'll enable the extra debugging and see if I have any logs to add, but that won't be until tonight and I doubt there's anything different from what freedit has sent.

@freedit-dev
Copy link
Contributor Author

Restarting always fixed. So my instance auto restarts every 20mins.

@sigaloid
Copy link
Member

Yeah, the issue is that the rollover doesn't work once the rate limit is hit. But the new token generation works at startup.

@freedit-dev, you mentioned arc-swap. I should have time tonight to push out a commit that switches to that. I assumed RwLock for the oauth client would be the best solution but honestly I'm not sure it is anymore. I don't know why it's not working, but I seriously doubt it's because of performance. CPU-based atomic are plenty to handle this workload of a few dozen requests a minute. On that note, what is your CPU architecture?

@sigaloid
Copy link
Member

Interesting. Check out #162, let me know here if it helps

@ButteredCats
Copy link
Contributor

Seems to have fixed it for me!

@sigaloid
Copy link
Member

Splendid to hear!!

@freedit-dev
Copy link
Contributor Author

So far so good. Thanks!

@sigaloid
Copy link
Member

Amazing. Glad that I finally was able to fix it on the third go-around, huge thanks to @freedit-dev for recommending the ArcSwap. I bet it was exactly the deadlock you mentioned. 🤔 I did limit scope to as small as possible but ArcSwap is the better long-term solution. Thanks @ButteredCats and @freedit-dev again for helping me debug this, as it really only showed up under heavy load! Closing :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants