-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[1.8.0] "No active pools" with dozen of them #194
Comments
Sorry but not a single change on the network code since 1.6.x I restart my proxy several times an hour, without issues. With such a fuckedup log file I can't help you. |
Maybe you're using newer boost/GCC/updated visual studio or other libs then on your old build. |
I don't know why log is pasted that way :) My locale is russian, that error is smth like "IO operation was cancelled" |
As i said, i've noticed miners stucking like that on previous builds. My environment isn't changed other than miners & cc server version. But i've restarted proxy earlier (on earlier versions) and it wasn't that reproductible. |
Falling back to older version for now and will try to reproduce it on english locale. |
It's Snippa's, MoneroOcean or some other fork? I saw this multiple times and it was always (in my case) proxy issue. I'm running multiple 1.8.0 miners atm and everything works well (MO proxy with some amendmens on rPi2/bionic/docker container). |
It isn't proxy issue, miner have a pleny of backup pools, it just stops doing anything when encounters that error on main pool. But proxy is xmrig-proxy if that's somehow relevant ... |
Without a proper log I can't help you. But to be honest the only change in 1.8 is the algorithm. Nothing else. So often the problem exists it is in all versions since 1.6. I would bet if I create a new build with just version incremened, there will be ppl saying the old one was better. |
Okay, i've reversed miners to 1.7.0 they're stopping mining when main proxy goes offline with the same error message but without "no pools, stop mining" and don't switch to backup pools either, but when main pool goes online they resume mining, that's difference with 1.8.0. |
I published log on pastebin > https://pastebin.com/tqEpdSkE |
When they lose connection to the proxy they have to stop. Because everything they mine is pure waste. When the pool connection is dropped, it will try again after some time. How long you waited..? |
I don't see a problem on your log? It starts mining after a while I still just see ?????? Btw in the log |
Ofcourse they stop when they lose connection to pool, but they don't switch to any backup pool either. There was probably 5 min window between when i restarted proxy and noticed that miners don't do anything. And miners config is 5 retres with 1 sec timeout |
Oh, forgot to mention, that's 1.7.0 behaviour, it resumes mining, 1.8.0 don't. I'll get to home in a hour and try to reproduce it again with russian and english locales, thanks for attention. |
Sorry for pointing at proxy but log is literally unreadable... It might be something with boost but something similar was inside xmrig <2.5.3 where miner can't recover connection and also switch to failover pool and there is no boost used in original xmrig so... |
Full log > https://pastebin.com/vN0EV0e0 To trigger the bug you need to configure atleast 3 pools; pool #1 is online, then offline, pool #2 is unresolveable (either nonexistent domain name, or unresponding IP or closed port), pool #3 is fallback pool. When miner first connects to pool #1 and then that pool goes offline - miner gets stuck at pool #2 and doesn't go further, suspending mining indifenitely (even when it configured with 1 retries and 1 second retry-pause). Note that if pool #1 is offline at the startup - miner succesfully connects to fallback pool #3 skipping unresolveable pool #2, but when pool #1 goes online and then offline it still gets stuck at #2. Hope that helps. |
Thanks for tests. Your 2nd test will not have this bug, i didn't test that case either. As for resuming mining after main pool goes offline and online again, i actually forgot to test that, was too focued on issue that i was seeing even on older versions (tested above). Will test that case now. |
I'll record tomorrow some videos. But what I tested so far, 1.7 is behaving exactly the same way 1.8 does. So this has to be a coincidence |
Silly me. The reason why i created this ticket is that after proxy restart a major part of my miners is gone from dashboard so i assumed they stopped mining, but that actually means that either they're crashed or cannot connect \ refused to connect by CC server. I'll try to reproduce this now. So there's actually 2 separate bugs. One is indefinite loop on unresolveable pool, miner just stops mining, this bug was here for atleast two releases prior 1.8.0 i just wasn't bothered enough to investigate it until now. Second one is still under question. |
Okay, i upgraded to 1.8.0 again and restarted proxy and CC server several times, it's not reproducting. So there's only one bug for now - miner can't fallback to pools down the list after unresolveable one.
Exacly, this bug was here in 1.7.0 too :) Tell us if you still will not be able to reproduce it. |
I have same results like @djfinch. You're the one who said it was working on 1.7, I always said that 1.7 does the same like 1.8 and that's still valid. |
I didn't quite get it, you don't consider @djfinch test #1 behaviour a bug ? As of second bug (disappearing of miners from dashboard of unknown reason) i will watch closely and i'll report if i will stumble upon it again, i've had disabled per miner logs so i can't see what was going on client side, i've enabled them now and it will be easier to see what's going on. |
I don't know what's misleading.. "I have same results like @djfinch." In other words "I see the issue he was able to reproduce too". I was able to reproduce that. But this is still a corner case. But I will look into it. Why the hell arent you fixing your proxies? Maybe you have DNS issues? I restart my proxies 50 times a day when hopping coins. And I don't have these issues... |
Okay, got it. That pools config is kinda legacy workaround, my gateway was acting weird and i added alternative ip aswell, i don't remember exacly what happened. Nevermind, i'll remove these unresolveable pools for now as workaround and report if there'll be problems with restarting proxy/CC. PS: 50 times a day? Holy cow... I'm not that dedicated :) Btw, how's progress with proxy integration? |
Can you please test this branch https://github.com/Bendr0id/xmrigCC/tree/proper_handling_of_dns_issues The described cases from @djfinch work now. DNS issues are now handles like normal connection errors, and jumps to the next pool. And keeps retrying, once the main pool is back again it jumps back to it. |
@uz-spark tested? |
Uh, sorry, not fixed. |
Why? It looks perfect, after 5 attemps it always tries to conncet to the next one and keeps trying the others until it is able to connect to one, then it stops retrying. And at the end it is connected to one of your fallbacks. [2018-10-20 19:11:57] [inexistant_domain_name:6666] Error: "[Connect] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond" But it will always try to connect to your primary server. Once it is connected to it, it will jump again to it. All that is the expected behavior. [2018-10-20 19:13:40] new job from Cant see an issue here. |
I started proxy back to see if it will connect to it. It's main proxy, first in the pool list. I've retested it with clearer settings. Hashrate is low but difficulty is 100 and despite there's some hashrate even after 'no pools, stop mining', you can see that it doesn't submit anything to failover proxy. |
I tested a lot cases and it was always recovering, at least when the main proxy is back. In you log file, there was phase where no fallback was responsive. I dont know if thats a real usecase. |
In that case failover pool is my proxy on the same machine as main proxy, just on other port, i don't know why it was unresponsive for a second, maybe host was too busy (it's on VM). But i don't see how that's not real usecase, i just replaced regular pool for my proxy to lower difficulty so i can see hashrate cleaner. |
I mean, i think that's pretty normal usecase : Pool #1 - your main proxy |
I'll do a test with config in default settings as far as possible to see if that's issue, i don't know why you can't reproduce that. |
Case 1: Case 2: |
Case 3: |
Btw, same counts when 2nd proxy has bad dns. Just tested it. |
Here, i retested with almost default config and both proxies and miner on the same machine. In yourtest you don't reproduce my steps. I'll describe it again in detail : Pools :
Steps :
What you're missing in your test is that pool #2 must be offline when pool #1 goes offline otherwise even if then pool #2 goes offline too - miner succesfully connects to pool #3 because pool #1 already unresponsive and only one thing it can do is to try to connect to other pools down the list, pool #3 in that case. |
So, was you able to confirm this or not ? |
I noticed smth like that with previous versions but with 1.8.0 this issue is on another level.
[2018-10-17 19:47:16] * POOL #1: $PROXY1$:3333 [2018-10-17 19:47:16] * POOL #2: $PROXY2$:3333 [2018-10-17 19:47:16] * POOL #3: $PROXY3$:3333 [2018-10-17 19:47:16] * POOL #4: de01.supportxmr.com:3333 [2018-10-17 19:47:16] * POOL #5: fr01.supportxmr.com:3333 [2018-10-17 19:47:16] * POOL #6: at01.supportxmr.com:3333 [2018-10-17 19:47:16] * POOL #7: hk01.supportxmr.com:3333 [2018-10-17 19:47:16] * POOL #8: de02.supportxmr.com:3333 [2018-10-17 19:47:16] * POOL #9: fr02.supportxmr.com:3333 [2018-10-17 19:47:16] * POOL #10: at02.supportxmr.com:3333 [2018-10-17 19:47:16] * POOL #11: hk02.supportxmr.com:3333 [2018-10-17 19:47:16] * POOL #12: pool.supportxmr.com:3333 [2018-10-17 19:47:16] * CC Server: $PROXY1$:3344 [2018-10-17 19:47:16] * COMMANDS: hashrate, pause, resume, quit [2018-10-17 19:47:16] Starting thread 1/3 affined to core: #0 -> huge pages: 1/1 scratchpad: 2.0 MB [2018-10-17 19:47:16] Starting thread 3/3 affined to core: #1 -> huge pages: 1/1 scratchpad: 2.0 MB [2018-10-17 19:47:16] Starting thread 2/3 affined to core: #2 -> huge pages: 1/1 scratchpad: 2.0 MB [2018-10-17 19:47:16] use pool $PROXY1$:3333 [2018-10-17 19:47:16] new job from $PROXY1$:3333 with diff 3000 and PoW 1 [2018-10-17 19:47:19] accepted (1/0) diff 3000 (1 ms) [2018-10-17 19:47:38] new job from $PROXY1$:3333 with diff 3000 and PoW 1 [2018-10-17 19:47:39] accepted (2/0) diff 3000 (3 ms) [2018-10-17 19:47:53] SIGHUP received, exiting [2018-10-17 19:47:53] no active pools, stop mining [2018-10-17 19:47:53] [$PROXY1$:3333] Error: "[Read] ќпераци¤ ввода/вывода была прервана из-за завершени¤ потока команд или по запросу приложени¤"
That is, on proxy restart miner stops mining entirely and doesnt resume. Strangely enough i have a few machines that stop mining too (no pools, stop mining) but resume when main poxy goes online.
[2018-10-17 17:49:09] new job from $PROXY1$:3333 with diff 3000 and PoW 1 [2018-10-17 17:49:10] new job from $PROXY1$:3333 with diff 3000 and PoW 1 [2018-10-17 17:49:18] [$PROXY2$:3333] Error: "[Connect] ������� ���������� ���������� ���� �����������, �.�. �� ������� ���������� �� ��������� ����� �� ������� ������ ������, ��� ���� ��������� ��� ������������� ���������� ��-�� ��������� ������� ��� ������������� ����������" [2018-10-17 17:49:19] [$PROXY1$:3333] Error: "[Read] End of file" [2018-10-17 17:49:19] no active pools, stop mining [2018-10-17 17:49:19] [$PROXY1$:3333] Error: "[Read] �������� �����/������ ���� �������� ��-�� ���������� ������ ������ ��� �� ������� ����������" [2018-10-17 17:49:20] use pool $PROXY1$:3333
The text was updated successfully, but these errors were encountered: