Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Printer freezes on bigger prints. #17161

Closed
Haxk20 opened this issue Mar 14, 2020 · 85 comments
Closed

[BUG] Printer freezes on bigger prints. #17161

Haxk20 opened this issue Mar 14, 2020 · 85 comments
Labels
A: LPC176x Bug: Potential ? Needs: More Data We need more data in order to proceed

Comments

@Haxk20
Copy link
Contributor

Haxk20 commented Mar 14, 2020

Bug Description

We slice the model using Cura and transfer it to SDcard and start the print.
The prints starts out just fine but after some time (It ranges from 10 minutes to several HOURS) it just freezes. No movement on steppers and LCD screen just says the SDcard has been plugged in. But i know that nobody touched the SDcard (I have stood near the printer for the 10 minutes and saw it just stop).
This also weirdly enough happens with USB printing too. LCD says the same thing.
I had this happen on huge print (1.5 day) and it stopped after 16 hours.
I would be kinda OK if it stopped and bed and hotend went to room temp but they still stay heated up. This is a huge fire hazard.

My Configurations

Configuration files.zip

Steps to Reproduce

  1. Start a print
    Expected behavior: Print starts and finishes just fine resoulting in printed model.

Actual behavior: Print starts and after some time freezes.

I also provide the gcode where the print stopped after 17 minutes:
Gcode.zip

@Haxk20
Copy link
Contributor Author

Haxk20 commented Mar 16, 2020

I have just tried deleting the entire SDcard. That yelded one vase mode print for me but failed on second one. @thinkyhead Opened a new issue as you requested.

@Fantomiaso
Copy link

Have exactly same issue.
SKR1.3 + TMC2208 in uart mode. Stealthchop (LA in firmware is OFF). No display. Anycubic I3 mega hardware.
Prints both from onboard SD and via USB freezes at random point. Heaters, motors and and fan sremain ON. USB-Serial bridge stops responding, but after reconnecting USB I can send commands to Marlin again. I think that problem is in SKR hardware. When I have to test board outside of case any touch of board or wires coud halt it. Inside of metal case board is shielded, but some interference can occur.
Tomorrow I will try better grounding of board and case. If it will not help, maybe I will add some capacitors on 5V 3V3 lines. In my opinion for reliable operation 12/24V should have 2000-3000uf, and 220uF at least on 5V and 3V3, but thera are only 100uF at 12/24V, 22uF at 3V3 and 33uF at 5V.

My config:
Config.zip

@thinkyhead
Copy link
Member

thinkyhead commented Mar 16, 2020

When Marlin is itself paused or stopped it will send messages to host at regular intervals, so that is something to look for.

Marlin will sit and wait if commands are no longer coming in from the SD card or from the host. There's currently no failover mode that deals with a hung up host or an SD card that stops responding. So that would be something new to work on.

@thinkyhead thinkyhead added Bug: Potential ? A: LPC176x Needs: More Data We need more data in order to proceed labels Mar 16, 2020
@martend
Copy link

martend commented Mar 17, 2020

I run skr 1.3 & 1.4 with matching tmc 2208's on 24v and don't run in to these problems. I use BTT stepsticks in uart mode, stealthchop etc. on. I also run one fan over the stepsticks and used the little potmeters to turn the factory current settings of 1.2 down to about 1.0 / 0.8. I observed while doing some prints if the sticks get to hot or there was to low or to much current on an axis (steppermotor) the print would fail. Other functions would continue like nothing happened, including the serial connection. Also a flimsy connection / cable could be the cause, our cheap hardware wants to be connected to good connectors and decent wires :-) I'm half through a 5.5 hour print now, later this evening I'm planning to start a 12h one, with the current Marlin firmware on the board. I'll report how it turned out.

@Fantomiaso
Copy link

Added 470uF capacitors on 3V3 and 5V lines, 2200uF on 12/24V. Will try some prints for next 24 hours.

@Fantomiaso
Copy link

Fantomiaso commented Mar 18, 2020

Ok, things are getting more interesting. At first look, adding capacitors solved the problem, but after 4-5 hours print stops. Next attempts gives me random durations to failure event.
Than I updated firmware to the latest one and while migrating configs, I changed
#define NO_TIMEOUTS
with value 1000 print goes much further, but I noticed freezes with duration near one second. Finaly print also stopped. It was 2.5 hour print with near 8 freezes. Situation is the same with USB and SD prints. I'll try configs from other printers with working SKR 1.3 and marlin to find what can be wrong. Also I'll get one more SKR, but with 2209 to try.

@martend
Copy link

martend commented Mar 18, 2020

@Fantomiaso could you share the gcode file you are printing? Would like to start a print and see what it does on my machine.

@Fantomiaso
Copy link

Fantomiaso commented Mar 18, 2020

One of parts with issues:
AI3M_Ai3MEGA-S_CaliperHolder_Right.zip

@Fantomiaso could you share the gcode file you are printing? Would like to start a print and see what it does on my machine.

@makemerush
Copy link

I previously had this issue on SKR 1.4 Turbo with TMC2130 SPI mode for X/Y/Z and TMC2225 UART for E

Build 2.0.5.1 has fixed it for me! Not sure what changed that would have resolved it though. Recommend the new build if you haven't tried it yet.

@Haxk20
Copy link
Contributor Author

Haxk20 commented Mar 18, 2020

@makemerush The thing is this isnt new issue. Few people had this issue months ago. Then it out of nowhere stopped. And now its back. Its really crazy cause i have just done few prints and all of them went fine. And they werent quick prints. They were 5+ hours prints. So TBH i have no real clue as to where the issue is coming from.
Oh and did i forgot to say that the issue went away without reflashing the firmware or anything ?
Yeah just out of nowhere its gone it seems.
And worst is i never know when it will come back.

@Fantomiaso
Copy link

Ok, I think I know what is the problem in my case. I started new print from SD via RepetierHost and detouched USB cable from printer (I don't have display at all, so I start prints from host). No freezes, no print stops. I tried different cables and hosts. In all cases, except one, problem turns back. That one case was when I used Odroid XU4 with battery power as host. My version of why that happens is bad USB implementation on hardware level. Maybe power interference, maybe signal. Don't have enough instrumentation to figure out with it. Will try USB print from Odroid next week, when I get stronger battery.

@Haxk20
Copy link
Contributor Author

Haxk20 commented Mar 19, 2020

Here i turned off StealthChop and the print is going fine for 2 hours so far. It is long print so will report if it goes to freeze or finishes.

@Fantomiaso
Copy link

Here i turned off StealthChop and the print is going fine for 2 hours so far. It is long print so will report if it goes to freeze or finishes.

There is a known problem with 2208 drivers. An extruder driver in StealthChop mode may freeze when using linear advance. Repraps wiki suggest using:
#define MINIMUM_STEPPER_PULSE 2
(if notes work try 4)
When 2208 are user in "classic" mode (no uart), only the extruder driver freezes. I suspect that its freeze in the uart mode can lead to deadlocks when the response from this one driver never arrives. If the setting of the pulse duration does not help, it is enough to switch only the extruder driver to the spreadcycle and everything starts to work. But such a problem is not observed with the rest of the drivers and when the linear advanced is turned off.

@Haxk20
Copy link
Contributor Author

Haxk20 commented Mar 19, 2020

Im not running TMCs on Extruder due to this exact reason. I run it only on X and Y

@Haxk20
Copy link
Contributor Author

Haxk20 commented Mar 19, 2020

And the printer just froze.

@martend
Copy link

martend commented Mar 19, 2020

Thanks for the file @Fantomiaso, will fire up that one later.
I printed a nearly 6 hour file yesterday and it went well, but since my cold end was not playing nice I needed to fix that first before starting a long print. I'll start a 12h this evening.
@Haxk20 I noticed you and others use different models of stepperdrivers in your config. Running TMC's only on X and Y. I on the other hand have all axis populated with TMC 2208's and stealthchop on in uart mode. Could it be possible that different stepsticks config don't play nice together at the moment? If I would have any other drivers then TMC's to put in my board I could test along, but I don't have 'm.

@Haxk20
Copy link
Contributor Author

Haxk20 commented Mar 19, 2020

It is very much possible that there is something going on between them. Unfortunately i cannot test with different ones. I given the original ones to friend.
But TBH it doesnt make sense. You are experiencing the same exact issue as us. Media insert on screen.
OK. Wait i just thought of something. This always started when i changed filament. TBH this at first sounded very crazy as how could filament cause this right ?
Well each filament is different in how much force they can handle before deformation occurs right ?
Well this happens in the extruder also as it need to grab the filament with teeth. And if the filament is grabbed stronger and the deformation is bigger or less then it can push more or less filament.
This in turn can make the motor work harder and require more current. One solution is to up the current but thats not the correct solution as the deformation is different the extruder is pushing different amount of filament. And well PETG and PLA are very different chemically vise. Thus the correct solution is to recalibrate the extruder steps per mm.

(I just came up with that and i may be horribly wrong. So i will tomorrow recalibrate and see what happens. And as safety precaution i will up the current to extruder.)

@thinkyhead
Copy link
Member

Are you able to force the printer to freeze by turning on heaters and doing fast printing moves? If you switch to "dry run" (or disable the heaters) does the freeze stop happening?

We've recently had an issue reported where an under-powered PSU experienced a voltage drop whenever the machine was printing at high speed with the heaters on (which you kinda need). If you are able to eliminate the problem of a drop in voltage at the input to the board, that will help narrow it down to… something else.

@Haxk20
Copy link
Contributor Author

Haxk20 commented Mar 19, 2020

Are you able to force the printer to freeze by turning on heaters and doing fast printing moves? If you switch to "dry run" (or disable the heaters) does the freeze stop happening?

We've recently had an issue reported where an under-powered PSU experienced a voltage drop whenever the machine was printing at high speed with the heaters on (which you kinda need). If you are able to eliminate the problem of a drop in voltage at the input to the board, that will help narrow it down to… something else.

It is well actually absolutely possible its voltage drop. Im technically running the bed at the limit of what PSU can deliver. The issue is i tried to order new PSU but well due to the world in its current state its just not possible to get it in here in the coming month.
But yes i will absolutely try to run it without heating up the printer and dry run it.

@Haxk20
Copy link
Contributor Author

Haxk20 commented Mar 20, 2020

Tested without heating and it finished fine @thinkyhead

@Haxk20
Copy link
Contributor Author

Haxk20 commented Mar 20, 2020

I will heat it up tomorrow and just not add any filament to see if that freezes too.

@Haxk20
Copy link
Contributor Author

Haxk20 commented Mar 24, 2020

@martend @makemerush @Fantomiaso Can you please tell me which slicer do you use ?
Cause i just tried one small print that crashed on me with CURA and now Prusa slice and in prusa slicer it printed. Trying a bigger print that crashed on me 100% of the times i tried it.

@makemerush
Copy link

Cura 4.5 here

@Fantomiaso
Copy link

In my case problem is not in software, but in combination of oversimplified schematics of SKR board (with no any protection except fuses) and lack of grounding. I'm living in Belarus and only about ten years ago making grounding in houses became mandatory. Mine is much mor older. I checked voltage between PC housing and 3D-printer and it was floating with peak near 80 volts. Even neon testing screwdriver glows when touching printer or pc, that tells us that it is either phase voltage or strong electromagnetic interference. That voltage latches MCU occasionaly. Same g-codes printed well from SD with USB detouched or via USB odroid single-board pc with battery-powered or fully isolated power source, but failed when printing when USB is connected to my PC.

@Haxk20
Copy link
Contributor Author

Haxk20 commented Mar 24, 2020

@Fantomiaso Can you tell me the slicer tho ?

@Fantomiaso
Copy link

Fantomiaso commented Mar 24, 2020

Slic3r 1.3.1 and Cura 4.5

@cooldudie2
Copy link

problems still persist i have also now ruled out the following

  1. the spread jumper is not on the board during testing it is set in firmware, all axis are set to stealthchop mode.

  2. I have tried a more powerful ATX power supply unit that is known and good, i have monitored the voltage all the way through the print process and the voltage barley fluctuated 0.2 volts either side of 12v so the power supply issue is ruled out.

3.No TMC prewarn overheat flags were detected/triggered.

i am next going to try running the Extruder in spreadcycle mode only to see if thats the cause as suggested moving the filament like a yo yo in and out constantly may trigger a TMC lockup.

@amonpaike
Copy link

amonpaike commented Jun 5, 2020

guys, it hurts me to say that although I had made some refinement adjustments as it was suggested to me a few posts ago, it happened to me again that it's freeze, luckily with the recent firmware that resumes the prints that stop in case of powerloss this problem it is no longer as "risky" as it was before .. but the fact is that it is still here, rarely, but it still happens.

@boelle
Copy link
Contributor

boelle commented Jun 21, 2020

@Haxk20 still an issue?

@boelle
Copy link
Contributor

boelle commented Jun 23, 2020

Lack of Activity
This issue is being closed due to lack of activity. If you have solved the
issue, please let us know how you solved it. If you haven't, please tell us
what else you've tried in the meantime, and possibly this issue will be
reopened.

@boelle boelle closed this as completed Jun 23, 2020
@Haxk20
Copy link
Contributor Author

Haxk20 commented Jun 23, 2020 via email

@zsoltkovacs
Copy link

I run into the same problem SKR E3 Mini v1.2 + TMC2209 + Octoprint

If I print via the Octoprint, the board freezes after a while, bed temperature goes to maximum, no activity on steppers, the display free, no response on button, and it drops the serial so OctoPrint detaches and cannot perfect. A perfect fire hazard. I will put back an earlier firmware an see what happens.

Prints go well if I print from SD card

@c3D-Dan
Copy link

c3D-Dan commented Jul 4, 2020

Another 24 hr print ruined due to inexplicable freezing

@thinkyhead
Copy link
Member

Another 24 hr print ruined due to inexplicable freezing

@c3D-Dan — From your report it sounds like a hardware issue. Bad grounding or a board with a bad voltage regulator?

@amonpaike
Copy link

Another 24 hr print ruined due to inexplicable freezing

I also risked recently losing a 10-hour print, luckily I activated the resume on power loss, which also works when there are these strange freeze (because you are forced to reset the printer anyway) and I managed not to lose the printed model

@c3D-Dan
Copy link

c3D-Dan commented Jul 16, 2020

Changed to DRV8825, thought the problem was fixed, had a 2 day print complete successfully. The very next print, failed after 36 hrs. This is on an STM32F4 platform.

@c3D-Dan
Copy link

c3D-Dan commented Jul 16, 2020

Another 24 hr print ruined due to inexplicable freezing

@c3D-Dan — From your report it sounds like a hardware issue. Bad grounding or a board with a bad voltage regulator?

I highly doubt it. Printing from new Micro SD Card, this time from the boards own Micro SD Reader. Changed motherboards to a BTT GTR, rewired everything, as above, changed from TMC2209 to DRV8825. Still freezing for no apparent reason. Clearly the printer thinks it's finished. "Motion" menu is available. Heaters on, LCD still responsive. Just stopped in its current position.

Not sure if relevant, but appears to have frozen this time on a Z movement. Is a vase mode print so can see the freeze occurred at the point at which Z increases.

@thinkyhead
Copy link
Member

Without more data, I'm afraid there's nothing we can follow up on here.

@LeBassiste
Copy link

LeBassiste commented Jul 17, 2020

i have an original UM2+ chassis w/ BTT SKR V1.4 TURBO, 4 x TMC5160 drivers running in non-stealthchop.
using the original ULTIMAKER 2+ ULTICONTROLLER, printing from ulticontroller SD-slot, USB on BTT connected to pronterface on host computer for monitoring only. using CURA 4.6.1 w/ GCODE set to MARLIN flavor.
symptoms:

  1. printer sometimes freezes after hours into the print or just after a couple of minutes. always using same GCODE file and same SD-CARD for testing. SD CARD is a SANDISK EXTREME PRO 32GB micro SD and micro-sd adapter, plugged into ulticontroller.
  2. when freezing, display still shows "printing..." for about a second, then BTT board does a reset, shows "snow" on display, then bootscreen, then normal start screen. all heaters are then off, parts fan off, and hotend fan starts running, if hotend temp is still over 50°C.
    observations:
  3. it seems that prints are more likely to successfully complete, when i do a power OFF/ON cycle on the printer prior to starting the print.
  4. when the printer is switched on and left idling (just doing nothing), no reset will occur (have it right on my desktop while working, so can watch it the entire time).
  5. used a thermal camera to check BTT board temperatures. highest is on TMC5160 @ 60 °C, all others (processor, on-board buck regulator and 3.3V LDO) are below 50°C.
  6. during all prints, i could never get an OT pre-warn from the drivers.

the configuration I'm using is located here:
[https://github.com/LeBassiste/Marlin-2.0.5.3-SKR-V1.4-Turbo-UM2Plus/blob/master/Marlin-2.0.5.3-SKR1.4T_UM2plus_V2.zip]

note that i'm using the servo connector on the BTT to control the hotend-fan with PWM.
can do more testing and measurements, but need guidance as to what to look for. can play with config files, but don't have a (useful) programming background.

@ellensp
Copy link
Contributor

ellensp commented Jul 17, 2020

Please test the bugfix-2.0.x branch to see where it stands. If the problem has been resolved then we can close this issue. If the issue isn't resolved yet, then we should investigate further.

@c3D-Dan
Copy link

c3D-Dan commented Jul 17, 2020

Without more data, I'm afraid there's nothing we can follow up on here.

Let me know what data you require and ill do everything I can to provide it.

@c3D-Dan
Copy link

c3D-Dan commented Jul 17, 2020

Please test the bugfix-2.0.x branch to see where it stands. If the problem has been resolved then we can close this issue. If the issue isn't resolved yet, then we should investigate further.

Am using Bugfix 200005 2.5.3 from memory. Problem remains. Printer is very well grounded.

@ellensp
Copy link
Contributor

ellensp commented Jul 17, 2020

thats ancient... 200006 is current bugfix.

@LeBassiste
Copy link

LeBassiste commented Jul 17, 2020

testing 200006 right now. did reset eeprom and bed level.
(have difficulties to test the freeze issue, because as soon as SD-card gets inserted into ulticontroller card slot, info screen shows "media inserted" and starts to irregulary flicker at approx. half-second interval. when trying to select "print from media" menu item w/ click wheel, menu sometimes jumps back to info screen, sometimes stops flickering.
tried #define DOGM_SPI_DELAY_US 20 but to no avail. any other settings i should try on the display config? have to use the ulticontroller though, can't change to other display types.)

ok, managed to get printing something: printer freezes after third layer, display still flickering, menu no longer responding to clickwheel.
edit: board does not go through reset now, as it did with marlin 2.0.5.3.

@LeBassiste
Copy link

could do some more testing with 2.0.5.3 and bugfix 200006.
i actually had two "overlapping" issues, which i did not fully understand before:
1. printer would randomly stop mid-print on a print from SD-card (SD-card connected to J14, i.e., EXP1), where the info screen changed from "printing..." to "media inserted" and all heaters and fans kept ON. a capacitor 100nF on EXP2 between J13:7 and J13:9 on the BTT board seems to have fixed that issue. after this, i was able to compare 2.0.5.3 and bugfix 200006 behavior during longer prints, w/o getting this particular behavior anymore.
2: unfortunatley, printer still randomly stops operating on both FW versions after approx. 1 hr. into the print, and goes through a reset.

anything else i could test?

@thinkyhead
Copy link
Member

A reset is highly unusual and often indicates a hardware problem of some kind. Check your grounding, especially.

@c3D-Dan
Copy link

c3D-Dan commented Jul 20, 2020

#18358

This could and was passed off as "grounding" issues until @minosg did the ground work to identify the problem on STM32F1 boards. In his words, he suspects "this issue is not limited to them (STM32F1)".

Further :
Quoting Minosg

@c3D-Dan I have suggested some items above. But the core idea is spread around the non critical tasks in the time domain so you don't freeze a handler so long that bits are lost.

And no this is definitely a core framework bug which occurs because Marlin prepares the right conditions of it.

Inherently marlin trusts the frameworks to properly implement handlers and they have to trust marlin as the user to be reasonable as to how to use the hardware.

If I are looking for your platform and the symptoms are the same I would still look if your frameworks uart handler is doing anything for the ORE or overrun bit use case.

Admittedly i'm assuming this is a frame work issue. Sure my experience is anecdotal at best, however having checked and recheck grounding, replaced at first processors, added heatsinks to MCU and more cooling, replaced TMC drivers with new TMC drivers before replacing the motherboard entirely and replacing TMC drivers with DRV8825s and rewiring everything. Im pretty sure this is an issue beyond the reasonable capacity of an everyday Marlin user.

@thinkyhead Is there anything you can do to mitigate the conditions @minosg refers to as a workaround until framework can be fixed properly? I understand fixing issues with frameworks isn't one of your responsibilities and I super appreciate the insane work you put into this project. However this issue is clearly wide spread with marlin users who have large build volumes. Not to be ignored, although perhaps overlooked. I think it's safe to say, that if I had a standard 30x30x30 printer, i'd rarely encounter this issue as it would be rare that I could print long enough to encounter the problem. Perhaps IF this issue was more repeatable on relatively small prints, it would attract more attention. To be clear, the same printer, printing oodles of kinder suprise objects performs perfectly well. Its not until I (attempt) to print for days on end that this issue rears its ugly head.

@LeBassiste
Copy link

LeBassiste commented Jul 20, 2020

@ thinkyhead
apologies for being not quite clear on describing the issue. when the issue happens, the printer simply stops, but still displays "printing..." on the info display. it takes a second or two from there before the printer goes through a reset. (dark screen, boot logo, ...). i actually thought that this behavior is a MARLIN feature where a watchdog brings the printer back to a safe state after freeze. what am i missing here?
in fact, i concur with you on the probability of a grounding issue. for that matter, i tested the printer with and w/o USB connected to the PC: no change. also, the power supply has a strong (and functioning) PE connection on the mains AC side, as well as the host PC has. on top of it, both printer and PC are operated off the same AC phase.

as a side note: i'm using 4 x TMC5160 with #define TMC_USE_SW_SPI

@c3D-Dan
Copy link

c3D-Dan commented Jul 20, 2020

as a side note: i'm using 4 x TMC5160 with #define TMC_USE_SW_SPI

I suspected TMC drivers were related to the issue. Changed with good ol' DRV8825 at 32x. I did finally manage to get a complete 3 day print, however on the following print, failed at day 2. Heaters on. Perhaps my first successful print was a fluke?

I'm completely prepared to test any theory anyone can throw at this without question but within my pay grade, im not sure there's anything else I can do to help. In order to test and debug, I need the support of those with more knowledge and experience to offer more things I can try.

Repeatedly insisting this is a PSU, grounding, heat related, driver related issue is of little help given the time and $ i've spent exploring those ideas to no avail.

@LeBassiste
Copy link

LeBassiste commented Jul 30, 2020

to whom it may concern, had another freeze this morning, running the bugfix 020006. 2 hrs. into the print, printing from SD card, USB not connected. while displaying "printing...", printer freezes and switches off part fans (nozzle fan keeps running). after approx. 3-4 sec., goes through reset. had successfully printed the same gcode file (5 hrs. total) just yesterday.
any other bugfix approach i should test?

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Sep 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A: LPC176x Bug: Potential ? Needs: More Data We need more data in order to proceed
Projects
None yet
Development

No branches or pull requests