-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Likely incorrect Hercules breaking/stepping handling/support? #322
Comments
Fish,
I think you are reading this correctly. As I read it, I wouldn't have chosen the word 'default', but rather simply that some models have one rate control and some models have multiple rate controls.
Yes.
I agree, As a Hercules user, I would like to see the behavior that you stated as your opinion; that ANY CPU reaching a breakpoint would stop the entire system.
I think that what are you are stating that the behavior should be is correct. However, again as a Hercules user, I would prefer the current behavior, where a range is honored. It is very difficult to diagnose code issues from the Hercules console already but having a range greatly reduces (for example) the impacts of interruptions constantly occurring while stepping, in that you can focus on a specific range and if the interruption happens and is processed by the OS and then control returns right to the next instruction in the range (as if the interruption didn't occur). I can appreciate that you want Hercules to accurately emulate the real hardware and its capabilities but in this case I think it would be a disadvantage. I'd point out that if one does not specify a range then the whole addressing space is the range, thus emulating the desired behavior that you stated. Having a smaller range available though is a very powerful nice to have. Regards, |
Hmmm... So what you're saying is, once any CPU reaches the specified breakpoint ("Address-Control") range and the the entire system (all CPUs) are stopped and instruction stepping mode is thus engaged/activated, that, as agreed, pressing the <enter> key would behave as if the "startall" command was given and thus each CPU would then single-step one and ONLY one instruction for each <enter> key press (i.e. even though the other CPUs might not be within the specified breakpoint range, because one of them is, they would all remain in instruction stepping mode, executing only one instruction at a time), but once the original CPU reached the end of its stepping/breakpoint range (or, for example, temporarily began executing instructions outside the stepping range, such as would occur if it did a Is that correct? If so, that might be tough to do! I'll have to think about that a bit. But THANK YOU! for your input/thoughts/opinion on this matter, Bob! Much appreciated buddy! |
Yes, I think you are correct. I really can't imagine that anyone would want other CPUs to continue to run and possibly change the state of things while one CPU is stopped in a range and that person is trying to evaluate the state of things at the stop. However, usually when I know that I am going to be using breakpoints and stepping, I change the configuration to have only one CPU anyway. For me at least, that is an acceptable work around and I don't think you need to try to do anything as an enhancement; again, speaking strictly for myself. Regards, |
Depending on what you do how would this s effect the SIGP instruction I am also wondering if you were to set SetThreadAffinityMask to the regs thread how it would effect issue #321 |
Huh? What the heck are you talking about? |
I just wonder if there is an IBM redbook on this and how it relates to particular models I googled z15 (latest Mainframe model) for more info |
As far as I know IBM does not document anywhere the way a particular specific mainframe model behaves. Instead, it only documents the ARCHITECTURE (i.e. the way ALL mainframe models are supposed to behave): And Chapter 12 is what documents how breakpoints and instruction stepping is supposed to behave. |
I got into the code but in a disabled interrupt exit lots of things I cannt do I want to get a good chuck of code running and I have to be careful because out of all the environments in z/os disabled code is most restrictive and there is a limited amount of things I can do. I have a few things to do sat night for a couple of hours but I plan on spending the weekend working on this thank you |
hope to have something for you sunday |
Fish I would most definitely want all other CPU's to run I know your perspective its emulating the hardware for me its debugging z/os code. I came across this debugging an SRB (code that has an option to run in a other address space with restrictions) I put in the option async. meaning wait for the code in my address space to run when the SRB completes. But typically a SRB would run concurrently stopping all CPU would make such difficult code even harder to debug, at the very least if you can give this option (letting other CPU's run) I for one would appreciate it thanks |
Isn't that what I originally suggested?
I suppose we could add a third And if there's one thing we try very hard to do with Hercules, it's conform to the published architecture! Which means the default behavior should be to stop ALL CPUs and have them ALL enter instruction step mode, and pressing the <enter> should cause ALL CPUs to execute a single instruction each. None of the CPUs would be allowed to run normally once instruction stepping mode was activated. But again, that would only be the default. The Cool? Would that be acceptable to you? |
I'm prepared to make it worth your while (I know that's not what this about) but I am kind of surprised no one noticed this bug. There really isn't anyway to debug certain code besides using this feature. Having worked with XDC ( the only debugger that debugs code in cross memory code and common storage ) this debugs code that I don't think that debugger does. Truthfully without exspatnn (excessive spin conditions actions) member allowing a cpu to spin z/os could freeze in a short period of time. I don't know when that parmlib member was introduced. Maybe a lot of the users aren't aware of this and haven't used this feature. |
Mostly just to make them consistent with one another. The new start_all_cpus function will be needed for eventual #322 implementation. [skip travis]
I'm just a very casual Hercules user, but from my point of view, it is not a problem if Hercules can do more than the real hardware could. This is true already in any case (you can't just duplicate a real life DASD, for example). Also, the architectural parts of debugging facilities are probably specified only very vaguely because they are not really part of the architecture-as-observable-from-the-software, but are more a meta-architecture. So if some models have only a single rate control which works on all CPUs, while other models have a rate control for each CPU separately, that would easily allow Hercules to have both of these at the same time, at the user's choice. Or some generalisation of them that makes sense. Including some sort of "range" which limits where the rate control is active. On the VICE emulator (for 8-bit Commodore machines) there is a full debugger with disassembler and breakpoints on program counter values or memory accesses and lots of other functionality. The original hardware had no such facilities (but these days you'd plug in an ICE for something similar, or there would be a JTAG port for it) but it absolutely makes sense for an emulator to have them. If only to help debugging the emulator itself. |
Word of caution here stopping even one cpu is not that simple z/os get bent out of shape when it doesn’t get a decent response from a cpu and stepping definitely slows that down even ONE cpu It would be good practice to issue a parmlib member to let cpu spin and that’s only for a limited time I think stopping all at least from a z/os perspective would wreck total havoc in short order and might defeat what ever the user intended
… On Aug 17, 2021, at 11:23 AM, Rhialto The M. ***@***.***> wrote:
I'm just a very casual Hercules user, but from my point of view, it is not a problem if Hercules can do more than the real hardware could. This is true already in any case (you can't just duplicate a real life DASD, for example). Also, the architectural parts of debugging facilities are probably specified only very vaguely because they are not really part of the architecture-as-observable-from-the-software, but are more a meta-architecture.
So if some models have only a single rate control which works on all CPUs, while other models have a rate control for each CPU separately, that would easily allow Hercules to have both of these at the same time, at the user's choice. Or some generalisation of them that makes sense. Including some sort of "range" which limits where the rate control is active.
On the VICE emulator (for 8-bit Commodore machines) there is a full debugger with disassembler and breakpoints on program counter values or memory accesses and lots of other functionality. The original hardware had no such facilities (but these days you'd plug in an ICE for something similar, or there would be a JTAG port for it) but it absolutely makes sense for an emulator to have them. If only to help debugging the emulator itself.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
|
Note: Issue #321 ("Same instruction sometimes executed TWICE when stepping") is closely related to this one.
I need some peer review/input/feedback on this.
(*)
During my efforts to try and resolve Issue #321 ("Same instruction sometimes executed TWICE when stepping"), I have discovered some interesting things that Hercules is currently doing that doesn't seem right, but is unclear whether it actually is or not due to the vague/imprecise way in which the Principles of Operation manual is worded on the subject.
I'm talking specifically about "Chapter 12: Operator Facilities".
It's a very short chapter (it's only 7 pages long, so I would like to encourage each of you to please read it!) and it documents the proper architectural behavior of various "Operator Controls" such as the 'Load-Normal' key (i.e. the IPL or Load button), the 'Start' and 'Stop' keys (buttons), etc.
The only ones that Hercules implements that I am interested in discussing at this time as they are the only Controls directly related to GitHub Issue #321, are the "Address-Compare" Controls, the "Rate" Control, and the "Start" and "Stop" Keys as they specifically behave in "Multiprocessing Configurations" (i.e. on machines with multiple CPUs):
It is this very last statement above that interests me the most!
It would seem to imply that the DEFAULT is for there to be only ONE rate control for the entire Multiprocessing Configuration, which in turn seems to imply that when there is only one control for the entire configuration, that instruction stepping therefore applies to ALL CPUs in the configuration, which in turn implies that the 'Start' and 'Stop' controls apply to ALL CPUs in the configuration as well whevever instruction stepping is active (whereas when instruction stepping is NOT active, the 'Start' and 'Stop' controls only apply to that one specific CPU).
Am I reading this correctly? Are my above assumptions essentially correct? Or do others interpret things differently?
What I'm thinking of maybe doing is providing a new tracing option (
TRACEOPT
) called, maybeBREAKSTEP
with two settings:ONLY
andALL
, where theONLY
setting causes pressing the <enter> key to behave as it does today: it only "starts" the current target CPU, whereas the newALL
setting would cause pressing the <enter> key to behave as if the "startall" command were issued instead, thereby issuing the start function for all CPUs, causing each of them to step one instruction just the same as the current target CPU does today.Does that seem right/reasonable to you guys?
Second related question: Currently, Hercules will only "break" (enter instruction stepping mode) whenever any CPU attempts executing any instruction within the currently specified break range (identical today to the stepping range, which is yet another bug) but continues normally if that particular CPU never happens to execute any instructions in that range, REGARDLESS of whether or not some other CPU in the configuration has reached the breakpoint. This does not seem right to me.
IMO once a breakpoint has been reached by ANY CPU, the entire system should stop (i.e. ALL CPUs should stop and then enter instruction stepping mode). Do you agree or disagree with this?
Finally, one final question which I believe may be a bug since it seems to violate the Principles of Operation (even though technically is doesn't mention anything about this): we should not (IMHO) be supporting the ability to specify a "instruction stepping range" the way we are today. IMO, once a breakpoint is reached and the processor(s) (CPU(s)) is/are stopped, the breakpoint range should be automatically temporarily disabled and the instruction stepping should immediately be entered -- WITHOUT any "range". That is, all instructions (for all CPUs IMO!) at any address should be stepped, and not just those that happen to be within the specified breakpoint/stepping range the way it works today. That is to say, IMO the way it works today is dead wrong IMO. There should be no such thing as a "instruction stepping range". There should only be a "instruction BREAK range" to emulate the architected "Address-Compare" Controls, and a single flag -- NOT a range! -- to enable instruction stepping (i.e. to emulate the architected "Rate Control") for ALL CPUs in the configuration.
Sorry for such a long post, but it needed explaining. I need to know whether or not you guys concur with my assessment of the situation and my proposed fix/enhancement. Hercules's stepping/tracing logic is very complicated and given the evidence provided in GitHub Issue #321 clearly buggy. I want to get it fixed, but I don't want to break things further or suddenly introduce brand new unexpected behavior either.
So I'd really appreciate some thoughtful feedback from you.
(*)
Thanks.
(*)
And by "you" I mean not just my fellow Hercules developers (although their opinions are the ones I'm admittedly mostly interested in), but also any/all experienced Hercules users as well. If you have knowledge/experience in this area then I want to hear from you! Thanks.The text was updated successfully, but these errors were encountered: