-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Net cleanups #50
base: master
Are you sure you want to change the base?
Net cleanups #50
Conversation
…a remote message. When we read from some client A, we may end up forwarding a message to other clients. If we forward to some client B and there is a write error, then we close B and remove it from the client list. However, if before this happened A->next == B, then the read loop will still be holding on to a pointer to B, and we crash. As it's unpredictable what clients could be closed at what point, the simplest approach is to retain closed clients in the list until we are at a point where we know there are no stray pointers on stack, and only then modify the list. This also simplifies anything that has to loop over clients, as it doesn't need to worry about the current client being freed under it.
Mostly refactoring the common code that was duplicated between the different output types so that there aren't many copies floating around. This introduces "struct net_writer" to store the state of a particular type of output service - buffers, time of last write, connection count etc. prepareWrite() / completeWrite() give access to the buffer and handle the actual writes and flushing when needed. Heartbeat and time-based flushing move into a generic periodic-work function. Update the SBS output code to use the new infrastructure. This makes a big different to CPU use when under load.
Whilst I like the vast majority of this code, and agree it tidies things up, the problem I have is backward compatibility with existing installations - specifically changing the meaning of the --net-ro-size and --net-heartbeat command line switches. There is a lot of "pseudo" documentation on various forums that would be invalidated by these changes. I think the way around this would be to add new command line switches for 'your' new variables, and then add conversion factors to convert from "rates" to "times" where necessary. That would allow current init.sh scripts and dos batch files to still work. So I'll file this pull in the "To be looked at again later" pile. Thanks. |
There are some merge conflicts that prevent this from applying cleanly to the current codebase. Can you sync your branch with the upstream? |
I don't know. I am using git hub under protest, and don't understand how a lot of it works. As soon as a merge fails I get in a right mess. My usual resolution is to take copied of both branches, manualy merge the differences (using beyond compare on windows) , and then re-post the result. However, this loses the history of the original poster. I'm open to suggestions and hints how to do this directly using git-hub tools :-) |
Just so I'm clear - I'm talking about your lines 1487 and 1663 being unsafe. |
From memory (it's been a while) I tried not to change the meaning of the existing switches - the meaning of the internal values has changed, but the option parsing got correspondingly scaled so the same option value should yield the same behaviour overall. I'll try to find some time to doublecheck that, and to update this for against the current master. |
It's conventional to make the person that submitted the pull request to sync their branch by pulling the upstream changes into it (mutability:net-cleanups, in this case) and resolving any merge conflicts themself. They can then push those recommitted conflict fixes to their branch and the pull request will automatically be updated by github to includes those new changes. |
I merged current master into this branch. Compiles, and I've eyeballed the diff, but not tested. I stomped on some changes from #33, preferring my code, as I don't believe that those changes are sufficient to avoid list corruption (see my comments on #48) (Also juggled the EOF detection code around a bit so it wasn't testing the same thing in several places) |
I looked the compatibility of the options, and I think they're OK. --net-heartbeat took a value in seconds and it still does now. The default has changed slightly, from approx 57s (900 * 64ms approx) previously to 60s now. --net-ro-rate previously took a value in 1/15th second units (actually "number of sample buffers" - so it would change if you resized the buffers or changed the sample rate) . It now behaves as if you'd provided --net-ro-interval with a value in seconds (dividing by 15). I removed this option from the help but it's still supported if provided. The only practical change is for values that represent fractional seconds; they now get truncated when converted to seconds. Is it useful to support values that represent fractional-second intervals here? It'd require changes in net_io.c to do that as currently it uses time_t to track time, which has a resolution of 1 second. We'd have to find a fractional representation that both posix & windows were happy with (does windows have gettimeofday or clock_gettime?) --net-ro-interval is the replacement for --net-ro-rate and takes values in seconds. --net-ro-size is unchanged, I think. I changed the naming of the internal variables to reflect what it's really setting: the high-water-mark for flushing data to the network. It does not (and did not, previously) control the total buffer size. Is there some change in behaviour you see here? |
This is moderately invasive; I'm happy to just keep it in my tree if you don't want to merge. (Need it locally mostly for the SBS output speedup - as otherwise my Pi runs out of CPU very fast)
See comments on 1769ac9 for most of the detail