-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: TunnelCommunity multiprocessing #2607
Conversation
ce63494
to
3025cab
Compare
Ok.. within the same timestamp a 1.9MB download goes from 0.0 to 1.0 after checkpointing. @devos50 is this the infamous checkpointing bug? (Or is this my error?)
|
8669a65
to
6e8b6c9
Compare
@qstokkink no I can't explain that error unfortunately. |
This is an actual bug in my code. Confirmed to fail on localhost reliably for this branch and pass reliably on devel. |
Alright, the bug is fixed by replacing a line of code I thought was duplicate (which it is, normally). The cause is that the We should probably look at refactoring the following line of code into the tests, where it belongs: |
@devos50 ready for whenever you feel masochistic enough to review |
@qstokkink there is much code coverage to gain. When will you write the unit tests for these uncovered lines? |
@qstokkink I also see many pylint errors that are easy to fix |
@devos50 As far as code coverage goes, I can write unit tests for the following classes somewhat easily:
And I really should, you are correct: I'll mark this as WIP until I have implemented these. The rest would really make more sense being covered by an integration test. As far as the pylint errors go, I believe fixing them all would be very much possible and very much treating the metric: I am prepared to defend each of these violations. |
ea5e024
to
1a0baf9
Compare
Alright, unit tests are now covering all classes which are not:
I think these would be better suited in a Gumby integration test, which will follow later. Note that inclusion of category 2 would mean having to come in with root access once in a while to clean up orphaned processes. Category 3's would be possible using a really fat testing framework, but then the unit tests (a) become small integration tests, (b) would also require the classes themselves to be refactored to allow for testing and (c) only serve to bring coverage up to 100% for trivial lines, while advanced functionality is already covered by the other unit tests. For now I will mark this as READY again, feel free to challenge me on any pylint violation or class assignment to the integration tests. |
@qstokkink I started reviewing, however, it might take a few days because I'm also busy with wrapping up the first alpha release of Tribler 7. Regarding the tests: the coverage of you code is quite low and we want more coverage by unit tests. While at first glance, it seems that writing tests for various classes leads to integration tests, it is often not very hard to create small self-contained unit tests using mocking. Take for instance the I don't expect that you cover all possible lines (since some are almost impossible to cover), however, the coverage of your code by the means of unit tests must be higher. The lack of unit tests is something that has plagued Tribler for many years and has lead to many regression bugs. Please try to keep these unit tests as small as possible and make sure they are only testing the thing that you actually want to test. You can take a look at the test suite in Tribler, especially the Core tests. There are various classes which are tested using mocking. If you have any questions, please let me know! |
@devos50 I hate having to implement a factory pattern/adding complexity just for testing purposes. Your point about regression bugs is very valid though, so I will (attempt to) squeeze out some more tests. |
@qstokkink I've tried various torrents on MacOS (Ubuntu, Pioneer, Big Buck Bunny). Some of them are starting the download and others are not. When the download is starting, the subprocesses seems to work correctly 👍 I don't see significant improvements to the download speed, however, I don't download much anonymously. Will test on Windows next |
Also, review is coming! |
@devos50 Thanks for the efforts! This last week-and-a-half I have been doing thesis writing, so no real progress on the additional tests: although I have started working on the Gumby experiment. Also, I guess it's at least good to see that the CPU usage is under control, even though the download speed hasn't improved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reviewed your work. Overall, it's quite impressive. I see some advanced data structures and methods being used (SyncDict
, RPC
) and it's interesting to see how they are working together.
There is still a bit confusion left on my side on which processes there are exactly so maybe you can make this more clear (at least in your thesis)? It would also be nice if you can write a page on our readthedocs
(http://tribler.readthedocs.io/en/latest/) about the structure of the new tunnel community so other users can read about it, however, this is probably something for after this PR.
I'm very positive that the code can be merged before the release of Tribler 7 and that we can ship it as an experimental feature that can be enabled in the settings pane (this means that it will be disabled by default unless @synctext has other ideas about this). I haven't tested on Windows yet since I'm unable to get the old GUI working on it so if you have any Windows computer around, please try it out to see whether it works.
Moreover, before the Tribler 7 release, we should test how well the code behaves in a frozen environment (after baking the final .dmg
/.exe
file that is distributed) since there are some minor quirks in such environments regarding multiprocessing.
There are still many Pylint issues open that are very easy to fix. Also, try to improve the code coverage by writing small unit tests.
If you comment my feedback, please use fixup commits so I don't have to review everything again after you have made changes :)
self.tunnel_community = self.dispersy.define_auto_load( | ||
HiddenTunnelCommunityMultichain, dispersy_member, load=True, kargs=tunnel_kwargs)[0] | ||
if self.session.get_tunnel_community_pooled(): | ||
if self.session.get_tunnel_community_test_pooled(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should add test-dependent code to our Core code base. Instead, you can load this test community in the test(s) where you need this.
self.tunnel_community = self.dispersy.define_auto_load( | ||
PooledTunnelCommunity, dispersy_member, load=True, kargs=tunnel_kwargs)[0] | ||
else: | ||
if self.session.get_tunnel_community_test_pooled(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above (no test-dependent code in our Core)
Tribler/Core/SessionConfig.py
Outdated
""" | ||
return self.sessconfig.get(u'tunnel_community', u'pooled') | ||
|
||
def set_tunnel_community_test_pooled(self, value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment about test-related code in the core code base.
Tribler/Main/tribler_main.py
Outdated
@@ -912,6 +912,15 @@ def run(params=[""], autoload_discovery=True, use_torrent_search=True, use_chann | |||
params = get_unicode_sys_argv()[1:] | |||
else: | |||
params = sys.argv[1:] | |||
|
|||
if "--tunnel_subprocess" in params: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add this option to the twistd plugin, both to the Tribler runner (tribler_plugin.py
) and the tunnel helper plugin (tunnel_helper_plugin.py
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that tribler_main.py
will be removed in a while. I guess this code is quite important for the functioning of the pooled community?
Also, is it possible to have this code in a separate file?
self.ctrl_written += s | ||
|
||
def clear_callbacks(self): | ||
"""Deal with the aftermath of calling send_rpc without a response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please have the """ and the text on separate lines (that's more in line with our used style of documentation)
# Configure MiniSession | ||
config.set_state_dir(working_dir) | ||
config.set_torrent_checking(False) | ||
if not test_mode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test-related code.
:return: RPC response code | ||
:rtype: str | ||
""" | ||
if not self.session_started: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't these kind of events be logged?
required_endpoint, | ||
info_hash) | ||
|
||
def on_data(self, s): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename this parameter to be more meaningful
@@ -1401,3 +1427,22 @@ def increase_bytes_received(self, obj, num_bytes): | |||
else: | |||
raise TypeError("Increase_bytes_received() was called with an object that is not a Circuit, " + | |||
"RelayRoute or TunnelExitSocket") | |||
|
|||
def set_process(self, process): | |||
"""Set the TUnnelSubprocess we belong to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra capital letter in the word TunnelSubprocess
circuit.unverified_hop = None | ||
|
||
if circuit.state == CIRCUIT_STATE_EXTENDING: | ||
ignore_candidates = [self.crypto.key_to_bin(hop.public_key) for hop in circuit.hops] + \ | ||
ignore_candidates = [self.hops[h].node_public_key for h in circuit.hops] + \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please name the h
variable better.
@devos50 Thanks for the review! I have one rebuttal, for several of the comments on the names in the pooled_tunnel_community: the names in this class have been adopted to provide the same interface as the TunnelCommunity. If I were to change these particular method names, I would have to refactor the TunnelCommunity and the rest of Tribler. This also ties in to the TODO's for duplicate code: I was planning to give the TunnelCommunity and the PooledTunnelCommunity a shared interface/abstract parent class for structural clarity. However, I deemed this too complicated for this PR and left it as a TODO for the future. |
f8e0092
to
ea20c6c
Compare
0060ab6
to
8c8b100
Compare
Welp. I finally got it working on Windows. Turns out Windows has special binary flags for the stdio streams. Fix incoming in a little bit. |
7dd823b
to
431c065
Compare
@devos50 This is confirmed working (again) on Windows and Linux. What do you want to do with this PR? (rebasing this is going to be a little bit of a bitch, so I'll hold off on that until this gets greenlighted for merging) |
congrats on the Windows victory! |
@qstokkink good to hear! Why not rebase now? It has to be done anyways. @synctext do we want to ship this as experimental feature in the upcoming Tribler 7 release or should we wait with merging until we released v7? |
If this is stable and mature, then it's great stuff to merge. Quality over Tribler feature set.. |
d5c2857
to
51aea88
Compare
Alright, history has been rewritten, pylint errors have been minimized (I can't get rid of the remaining ones or everything breaks 😃 ): as far as I can see this is ready for review again. |
496c5a1
to
0ca5155
Compare
0ca5155
to
f7ca6ca
Compare
Changing this to WIP to implement some things on my wishlist for this PR:
|
102000b
to
8157422
Compare
With the advent of IPv8 and separating the GUI from the core, 90% of this PR is no longer required. We can filter out the useful stuff later, if/when we need it. |
Related to #1882 .
This PR adds a multiprocessing option for the TunnelCommunity by using Twisted's reactor.spawnProcess.
Architecture
The architecture of this implementation consists of two halves:
The following subsections will cover these two views. After that the generic process interface will be discussed.
Main process view
Using the
pooled
option in the settings the main process can decide to substitute its TunnelCommunity (/HiddenCommunity) for a PooledTunnelCommunity. This PooledTunnelCommunity offers the same interface as the normal TunnelCommunity. The big difference of course, is that it offloads (1) circuit creation (create_circuit
), (2) sending LibTorrent data (send_data
) and (3) retrieving download info/peers (monitor_downloads
) to other (child) processes.This offloading to child processes is handled through the ProcessManager class, leaving the PooledTunnelCommunity solely in charge of decorating the TunnelCommunity interface. This keeps the code clean and focused.
The wrapper for the child processes which the ProcessManager uses, is the TunnelProcess class (
processes/tunnel_childprocess.py
). The TunnelProcess takes care of the high-level interfacing with the TunnelSubprocess class running on the actual child process (more on that in the next subsection).The low-level twisted Protocol interfacing is handled by the TunnelProcess's parent, the ChildProcess.
Subprocess view
The TunnelSubprocess is in charge of running a TunnelCommunity which supports RPC calls from the main process. This is achieved (in line with the existing
tunnel_helper
code) by running a very light-weight Session. The low-level interfacing with the main process is handled by the TunnelSubprocess's parent class, the Subprocess class.Note that the TunnelSubprocess is in charge of all encryption/decryption and management of circuit and hop constructs. The TunnelSubprocess will share its circuits and hops with the main process, but it is the only process allowed to edit these.
The IProcess
To minimize duplicate code, both the TunnelProcess and TunnelSubprocess inherit from the IProcess interface. This interface is implemented on the ChildProcess by means of the twisted ProcessProtocol.transport and on the Subprocess using
os
file descriptors and raw threads. These raw threads on the Subprocess are delegated to the twisted threadpool a.s.a.p. to avoid threading issues. Building on the IProcess is the RPCProcess which provides generic serialization for RPC calls.The generic RPC interface, in turn, allows for object synchronization using the SyncDict class. This class is a subclass of the
dict
class, which checks for updates to its entries (RemoteObjects) and sends out minimized update messages/diffs to synchronize the main process's view of the shared objects.Data flows
There are three data flows between the main process and a child process. These consist of:
These data flows are implemented using separate process file descriptors (so, next to stdin, stdout and stderr). Note that this leaves the stdout and stderr open for debugging purposes.
The rationale for each of these streams will be discussed in the following subsections; each of these different flows has radically different data characteristics, which would otherwise interfere.
Control flow
The control flow handles messages with high diversity and low volume. This comes down to the main process requesting community setup, circuit creation and download monitoring and the child process forwarding notifications, SyncDict changes and circuit deaths.
Raw data flow
The raw data flow deals with no diversity and high volume messages. Its only purpose is to serialize and deserialize LibTorrent data as fast as possible.
Exit messages
The exit messages have a low latency requirement. If these messages were to have to wait for both data and control messages this would add unnecessary delay to Tribler shutdowns. This turned out to be especially important in a previous Proof of Concept, in which a bug clogged the pipe and the process could not exit (non-forcefully).
Reviewing this PR
The commit history of this PR has been fabricated in such a way that: