Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of RawSocket using ljsyscall #655

Merged
merged 7 commits into from
Jan 5, 2016

Conversation

dpino
Copy link
Contributor

@dpino dpino commented Nov 6, 2015

Reimplements the existing RawSocket app using ljsyscall.

assert(self.fd)
assert(C.send_packet(self.fd, p) ~= -1)
function RawSocketDev:transmit (p)
local _, err = S.write(self.sock, p.data, p.length)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if write fails to transmit p.length data? In RawSocket:push we barely call self.dev:transmit(p) and then unconditionally packet.free(p). I would suggest to retry sending until whole p.length is sent. Of course then we get branches in the Lua code, which could be better implemented in C for performance reasons (at least that is what Alex Gall did here #638)

As for the error handling, I can imagine a testcase where we send packets from a 82599 10Gbps port to a 1Gbps eth0. Then you can easily get EAGAIN or EWOULDBLOCK (see man 2 write) for our non-blocking socket. I guess we could retry a couple of times before we "drop" the packet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if write fails to transmit p.length data?

Good question!

Breaking it down:

  1. How much data should be buffered?
  2. Should this be buffered by the kernel, by Snabb, or both?
  3. What should happen when the buffer is full?

I would like to make a case in favor of the behavior on the PR now:

  1. Buffer should be sufficient for one "burst" of packets. Ballpark 256KB?
  2. The kernel already has a buffer, and we should be able to adjust its size by setsockopt(), so don't need to buffer in Snabb too.
  3. If the buffer is full (EAGAIN or EWOULDBLOCK) then we should drop the packet immediately, provided that we trust the buffer to be big enough to handle bursts and only fill up during actual congestion (insufficient throughput from the kernel).

@lukego
Copy link
Member

lukego commented Nov 8, 2015

Great hacking, @dpino! I love PRs that reduce code size while preserving functionality.

One more opportunity to simplify if you fancy: the code in dev.lua could be merged straight into raw.lua. The separation between I/O primitives and apps does not seem very important these days. That was the way we wrote code back when the "app network" concept was new and experimental but these days it seems to be well established. (That is one perspective anyway.)

@dpino
Copy link
Contributor Author

dpino commented Nov 9, 2015

I merged RawSocketDev into RawSocket. I made some slight modifications too like adding a stop call at the end of the test, and make transmit return the length of the packet transmitted. I have a patch in the queue implementing the change @nnikolaev-virtualopensystems suggested, retry transmit in case the whole length of a packet was not transmitted. Let me know what do you think.

@eugeneia
Copy link
Member

This PR also fixes #644.

@lukego
Copy link
Member

lukego commented Dec 1, 2015

This is a great initiative and the new ljsyscall code looks very crisp and clean compared with having to call out to C.

Sorry to be so slow on a detailed review. I have spotted a couple of small bugs that are worthy of fixing before merge, details below.

local index, err = S.util.if_nametoindex(ifname, sock)
if err then
S.close(sock)
return nil, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The engine does not support this return nil, err return value. Rather it is always expecting an app instance to be returned. I think the best action here would be to raise an exception with error(...reason...) which the engine then should catch and handle (doesn't right now by default - that is something else we need to fix separately).

@dpino dpino force-pushed the ljsyscall_rawsocket branch from 95cf294 to c89f725 Compare December 8, 2015 15:17
@dpino dpino force-pushed the ljsyscall_rawsocket branch from c89f725 to b4d6f08 Compare December 8, 2015 15:19
@dpino
Copy link
Contributor Author

dpino commented Dec 8, 2015

@lukego: Sorry, it took me a while to update the PR. I added some code to handle err in the constructor. error(errno) prints out a message according to errno and stops execution. I had to push force to update as the branch needed to be rebased against master.

lukego added a commit to lukego/snabb that referenced this pull request Dec 11, 2015
@lukego
Copy link
Member

lukego commented Dec 11, 2015

Thanks Diego! Sorry to take so long merging these recent excellent PRs. Merged to next now.

Side-topic: The intention of our git workflow is to never force-push to a branch that is the source of a pull request. This should not ever be necessary: if it seems to be then please say why so we can improve the workflow docs :-).

For the case where you want to incorporate new changes that have landed on master into a PR branch then the solution should be git merge master to bring them in with a merge commit. I know that many people hate merge commits but the Torvalds school of thought says it is valuable history. (If you don't actually care about the latest changes on master then you should also be able to ignore them and SnabbBot will be smart enough to automatically (re)test the right version i.e. latest master with the PR branch merged.)

The main reason I want to avoid force commits on PR branches is that it will make the master branch into a central bottleneck. If force pushes are allowed then no sane person will merge PRs into their own branches and everybody will wait for the "real" version to land on master before merging fixes and features. This is backwards for our distributed workflow: the master branch should be the most conservative one trailing behind the other branches that are eagerly incorporating and testing early versions of PRs.

@dpino
Copy link
Contributor Author

dpino commented Dec 16, 2015

@lukego Thanks for the long explanation. Now I have much clearer mental modal. Initially I clicked the "resolve conflict" button that merges master into the PR, but that didn't seem to me right. Now I know it's safe to do that.

@eugeneia eugeneia mentioned this pull request Dec 17, 2015
@eugeneia eugeneia merged commit b4d6f08 into snabbco:master Jan 5, 2016
@dpino dpino deleted the ljsyscall_rawsocket branch January 6, 2016 09:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants