-
Notifications
You must be signed in to change notification settings - Fork 299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add lib.ctable module. #722
Conversation
Probably needs some documentation. What kind? This is ported from the PodHashMap that we use in the lwaftr. We have performance tests but I don't know how to support them in Snabb yet. I also removed the streaming lookup bits from this table. We can add those in a separate PR, as they rely on a couple of dynasm modules. As you like. |
Another thing I left out was support for saving the table to disk and loading it back again. Will do that in a future commit. |
@alexandergall what would it take for you to adopt this for the learning bridge (#638) as a step towards consolidating our table structures? |
(See also #718) |
@andywingo What do you mean by “What kind?” Generally any public API has to be accompanied by a complete API documentation. Also useful would be to update this PR's description to tell:
|
@alexandergall Here's a test MAC table: function mac_table_check()
local uint32_ptr_t = ffi.typeof('uint32_t*')
local uint16_ptr_t = ffi.typeof('uint16_t*')
local key_t = ffi.typeof('struct { uint8_t mac[6]; }')
local value_t = ffi.typeof('struct { uint16_t lo; uint16_t hi; }')
local cast = ffi.cast
local bor = bit.bor
local function hash_mac(key)
local hi = cast(uint32_ptr_t, key.mac)[0]
local lo = cast(uint16_ptr_t, key.mac + 4)[0]
-- Extend lo to the upper half too so that the hash function isn't
-- spreading around needless zeroes.
lo = bor(lo, lshift(lo, 16))
return hash_i32(bxor(hi, hash_i32(lo)))
end
-- 14-byte entries
local occupancy = 2e5
local params = {
key_type = key_t,
value_type = value_t,
hash_fn = hash_mac,
max_occupancy_rate = 0.4,
initial_size = ceil(occupancy / 0.4)
}
local ctab = new(params)
-- Fill with { i, 0 } => { bnot(i) }.
do
local k = key_t()
local v = value_t()
for i = 1,occupancy do
cast(uint32_ptr_t, k.mac)[0] = i
cast(uint16_ptr_t, k.mac+4)[0] = 0
cast(uint32_ptr_t, v)[0] = bnot(i)
ctab:add(k, v)
end
end
local pmu = require('lib.pmu')
local has_pmu_counters, err = pmu.is_available()
if not has_pmu_counters then
print('No PMU available: '..err)
end
if has_pmu_counters then pmu.setup() end
local function measure(f, iterations)
local set
if has_pmu_counters then set = pmu.new_counter_set() end
local start = C.get_time_ns()
if has_pmu_counters then pmu.switch_to(set) end
local res = f(iterations)
if has_pmu_counters then pmu.switch_to(nil) end
local stop = C.get_time_ns()
local ns = tonumber(stop-start)
local cycles = nil
if has_pmu_counters then cycles = pmu.to_table(set).cycles end
return cycles, ns, res
end
local function check_perf(f, iterations, max_cycles, max_ns, what)
require('jit').flush()
io.write(tostring(what or f)..': ')
io.flush()
local cycles, ns, res = measure(f, iterations)
if cycles then
cycles = cycles/iterations
io.write(('%.2f cycles, '):format(cycles))
end
ns = ns/iterations
io.write(('%.2f ns per iteration (result: %s)\n'):format(
ns, tostring(res)))
if cycles and cycles > max_cycles then
print('WARNING: perfmark failed: exceeded maximum cycles '..max_cycles)
end
if ns > max_ns then
print('WARNING: perfmark failed: exceeded maximum ns '..max_ns)
end
return res
end
local function test_lookup(count)
local result
local k = key_t()
for i = 1, count do
cast(uint32_ptr_t, k)[0] = i
result = ctab:lookup_ptr(k).value.lo
end
return result
end
check_perf(test_lookup, 2e5, 300, 100, 'lookup (40% occupancy)')
end Add that to ctable.lua and test on your machines. Here on this old i7-3770 sandy bridge desktop I get about 50ns/lookup. This CL doesn't have streaming lookups yet though, so I can't give those numbers. |
Thanks @eugeneia, that's what I was asking for. I am surprised a bit by the conventions tho -- to me a PR shouldn't say what the code does, the code should say it, augmented by comments if needed. This module is lacking the documentation block of course. I also don't really get the purpose of out-of-line per-function documentation; it is bound to drift out of sync. We're always spelunking in the code anyway, to me the right place to document is there. Obviously we disagree about how to do things :) |
Frankly, I'm extremely glad to have the out-of-line per-exposed-API documentation. I pretty much always end up reading the code after, but that serves a different set of purposes. |
Basically, what @kbara said, but to elaborate a bit:
Documentation and actual API will not drift out of sync because I do code review and will not accept PRs that break the API (or lack documentation). |
I feel like I'm being lectured-to here a bit and I don't enjoy it. Respectfully @eugeneia I would have thought that as a person who does code review that you would have followed some of the long, long discussions about ffi hash tables that have taken place over the last 3 months and for that reason we would start from implementation concerns rather than process. Anyway, thank you for the documentation link, and I look forward to your comments on the code. |
@andywingo @lukego I would certainly appreciate if I can replace my MAC table hack with this. Unfortunately, I don't have the time right now (need to finish the NixOS packaging first). I haven't looked at the code either, but curious how this affects the issue I was having with "branchy loops" in my bridge app. If someone volunteered to write a performant bridge app for me, I wouldn't mind :) |
@alexandergall if you use the streaming lookup, there are no branches at all if you have no hash collisions. That's because it fetches the entire bucket, so to speak, into cache, then uses a branchless binary search for the hash in that bucket. (I use the terminology you use in your mac table.) If you use the API in this PR you can get side traces though because of the linear probe. On the other hand this API fetches less memory than a streaming lookup, which can make it competitive. In practice we haven't experienced the branchy loop problem (yet?), and this table is integrated in the lwaftr. Question for you though: what is your performance target? Is 50 ns/lookup OK? |
On an interlaken-alike haswell E5-2620v3, 20 runs of that benchmark I pasted above:
|
@eugeneia @andywingo Github Pull Requests are unfortunately ambiguous. People open a PR with a specific intention, but what is that intention? (To get code accepted upstream ASAP? To get early feedback on a design before going forward? Just to be social and show the code they are working on?). So it is easy to rub each other the wrong way e.g. don't bother to merge something that you assume is a draft, or give "I can't merge this until X,Y,Z" feedback on code that is not prepared for merge. I'm not sure how to reduce this. I tend to put hints in my PR titles like |
@andywingo No lecturing intended at all. I am sorry if my response came across as harsh, it was not intended to, my sole intent was to answer the questions asked. @lukego raises some points that apply here regarding PR intents (probably a good idea to “standardize” the E.g. my perspective was “what is required for this to land in the next release” but maybe that isn't the intent at all here. I have indeed not followed the ffi hash table discussion, so some context in the PR description would help me to get into the code. Where is that discussion by the way, is #718 a follow up? |
@andywingo Thanks for the explanation. That sounds terrific and should eliminate a big part of the problem. There was a branchy loop in the bridge code as well (determining whether to unicast or flood a packet). I might come back to you for advice how to eliminate that one, too. And, yes, 50ns/lookup is nothing I would complain about :) My performance goal for the entire VPN app is fairly moderate, something like 6-8Mpps. But getting faster is always nice. |
@andywingo @alexandergall you guys might want to talk cycles because I think you're testing with much different CPU clock speedw |
@lukego The clock on my system is 3.5GHz. |
@wingo Thank you for doing this work and sending it upstream! The "need a big table for FFI keys/values" problem has been with us since pretty much the beginning of Snabb Switch, way back when @plajjan did his DDoS protection prototype. Since then we have tried a lot of different things but never found a really general solution: until now it would seem. I think the code is extremely stylish btw :-) both on this PR and also the asm code for searching the table. I would not have guessed it would be possible to have a sophisticated optimized hashtable in so few lines of simple code. For my part I had experimented with bindings to a few off-the-shelf libraries ("Judy arrays in 1250 lines", khash, a patricia trie) but was not satisfied with any of those and gradually coming to the conclusion that what we really wanted was something almost exactly like what you have written here. In the back of my mind I wonder if some day we could unify our hashtables with the builtin table support in LuaJIT -- without mixing up FFI data with the Lua heap -- but I have no idea if that is practical and it seems like something for later if at all. |
Hi :) Picking up this thread. The reason I initially asked about the documentation format was because there was no obvious place to which to add documentation, and the lib.foo modules in general are undocumented, with some exceptions, and the modules that are documented have their own subdirectories. There was no example. I will instead add a lib/README.ctable.md and include it in genbook.sh. Let me know if there is a better way, @eugeneia :) |
@wingo Eh, please no |
@eugeneia Anything is possible :) However it seems to me that it would be worse to require the user to do local ctable = require('lib.ctable.ctable') than local ctable = require('lib.ctable') For that reason I think a subdirectory is not the right thing. Thoughts on naming are still welcome though :) |
@wingo I agree that the repeating module names are not pretty but plenty of other Snabb Switch modules already look like that (for the same reason). Its just a cosmetic issue that does not impede on functionality so I tend to not consider this an argument against the sub directory. |
I feel strongly enough about the unnecessary module hierarchy that I will resubmit a patch without the additional subdir. Though it's a factor to consider, to me the documentation build system should not dictate the module hierarchy. Feel free to NAK if you also feel strongly about this. |
* src/Makefile (MDSRC): Rename from RMSRC, and search for any .md.src file. Update users. (MDOBJS): Rename from RMOBJS.
… functions * src/lib/ctable.lua (make_equal_fn): Add specialized implementations for 2-byte and 6-byte keys as well. (hash_32): Rename from hash_i32, as it works on any 32-bit number. (hashv_32, hashv_48, hashv_64): New functions. (selftest): Update to test the new equal functions.
* src/doc/genbook.sh: Add section for specialized data structures. * src/lib/README.ctable.md: New file.
The documentation build system does not dictate the module hierarchy, but rather the general build system. E.g. in order to have a module that consists of N files and having a directory that contains all N files the module hierarchy has to reflect this. Imho this issue is not severe enough to justify changing the build system. I personally think there is value to having documentation (in form of README.md) and code in a dedicated directory (GitHub plays nicely with this, as it will display the README.md when browsing a directory). See /SnabbCo/snabbswitch/tree/master/src/lib/hardware for an example where this works nicely (in both ways, e.g. That being said, I am not against putting |
So, probably the documentation needs to move; but its final location depends on how it's integrated into the document, so please see the genbook change, @eugeneia. These commits went ahead and added README.ctable.md even after you expressed some disagreement there. I was going to change, but then looking on how to integrate the docs into the main book I hesitated and left them where they are for now. I didn't add lib/README.md because I didn't know how to section the documentation for this module, relative to the other modules that are in lib/ but already in the genbook.sh. To have a comprehensible documentation of lib/ means that to me perhaps we need to not have a src/lib/README.md. I anticipate adding a few more of these specialized data structures in other commits, so let's set the pattern correctly here. @eugeneia for me this PR is ready to go, please tell me the changes I must make. Thanks! :) |
pointer to the value in the table, or copy the value into a | ||
user-supplied buffer, depending on what is most convenient for the user. | ||
|
||
As an implementation detail, the table is stored as an open-addressed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separate implementation details (these should be comments in the source) from user documentation
We can certainly remove implementation details from here, but the performance characteristics of this table are user-visible features and it's hard to explain them without describing how the table works. WDYT?
I am sorry you feel that way. Why would you think I am frustrated? I was simply replying to the different issues that where touched on. I think you missed my point: yes, the different discussions all do relate to this PR but they do not have to be in a single thread and we can use GitHub's cross referencing to make it more easily digestible.
Generally this is how I prefer it too, but the case of proofreading documentation it tends to involve creating a LOT of commits. Maybe comments on source are cheap and I shouldn't be afraid of the overhead. |
* `hash_fn`: A function that takes a key and returns a hash value. | ||
|
||
Hash values are unsigned 32-bit integers in the range `[0, | ||
0xFFFFFFFF)`. That is to say, `0xFFFFFFFF` is the only unsigned 32-bit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a typo: [] indicate inclusive bounds, () indicate exclusive bounds. I think it's explained in the next sentence.
@eugeneia I see it as a high priority to bring programs onto the master branch as much as possible. This is something that would be done voluntarily by people developing those programs to make their own lives easier. It is our job as upstream to "sell" this concept so that people actually decide to do it. The reason this is important to me is that I think it is very powerful to have a common code base. This makes it possible to make "global" changes to the software like the "straightline" redesign or rewriting the standard hashtable or ... lots of interesting things to keep the software moving forward. I see lots of this kind of work on the horizon e.g. making programs work with any I/O interface instead of depending on an 82599 NIC, making programs parallelize in consistent ways, and so on. I would much prefer to tackle these problems with all of the relevant code "on the table" rather than tucked away in branches and without upstream test coverage. Now, if this concept is going to succeed, people will need to actually want to host their software on the master branch. This will mean staying in control of their code and not having to worry about undue interference from gatekeepers (i.e. us) between themselves and their users. This will require establishing definite expectations about who is responsible for what. I have discussed this privately with @alexandergall in the past in the context of his VPWS/ALX application. The basic model that we came up with was for applications to have their own private directory trees ( The expectation is that the code outside those directory trees would be shared between all the programs and we would maintain that together in accordance with the evolving community standards. |
@eugeneia really we are running the major risk of blowing that whole scheme here. Andy is trying to submit some valuable code and I am holding it up by talking about elaborate Git workflows and you are holding it up by cataloguing mistakes in the README. If the downstream branches see engaging with upstream as having net-negative value then people will not bother sending us code and we will become irrelevant. I think that we need to re-style ourselves as humble servants rather than gatekeepers :-) |
descriptions, the object on the left-hand-side of the method invocation | ||
should be a ctable. | ||
|
||
— Method **:resize** *size* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prepend method definitions with the class name. E.g. ctable:new instead of :new
So, new
is a function, but there are many methods like resize
. I was a bit worried though that there would be confusion between the module name and the class name, and for that reason left off the LHS on these definitions. In fact partly for that reason we don't mention the name "class" in these docs, as it doesn't really exist. What's the right solution here? Is this acceptable as I've described it above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alex has made the precedent for OO style in Snabb and I have accepted/adopted his conventions because I felt they were good. In these terms the module ctable
is a class, which would have a class method new
to instantiate ctable
objects. You can find examples for this in lib.protocol
and its documentation. (Generally this is how I do it and what I recommend: look for prior examples in existing code and try to avoid breaking conventions where possible.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do the same, but in this particular case I have been really weirded out by the OO conventions of the protocol libraries; I find them distinctly non-snabby. Compare to core.packet
for example. Sometimes I avoid using the protocol libraries just because I find them too weird!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add to that I just noticed the hash functions of the ctable
module, from an OO perspective these would probably be class methods as well? There is a precedent in lib.protocol
where header classes have class and instance methods as well but the distinction is not made in the documentation. I guess this is a FIXME, I am somewhat out of answers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes they are definitely distinct from core.*
(but I don't think that is necessarily an issue), imho the question is if the style works well for the API. But, we are digressing again. I say do as you see fit for now.
Edit: but please add the prefixes, e.g. “Function ctable.new ... returns a ctable.” and “Method ctable:foo ...”
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the hash_32 are class methods as they don't expect to be operating on a class.
At least in a different language (like python), new() would be a class method, returning an instance of the ctable class. But the hash functions don't return an instance of a class, just a value, so they are just functions that happen to be residing in this module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one more note:
Edit: but please add the prefixes, e.g. “Function ctable.new ... returns a ctable.” and “Method ctable:foo ...”
In Function ctable.new
, the left hand side ctable
is the module:
local ctable = require('lib.ctable')
But in “Method ctable:foo ...”, the left-hand-side is just a ctable. This method is not available otherwise! For example you can't get to this function except as a property of a ctable. Notably it is not an export of the `ctable' module. You sure you want to document it in these two ways?
This ctable change is looking extremely polished to me. I am looking forward to getting it on a feed to |
This reverts commit 9666348.
Thanks to Max Rottenkolber for review.
Okeysmokes, I have addressed the review comments, I think. Please take a look @eugeneia when you can. Thanks! The open question that remains is which feed to |
@wingo You have already done so much. Somebody should update the conntrack module but I think we can take this asynchronously and it need not be you if you don't feel inspired to do so. I'm assuming that this change is targeted to |
Heh, the code isn't all that much, and having a use in-tree would help. I'll handle the conntrack thing in a followup. Probably some other things will come first though. |
What is the status here? |
Merged into |
### Ctable (lib.ctable) | ||
|
||
A *ctable* is a hash table whose keys and values are instances of FFI | ||
data types. In Lua parlance, an FFI value is a "cdata" value, hence the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there are some FIXMEs in the code I assume this will be iterated upon. I will leave some notes of things I thought about while going through this PR:
Most of the introduction (from “In Lua parlance...” to line 35) are implementation details. I would avoid talking about them in the documentation. The reader most likely only cares about how to use it, not how it works (the how can be source code comments). This case is especially questionable because implementation details are mixed with usage information, so you have to read the whole lot even if you are only looking for either one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I quite like the introduction, including both use and how, but maybe I'm more interested in the details of the how than the average user. I'm not against rearranging the text but I don't want to remove anything at least :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that non-normative sections in the documentation can lead to other relying on a specific behavior that is not actually part of the API. But I might be wrong, feedback in #753 would be appreciated,
Glad to have this change land on next. Note that there are infinitely many opportunities for everybody to further improve this code and documentation with follow-on PRs. |
Revert "Remove end-addr in psid-map"
Add a standard hash table for use when dealing with FFI keys and values. See #718. For the road to this implementation, see Igalia#97, where this table was called "PodHashMap".