-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1 Second TTL Expires (Nearly) Immediately #307
Comments
Sadly since there's no high resolution clock in use, that's a thing that'll happen. The smallest realistic TTL is 2, and that'll still be +/- 1. It's always been this way though, and not sure it should be changed. Should I document this better somewhere before closing the issue? |
I think it could be helpful to others in the future if it was clearly stated somewhere. We've been getting burned by this for over a month and took is quite a while to narrow it down to this issue. We use MC as the locking mechanism in a mutex class on PH (using the ADD command), which in some cases we want them to have a very short TTL, but weren't aware that we'd be losing up to a full second from it. ;) |
It could be worth replacing the comparison that's done from In my opinion, it's better to have the keys live a second too long rather than a second too short. As long as there are no evictions, TTLs should behave like contracts in that manner. If it stays how it is, then I think it should be documented to avoid other people with real-time applications running into the same issue. |
Maybe a one second TTL should be refused? |
Sadly, someone somewhere relies on the existing behavior. It's kind of a nightmare to change, and I've already had a lot of bloat with starttime twiddles. I can document it in doc/protocol.txt, and maybe parts of the wiki, but it's hard to guarantee people will even see that :) |
It might save someone else a bit of trouble. It took us a pretty long time to narrow it down, since it wasn't really mentioned anywhere online (that we could find). We didn't think that this could be the source of our problem; we figured if this was the case, someone else would have had it happen to them given the maturity of MC. |
I'm sure some folks have, but few reach out, so thanks :) I'll make sure it's added to any documentation discussing TTL's before closing the issue here. |
Nitrix suggest to replace =< with < is hands down MOST logical.. it CANNOT affect anyones code (according to dormando "existing behavior" ) .. as timing accuracy is already +/- 1sec. Thus < cannot possibly change ANY code except guarantee a 1 sec TTL (range 1.000001 - 1.999999) that is still within a +/-1Sec timeframe ( 1sec + 1sec = 2 ) plus 1 microsec in worst case. Neoform team ought not to have experienced that downtime .. (red faces). May i ask for links to the "bloat" start-time twiddle issues ? Talks with Mellanox prove one thing .. to offload compute or to bypass kernel on a in-memory store is an only way to gain bandwidth .. Why bog down a distributed access with old protocol ? if i may quote .. " InfiniBand and RoCE employ lossless link-level flow control, namely, credit-based flow control and Priority Flow Control. Even with unreliable transports (UC/UD), packets are never lost due to buffer overflows. If i store all my data in DRAM then i have 100% mem-cache hit , the only worry is BER .. |
It affects code if they set items with a 1s TTL and expect them to be missing some of the time; ie a probabilistic rate limit or lock. Though it's a guess; the truth is anything exposed by an API to the public ends up being used by someone. I'm not clear on what the rest of your comment is asking about. I've been working with mellanox cards for over a decade and the only thing consistent with them is their production performance is typically 25-50% of what marketing promises. They make a bulk of their sales by undercutting intel on price and aggressively finding customers until someone buys without doing careful tests first (or they don't actually need the performance but enjoy being promised it). I'd take anything they tell you with a grain of salt. |
Dormando .. the 1s TTL is a special case, that can timeout in 1 uS , or 2.621 mS thus NOT yield any usable timing .. at 2s that returns a valid delay of 1.000001 thru 1.999999 s so the spec is +0/-0.999999 not +/-1s ..
of course, an individual may opt to change <= to < in his own code. This cannot affect incoming GETS ?
only can block (drop) for a 1s TTL setting . in that probabilistic "Error" u mention.. 1uS or 2,621 / 3mS .. and then >5mS may work ( code function dependent )
The rest of comments are exactly what they appear.. to offload compute or kernel bypass, is todays only way to gain bandwidth .. i may trim 12 bytes out of memcache Command, but not gain any practical speed-up for the trouble. 50% performance for 40Gb IB Connect X cards is fantastic for lossless delivery
in a twinflex (Cu) passive cable cluster connect, as u say most dont need such performance, only the pleasure of advertised promise. I know exactly what u are saying : ) .. What i really want ...
Lets set up an infiniband isolated 8 node cluster at my associates NY datacenter, (already in a rack) with a 2 Server http// front end running PF_ring ZC as packet filter & socket channel management for >100,000 client connects ..I believe memcache is perfect to fetch 8 packet responses from client index header inquiry.
IB is the perfect isolator between the public connect and the backend cluster in-mem pinned database that memcache supervises on a per server instance. It will be interesting how u intercept the IB headers as they come thru managed Subnets ..
What makes this interesting is the wire-speed compute engine in the IB fabric (proprietary HW) lets work on Mellanox buying the IP for a few 100m .. RAM storage and parallel distributed processing are two features of in-memory computing. An effective distribution of data processing, which is an integral part of in-memory computing, remains elusive. The way i see this .. i supply all hardware, what is there to lose ?
…On Thu, Nov 2, 2017 at 12:03 PM, dormando ***@***.***> wrote:
It affects code if they set items with a 1s TTL and expect them to be
missing some of the time; ie a probabilistic rate limit or lock. Though
it's a guess; the truth is anything exposed by an API to the public ends up
being used by someone.
I'm not clear on what the rest of your comment is asking about. I've been
working with mellanox cards for over a decade and the only thing consistent
with them is their production performance is typically 25-50% of what
marketing promises. They make a bulk of their sales by undercutting intel
on price and aggressively finding customers until someone buys without
doing careful tests first (or they don't actually need the performance but
enjoy being promised it). I'd take anything they tell you with a grain of
salt.
|
Can this run on : Argo node OS, specializes a single kernel into multiple aspects that provide: i only store data tables : "..you'll still want to have the cache expire occasionally. In case your app has a bug, a crash, a network blip, or some other issue where the cache could become out of sync." the only blip comes from MemCache ..as in mem leak ..suppose that is likely with large client number ? what is its codebase? Can it run LUA , what safeguards needed for ASM ? Any boolean ops in future ? Then theres ESI .. the only dynamics i need is for meta-data (as it changes) so ESI is MemCache feature? **_Clients are not able to store all types of complex structures. You must turn the data into a pure array, or hash/table type structure before being able to store or retrieve it. ( Externally ? how is the Array sorted .. Pls links to elucidate multi-dimensional storage / GET .. Link an entire book on the subject. Since item flags are used when storing the item, the same client is able to know whether or not to deserialize a value before returning it on a 'get'. You don't have to do anything special._** More , pls. i dont wish this to occur (as common with wrapper offload nightmares) .. (solved by removing handlers) Meanwhile, i agree that aggressive marketing needs to be put in a dumpster. Mell are hypocritic in still selling Connect X2 VPI cards , while dropping all support from WIN OFED 4.9 (not that i wish to run on windows) when X2 VPI are perfectly suited for IB , not SR-IOV (who needs to virtualize a PCIe bottleneck or Server Sync timestamp (only for FOREX traders, who can afford Mlx-6) .. https://community.mellanox.com/message/4473#comment-4473 meanwhile RoCE still has its problems .. who says RoCE is the only RDMA possible? |
That's outside the scope of this project, and this issue. I'll fix the documentation when I get a chance. |
When you set the TTL for an item when the clock is nearing the end of that second (eg, 21:00:00.999999) the item expires 0.000001 seconds later instead of 1 second later.
The text was updated successfully, but these errors were encountered: