-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blocks are only sanity checked when debug mode is active #1152
Comments
I tend to agree. we moved this to speed things up, making the point that if we cant trust our own local repo we're in trouble already. But i think i'm somewhere in the middle-- definitely know cases where the tradeoff of having a node operate as fast as possible making assumptions is fine... Then again, I do think that:
should never return data that does not hash to |
Now that I have really dug into what is happening, I agree it is less of an issue than what I have made it out to be. (please correct me if I have the following wrong) Bitswap advertises a desire for a hash There is no point in trying to detect/defend against changes on hard disk level because this is not part of ipfs's adversary model (which I have not seen spelled out and I might be able to help with that) If the performance hit is tolerable, make it permanent. (or really, like most performance/security tradeoffs, let the user decide via config file) |
that's incorrect (or should be). all blocks MUST be checked before saving.
right, this is the case (though this contradicts the statement above)
no-- actually, some nodes will have to do this. but not all. ipfs is not a domain specific protocol, it's a very general thing where different use cases demand sitting at different points of the security x perf tradeoff.
right. |
So here is the code that does the "sanity check" for bitswap. Essentially it takes in the given blocks, hashes them and looks if those hashes are on the wantlist and passes them on. It seems to discard the unwanted blocks (is this desired behavior? do we want people to be able to broadcast arbitrary blocks to proliferate them?) I'm interested in helping with building a real adversary model because it actively overlaps with my research (We can handle 'variable' levels of required security by breaking it down by use cases) It would help in communicating "what benefit to security does IPFS offer to me?" in explicit terms. Wrapping up: A fix to this issue would include:
|
@BrendanBenshoof throwing away blocks that we dont want is desired behaviour. You should only have blocks on your system if they are explicitly requested. Also, i don't think that we should validate blocks upon reading from the disk. If an adversary has access to your disk, you have much bigger problems than whether or not that particular block is correct. |
yeah though the model should be made in relation to where we're headed, not where we are today. (of course need to fix impl - model divergence)
yep.
not in all threat models. we want some nodes to be able to validate blocks they pull in from their repos. not all repos are pulled from local disk, and not all local disks are the same. process + disk are very different security arenas, and one may be at much higher risk than the other, particularly with networked filesystems. |
Blocks received from the network are always checked. Blocks read from disk are not checked by default, but if you set |
@whyrusleeping is that captured in documentation anywhere? it should also be captured in some security notes. Poisoning or corrupting repos could be an attack vector. |
cc @RichardLitt as he handled various documentation things for go-ipfs |
Let's reopen this and assign it to me as a thing to add to the security docs. |
I disagree with rehashing on read being by default on, there is a big con for it that in my opinion outweigh possible benefits: Possible pros are: disk error detection and attack prevention, but in both of those cases IPFS isn't primary risk factor. When disk is dying so badly that it reads bad blocks user would notice from other instabilities, he for sure won't read IPFS's console logs because those users notice performance loss and change the disk long time ago. In case of using IPFS as an attack vector: user is screwed up in so many other ways when malware or 3rd person has access to user's file system. For contrast: git by default doesn't even do object hashing while fetching from the network. |
I agree with @Kubuxu on this one. |
Reasoning A: Reasoning B: On Fri, Aug 26, 2016, 12:13 Kevin Atkinson [email protected] wrote:
|
Well put, @BrendanBenshoof. I'm on Reasoning B all the way through. If the user wants to decrease security to increase performance, that's their prerogative and we can give them a knob to turn. But we should not put the user in a weaker spot from the beginning. The file system is definitely an attack vector. And it cannot be trusted as much as main memory, or the cpu registers. As @BrendanBenshoof, the filesystem may be coming from the network as well, or may not be under our full control. This will only get to be the case more as we add support for independent auth agents (so your keys can be stored more securely), or "storing" blocks in other parts of the filesystem (the source files themselves, as @kevina is writing). We have guides and readmes, and config docs. between:
I much prefer (b). |
I am unlikely to get to this soon. I would suggest unassigning this and labeling at as 'help wanted'. Note: posting this may automatically remove my assignment. |
Haha, thanks @RichardLitt |
I need to investigate this. Thus bug is two years old and I think some things might have changed. |
It is still the same, |
This is currently the case. It should be filpped to true, but in the meantime let's document the current situation See: #1152
We should evaluate the preformance hit, and if it's not crushing, enable |
Hashing tends to be pretty expensive but we'd have to do some profiling to be sure. Ideally, we'd be at a level of performance where re-hashing would kill performance but I'm pretty sure we aren't there yet. Note: corrupting a local datastore isn't really a reasonable attack vector. The local disk should be more trusted than the network because we already trust that the go-ipfs binary isn't corrupted, the user's bashrc doesn't start any malware, etc. On the other hand, corrupting S3, etc., could be an issue. I wonder if we should flag different datastores as trusted/untrusted? I'm worried this adds too much complexity. |
I've evaluated the performance hit a long time ago. It was high enough that enabling it by default wasn't really an option but it is a good idea to reevaluate. |
IMO, we should re-evaluate and, if it doesn't have a performance impact, figure out why and fix the performance issue. |
The performance impact was just the hashing itself. Native If someone reads 100MiB/s then it will be 33% CPU spent on hashing. |
This is ancient, modern releases of go can hash at ~2GiB/s per core on all modern amd64 cores (thx to new shani accelerated crypto), we also used a third party sha256impl for a while which did this already. |
https://github.com/ipfs/go-ipfs/blob/79360bbd32d8a0b9c5ab633b5a0461d9acd0f477/blocks/blocks.go#L27
This should be checked even if debug is not enabled. Otherwise malicious blocks and merkledag structures could be loaded.
The text was updated successfully, but these errors were encountered: