Ralph Merkle had the solve the following problem in his phd thesis from 1979:
Given a vector of data items = Y1,. Y2, ... Yn design an algorithm which
can quickly authenticate a randomly chosen Y but which has modest memory
requirements, i.e., does not have a table of Y1, Y2, ... Yn.
(see page 47)
Notes:
authenticate
-> proof that item is in vector- We do not want to have a copy of the elements
A Merkle Tree is a data structure for digitally signing datasets with desire for fast verification of data consistency.
Want to have:
- Central party that manages a public, append-only log of data
- Anyone is able to verify the log (no data removed, no data changed)
Merkle Trees offers 3 properties to achieve this:
-
For any specific record
R
in a log of lengthN
, we can construct a proof of lengthO(log2 N)
allowing the client to verify thatR
is in the log. -
For any earlier log observed and remembered by the client, we can construct a proof of length
O(log2 N)
allowing the client to verify that the earlier log is a prefix of the current log. -
An auditor can efficiently iterate over the records in the log.
(For more info, see Transparent Logs for Skeptical Clients from Russ Cox)
A Merkle Tree is a tree in which every leaf [-> node without child nodes] is labelled with the cryptographic hash [-> no collision] of a data block, and every node that is not a leaf is labelled with the cryptographic hash of the labels of its child nodes. (see Wikipedia)