This is a basic, simple implementation of the Merkle Tree with flexible configuration for base (number of leaves per node) because it was possible. I do not have a formal mathematical proof that base 2 is most efficient in terms of proof size for a given leaf capacity, but I'm fairly confident it is.
By default, this uses 2 as its base and sha256 as its hashing function, but
both can be manually specified if needed. The most practical way to use this
module is to generate the leaves from a pre-computed set of messages, and then
call the from_leaves
class method to create an instance. I have not included
a serializer.
Note that any leaf or node without a computed value will be 32 null bytes. Unlike a Merkle Mountain Range, where trees of variable depth are combined dynamically ("bagging peaks"), this implementation instead scales to any number of messages and fills in the gaps in the tree along incomplete paths with strings of null bytes. Thus, imperfect trees still guarantee the validity of inclusion proofs.
This uses sha256 from the PyNaCl library as the default hash function.
- Install the
python3-nacl
library if you wish to use the default configuration. - Put
merkle.py
somewhere in the project files. from [path/to/merkle] import MerkleTree
See sample.py
for some sample code.
MerkleTree
inherits from list and is essentially a list of lists, with
the root as the only element of the 0th list, the leaves as the contents of
self[-1]
, and the intermediate nodes in the lists between.
-
Constructors
__ini__
- from_leaves
- from_messages
-
Instance methods
- calculate_tree
- put
- fill
- print_hex
- prove
-
Static methods
- print_hex_proof
- verify
Parameters:
base
: the number of children per nodelevels
: the depth of the treehashfunc
: a function used to compute hashes
Returns a MerkleTree with the given parameters and base**levels
leaves. Each
node and leaf will be 32 null bytes.
@classmethod from_leaves (leaves, base = 2, hashfunc = lambda data: sha256(data, encoder=RawEncoder))
Parameters:
leaves
: the hashes of all messages for the treebase
: the number of children per nodehashfunc
: a function used to compute hashes
Returns a MerkleTree with the given leaves
, base
, and hashfunc
. Note that the
number of leaves will be a multiple of the base but not necessarily base**levels.
Nodes that have children will have their values computed, and others will be
32 null bytes.
@classmethod from_messages (messages, base = 2, hashfunc = lambda data: sha256(data, encoder=RawEncoder))
Parameters:
messages
: a list of all messages (bytes) for the treebase
: the number of children per nodehashfunc
: a function used to compute hashes
Computes the hashes of the messages and returns the result of from_leaves.
Computes all the nodes from the leaves up to the root.
Parameters:
leaf
: the hash of a message that has been replacedindex
: the index of the leaf being replaced
Replaces the hash at the given index with the given leaf
and then calls calculate_tree.
Parameter:
leaves
: a list of all message hashes for the tree
Replaces the leaves with the given leaves
(as long as they are not in excess of
base**levels), then calls calculate_tree.
Prints out the MerkleTree in a human-readable, hexidecimal format.
Parameter:
message
: a message in bytes included somewhere in the tree
Generates and returns an inclusion proof for a given message. This proof takes the form of a list of lists, where each list (except the last) has the index for the child hash and all other child hashes used to compute the next node. The final list is simply the root node. The first child hash is the hash of the message.
Prints out a proof in a human-readable, hexidecimal format.
Parameters:
message
: a message in bytes supposedly included in a treeproof
: a list of lists, ostensibly the output ofprove
hashfunc
: a function used to compute hashes
Verifies that the given inclusion proof for the given message is valid. Returns a boolean.
- Experimental refactor where leaves are the first list so the tree can dynamically grow.
- Implement a Merkle Mountain Range (MMR).
- Compare performance and efficiency of MerkleTree and MMR.
Copyright (c) 2021 Jonathan Voss
Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.