The Element Accumulator is the centerpiece of Sia's State. It is a cryptographic data structure—specifically, a Merkle Binary Numeral Tree—whose leaves are "elements," primarily siacoin outputs. (Other element types are discussed below.)
In most blockchains, Outputs are stored in a database. When an output is created, it is added to the database; when a transaction spends the output, it is deleted. This approach is simple, but the database—and thus the State—can grow arbitrarily large, placing greater resource demands on nodes, and hampering syncing.
The accumulator, by contrast, consists of just a handful of Merkle tree root hashes. When a Transaction spends an output, it includes not just the output's ID (as in other blockchains), but rather the entire output object, as well as a Merkle proof asserting the output's presence in the accumulator. This same Merkle proof can then be used to compute a new accumulator in which the output is marked as spent. Thus, the accumulator distributes the "cost" of the database across the network: nodes only need to store Merkle proofs for the outputs they personally care about.
This approach to UTXO management was pioneered by Tadge Dryja in his Utreexo proposal for Bitcoin. Sia's accumulator differs in a few ways (most notably, it never deletes elements), but the core idea is largely unchanged.
Element Types
As a rule of thumb, any piece of state that has a dynamic size is "stored" in the accumulator. This means that the State object itself remains constant-size, which has numerous benefits. The Sia accumulator stores the following element types:
- Siacoin elements are the most common element type. In addition to their value and address, the maturity height is also hashed.
- Siafund elements include the total tax revenue accumulated as of the element's creation. This is used to calculate the dividend when the output is spent.
- File Contract elements may be updated by revisions multiple times before they are "spent" (that is, resolved).
- Attestation elements are included so that they may be trustlessly downloaded by light clients.
- Chain Index elements consist of a height and a Block ID. They are referenced by storage proofs, which make use of the entropy present in the ID of an ancestor block.
Each element is concatenated with its leaf index and a "spent" flag when computing its leaf hash in the Merkle tree. The inclusion of the leaf index provides some defense against collision attacks, as the attacker must commit to colliding with a specific element in the tree instead of any element. In theory, this permits truncating the digest size by a factor of the accumulator size; but out of an abundance of caution, this optimization is not currently implemented.
Proof Basis
A Merkle proof is only valid against a particular accumulator. That accumulator—or rather, the index of that accumulator's State—is the proof's basis. This basis can be outdated: after a new block is applied, the wallet still needs to update the Merkle proofs of its outputs to accord with the new state. (This is handled automatically by all Sia node software.) This also impacts protocol design: since peers may be out of sync with respect to the current State, it is crucial to specify the basis of any Merkle proofs exchanged (e.g. when collaboratively constructing a transaction), so that they can be updated if necessary.