Dear Smeshers,
Last week we released go-spacemesh 1.5.2-hotfix1 which included a hotfix to a serious implementation bug. We’re aware that there’s been some frustration in the community with the way this fix was communicated and released, and with the fact that it caused several identities to be invalidated. The purpose of this post is to explain what happened and why, and to explain how we can do better next time.
Background
Our team became aware of a flaw in the go-spacemesh implementation of the Spacemesh protocol in late February. The issue involves miners publishing an ATX (activation transaction) that refers not to their most recent ATX but instead to an older ATX. As explained in the CVE disclosure, the Spacemesh protocol dictates that a miner must publish ATXs serially, i.e., that each ATX must refer to the previous ATX so that each miner’s ATXs form a singular chain back to the miner’s first ATX. This was described in the Spacemesh whitepaper as early as May 2018, which clearly and unambiguously states that an ATX must include a reference to a miner’s immediately previous ATX, and that two ATXs from the same miner may not have the same sequence number:
The reference go-spacemesh implementation faithfully implemented this protocol as described, as you can see in this code snippet. The node always ensured that it was fully in sync and had the latest ATX information before publishing a new ATX to ensure that it would never publish a malfeasant ATX, e.g., one with an invalid sequence number. Anyone running the reference implementation would not have run afoul of this protocol rule and produced a malfeasant ATX without removing guardrails such as running two different nodes with the same miner identity, or manually modifying the node state.
Moreover, prior to this hotfix, the go-spacemesh node implementation would correctly identify malfeasant ATXs that violated this rule and would print an error to this effect. It’s important to point out that all of this is unambiguous and clear in the implementation: please see this code snippet.
Unfortunately, however, it did not produce and gossip a malfeasance proof to this effect so the malfeasant identities weren’t invalidated when they should’ve been. It also failed to check that sequence numbers were unique.
The Hotfix
We immediately began working on a fix, but the fix was rather large because it involved introducing a new class of malfeasance proof and adding some additional guardrails. We would normally include bugfixes as part of the ordinary go-spacemesh release process. However, in the case of this update, it would’ve been immediately obvious to any observer that we were patching an existing, serious implementation flaw, so in the name of network and protocol security we decided to take the extraordinary measure of releasing a hotfix binary release before releasing the code including the hotfix. The hotfix code was released two days later once we were confident that a sufficient proportion of nodes had updated and that the flaw could no longer be abused.
Malfeasant Identities
Today, identities that violated this protocol rule in the past will be marked malfeasant and will no longer be eligible to participate in consensus or earn block rewards. It’s important to note that we did not apply this fix retroactively in the sense that proposals submitted by these identities were not retroactively cancelled and all rewards earned by these identities are retained. No account balances were changed and no irregular state transition occurred as a result of this update. However, for purposes of protocol security and due to the lack of slashing in the Spacemesh protocol it’s essential that miner identities that violate protocol rules including construction of invalid ATXs be marked malfeasant. In particular, in this case there was no way to “forgive” miner identities that previously generated malfeasant ATXs without sacrificing protocol security.
To reiterate: even before the fix the go-spacemesh code clearly and unambiguously identified this behavior as malfeasant and in violation of protocol rules and produced a warning to this effect*.* Moreover, as with the previous equivocation issue unpatched go-spacemesh software would never have produced a malfeasant ATX violating this protocol rule if extraordinary steps hadn’t been taken, such as manually modifying the node state and running the same miner identity on multiple nodes with multiple PoET phases. Home miners running Smapp or go-spacemesh are therefore unaffected by this change.
We calculated that 586 miner identities in total out of about 3.4M miner identities were affected by this change and marked as malfeasant. That’s around 0.017% of all miner identities. We’re aware that some community members inadvertently fell afoul of this issue while “performing surgery” on their node by deleting state, registering to multiple PoET rounds, etc. We feel bad that this happened but please understand that we must put protocol security first and that we unfortunately have no way to differentiate between those who intentionally took advantage of the flaw and those who did so unintentionally.
Going Forward
We’ve always been committed to putting the security, stability, and integrity of the Spacemesh network and protocol first and we will continue to do so even where it causes inconvenience for certain miners or users. The Spacemesh network and protocol aren’t useful to anyone if they’re not as secure as possible.
We do our best to write bug free code, to test code as thoroughly as possible, and to work with third-party vendors to perform audits on our code. Nevertheless and despite our best effort bugs like this one will appear from time to time. This is a good opportunity to remind the Spacemesh community that the Spacemesh network and software are still in alpha phase and are largely unaudited, and that you should not invest more time or money in the network than you can afford to lose. We’re grateful for the community’s patience and understanding as we continue to work towards improving the Spacemesh protocol and software, and as always we welcome your contributions to this open source project.
We’ve tried hard to communicate clearly and transparently about this issue, the resolution, and the consequences. We ask the community’s understanding of the fact that, while we work in the open as much as possible, the one exception to this rule is serious security vulnerabilities and their resolution where, in line with standard industry practice, from time to time we may have to ensure that an issue is resolved before disclosing the issue publicly.
Nevertheless we recognize that we could’ve done a better job communicating all of this and we recommit ourselves to doing an even better job next time. To this end we remain open to your feedback, suggestions, and constructive criticism.
Join our newsletter to stay up to date on features and releases