PLEASE NOTE: This version of the roadmap is a work in progress, and will be subject to further refinements.
In this document, our chief architect Noam Nelke has layed out the issues we’re currently working on and what we have lined up. He also explains our choices by tying the issues we’re addressing to our values and goals.
Before diving into the details, this roadmap outlines our current focus areas and the key developments we are prioritizing. Each issue addresses critical challenges or improvements necessary to enhance Spacemesh's performance, usability, and scalability.
Structural Improvements
Node Split
We’re separating the Full Node logic into two distinct parts. The
first part, the node, handles p2p, gossip, determining consensus,
executing blocks, and maintaining state. The second part, the
smeshing software, is for active participation, including managing
PoST, determining eligibility, and publishing messages like ATXs,
proposals, ballots, and Hare messages.
In accordance with the Spacemesh ethos, we always strive to make
smeshing as easy as possible for any smesher, big or small. While
we’re working on ways to make running a node less resource
intensive, we realized that running the node separately from the
smeshing software makes sense for many reasons.
Here’s a high-level, illustrative comparison of the two proposed
components:
Go-spacemesh Node | Smeshing Software |
---|---|
To be able to connect to the gossip network, nodes must be internet accessible, making security more challenging. | Only needs to support outgoing connections (to nodes and PoET servers) so can be kept isolated from the internet (block all incoming connections). |
Holds only publicly available data, making security less important. | Needs to be secure and keep the smesher’s private key safe. |
Multiple nodes can be used for redundancy and staggered upgrades. | Multiple smeshing clients for the same identity might accidentally sign and publish conflicting messages - invalidating the identity. |
Syncing and validating missing layers can be slow when catching up to the gossip network after a restart or crash. | Near instantaneous startup. |
When multiple nodes are configured, failure or restart of one of them has virtually no impact on the smeshing software. | Downtime may result in loss of rewards. |
As the above table shows, the two components have vastly different
requirements and sensitivities. Splitting the two will make them
both easier to develop, as priorities become more clear, but more
importantly - users will have a much easier time maintaining the
system.
While we initially expect smeshers to still run both parts of the
operation (the node and the smeshing software), once this is tested
and stable, we intend to support reduced-trust external nodes. This
means that if several entities offer public nodes that other
smeshers can use, there’ll be systems in place to validate them
against each other and make smart autonomous decisions about which
external nodes to delegate decisions to. More on this in a future
update.
Status:
The rollout plan has 3 stages:
Internal PoC - This phase is done and we’ve
proven that it works. We’re still testing and refining the code
ahead of the next stage.
Initial Version - This version will initially be
released in parallel with the existing unified version for limited
testing by smeshers that are willing to take some risk, possibly
for a smaller part of their deployment. As we iron out issues,
interested parties can build services on top of this architecture.
Eventually we’ll start recommending this version to all
smeshers.
Initial version should be released in a few weeks.
Initial version should be released in a few weeks.
Node Rewrite - We plan to rewrite the node using
this architecture. This will make the code easier to maintain,
more efficient and more stable.
This is a longer term project that will not get significant development resources before the new PoST protocol is nearing completion.
This is a longer term project that will not get significant development resources before the new PoST protocol is nearing completion.
Sync v2
We’re reworking sync to make it considerably faster and more
bandwidth efficient. This includes quick set reconciliation between
neighboring nodes, fast detection when the node is already synced
(much shorter wait for sync after a restart), greatly improved
bandwidth usage (up to 10x savings in some scenarios) and smarter
peer selection.
Status:
We’re preparing to ship the new sync to testers and possibly to a wider audience in “server only mode”. This is expected even before node-split.
New init and smeshing apps
We’re releasing new graphical apps for managing storage
initialization and smeshing.
Storage initialization is already performed in a separate process
today, so we’re now making it easier for users to manage this
process in a separate app. Many users use separate machines for
initialization and smeshing, so now they will be able to use this
lightweight tool on the initialization machine and not bother
installing it on the smeshing machine.
After node-split, smeshing will be separate from running a node, so
this lightweight app will allow managing smeshing in an intuitive
way.
Status:
The new apps become relevant along with node-split, so they’re expected to be released shortly after that.
Identity Split Fixes
There is a very large number of smeshing identities on Spacemesh
today. This causes not only a large number of ATXs being published
each epoch, but also a huge number of block proposals and ballots each
layer. The large number of block proposals causes Hare pre-round
messages to be very large and sometimes not propagate in time.
Sometimes not all block proposals that are published in time reach a
critical mass of Hare participants to be considered and rewarded.
We’ve identified the root cause for the increased number of identities
to be a shortcoming of our design that allows a multitude of smaller
identities, when used in a specific way, to be more efficient than a
single large identity. The fix for this loophole is a new version of
the PoST protocol, based on the same initialized files. This new
version is a bigger update that requires significant speccing,
implementation work and testing.
In the meantime, we’re releasing a small part of the solution - the
ability to merge ATXs. This will enable large smeshers to merge their
ATXs, reducing the total number of ATXs published, but more
importantly, it will result in coalescing those smeshers’
eligibilities, so we won’t see as many block proposals and ballots (so
Hare pre-round messages will also be smaller, as a result).
ATX Merge
The first step in the implementation is allowing smeshers to merge
their identities. To make this possible without significant changes
to the protocol, we’ll now allow ATXs to contain a list of PoST
proofs instead of just a single one.
This change requires some additional mechanics to prevent abuse.
Identities that are consolidated into other identities have to
commit to this union, and can never join any other identity set.
The mechanism for generating and validating PoST proofs remains
unchanged and the weight of the consolidated ATX is simply the sum
of weights of the individual PoST proofs.
Having fewer heavier ATXs doesn’t just mean that the list of active
identities becomes significantly smaller, it also means that not
every individual identity gets their own guaranteed eligibility. To
be clear - this has no impact on the total weight or total reward of
combined smeshers, but it means they may need to publish fewer
ballots and proposals to achieve the same impact and reward. Even
smeshers that control hundreds of identities, might need to post a
single ballot and proposal after merging their identities. This
reduces much of the ongoing load during the epoch. Having to
validate and propagate orders of magnitude fewer proposals will
ensure that all smeshers who published their proposal in time will
get rewarded.
Critically, it doesn’t just reduce the number of ballots and block
proposals each layer - but also the size of Hare pre-round messages.
Every pre-round message lists all block proposals that a smesher
sees in that layer. Today the list can be thousands of proposal IDs.
This set of large messages has to be validated and gossipped very
quickly, before the next round of Hare. By shrinking this message
back to its intended size, we’re ensuring Hare stability and
consistency.
Status:
This is code mostly complete. However, rolling out this change safely is tricky and will have to be done carefully. It’s in advanced phases of testing and should be rolled out by February. We expect supporting v2 ATXs on mainnet starting from epoch 41.
New PoST
While ATX Merge fixes many of the issues that stem from splitting
identities, it doesn’t fix the root cause. Having to validate and
propagate all the individual PoST proofs that are included in ATXs
is still a heavy burden.
The long term solution is a new version of PoST. While eliminating
the incentive to have smaller identities and making combined PoST
proofs for merged identities almost the same size as single
identities, it does not require smeshers to re-initialize their PoST
data.
Today there’s a label quality threshold for the proof which depends
on the smesher’s allocated storage size. Smeshers need to find a
proof with enough valid labels. This provides some degrees of
freedom, as long as enough sufficiently good labels are found. This
is what allows the bad incentive to split identities.
By removing the concept of the threshold and having smeshers include
the “best” (lowest hash) labels from across all their identities we
eliminate the degrees of freedom. The weight of the combined proof
is then determined by the quality of labels. Nodes estimate the
allocation size that would have led to this quality of labels (in a
deterministic way) and use that weight to grant the smesher
eligibility, voting weight and rewards.
The side effect of this is that smesher weight will be subject to
some statistical variability from epoch to epoch, depending on the
quality of labels they happened to find. However, smeshers will no
longer have an incentive to split their identities and smeshing will
become fair again.
Status:
We’re finalizing the spec for this, while development is already commencing in the next few weeks. It’s not a small undertaking, so expect a few months for development and testing before an official launch date is announced.
Athena
Athena is our most ambitious undertaking since launching Spacemesh. We
promised to be the best and most efficient smart contract platform and
we intend to keep that promise. Athena development is happening in a
parallel track to the above, and we’re making good progress.
While the groundwork for running arbitrary template code is being
developed, the research team is working on narrowing down the launch
scope of the infrastructure around the core VM. We’re trying to get an
initial version in the hands of template developers as quickly as
possible, while keeping the path clear towards our ultimate vision for
the interaction between the VM and the rest of the system.
We’re aiming for the simplest, yet most powerful, mechanism that’s as
easy as possible for smeshers to support. This will allow us to keep
fees down and throughput up.
Status:
While all other tasks listed above can be considered “maintenance”
(making the existing system work better) - Athena is a real upgrade.
We’re working on this in a parallel, completely separate track.
An Athena testnet is already running. We took some shortcuts to make
it operational, so it doesn’t really resemble how the final product
will work, but it allows us to “kick the tires” on the code we
already have.
Getting Athena to be ready for a prime-time rollout will take some
time, but we’re confident that it will be worth the wait.