Rajvardhan

February 2, 2023

Exploring Cosmos: A Security Primer

A developer’s guide to building secure applications on Cosmos

What Is Cosmos? Why Use Cosmos?

What exactly is Cosmos, the so-called internet of blockchains? Simply put, it’s a network of interoperable blockchains. The Cosmos SDK provides developers a framework to easily create their own application-specific chain that can communicate with other chains via the inter-blockchain communication (IBC) protocol.

Now, why would we want to create our own application-specific blockchain in the first place? One advantage is performance. A custom blockchain means we don’t need to compete with other applications for resources. The underlying blockchain is used for a single purpose instead of being shared by a million other applications. There’s also no need for transactions to go through the bottleneck of being interpreted by a virtual machine.

Another big advantage is sovereignty. Unlike applications deployed on general-purpose blockchains, the governance of our application is not limited by the governance of the underlying blockchain. That is, there is only one layer of governance.

Architecture Overview

Application-specific blockchains can be broken up into three parts: the networking layer, the consensus layer, and the application layer.

Tendermint Core handles the first two layers for us, propagating transactions and allowing nodes to agree on the transactions that should be added to the blockchain. As a developer, we only need to implement the application layer, which has the responsibility of processing transactions and updating state. We can use the Cosmos SDK, a Golang framework, to build this.

An application built using the Cosmos SDK is composed of individual modules. Modules can be thought of as building blocks that can be composed together to create a complete blockchain application. Each module in the SDK is responsible for a specific aspect of the blockchain application, such as managing the state of the blockchain or interacting with external systems. By using modules, developers can easily reuse existing functionality and focus on building the unique features of their application.

The Cosmos SDK includes a number of built-in modules, such as the staking module, the governance module, and the IBC module, and it also allows developers to create their own custom modules. The vision is to create a massive ecosystem of open-source modules, allowing developers to quickly and easily create complex applications.

However, without an access control mechanism, the potential for malicious modules creates a large security risk. For instance, an open-source module could turn malicious due to a supply chain attack. To mitigate this risk, Cosmos SDK uses the object-capability model to enforce boundaries between modules. A module can only obtain a capability if it is passed a reference to an object with the capability.

Object Capability Model

Each module can be thought of as an independent state machine. A module can use one or more KVStores (key-value stores) to maintain its state. The store key gives unrestricted read and write access to the store. A Keeper object is defined to hold this key, along with methods to interface with the store in a more constrained manner. The keeper is named so because it is the gatekeeper to the store. It is therefore essential that the storeKey is not exposed to other modules in the application.

Modules can define interfaces and extend their Keeper to interact with external modules. The interface should only expose the methods that are essential for the module’s operation. The following is an example of the staking module’s Keeper, which extends Keepers from the banking and auth modules.

// Keeper of the x/staking store
type Keeper struct {
    storeKey   storetypes.StoreKey
    cdc        codec.BinaryCodec
    authKeeper types.AccountKeeper
    bankKeeper types.BankKeeper
    hooks      types.StakingHooks
    authority  string
}

The bank module defines the following interface.

type Keeper interface {
    SendKeeper
    WithMintCoinsRestriction(MintingRestrictionFn) BaseKeeper

    InitGenesis(sdk.Context, *types.GenesisState)
    ExportGenesis(sdk.Context) *types.GenesisState
    ...
    MintCoins(ctx sdk.Context, moduleName string, amt sdk.Coins) error
    BurnCoins(ctx sdk.Context, moduleName string, amt sdk.Coins) error

    DelegateCoins(ctx sdk.Context, delegatorAddr, moduleAccAddr sdk.AccAddress, amt sdk.Coins) error
    UndelegateCoins(ctx sdk.Context, moduleAccAddr, delegatorAddr sdk.AccAddress, amt sdk.Coins) error

    types.QueryServer
}

Note that the BankKeeper interface only includes a subset of these methods, which are exposed to the staking module.

type BankKeeper interface {
    GetAllBalances(ctx sdk.Context, addr sdk.AccAddress) sdk.Coins
    GetBalance(ctx sdk.Context, addr sdk.AccAddress, denom string) sdk.Coin
    LockedCoins(ctx sdk.Context, addr sdk.AccAddress) sdk.Coins
    SpendableCoins(ctx sdk.Context, addr sdk.AccAddress) sdk.Coins

    GetSupply(ctx sdk.Context, denom string) sdk.Coin

    SendCoinsFromModuleToModule(ctx sdk.Context, senderPool, recipientPool string, amt sdk.Coins) error
    UndelegateCoinsFromModuleToAccount(ctx sdk.Context, senderModule string, recipientAddr sdk.AccAddress, amt sdk.Coins) error
    DelegateCoinsFromAccountToModule(ctx sdk.Context, senderAddr sdk.AccAddress, recipientModule string, amt sdk.Coins) error

    BurnCoins(ctx sdk.Context, name string, amt sdk.Coins) error
}

The external keepers are then initialized as the new interface type in the staking module’s constructor.

// NewKeeper creates a new staking Keeper instance
func NewKeeper(
    cdc codec.BinaryCodec,
    key storetypes.StoreKey,
    ak types.AccountKeeper,
    bk types.BankKeeper,
    authority string,
) *Keeper {
    ...
    return &Keeper{
        storeKey:   key,
        cdc:        cdc,
        authKeeper: ak,
        bankKeeper: bk,
        hooks:      nil,
        authority:  authority,
    }
}

BaseApp

The BaseApp type implements most of the core functionalities of Cosmos SDK. It primarily consists of the application blockchain interface (ABCI) implementation to interact with Tendermint, a multistore to persist the state, and message service router to route transactions to appropriate modules. A developer would typically extend BaseApp to create an application on top of Cosmos SDK.

type BaseApp struct {
    // initialized on creation
    logger            log.Logger
    name              string                      // application name from abci.Info
    db                dbm.DB                      // common DB backend
    cms               storetypes.CommitMultiStore // Main (uncached) state
    ...

    mempool         mempool.Mempool            // application side mempool
    anteHandler     sdk.AnteHandler            // ante handler for fee and auth
    postHandler     sdk.PostHandler            // post handler, optional, e.g. for tips
    initChainer     sdk.InitChainer            // initialize state with validators and state blob
    beginBlocker    sdk.BeginBlocker           // logic to run before any txs
    processProposal sdk.ProcessProposalHandler // the handler which runs on ABCI ProcessProposal
    prepareProposal sdk.PrepareProposalHandler // the handler which runs on ABCI PrepareProposal
    endBlocker      sdk.EndBlocker             // logic to run after all txs, and to determine valset changes
    ...

    // checkState is set on InitChain and reset on Commit
    // deliverState is set on InitChain and BeginBlock and set to nil on Commit
    checkState           *state // for CheckTx
    deliverState         *state // for DeliverTx
    processProposalState *state // for ProcessProposal
    prepareProposalState *state // for PrepareProposal
  ...
}

Some of the important components include the following:

CommitMultiStore is the main uncached state present in BaseApp. It consists of KVStores from each module in the application. The state is committed at the end of each block once a precommit has been signed by majority of the validators.
anteHandler is run during both CheckTx and DeliverTx phases and is responsible for authentication, fee payment, and other pre-execution checks.
The application maintains a mempool that stores valid transactions from the CheckTx phase. These are later used by ProcessProposalHandler to propose a block.
checkState and deliverState are cached states that are branched from CommitMultiStore. These are used during the CheckTx and DeliverTx phases respectively.

Transaction Flow

CheckTx

Once a transaction is received by a full node running Tendermint, it sends the CheckTx message to the application layer through the ABCI. The goal of this phase is to eliminate invalid transactions early on and safeguard the full node’s mempool from spam transactions. Note that gas fee is not charged during this phase; therefore, it is recommended to keep the checks lightweight.

The CheckTx phase performs the following steps:

The transaction is decoded into sdk.Msgs from raw bytes provided by Tendermint.
It then runs validateBasic on each sdk.Msg, which performs basic sanity checks.
The checkState, which itself is cached, is branched again before the anteHandler is invoked. This is done to ensure that writes to the state are not committed if anteHandler fails.
The anteHandler performs authentication and fee checks on the transaction. Note that no precise gas checks are performed as the handler does not process sdk.Msg. The gas fee is calculated based on transaction size and minGasPrices, which is specific to the node.
If the validation checks pass, the transaction is added to Tendermint’s mempool and the application side mempool if it exists.

BeginBlock

When a block proposal is received by the Tendermint engine, it uses ABCI to send the BeginBlock message to the application layer. This allows developers to execute code before Msgs are run. It also resets the main gas meter, branches CommitMultiStore to initialize deliverState, and calls BeginBlocker() for all loaded modules. It is important to not have computationally heavy logic in BeginBlock as it’s not directly associated with a user transaction and no gas fee can be charged.

DeliverTx

The DeliverTx phase performs the exact same checks as the CheckTx phase before executing the transaction (except the gas fee checks in anteHandler, which differ between nodes). Moreover, the transaction is removed from the application side mempool. Each Msg in the transaction is then routed to the appropriate module’s protobuf service. If all Msgs execute successfully and PostHandler succeeds, the branched multistore is written to DeliverState’s CacheMultiStore.

This is where the meat of the application logic is defined and is executed.

Notably, the auth module exposes a special AnteHandler that performs special checks, including signature verification, fee deduction, gas calculation, and account sequence incrementing as to avoid replay attacks.

EndBlock

The EndBlock message is sent to the application by Tendermint once all the transactions have finished executing. Like BeginBlock, it can be used by developers to execute code after Msgs have been run. It also calls EndBlocker() for all loaded modules.

Commit

Once consensus has been reached (i.e., the underlying Tendermint engine has received precommits from 2/3 of the validators), it sends the Commit ABCI message to the application. Cosmos SDK then writes the branched multistore from DeliverState to app.cms, which is then persisted. DeliverState stores all state transitions from the BeginBlock, DeliverTx, and EndBlock phases. It sets the DeliverState to nil and returns the commit hash for app.cms to Tendermint.

Potential Bugs Using Cosmos SDK

Non-determinism

One of the most common mistakes is having non-deterministic behavior in your application, leading to consensus failing and the blockchain halting. For instance, the Security Advisory Jackfruit reported a problem where the authz module was vulnerable due to non-determinism. The module used local clock times, which are subjective to the node. The block header’s timestamp should have been used instead.

func (g Grant) ValidateBasic() error {
    if g.Expiration.Unix() < time.Now().Unix() {
    return sdkerrors.Wrap(ErrInvalidExpirationTime, "Time can't be in the past")
}

Another common source of non-determinism is iterating over maps. This is because the iteration order of maps is not deterministic. It is recommended to only iterate over maps when either clearing a map or retrieving the keys from a map so it can be sorted.

m := map[string]int{
    "a": 0,
    "b": 1,
    "c": 2,
    "d": 3,
}
// BAD: Non-deterministic
for k, v := range m {
    fmt.Println(k, v)
}

// GOOD: Deterministic
ks := make([]string, 0, len(m))
for k := range m {
    ks = append(ks, k)
}
sort.Strings(ks)
for _, k := range ks {
    fmt.Println(k, m[k])
}

Other sources of non-determinism include (but are not limited to) using certain system packages (e.g., unsafe, runtime, reflect, math/rand), goroutine execution order, and calls to any external sources (e.g., disk, network).

Integer overflows

The classic issue of integer overflow is possible in Golang. It is recommended to use the math library. Caution should also be taken when casting from an unsigned integer to a signed integer.

var x uint8 = 255
// INT OVERFLOW: prints 0
fmt.Println(x + 1)

Float (lack of) associativity

Floats are not associative, due to having finite precision. If you are using floats, make sure that the loss in precision does not result in ill-defined behavior.

var f, f2, f3 float64 = 0.1, 0.2, 0.3
// FALSE
fmt.Println((f+f2)+f3 == f+(f2+f3))

Panics and unbounded computation in BeginBlock/EndBlock

Care should be taken so that panics and excessive computation do not occur in the BeginBlock nor the EndBlock. Mistakes in these functions can result in denial of service. Note that BeginBlock and EndBlock are not tied to a user sending a transaction, so you cannot directly collect gas in these functions. If needed, gas should be collected prior.

Inter-Blockchain Communication (IBC)

Now that we have an application-specific blockchain, we want to be able to communicate with other chains. This is done through IBC protocol, which stands for Inter-Blockchain Communication. So, how does IBC work?

One component of IBC is the light client. Light clients are on-chain nodes that store a subset of the blockchain state and serve to track the consensus state of another chain. A unique client ID is used to identify each light client. It also contains the proof specs of the other chain, allowing it to verify commitment proofs against the consensus state. This allows resource-constrained devices to participate in the network.

Light clients are encapsulated in connectionEnd objects. A pair of connectionEnd objects on separate blockchains make up a connection. Through a four-way handshake, connections establish the identity of the other chain and verify that the light clients correctly correspond to the connected chains.

In order to pass data (e.g., performing the connection handshake), blockchains first commit state to a specific path. A relayer monitors this path and relays the data, alongside a proof, to the counterparty chain. The proof is, then, passed to the light client and verified. With this architecture, you do not need to trust the relayer. If the relayers are passing faulty data, the proofs would simply be rejected, only affecting the liveness of the network. The security of IBC, in theory, reduces down to the security of the connected chains.

Once a connection has been established, a channel is created to relay packets between cross-chain modules. First, a module can bind to a port. Doing so will return a dynamic object capability. A channel can then be established between two ports with another four-way handshake. This also returns another dynamic capability. Performing any action on the ports or channel will require the correct capability, meaning malicious modules cannot use ports or channels they do not own. A module can now pass data to another module by sending packets through the channel. This construction is named the IBC/TAO layer – TAO standing for transport, authentication, and ordering.

Interchain Security

In Tendermint, you need 2/3 of validators to agree to have a consensus. This means that if you control 1/3 of validators, you can attack the liveness of the chain. If you control 2/3 of validators, you can attack the correctness of the chain.

Interchain Security allows for the sharing of validators from a provider chain (e.g., ATOM) to a consumer chain. Validators can use the tokens they staked on the provider chain to produce blocks on the consumer chain. Validators are chosen from the provider chain to run validator nodes on both consumer and provider chains. In return, they receive rewards and fee from both chains.

Of course, if a validator misbehaves or is unreliable, the tokens they staked on the provider chain can be slashed. This is achieved through a mechanism called CCV (cross-chain validation). The consumer chain can submit evidence of misbehavior to the provider chain through the CCV module, and the validator risks losing bonded tokens on the provider chain. Both the chains implement CCV modules and communicate using IBC.

IBC Security Discussion

The design of IBC certainly has security in mind. There’s no need to trust a third party to verify cross-chain communication. If there is misbehavior, there are mechanisms to limit the damage done. For example, relayers can submit proofs of validator misbehavior to the light clients. Or the use of dynamic capabilities to limit the impact of a malicious module.

However, like with cross-chain bridges, IBC is still an attractive target for attackers, and security is only as strong as its weakest link. The Dragonfruit advisory (bug responsible for the BNB smart chain exploit) and the Dragonberry advisory showed implementation issues in the IAVL RangeProof and ics23 respectively. Luckily, the ics23 bug was discovered by the core Cosmos and Osmosis teams before an attacker.

Also, the existence of IBC, like with bridges, can also make mitigations difficult if a chain is hacked, since the tokens can be transferred to other chains, providing the attacker an easier exit strategy.

CosmWasm

Maintaining your own chain may be overkill for a lot of applications. You can instead use CosmWasm, a module for the Cosmos SDK that implements an on-chain smart contract VM. It provides an easy-to-use interface for developers to write smart contracts in multiple languages, which can later be compiled into WebAssembly (WASM) bytecode and executed on the Cosmos network. This can also be thought of as an L1 chain where users can deploy their own smart contracts that anyone can interact with.

To deploy the contract, we first upload the wasm code to the blockchain. If successfully uploaded, a code ID will be in the response. Then, we instantiate the contract we uploaded with some InstantiateMsg.

pub fn instantiate(
    deps: DepsMut,
    _env: Env,
    info: MessageInfo,
    msg: InstantiateMsg,
) -> Result<Response, ContractError> {

We can instantiate the code as many times as we like. Each time we instantiate, we will be given a unique contract address (meaning a contract can only be instantiated once). This separation between uploading code and instantiation allows the reuse of common code that only differs on initialization.

Optionally, we can provide an owner during instantiation. The owner has the ability to migrate the contract using the MsgMigrateContract transaction, pointing the contract to new code. There is no need for messy proxy patterns.

There is a single point of entry for executing transactions. The execute function is responsible for dispatching the Msg to the correct handler. This prevents the developer from accidentally exposing an unwanted entry point.

pub fn execute(
    deps: DepsMut,
    _env: Env,
    info: MessageInfo,
    msg: ExecuteMsg,
) -> Result<Response, ContractError> {
    match msg {
        ExecuteMsg::CreatePot {
            target_addr,
            threshold,
        } => execute_create_pot(deps, info, target_addr, threshold),
        ExecuteMsg::Receive(msg) => execute_receive(deps, info, msg),
    }
}

For contract composability, CosmWasm uses the actor model. The core idea is that actors can only process a single message at a time. If a contract wants to call another contract, it needs to save its state and dispatch a message. For example, the escrow contract would finish executing and return a new message that invokes BankMsg::Send. Note that all the messages are bundled up in a single transaction, so if any messages fail, the whole transaction is reverted.

fn send_tokens(to_address: Addr, amount: Vec<Coin>, action: &str) -> Response {
    Response::new()
        .add_message(BankMsg::Send {
            to_address: to_address.clone().into(),
            Amount,
        })
        .add_attribute("action", action)
        .add_attribute("to", to_address)
}

If a result is needed from a message, then a submessage can be used. A submessage will give a reply by invoking the caller’s reply function.

CosmWasm Security

CosmWasm’s design learns from Solidity and prevents common pitfalls.

Reentrancy is prevented with the actor pattern. There is no volatile state when another contract is called.
Integer underflows and overflows are prevented by turning on Rust’s overflow checks.
The developers must explicitly define entry points, so there are no accidental entry points.
Errors anywhere in the message chain will revert the whole transaction.
There are no uninitialized storage pointers.
There are no delegate calls.

Denial of service is, however, still a valid concern.

Conclusion

Overall, the Cosmos ecosystem is a unique vision for the future of decentralized apps. Many security mechanisms are in place to help prevent developers from making costly bugs. Combined with robust testing frameworks available in both Cosmos SDK and CosmWasm, we’re optimistic about the future of Cosmos.

About Us

Zellic specializes in securing emerging technologies. Our security researchers have uncovered vulnerabilities in the most valuable targets, from Fortune 500s to DeFi giants.

Developers, founders, and investors trust our security assessments to ship quickly, confidently, and without critical vulnerabilities. With our background in real-world offensive security research, we find what others miss.

Exploring Cosmos: A Security Primer

What Is Cosmos? Why Use Cosmos?

Architecture Overview

Object Capability Model

BaseApp

Transaction Flow

Potential Bugs Using Cosmos SDK

Non-determinism

Integer overflows

Float (lack of) associativity

Panics and unbounded computation in BeginBlock/EndBlock

Inter-Blockchain Communication (IBC)

Interchain Security

IBC Security Discussion

CosmWasm

CosmWasm Security

Conclusion

About Us

About us

What we do

Follow us

What Is Cosmos? Why Use Cosmos?​

Architecture Overview​

Object Capability Model​

BaseApp​

Transaction Flow​

Potential Bugs Using Cosmos SDK​

Non-determinism​

Integer overflows​

Float (lack of) associativity​

Panics and unbounded computation in BeginBlock/EndBlock​

Inter-Blockchain Communication (IBC)​

Interchain Security​

IBC Security Discussion​

CosmWasm​

CosmWasm Security​

Conclusion​

About Us​

What Is Cosmos? Why Use Cosmos?

Architecture Overview

Object Capability Model

BaseApp

Transaction Flow

Potential Bugs Using Cosmos SDK

Non-determinism

Integer overflows

Float (lack of) associativity

Panics and unbounded computation in BeginBlock/EndBlock

Inter-Blockchain Communication (IBC)

Interchain Security

IBC Security Discussion

CosmWasm

CosmWasm Security

Conclusion

About Us