Suspicious Contract Detection Bot

Article by Forta Network May. 10, 2022

Christian Seifert | Forta Researcher-in-Residence

In this blog, we make a series of changes to an existing Detection Bot transforming it into a generic bot which generates relevant alerts to protocol security events. This is an extension of the potential exploiter bot developed by Nethermind, which raises an alert if an account funded by a privacy protocol (e.g. Tornado Cash) creates a new contract. The practice is common for malicious actors as the attacker can retain anonymity while preparing for a subsequent attack (e.g. a reentrancy attack requires an attacker controlled contract).


Forta leverages a network of community developed detection bots to monitor all on-chain transactions for relevant security events. Interested protocols can subscribe to bot alerts to activate incident response processes. In the recent Saddle Finance hack, the attacker created an attack contract that was utilized to drain funds through a Uniswap oracle price manipulation. The contract contained relevant information (Uniswap, flash loan and Saddle contract addresses) that would have provided Saddle with an early heads up of malicious activity pertaining to the Saddle protocol. By design, Forta is intended to give defenders an early signal of an attack providing an opportunity for mitigation. 

Detection Bots can be generic, raising relevant security events across the chain like identifying suspicious accounts based on the account’s transaction patterns. They can also be protocol relevant where a protocol can subscribe to an alert and be notified of a reentrancy observed on their contracts. Protocol relevant alerts are packaged into Threat Detection Kits: a bundle of generic detection bots that protocols can subscribe to obtain threat relevant alerts for their protocols providing security relevant value straight out of the box.

Since the potential exploiter bot developed by Nethermind bot is not protocol relevant, it is not very useful for protocols as it does not expose information on whether the created contract is relevant for a specific protocol. Relevance is determined based on if the created contract contains addresses related to a specific protocol. Said protocol relevance and the implementation in the Detection Bot is the goal of the modification discussed in this post.

In order to obtain those addresses, the bot first needs to identify contract creation happening when a transaction is sent without a To address specified. Note, however, that contract creation can also happen from another contract in the same transaction. In order to identify this technicality, one needs to iterate over the trace and extract create events.

Once the creation of a contract is identified, one needs to query the contract to extract the addresses. The contract address, however is not specified in the To field of the trace object, but rather is dynamically generated by hashing the nonce and public address of the account creator:


def calc_contract_address(w3, address, nonce) -> str:
   """
   this function calculates the contract address from sender/nonce
   :return: contract address: str
   """
 
   address_bytes = bytes.fromhex(address[2:].lower())
   return Web3.toChecksumAddress(Web3.keccak(rlp.encode([address_bytes, 
nonce]))[-20:])

Once the address is found, you can start querying the contract to extract addresses that may be contained in the created contract – both statically and dynamically:

As the contract is created, the code can be inspected to extract addresses that may be hard coded in the contract code. Since there are many theoretical addresses contained in the byte code, conversion to opcode is needed to reduce the set to likely candidate addresses. In python, pyevmasm can be utilized. Once disassembled, one can iterate over the potential addresses (20 byte parameter hex strings) and assess whether the extracted address is a contract.


def get_opcode_addresses(w3, address) -> set:
   """
   this function returns the addresses that are references in the opcodes of a 
   contract
   :return: address_list: list (only returning contract addresses)
   """
   if address is None:
       return set()
 
   code = w3.eth.get_code(Web3.toChecksumAddress(address))
   opcode = disassemble_hex(code.hex())
   address_set = set()
   for op in opcode.splitlines():
       for param in op.split(' '):
           if param.startswith('0x') and len(param) == 42:
               if is_contract(w3, param):
                   address_set.add(Web3.toChecksumAddress(param))
   return address_set

Static extraction only works if the addresses embedded in the code are in the clear. If an address is set dynamically and stored locally, analysis of storage slots may reveal an address that was not visible through static analysis. Contract storage is a topic in itself but once dynamic arrays and mappings are involved things can get complicated quickly. If we are merely interested in assessing the local contract variables, we iterate over the first few (configurable) storage variables:


def get_storage_addresses(w3, address) -> set:
   """
   this function returns the addresses that are references in the storage of a 
   contract (first CONTRACT_SLOT_ANALYSIS_DEPTH slots)
   :return: address_list: list (only returning contract addresses)
   """
   if address is None:
       return set()
 
   address_set = set()
   for i in range(CONTRACT_SLOT_ANALYSIS_DEPTH):
       mem = w3.eth.get_storage_at(Web3.toChecksumAddress(address), i)
       if mem != 
       HexBytes('0x0000000000000000000000000000000000000000000000000000000000000000'):
           # looking at both areas of the storage slot as - depending on packing - 
           # the address could be at the beginning or the end.
           if is_contract(w3, mem[0:20]):
               address_set.add(Web3.toChecksumAddress(mem[0:20].hex()))
           if is_contract(w3, mem[12:]):
               address_set.add(Web3.toChecksumAddress(mem[12:].hex()))
 
   return address_set

This modified bot will now raise alerts when a Tornado Cash funded account creates a new contract and extracts the relevant addresses contained in the contract as described above. 

Real-life examples below provide the needed insights to put an alert into context of the protocol and the imminent attack:

– In the Saddle Finance attack, where the Uniswap price oracle was manipulated through a flash loan. The created contract contained addresses of the flash loan, Uniswap, as well as Saddle protocol, which could have provided Saddle a heads up before funds were drained that such an attack was about to happen.

– In the Beanstalk Finance attack there were two contract creations: the attacker contract as well as the BIP-18 proposal. Each contained addresses of BEAN tokens as well as a set of addresses one usually observes in a flash loan attack (flash loan provider, several DEXes and Tokens), which  should have been considered suspicious for the protocol. 

– The Inverse Finance attack has similar indicators when the attacker created their contract. The created contract had references to xINV, INV, DOLA tokens, top tier tokens (WBTC/ USDC/ WETH) as well as several DEXs that could have given Inverse an early signal around the price manipulation that the attacker was about to execute.

It is important to note that this Detection Bot is not bullet proof. There is an opportunity for an attacker to obfuscate addresses and store at memory locations which are not monitored by Forta. That said, the beauty of Forta is that the network is powered by the community with 1000+ developers building countless iterations of Detection Bots. I believe that the wisdom of the crowd will solve many of the problems related to the above challenge. If you have ideas on how to improve this detection bot (it is located here), please modify it and share it with the community on Discord or with me directly (Christian | Forta#0582 on Discord). With Forta, we will work together to secure all transactions in Web3. To share tips & tricks on how to develop great detection bots, join Forta’s Discord and share your learnings with others.