Security Research

Unleashing the Power of Forta Threat Intelligence

November 17, 2023

Forta is the largest network of security intel in Web3. The decentralized Forta Network leverages machine learning and a community of security researchers to detect exploits, scams and other threats.

Forta is a powerful Web3 protocol that generates actionable and insightful threat intelligence, but how can you access that power to answer questions about the blockchain landscape? Several feeds exist on the network that produce lists of indicators (EOAs, contracts, URLs, social media account names) of particular attacks (Scam Detector Feed, Spam Detector Feed, Token Sniffer Rug Pull Feed, Attack Detector Feed)

This data in itself is interesting. For instance, Scam Detector data in October 2023 shows that the majority of end user attack indicators reside on Ethereum, Polygon and BSC. Of the different types of end-user attacks, the data shows address poisoning being most prevalent (but this is an artifact of how these attacks operate). 

But what if you want to answer some questions that require on-chain data to answer? In those cases, Forta Threat Intelligence really shines when combining it with the cutting-edge data platform ZettaBlock. This allows you to answer questions like:

1. Given the value of all swaps on the Uniswap USDC/ETH pool, what percentage are associated with scams?

2. How many scam and spam contracts are created over time?

Answering Question #1: Given the value of all swaps on the Uniswap USDC/ETH pool, what percentage are associated with scams?

To answer the first question, USDC inflows into the pools need to be analyzed. This is when someone swaps USDC to ETH (which scammers do given USDC have blocklist functionality and the assets could be frozen by Circle, the issuer of USDC). 

Swaps on Uniswap pools emit a swap event, which shows sender/recipient, the amounts of the swap as well as the direction of the swap. Unfortunately, the sender/recipient can consist of the Uniswap router for multicall transaction. As such, in order to determine who is swapping, the initiating EOA is utilized.

The percentage is determined by summing all USDC for USDC to ETH swaps initiated by scammers vs summing all USDC for USDC-ETH swaps. 

In practice, the following three steps need to be performed:

1. Query Forta Threat Intelligence using the Forta Graph QL API
2. Upload the data to ZettaBlock
3. Build and execute the query to answer the question at hand.

The Scam Detector’s API tutorial does an excellent job of explaining the first question. Once the data has been stored in a CSV, it can be uploaded directly to ZettaBlock using the Data Sources to CSV File upload functionality. 

Afterwards, the data will be available to query. This allows you to intersect the Forta Threat Intelligence with historical on-chain activity. 

Now, moving on to building the query starting out by summing all USDC to ETH swaps for the Uniswap USDC/ETH pool via the event emitted. This gives you the denominator of $2.49B USDC:

SELECT
  SUM(amount_usdc) / POW(10, 6) AS amount_usdc_sum
FROM
  (
    SELECT
      transaction_hash,
      from_address,
      'usdc_for_eth' AS TYPE,
      CAST(argument_values [3] AS DECIMAL) AS amount_usdc,
      CAST(argument_values [4] AS DECIMAL) AS amount_eth
    FROM
      ethereum_mainnet.logs
      JOIN (
        SELECT
          from_address,
          HASH
        FROM
          ethereum_mainnet.transactions
        WHERE
          data_creation_date >= DATE('2023-09-01')
          AND data_creation_date < DATE('2023-10-01')
      ) AS txs ON ethereum_mainnet.logs.transaction_hash = txs.hash
    WHERE
      data_creation_date >= DATE('2023-09-01')
      AND data_creation_date < DATE('2023-10-01')
      AND contract_address = '0x88e6a0c2ddd26feeb64f039a2c41296fcb3f5640'
      AND CONTAINS(
        topics,
        '0xc42079f94a6350d7e6235f29174924f928cc2ac818eb64fed8004e115fbcca67'
      )
      AND CAST(argument_values [3] AS DECIMAL) > 0
  ) AS usdc_swaps

Performing the same query, but joining it with scammer addresses identified will yield scammers swapping USDC: This provides the numerator: approximately $3.5M.

SELECT
  SUM(amount_usdc) / POW(10, 6) AS amount_usdc_sum
FROM
  (
    SELECT
      transaction_hash,
      from_address,
      TYPE,
      amount_usdc,
      amount_eth,
      scammer_address
    FROM
      (
        SELECT
          transaction_hash,
          from_address,
          'usdc_for_eth' AS TYPE,
          CAST(argument_values [3] AS DECIMAL) AS amount_usdc,
          CAST(argument_values [4] AS DECIMAL) AS amount_eth
        FROM
          ethereum_mainnet.logs
          JOIN (
            SELECT
              from_address,
              HASH
            FROM
              ethereum_mainnet.transactions
            WHERE
              data_creation_date >= DATE('2023-09-01')
              AND data_creation_date < DATE('2023-10-01')
          ) AS txs ON ethereum_mainnet.logs.transaction_hash = txs.hash
        WHERE
          data_creation_date >= DATE('2023-09-01')
          AND data_creation_date < DATE('2023-10-01')
          AND contract_address = '0x88e6a0c2ddd26feeb64f039a2c41296fcb3f5640'
          AND CONTAINS(
            topics,
            '0xc42079f94a6350d7e6235f29174924f928cc2ac818eb64fed8004e115fbcca67'
          )
          AND CAST(argument_values [3] AS DECIMAL) > 0
      ) AS usdc_swaps
      LEFT OUTER JOIN (
        SELECT
          DISTINCT scammer_address
        FROM
          forta_data.scammer_addresses
       ) AS scammers ON usdc_swaps.from_address = scammers.scammer_address
  )

As such, it can concluded that in September 2023, at least 0.15% of USDC inflows into the USDC/ETH Uniswap V3 pool were coming from scammers!

Answering Question #2: How many scam and spam contracts are created over time?

For the second question, ChatGPT can conduct the analysis (GPT-4 is recommended here). One still needs to upload the scam and spam token contracts to ZettaBlock, but the following prompt can take care of the rest. 

You are a blockchain analyst that is able to use a database called zettablock that indexes blockchain data. Data is stored in AWS Athena and accessible via SQL. 

Tables are bucketed by chain, so there is ethereum_mainnet, polygon_mainnet, bsc_mainnet. Within each bucket, there are tables as follows:

contract_creations
This table contains all the contract creations performed. It contains the following fields:
transaction_hash,VARCHAR
block_number,BIGINT
block_time, TIMESTAMP
creator_address,VARCHAR
creator_address_tx,VARCHAR
address,VARCHAR
bytecode,VARCHAR
data_creation_date,DATE

This table is indexed by data_creation_date, so querying without any data_creation_date will query the entire table and cost more credits. 

You have uploaded a second table called forta_data.spamtokens_20231020, which contains contracts associated with spam. The address of the contract resides in column called col0. 

You have uploaded a third table called forta_data.scam_contract_addresses_20231025_2, which contains contracts associated with scams. The address of the contract resides in column called address. 

Create a query that shows the counts of total contract per month and total spam tokens per month and total scam contracts per month created.

ChatGPT responds with a working query:

WITH monthly_contracts AS (
  SELECT
    DATE_FORMAT(block_time, '%Y-%m') AS month,
    COUNT(*) AS total_contracts
  FROM
    ethereum_mainnet.contract_creations
  GROUP BY
    DATE_FORMAT(block_time, '%Y-%m')
),
monthly_spam AS (
  SELECT
    DATE_FORMAT(c.block_time, '%Y-%m') AS month,
    COUNT(*) AS spam_contracts
  FROM
    ethereum_mainnet.contract_creations c
  JOIN
    forta_data.spamtokens_20231020 s ON c.address = s.col0
  GROUP BY
    DATE_FORMAT(c.block_time, '%Y-%m')
),
monthly_scam AS (
  SELECT
    DATE_FORMAT(c.block_time, '%Y-%m') AS month,
    COUNT(*) AS scam_contracts
  FROM
    ethereum_mainnet.contract_creations c
  JOIN
    forta_data.scam_contract_addresses_20231025_2 sc ON c.address = sc.address
  GROUP BY
    DATE_FORMAT(c.block_time, '%Y-%m')
)
SELECT
  mc.month,
  mc.total_contracts,
  COALESCE(ms.spam_contracts, 0) AS spam_contracts,
  COALESCE(msc.scam_contracts, 0) AS scam_contracts
FROM
  monthly_contracts mc
LEFT JOIN
  monthly_spam ms ON mc.month = ms.month
LEFT JOIN
  monthly_scam msc ON mc.month = msc.month
ORDER BY
  mc.month;

Along with an excellent explanation:

To create a SQL query that will show the counts of total contract tokens, spam tokens, and scam tokens per month, we need to perform a series of JOINs and GROUP BY operations to aggregate the data by month. We'll assume that the block_time field in the contract_creations table represents the timestamp at which a contract was created, and we'll use that to group by month.

Here's what each part does:

The monthly_contracts common table expression (CTE) selects all contract creations and groups them by month. The monthly_spam CTE joins the contract_creations table with the spamtokens_20231020 table to count the number of spam contracts per month.
The monthly_scam CTE joins the contract_creations table with the scam_contract_addresses_20231025_2 table to count the number of scam contracts per month. The final SELECT statement then brings it all together, joining the three CTEs on the month and selecting the counts, using COALESCE to ensure that months without spam or scam contracts show a zero instead of a null.

Note, if there are slight syntax errors, the ChatGPT module embedded in ZettaBlock can help you quickly debug them. The resulting data yields the following results:

In summary, Forta Threat Intelligence augmented by ZettaBlock allows you to answer interesting questions about what is happening on-chain. One can query ZettaBlock directly or even use a no-code solution with ChatGPT empowering everybody to gain these valuable insights.

What on-chain questions do you wonder about? The answers are at your fingertips.

Share