Forta is the largest network of security intel in Web3. The decentralized Forta Network leverages machine learning and a community of security researchers to detect exploits, scams and other threats.
Forta is a powerful Web3 protocol that generates actionable and insightful threat intelligence, but how can you access that power to answer questions about the blockchain landscape? Several feeds exist on the network that produce lists of indicators (EOAs, contracts, URLs, social media account names) of particular attacks (Scam Detector Feed, Spam Detector Feed, Token Sniffer Rug Pull Feed, Attack Detector Feed)
This data in itself is interesting. For instance, Scam Detector data in October 2023 shows that the majority of end user attack indicators reside on Ethereum, Polygon and BSC. Of the different types of end-user attacks, the data shows address poisoning being most prevalent (but this is an artifact of how these attacks operate).
But what if you want to answer some questions that require on-chain data to answer? In those cases, Forta Threat Intelligence really shines when combining it with the cutting-edge data platform ZettaBlock. This allows you to answer questions like:
1. Given the value of all swaps on the Uniswap USDC/ETH pool, what percentage are associated with scams?
2. How many scam and spam contracts are created over time?
Answering Question #1: Given the value of all swaps on the Uniswap USDC/ETH pool, what percentage are associated with scams?
To answer the first question, USDC inflows into the pools need to be analyzed. This is when someone swaps USDC to ETH (which scammers do given USDC have blocklist functionality and the assets could be frozen by Circle, the issuer of USDC).
Swaps on Uniswap pools emit a swap event, which shows sender/recipient, the amounts of the swap as well as the direction of the swap. Unfortunately, the sender/recipient can consist of the Uniswap router for multicall transaction. As such, in order to determine who is swapping, the initiating EOA is utilized.
The percentage is determined by summing all USDC for USDC to ETH swaps initiated by scammers vs summing all USDC for USDC-ETH swaps.
In practice, the following three steps need to be performed:
1. Query Forta Threat Intelligence using the Forta Graph QL API 2. Upload the data to ZettaBlock 3. Build and execute the query to answer the question at hand.
The Scam Detector’s API tutorial does an excellent job of explaining the first question. Once the data has been stored in a CSV, it can be uploaded directly to ZettaBlock using the Data Sources to CSV File upload functionality.
Afterwards, the data will be available to query. This allows you to intersect the Forta Threat Intelligence with historical on-chain activity.
Now, moving on to building the query starting out by summing all USDC to ETH swaps for the Uniswap USDC/ETH pool via the event emitted. This gives you the denominator of $2.49B USDC:
SELECT SUM(amount_usdc) / POW(10, 6) AS amount_usdc_sum FROM ( SELECT transaction_hash, from_address, 'usdc_for_eth' AS TYPE, CAST(argument_values [3] AS DECIMAL) AS amount_usdc, CAST(argument_values [4] AS DECIMAL) AS amount_eth FROM ethereum_mainnet.logs JOIN ( SELECT from_address, HASH FROM ethereum_mainnet.transactions WHERE data_creation_date >= DATE('2023-09-01') AND data_creation_date < DATE('2023-10-01') ) AS txs ON ethereum_mainnet.logs.transaction_hash = txs.hash WHERE data_creation_date >= DATE('2023-09-01') AND data_creation_date < DATE('2023-10-01') AND contract_address = '0x88e6a0c2ddd26feeb64f039a2c41296fcb3f5640' AND CONTAINS( topics, '0xc42079f94a6350d7e6235f29174924f928cc2ac818eb64fed8004e115fbcca67' ) AND CAST(argument_values [3] AS DECIMAL) > 0 ) AS usdc_swaps
Performing the same query, but joining it with scammer addresses identified will yield scammers swapping USDC: This provides the numerator: approximately $3.5M.
SELECT SUM(amount_usdc) / POW(10, 6) AS amount_usdc_sum FROM ( SELECT transaction_hash, from_address, TYPE, amount_usdc, amount_eth, scammer_address FROM ( SELECT transaction_hash, from_address, 'usdc_for_eth' AS TYPE, CAST(argument_values [3] AS DECIMAL) AS amount_usdc, CAST(argument_values [4] AS DECIMAL) AS amount_eth FROM ethereum_mainnet.logs JOIN ( SELECT from_address, HASH FROM ethereum_mainnet.transactions WHERE data_creation_date >= DATE('2023-09-01') AND data_creation_date < DATE('2023-10-01') ) AS txs ON ethereum_mainnet.logs.transaction_hash = txs.hash WHERE data_creation_date >= DATE('2023-09-01') AND data_creation_date < DATE('2023-10-01') AND contract_address = '0x88e6a0c2ddd26feeb64f039a2c41296fcb3f5640' AND CONTAINS( topics, '0xc42079f94a6350d7e6235f29174924f928cc2ac818eb64fed8004e115fbcca67' ) AND CAST(argument_values [3] AS DECIMAL) > 0 ) AS usdc_swaps LEFT OUTER JOIN ( SELECT DISTINCT scammer_address FROM forta_data.scammer_addresses ) AS scammers ON usdc_swaps.from_address = scammers.scammer_address )
As such, it can concluded that in September 2023, at least 0.15% of USDC inflows into the USDC/ETH Uniswap V3 pool were coming from scammers!
Answering Question #2: How many scam and spam contracts are created over time?
For the second question, ChatGPT can conduct the analysis (GPT-4 is recommended here). One still needs to upload the scam and spam token contracts to ZettaBlock, but the following prompt can take care of the rest.
You are a blockchain analyst that is able to use a database called zettablock that indexes blockchain data. Data is stored in AWS Athena and accessible via SQL.
Tables are bucketed by chain, so there is ethereum_mainnet, polygon_mainnet, bsc_mainnet. Within each bucket, there are tables as follows:
contract_creations This table contains all the contract creations performed. It contains the following fields: transaction_hash,VARCHAR block_number,BIGINT block_time, TIMESTAMP creator_address,VARCHAR creator_address_tx,VARCHAR address,VARCHAR bytecode,VARCHAR data_creation_date,DATE
This table is indexed by data_creation_date, so querying without any data_creation_date will query the entire table and cost more credits.
You have uploaded a second table called forta_data.spamtokens_20231020, which contains contracts associated with spam. The address of the contract resides in column called col0.
You have uploaded a third table called forta_data.scam_contract_addresses_20231025_2, which contains contracts associated with scams. The address of the contract resides in column called address.
Create a query that shows the counts of total contract per month and total spam tokens per month and total scam contracts per month created.
ChatGPT responds with a working query:
WITH monthly_contracts AS ( SELECT DATE_FORMAT(block_time, '%Y-%m') AS month, COUNT(*) AS total_contracts FROM ethereum_mainnet.contract_creations GROUP BY DATE_FORMAT(block_time, '%Y-%m') ), monthly_spam AS ( SELECT DATE_FORMAT(c.block_time, '%Y-%m') AS month, COUNT(*) AS spam_contracts FROM ethereum_mainnet.contract_creations c JOIN forta_data.spamtokens_20231020 s ON c.address = s.col0 GROUP BY DATE_FORMAT(c.block_time, '%Y-%m') ), monthly_scam AS ( SELECT DATE_FORMAT(c.block_time, '%Y-%m') AS month, COUNT(*) AS scam_contracts FROM ethereum_mainnet.contract_creations c JOIN forta_data.scam_contract_addresses_20231025_2 sc ON c.address = sc.address GROUP BY DATE_FORMAT(c.block_time, '%Y-%m') ) SELECT mc.month, mc.total_contracts, COALESCE(ms.spam_contracts, 0) AS spam_contracts, COALESCE(msc.scam_contracts, 0) AS scam_contracts FROM monthly_contracts mc LEFT JOIN monthly_spam ms ON mc.month = ms.month LEFT JOIN monthly_scam msc ON mc.month = msc.month ORDER BY mc.month;
Along with an excellent explanation:
To create a SQL query that will show the counts of total contract tokens, spam tokens, and scam tokens per month, we need to perform a series of JOINs and GROUP BY operations to aggregate the data by month. We'll assume that the block_time field in the contract_creations table represents the timestamp at which a contract was created, and we'll use that to group by month.
Here's what each part does:
The monthly_contracts common table expression (CTE) selects all contract creations and groups them by month. The monthly_spam CTE joins the contract_creations table with the spamtokens_20231020 table to count the number of spam contracts per month. The monthly_scam CTE joins the contract_creations table with the scam_contract_addresses_20231025_2 table to count the number of scam contracts per month. The final SELECT statement then brings it all together, joining the three CTEs on the month and selecting the counts, using COALESCE to ensure that months without spam or scam contracts show a zero instead of a null.
Note, if there are slight syntax errors, the ChatGPT module embedded in ZettaBlock can help you quickly debug them. The resulting data yields the following results:
In summary, Forta Threat Intelligence augmented by ZettaBlock allows you to answer interesting questions about what is happening on-chain. One can query ZettaBlock directly or even use a no-code solution with ChatGPT empowering everybody to gain these valuable insights.
What on-chain questions do you wonder about? The answers are at your fingertips.