Introducing a Cutting-Edge ML-Powered Detection Bot for Phishing Scams

Article by Forta Network Jul. 12, 2023

Forta is a real-time detection network for security monitoring of blockchain activity. The decentralized Forta Network scans all transactions and block-by-block state changes, leveraging machine learning to detect threats and anomalies on wallets, DeFi, NFTs, bridges, governance and other Web3 systems. When issues are detected, Web3 infrastructure can respond to prevent attacks via transaction screening and incident response.


The Forta Network is once again pushing the boundaries of blockchain security by unveiling its latest innovation: a groundbreaking, AI-driven detection bot designed to detect phishing scams with remarkable precision. With phishing scams wiping out billions of user funds, the introduction of an advanced machine learning detection model represents a significant step forward in combating these security threats.

As the cryptocurrency landscape continues to grow, it has become evident that phishing scams are an increasingly prevalent and sophisticated threat. The decentralized nature of blockchain technology, while being one of its strengths, also makes it an attractive target for malicious actors. Phishing scammers exploit this by devising schemes to trick individuals into divulging sensitive information or transferring funds to fraudulent addresses. The Forta Network’s new ML-Powered Detection Bot is needed now more than ever for the following reasons:

Rapid Evolution of Scams: Forta Network’s bot adapts to evolving phishing tactics through machine learning, staying ahead of emerging threats.

High Financial Stakes: With substantial financial assets at risk, the bot’s efficient detection system is crucial to minimize losses in Ethereum transactions.

Data-Driven Decision Making: Through complex algorithms and analysis of vast data, the bot empowers highly accurate, data-driven decisions to differentiate between legitimate activities and scams.

Addressing the Imbalance: The bot’s use of EasyEnsembleClassifier effectively handles imbalanced datasets, focusing on the minority class for improved scam detection.

Continuous Monitoring and Real-Time Alerts: Offering 24/7 surveillance, the bot ensures vigilant monitoring and generates immediate alerts for suspicious activities on the blockchain.

Contribution to a Safer Community: By identifying and flagging potential scammers, the bot fosters a safer blockchain environment and encourages responsible conduct within the community.

Another Asset to Transaction Screening: As Forta’s data fed into wallet provider’s threat intelligence repositories, this bot can assist wallets in alerting users of phishing scams via active transaction screening.

Delving Into the Technology 

Forta Network’s Ethereum Phishing Scam Detection Model relies on the EasyEnsembleClassifier with integration of LGBM classifiers. This innovative use of EasyEnsembleClassifier is specifically designed to detect EOA (Externally Owned Account) phishing scammers.

EasyEnsemble is an ensemble of weak learners that employs undersampling to enhance classification on imbalanced datasets, which is ideal for phishing scam detection due to the very low percentage of addresses that are actually scammers. LightGBM, a high-performance, distributed, and efficient gradient boosting framework based on decision tree algorithms, is used for weak learners.

The model produces a prediction score ranging from 0 to 1, where scores closer to 1 imply a higher likelihood of the address being a phishing scammer. A threshold variable named MODEL_THRESHOLD is employed to trigger alerts for phishing scammers if the prediction score exceeds the specified threshold.

Key Features

The model uses various features to make predictions, such as:

Addresses’ incoming and outgoing transaction count, block number, and value.
Transaction activity of addresses’ 1-degree neighbors to identify money laundering and mass scamming activities.

Performance Metrics

The model has been evaluated against a test set from a Kaggle competition and demonstrated impressive results:

Precision: 0.69
Recall: 0.44
F1-score: 0.54

Furthermore, in a dataset of recent scams from May and June of 2023, it identified 54% of known scams with 88% precision.

Alerts

The model raises an EOA-PHISHING-SCAMMER alert when a phishing scammer is detected. The severity is always set to CRITICAL, and the type is set to SUSPICIOUS. The metadata of the alert includes model features and prediction values along with feature generation and prediction response times.

Example Alert

$ npm run tx 0x3dd20a427489cec65c333cfdbc9d76c8533b57e9735961bead79a9d6729c3dd1

...

1 findings for transaction 0x3dd20a427489cec65c333cfdbc9d76c8533b57e9735961bead79a9d6729c3dd1 {

  "name": "Phishing Scammer Detected",

  "description": "0xc6f5341d0cfea47660985b1245387ebc0dbb6a12 has been identified as a phishing scammer",

  "alertId": "EOA-PHISHING-SCAMMER",

  "protocol": "ethereum",

  "severity": "Critical",

  "type": "Suspicious",

  "metadata": {

    "scammer": "0xc6f5341d0cfea47660985b1245387ebc0dbb6a12",

    "feature_generation_time_sec": 55.393977834,

    "prediction_time_sec": 3.258650750000001,

    "feature_1_from_address_count_unique_ratio": 0.8977777777777778,

    "feature_2_from_address_nunique": 202,

    "feature_3_in_block_number_std": 103944.45395073255,

    "feature_4_in_ratio": 0.0000027220471162690886,

    "feature_5_ratio_from_address_nunique": 0.6824324324324325,

    "feature_6_total_time": 9495012,

    "feature_7_from_in_min_std": 0,

    "feature_8_from_in_block_timespan_median": 477557,

    "feature_9_from_out_min_std": 0,

    "feature_10_from_out_block_std_median": 166256.50317778645,

    "feature_11_to_in_sum_min": 48516.30387100715,

    "feature_12_to_in_sum_median": 196337.4491312858,

    "feature_13_to_in_sum_median_ratio": 5190.0165806816785,

    "feature_14_to_in_min_min": 1e-18,

    "feature_15_to_in_block_std_median": 236914.9074575176,

    "feature_16_to_out_min_std": 0,

    "anomaly_score": 1,

    "model_version": "1678286940",

    "model_threshold": 0.5

  },

  "addresses": [],

  "labels": [

    {

      "entityType": "Address",

      "entity": "0xc6f5341d0cfea47660985b1245387ebc0dbb6a12",

      "label": "scammer-eoa",

      "confidence": 0.659,

      "remove": false,

      "metadata": {}

    }

  ]

}

This model represents a substantial advancement in the detection of phishing scams that plague the Web3 security domain. By leveraging machine learning and the power of data, Forta Network is contributing to a more secure and reliable blockchain ecosystem. The Ethereum Phishing Scam Detection ML Model bot can be found deployed on the Forta Network here and further details on the machine learning model can be found here. This bot is currently integrated into the beta version of Forta’s Scam Detector and will be elevated to the production bot in the near future.