Entity Database
The BlockchainAnalysis.io Entity Database is one of the largest curated collections of labeled blockchain addresses in the industry, containing over 49M+ labeled addresses across 38 supported blockchains. It powers the platform's risk scoring, wallet screening, and compliance reporting capabilities.
The Entity Database is continuously updated through automated pipelines, manual research, and partner data feeds. New labels are ingested daily.
Overview
An entity in the BlockchainAnalysis.io context is a real-world organization, service, or actor associated with one or more blockchain addresses. Examples include cryptocurrency exchanges, DeFi protocols, sanctioned wallets, darknet marketplaces, and scam operations.
Each labeled address in the database includes:
- Entity Name — The name of the organization or actor (e.g., "Binance", "Tornado Cash", "Lazarus Group").
- Category — The type of entity (e.g., exchange, mixer, scam).
- Threat Level — The assessed risk of the entity (none, low, medium, high, critical).
- Chain — The blockchain on which the address operates.
- Data Source(s) — Where the label was obtained.
- First Seen / Last Seen — When the address was first and last observed in on-chain activity.
- Confidence Score — Internal confidence level of the label attribution (used in risk scoring).
Entity Categories
Every labeled address is assigned to one of the following categories:
| Category | Description | Examples | |----------------------|-----------------------------------------------------------------------------------------------|----------------------------------------------| | Exchange | Centralized cryptocurrency exchanges (CEX) | Binance, Coinbase, Kraken, OKX | | DeFi Protocol | Decentralized finance smart contracts and associated addresses | Uniswap, Aave, Compound, MakerDAO | | Mixer | Cryptocurrency mixing/tumbling services used to obscure transaction trails | Tornado Cash, Wasabi Wallet, ChipMixer | | Gambling | Online gambling platforms accepting cryptocurrency | Stake, Rollbit, BC.Game | | Darknet | Darknet marketplace wallets and vendor addresses | Hydra Market, AlphaBay, Silk Road | | Scam | Addresses associated with known scams, phishing, rug pulls, and fraud | Various phishing campaigns, rug pull deployers| | Sanctioned | Addresses on government sanctions lists (OFAC SDN, EU, UN) | OFAC-designated wallets, Lazarus Group | | Bridge | Cross-chain bridge protocol contracts and operator addresses | Wormhole, LayerZero, Multichain | | NFT Marketplace | NFT trading platforms and associated smart contracts | OpenSea, Blur, Magic Eden | | DAO | Decentralized autonomous organization treasuries and governance contracts | Uniswap DAO, Aave DAO, MakerDAO | | Payment Processor| Crypto payment service providers | BitPay, CoinGate, NOWPayments | | Custodian | Institutional custody providers | Fireblocks, BitGo, Anchorage | | Mining Pool | Mining pool payout and deposit addresses | F2Pool, Foundry, AntPool | | ATM Operator | Bitcoin ATM networks and their collection addresses | Bitcoin Depot, CoinFlip | | Unknown/Other | Addresses with confirmed labels but that do not fit neatly into the above categories | Various |
Threat Levels
Each entity (and its associated addresses) is assigned a threat level that reflects the risk of interacting with that entity. Threat levels are used as a key input to the Risk Score calculation.
| Threat Level | Numeric Range | Description | |---------------|---------------|---------------------------------------------------------------------------------------------------------------| | None | 0 | No identified risk. Mainstream, regulated entities with strong compliance programs (e.g., Coinbase, Kraken). | | Low | 1–25 | Minimal risk. Established entities with adequate compliance, but may operate in less regulated jurisdictions. | | Medium | 26–50 | Moderate risk. Entities with limited compliance controls, P2P services, or privacy-focused features. | | High | 51–75 | Elevated risk. Entities associated with obfuscation, darknet exposure, or repeated involvement in scams. | | Critical | 76–100 | Maximum risk. Sanctioned entities, confirmed scam operators, terrorist financing, or active exploit wallets. |
Threat levels are assessed based on available data and are updated as new information becomes available. A "none" threat level does not guarantee that an entity is safe — it means no adverse information has been identified at the time of labeling.
Threat Level Assignment
Threat levels are determined through a combination of:
- Regulatory status — Is the entity licensed, registered, or sanctioned?
- Historical incidents — Has the entity been involved in hacks, fraud, or compliance failures?
- Operational transparency — Does the entity publish proof-of-reserves, conduct audits, and maintain KYC/AML programs?
- Category risk — Certain categories carry inherently higher baseline risk (e.g., mixers, darknet).
- Counterparty analysis — What entities does this address frequently interact with?
- Sanctions lists — Is the entity or its operators on any government sanctions list?
Data Sources
The 49M+ labels in the Entity Database are sourced from multiple channels to ensure breadth, accuracy, and timeliness:
1. Public Labels
Publicly available address labels from block explorers and community-maintained databases:
- Etherscan / BscScan / PolygonScan verified labels
- Exchange hot/cold wallet disclosures
- Protocol deployer addresses from verified contracts
- ENS (Ethereum Name Service) reverse resolutions
2. Proprietary Research
The BlockchainAnalysis.io research team conducts original investigations:
- On-chain clustering analysis (linking addresses to the same entity based on transaction patterns)
- OSINT (Open Source Intelligence) investigations
- Honeypot and phishing detection systems
- Smart contract source code analysis
- Behavioral pattern matching
3. Partner Feeds
Data shared through partnerships with compliance platforms, law enforcement agencies, and blockchain analytics providers:
- Institutional data-sharing agreements
- Cross-platform entity resolution
- LEA (Law Enforcement Agency) advisories
4. Blockchain Explorers
Automated ingestion of verified labels from major block explorers:
- Contract verification metadata
- Token creator and deployer addresses
- Proxy contract resolution
5. OFAC / Sanctions Lists
Government sanctions lists are monitored and ingested automatically:
- OFAC SDN List (US Treasury) — Updated within hours of OFAC publications
- EU Consolidated Sanctions List
- UN Security Council Sanctions
- UK HM Treasury Sanctions
- Other national sanctions lists
OFAC sanctions addresses are flagged with a critical threat level immediately upon ingestion. Screening results clearly indicate if an address is directly sanctioned or has transacted with a sanctioned entity.
6. Community Reports
Crowd-sourced intelligence from verified community reporters:
- Scam reports submitted through the BlockchainAnalysis.io reporting tool
- Reports from partner platforms and industry working groups
- Validated social media reports (e.g., known rug pull announcements)
All community reports go through a verification process before being added to the database.
7. Regulator Databases (40+ Sources)
Compliance data from major financial regulators worldwide is imported and cross-referenced:
| Region | Regulators | |---|---| | Middle East | VARA (Dubai), DFSA (DIFC), ADGM (Abu Dhabi) | | Asia Pacific | MAS (Singapore), JFSA (Japan), SFC (Hong Kong — 97K+ entities) | | Europe | FCA (UK — 13.3K warning list entities), BaFin (Germany), FINMA (Switzerland) | | Americas | SEC (US — 9.6K barred entities), FinCEN (US) | | Oceania | ASIC (Australia — 7.1K entities) | | International | DNB (Netherlands — 3.3K), Companies House (UK — daily officer sync) |
8. Corporate Registry Data
The Entity Database integrates corporate ownership and UBO data:
- Companies House — 5.7M UK companies with officers and PSC data
- GLEIF — 3.2M companies with Legal Entity Identifiers
- LEI Relationships — 467.9K corporate ownership links (parent/subsidiary)
- Company Officers — 2.85M director and officer records
- PSC / UBO — 14.3M Persons with Significant Control records
9. OFAC 50% Rule Derived Entities
The platform applies the OFAC 50% Rule: entities that are 50% or more owned by sanctioned persons are treated as sanctioned, even if not explicitly named on the SDN list. The database includes 5,820 derived entities from OFAC's SDN Advanced XML data subject to this rule.
How Entity Matching Works
When you screen an address using BlockchainAnalysis.io (via the web dashboard, API, or Telegram bot), the platform performs entity matching in the following steps:
Step 1: Direct Match
The submitted address is checked directly against the Entity Database. If an exact match is found, the entity label, category, and threat level are returned immediately.
Step 2: Cluster Expansion
If no direct match is found, the platform checks whether the address belongs to a cluster — a group of addresses controlled by the same entity. Clustering is based on:
- Common input ownership (for UTXO chains) — Addresses that appear as inputs in the same transaction are likely controlled by the same entity.
- Deposit address mapping — Exchange deposit addresses are linked to their parent exchange.
- Smart contract interaction patterns — Addresses that consistently interact with the same set of contracts in identical patterns.
- Funding source analysis — Addresses funded from the same source wallet.
If the address belongs to a known cluster, the entity label of the cluster is applied.
Step 3: Counterparty Analysis
Even if the address itself is unlabeled, the platform analyzes its counterparties — the addresses it has sent to or received from. This produces:
- Direct exposure — Percentage of funds sent to or received from labeled entities, broken down by category and threat level.
- Indirect exposure — Second-hop analysis showing counterparties of counterparties.
Step 4: Risk Score Computation
The entity match results feed into the Risk Score algorithm, which considers:
- Direct entity label (if found)
- Cluster membership
- Direct and indirect counterparty exposure
- Transaction patterns
- Chain-specific risk factors
Database Statistics
| Metric | Value | |-------------------------------|--------------------| | Total Labeled Addresses | 49M+ | | Verified On-Chain Addresses | 4.2B+ | | Unique Entities | 150,000+ | | Supported Chains | 38 | | Data Sources | 247 | | Screening Entities | 7.5M+ | | Company Records | 8.9M+ | | Company Officers | 2.85M+ | | PSC / UBO Records | 14.3M+ | | LEI Relationships | 467.9K+ | | Sanctions Addresses Tracked | 25,000+ | | Categories | 15 | | Daily Label Updates | ~50,000–100,000 | | Average Label Confidence | >95% |
Using the Entity Database
Web Dashboard
Navigate to Compliance > Entity Database to search for entities by name, address, or category. You can:
- Search by address to see its entity label and threat level
- Browse entities by category
- Filter by threat level
- Export entity lists as CSV
API
The Entity Database is accessible via the BlockchainAnalysis.io REST API. See the API Reference for endpoints and authentication.
Telegram Bot
Use the /check or /advanced command with @BA_ScreenBot to query the Entity Database in real time. See Telegram Bot Getting Started.
Address Verification (Bloom Filter Pre-Check)
Before performing a full entity lookup, BlockchainAnalysis.io runs every submitted address through an instant address verification layer powered by a bloom filter containing 4.2B+ verified on-chain addresses sourced from Google BigQuery public datasets.
How It Works
- Pre-validation — When an address is submitted for screening, it is first checked against the bloom filter to confirm it has appeared on-chain. This takes less than 1 millisecond.
- Chain coverage — The bloom filter includes addresses from 9 major blockchains: Bitcoin, Ethereum, BNB Smart Chain, Polygon, Arbitrum, Optimism, Base, Avalanche, and Fantom.
- Result interpretation — If the address is found in the bloom filter, it is confirmed as a real on-chain address and proceeds to full entity matching. If not found, the platform flags it as potentially invalid or never-used, which is useful for detecting typos, fabricated addresses, or addresses on unsupported chains.
The bloom filter is a probabilistic data structure with a false positive rate below 0.01%. A positive match confirms the address has been seen on-chain; a negative match guarantees the address has never appeared in the covered chains.
Benefits
- Instant feedback — Users get immediate confirmation that an address is valid before the full screening pipeline runs.
- Reduced false screenings — Prevents wasted credits on invalid or non-existent addresses.
- Fraud detection signal — An address that does not appear in 4.2B+ verified addresses may indicate a fabricated wallet in counterparty documentation.
Data Accuracy and Limitations
While BlockchainAnalysis.io strives for maximum accuracy, no entity database is 100% complete or error-free. Labels are based on best-available information and are subject to change as new data emerges.
- False positives — An address may be incorrectly labeled. If you believe a label is wrong, contact support@blockchainanalysis.io to request a review.
- Unlabeled addresses — Not every address is labeled. An address without a label is not necessarily safe — it simply has not been identified yet.
- Lag in updates — There may be a delay between a real-world event (e.g., a new scam) and the addition of associated addresses to the database.
- Cluster accuracy — Clustering algorithms are probabilistic. Addresses may be incorrectly grouped or ungrouped.
Next Steps
- Wallet Screening — Learn how to screen addresses using the Entity Database.
- Risk Score — Understand how entity data feeds into the risk scoring algorithm.
- Sanctions Screening — Dedicated sanctions and watchlist screening.
- Supported Blockchains — Full list of chains covered by the Entity Database.