Data collection & classification
Methodology
How BA | Intel collects, validates, classifies, enriches, and retires entity labels. Suitable as an annex to Data License Agreements, for SOC 2 Type II evidence, and GDPR Article 30 records of processing.
Full PDF
The full methodology (v1.0, 12 sections, ~12 pages) is available as PDF for due-diligence and compliance review.
At a glance
- 7 source categories: regulatory lists, industry blacklists, exchange attribution, token holders, behavioral inference, academic datasets, cluster expansion.
- 3 value tiers: T1 compliance-critical, T2 behavioral/contextual, T3 coverage breadth.
- 16 category types: EXCHANGE, MIXER, SCAM, PHISHING, SANCTIONED, EXPLOIT, BRIDGE, DEFI, NFT, GAMBLING, WALLET_SERVICE, CUSTODIAL, MINING, DAO, PAYMENT, OTHER.
- 5 threat levels: SAFE, LOW, MEDIUM, HIGH, CRITICAL.
- Confidence scoring: 1.0 authoritative → 0.5 cross-chain propagation, with decay rules.
- SANCTIONED reserved: only for addresses listed by name on OFAC / EU / UN / UK HMT / Swiss SECO. Inferred claims become
OTHER+ tag.
Key legal defensibility points
- Every label carries
source+confidence— claims are traceable to origin. - BA | Intel provides evidence for the customer's compliance decision — we do not make the decision.
- False-positive reports reviewed within 5 business days, logged in immutable audit table.
- Annual external audit planned Q4 2026 (SOC 2 Type II engagement).
- Full change log and versioning; schema breaking changes get 90-day notice.