Announcing our comprehensive safety framework for AI agents that handle financial transactions, including spending limits, human oversight, and transaction verification.

.safety[ AI ]
As we enable AI agents to participate in financial transactions, ensuring safety isn't optional—it's essential. Today, we're publishing our comprehensive safety framework for autonomous financial agents.
Our safety framework is built on four foundational principles:
Users must always have the ability to oversee, modify, or revoke agent permissions at any time.
Agents should operate with the minimum permissions necessary to accomplish their tasks.
All agent actions must be logged and explainable to users in understandable terms.
Where possible, actions should be reversible, with clear processes for undoing unintended transactions.
Our platform implements hierarchical spending controls:
| Limit Type | Scope | Default | Configurable |
|---|---|---|---|
| Per-transaction | Single payment | $100 | Yes |
| Daily | 24-hour rolling window | $500 | Yes |
| Weekly | 7-day rolling window | $2,000 | Yes |
| Monthly | 30-day rolling window | $5,000 | Yes |
| Lifetime | Total agent spending | Unlimited | Yes |
Limits can be adjusted based on trust signals:
interface LimitAdjustment {
// Factors that can increase limits
positiveSignals: {
successfulTransactions: number;
accountAge: Duration;
verificationLevel: 'basic' | 'enhanced' | 'enterprise';
};
// Factors that decrease limits
riskSignals: {
unusualActivity: boolean;
failedVerifications: number;
disputeRate: number;
};
}
Not all transactions should be fully autonomous. Our framework defines when human approval is required:
The platform supports multiple approval patterns:
Our ML-based fraud detection system monitors for:
Unusual transaction frequency or amounts
Known fraud patterns and attack vectors
Deviations from established agent behavior
class FraudDetector:
def analyze_transaction(self, txn: Transaction) -> RiskScore:
scores = [
self.velocity_check(txn),
self.pattern_match(txn),
self.behavioral_analysis(txn),
self.merchant_risk(txn),
self.geographic_risk(txn)
]
# Weighted ensemble of risk signals
final_score = self.ensemble_score(scores)
if final_score > BLOCK_THRESHOLD:
return RiskScore.BLOCK
elif final_score > REVIEW_THRESHOLD:
return RiskScore.REQUIRE_REVIEW
else:
return RiskScore.APPROVE
Every agent action is logged with:
| Data Point | Retention | Access |
|---|---|---|
| Transaction logs | 7 years | User, Admin, Compliance |
| Authorization events | 7 years | User, Admin |
| Agent decisions | 2 years | User, Admin, Research |
| System events | 90 days | Admin, Security |
When issues are detected, our response protocol activates:
Safety is not a destination—it's an ongoing process. We commit to:
Read the full safety documentation or contact our safety team to discuss specific requirements for your use case.
Safety Team
Hyperfold Safety
Ivo Kolev, Luis Povoa, and Ali Youssef