Problem Statement
The Unified Payments Interface (UPI) has revolutionized digital payments in India, processing over 131 billion transactions worth ₹200 lakh crore in FY2024. However, this growth has been accompanied by a surge in fraudulent activities. According to the National Payments Corporation of India (NPCI), ₹1,087 crore was lost to UPI fraud in 2024, affecting approximately 1.34 million users.
Fraudsters employ sophisticated tactics such as:
- SIM Swap Scams: Attackers port a victim’s mobile number to a new SIM card, bypassing OTP authentication.
- Device Cloning: Malware extracts device fingerprints (IMEI, MAC address) to mimic legitimate users.
- QR Code Phishing: Fake UPI IDs embedded in fraudulent messages trick users into authorizing payments.
For financial institutions, the consequences are twofold: direct monetary losses and eroded customer trust. A major private bank reported ₹4.2 crore in monthly losses and a 14% decline in UPI adoption due to security concerns.
Data Collection
To combat these threats, we designed a data collection framework capturing 85 parameters across four categories:
Device Attributes
- Hardware Signatures: IMEI, MAC address, battery health, and processor type.
- Software Configuration: OS version, installed apps (hashed via SHA-256), and system fonts.
- Behavioral Patterns: Typing speed, screen tap intervals, and session duration.
Geolocation Data
- GPS Coordinates: Compared against historical patterns.
- Location Velocity: Calculated using the Haversine formula
from math import radians, sin, cos, sqrt, atan2
def haversine(lat1, lon1, lat2, lon2):
R = 6371 # Earth radius in km
dlat = radians(lat2 - lat1)
dlon = radians(lon2 - lon1)
a = sin(dlat/2)**2 + cos(radians(lat1)) * cos(radians(lat2)) * sin(dlon/2)**2
c = 2 * atan2(sqrt(a), sqrt(1-a))
return R * c
Transactions exceeding 150 km/hour velocity are flagged.
Transaction Context
- Beneficiary History: New payees are risk-scored against known mule accounts.
- Time-Based Features: Hour of day and transaction frequency (e.g., ₹50k+ transfers at 2 AM).
Behavioral Biometrics
- Keystroke Dynamics: Measured via Android’s MotionEvent API.
- Legitimate users exhibit consistent typing speeds (150–200 ms/keystroke).
- Bots often have sub-100 ms intervals.
Model Development
Dataset Construction
We analyzed 10 million anonymized UPI transactions (January 2023 – December 2024), including:
- Training Set: 8 million samples (80%)
- Validation Set: 1 million (10%)
- Test Set: 1 million (10%)
Class distribution was heavily imbalanced, with only 0.3% fraudulent transactions. To address this, we applied Synthetic Minority Oversampling (SMOTE):
from imblearn.over_sampling import SMOTE
smote = SMOTE(sampling_strategy=0.12, random_state=42)
X_resampled, y_resampled = smote.fit_resample(X_train, y_train)
Algorithm Selection
After testing logistic regression, random forests, and neural networks, XGBoost emerged as the optimal choice due to:
- Handling of Imbalanced Data: Custom loss weighting (fraud class weighted 142x).
- Explainability: Feature importance scores aligned with domain expertise.
- GPU Acceleration: Training on AWS EC2
p3.8xlarge
reduced runtime from hours to minutes.
Hyperparameter Optimization
Using Optuna, we executed 500 trials to maximize AUC-ROC:
import optuna
def objective(trial):
params = {
'max_depth': trial.suggest_int('max_depth', 3, 7),
'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.3),
'subsample': trial.suggest_float('subsample', 0.6, 1.0),
'scale_pos_weight': trial.suggest_int('scale_pos_weight', 50, 200)
}
scores = xgb.cv(params, dtrain, nfold=5, metrics='auc')
return scores['test-auc-mean'].iloc[-1]
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=500)
Final Parameters:
best_params = {
'max_depth': 5,
'learning_rate': 0.15,
'subsample': 0.8,
'scale_pos_weight': 142,
'tree_method': 'gpu_hist',
'objective': 'binary:logistic'
}
Implementation
Cloud Infrastructure
- Data Lake: Amazon S3 stored raw transactions in Parquet format.
- Stream Processing: Apache Kafka ingested real-time data at 10,000 transactions/second.
- Model Serving: XGBoost deployed on EC2 G4 instances with NVIDIA T4 GPUs.
API Integration
A Flask API provided risk scores to UPI apps within 18ms:
from flask import Flask, request
import xgboost as xgb
app = Flask(__name__)
model = xgb.Booster()
model.load_model('s3://digicraft-models/upi-risk-v4.xgb')
@app.route('/assess_risk', methods=['POST'])
def assess_risk():
data = request.json
features = preprocess(data) # Device, location, transaction
dmatrix = xgb.DMatrix([features])
risk_score = model.predict(dmatrix)[0] * 1000 # Scale to 0-1000
return {'risk_score': int(risk_score)}, 200
Payment Flow Integration

Performance Metrics
Model Accuracy
Metric | Pre-Tuning | Post-Tuning |
---|---|---|
AUC-ROC | 0.91 | 0.96 |
Recall (Fraud) | 76% | 89% |
False Positive Rate | 3.8% | 1.3% |
System Efficiency
- Latency: 18ms per inference (50ms end-to-end).
- Throughput: 2,100 transactions/second on a single EC2 instance.
Impact
Deployed at a partner bank processing ₹6,600 crore/month via UPI:
- Fraud Prevention: Blocked ₹12.7 crore/month in losses.
- User Retention: 92% satisfaction rate (vs. 67% pre-deployment).
- Operational Efficiency: 78% reduction in manual fraud reviews.
Challenges & Future Directions
Persistent Gaps
- Explainability: Users demand clarity on blocked transactions.
- Zero-Day Attacks: 11% of novel fraud patterns evade detection.
- Regulatory Compliance: RBI’s evolving digital lending guidelines require agile updates.
Roadmap
- Explainable AI: Integrate SHAP values to visualize risk factors.
- Federated Learning: Collaborate with 5 banks to detect emerging threats.
- On-Device ML: TensorFlow Lite models for low-risk transactions (5ms latency).
Conclusion
This AI-driven risk engine demonstrates how machine learning can secure India’s digital payments ecosystem without compromising speed or user experience. By combining device biometrics, behavioral analytics, and scalable cloud infrastructure, financial institutions can reduce fraud losses by 85% while maintaining <2% false positives.
Reach Out to Us
At DigiCraft Technovision Private Limited, we are passionate about leveraging AI/ML technologies to solve real-world problems in Fintech space and beyond. If you have any questions about this project or want to explore how AI can transform your business operations, feel free to reach out!
Email us at [email protected]
Visit our website at https://digicraft.ai
Let’s collaborate and innovate together!