BlindML
scikit-learn for encrypted data
BlindML trains Naive Bayes, decision trees, and logistic regression models on marginals—aggregate counts derived from encrypted records—rather than raw data. Your scikit-learn workflow stays intact. No record is ever decrypted during training or inference.
Trained on aggregate counts, not plaintext records
Marginals are cross-tabulated count summaries over encrypted fields. The Blind Insight platform computes them by issuing aggregate queries against ciphertext and returning a count table. The model trains on that table—never on feature vectors derived from decrypted records.
from blind_ml import NaiveBayesModel
model = NaiveBayesModel().fit(marginals, n_pos=3201, n_neg=76402)
pred, risk = model.predict({"fraud_type": "card_fraud"})
# 0.0 F1 delta vs. plaintext — 600K records blind grants create --data '{
"name": "blindml-training",
"field_names": {"risk_level": true, "fraud_type": true, "is_fraud": true},
"can_create_records": false
}' Supported algorithms
Source of truth: blind-insight/blind-ml on GitHub — open source.
- Naive Bayes Best for multi-class classification tasks over categorical fields with conditional independence assumptions.
- Decision Trees Interpretable split-based models; useful when explainability or auditable decision paths are required.
- Logistic Regression Linear decision boundaries; calibrated probabilities for binary classification over structured aggregate features.
The numbers.
- 0.0 F1 delta
- vs. plaintext baseline, 600K records
- HIPAA k=11
- Suppression built in—F1 holds
- Naive Bayes · Decision Trees · Logistic Regression
- scikit-learn compatible
- No plaintext exposure
- Training and inference on aggregates only
Start building on your schema.
Get started at the Build tier, $9/month.