BlindML

scikit-learn for encrypted data

BlindML trains Naive Bayes, decision trees, and logistic regression models on marginals—aggregate counts derived from encrypted records—rather than raw data. Your scikit-learn workflow stays intact. No record is ever decrypted during training or inference.

Trained on aggregate counts, not plaintext records

Marginals are cross-tabulated count summaries over encrypted fields. The Blind Insight platform computes them by issuing aggregate queries against ciphertext and returning a count table. The model trains on that table—never on feature vectors derived from decrypted records.

blind_ml — training
from blind_ml import NaiveBayesModel

model = NaiveBayesModel().fit(marginals, n_pos=3201, n_neg=76402)
pred, risk = model.predict({"fraud_type": "card_fraud"})
# 0.0 F1 delta vs. plaintext — 600K records
blind grants create — scope training access
blind grants create --data '{
  "name": "blindml-training",
  "field_names": {"risk_level": true, "fraud_type": true, "is_fraud": true},
  "can_create_records": false
}'

Supported algorithms

Source of truth: blind-insight/blind-ml on GitHub — open source.

  • Naive Bayes Best for multi-class classification tasks over categorical fields with conditional independence assumptions.
  • Decision Trees Interpretable split-based models; useful when explainability or auditable decision paths are required.
  • Logistic Regression Linear decision boundaries; calibrated probabilities for binary classification over structured aggregate features.

The numbers.

0.0 F1 delta
vs. plaintext baseline, 600K records
HIPAA k=11
Suppression built in—F1 holds
Naive Bayes · Decision Trees · Logistic Regression
scikit-learn compatible
No plaintext exposure
Training and inference on aggregates only

Start building on your schema.

Get started at the Build tier, $9/month.