BlindML

scikit-learn for encrypted data

BlindML trains Naive Bayes, decision trees, and logistic regression models on marginals—aggregate counts derived from encrypted records—rather than raw data. Your scikit-learn workflow stays intact. No record is ever decrypted during training or inference.

Start building — $9/mo Let's talk

Trained on aggregate counts, not plaintext records

Marginals are cross-tabulated count summaries over encrypted fields. The Blind Insight platform computes them by issuing aggregate queries against ciphertext and returning a count table. The model trains on that table—never on feature vectors derived from decrypted records.

from blind_ml import NaiveBayesModel

model = NaiveBayesModel().fit(marginals, n_pos=3201, n_neg=76402)
pred, risk = model.predict({"fraud_type": "card_fraud"})
# 0.0 F1 delta vs. plaintext — 600K records

blind grants create --data '{
  "name": "blindml-training",
  "field_names": {"risk_level": true, "fraud_type": true, "is_fraud": true},
  "can_create_records": false
}'

Supported algorithms

Source of truth: blind-insight/blind-ml on GitHub — open source.

Naive Bayes Best for multi-class classification tasks over categorical fields with conditional independence assumptions.
Decision Trees Interpretable split-based models; useful when explainability or auditable decision paths are required.
Logistic Regression Linear decision boundaries; calibrated probabilities for binary classification over structured aggregate features.

The numbers.

0.0 F1 delta: vs. plaintext baseline, 600K records
HIPAA k=11: Suppression built in—F1 holds
Naive Bayes · Decision Trees · Logistic Regression: scikit-learn compatible
No plaintext exposure: Training and inference on aggregates only

Start building on your schema.

Get started at the Build tier, $9/month.

Start building — $9/mo Let's talk