Risk Tolerance Classifier (XGBoost, v3)

What it does:
Predicts an entrepreneur’s risk tolerance — Low, Medium, or High — from eight numeric features that capture financial, psychological, and behavioral factors.

Why it’s here:
Understanding risk tolerance is crucial for entrepreneurial readiness. This sub-model complements the Skill Level Classifier by quantifying how much uncertainty, failure, and financial variability an individual can handle.

📊 Input Features

All input features are numeric and should be scaled as defined below:

comfort_with_uncertainty (1–10)
- Meaning: How comfortable the individual feels making decisions without knowing the outcome.
- High value: Person is confident in uncertain situations → more risk-tolerant.
savings_to_expense_ratio (0.1–12.0)
- Meaning: Ratio of monthly savings to monthly expenses (financial buffer).
- High value: Stronger financial cushion → easier to tolerate risks.
runway_months (0–60)
- Meaning: How many months the person could cover costs if no new income came in.
- High value: Longer runway → more freedom to take risks.
debt_to_income_ratio (0.0–1.5)
- Meaning: Portion of income already committed to debt.
- High value: Higher debt burden → lower risk tolerance.
comfort_with_failure (1–10)
- Meaning: How resilient the person feels after setbacks or failures.
- High value: Bounces back quickly → higher risk tolerance.
entrepreneurial_experience_level (0–10)
- Meaning: Past experience starting or running ventures/projects.
- High value: More experience → typically higher tolerance.
investment_risk_history (1–10)
- Meaning: Willingness to take risks in past investments/decisions.
- High value: Prior bold decisions → greater tolerance now.
short_term_vs_long_term_preference (1–10)
- Meaning: Whether the person prefers immediate results (low) vs. long-term outcomes (high).
- High value: Long-term focus → can withstand short-term risks for future payoff.

🎯 Target

risk_tolerance (categorical):
- Low → 0
- Medium → 1
- High → 2

This is derived from the individual’s profile and reflects their comfort with uncertainty, failure, and financial tradeoffs.

🧠 Training Setup

Algorithm: XGBoost (gradient-boosted decision trees)
Task: Tabular classification (3-class)
Dataset size: 2,000 rows (synthetic, balanced across Low/Medium/High)
Split: 80% train / 20% validation
Early stopping: Enabled

📈 Results (Validation)

Accuracy: 0.9225
Macro F1: 0.9212
Log Loss: 0.1839
Best Trees: 165

Confusion Matrix:

	Pred High	Pred Low	Pred Medium
True High	123	0	7
True Low	0	135	5
True Medium	10	9	111

📂 Artifacts

xgb_model_Risk_Tolerance_v3.json → trained model
feature_order_Risk_Tolerance_v3.json → feature order (list of 8 features)
label_map_Risk_Tolerance_v3.json → mapping ({"High":0,"Low":1,"Medium":2})

🚀 Usage Example (Python)

import json, pandas as pd, numpy as np
from xgboost import XGBClassifier
from huggingface_hub import hf_hub_download

REPO_ID = "mjpsm/Risk-Tolerance-XGB"

# --- Download artifacts from Hugging Face Hub ---
model_file = hf_hub_download(REPO_ID, "xgb_model_Risk_Tolerance_v3.json")
feat_file  = hf_hub_download(REPO_ID, "feature_order_Risk_Tolerance_v3.json")
map_file   = hf_hub_download(REPO_ID, "label_map_Risk_Tolerance_v3.json")

# --- Load model + metadata ---
clf = XGBClassifier()
clf.load_model(model_file)
features = json.load(open(feat_file))
label_map = json.load(open(map_file))
inv_map = {v:k for k,v in label_map.items()}

# --- Example row ---
row = {
    "comfort_with_uncertainty": 8,
    "savings_to_expense_ratio": 3.2,
    "runway_months": 14,
    "debt_to_income_ratio": 0.35,
    "comfort_with_failure": 7,
    "entrepreneurial_experience_level": 6,
    "investment_risk_history": 7,
    "short_term_vs_long_term_preference": 8,
}

# --- Predict ---
X = pd.DataFrame([row])[features].astype("float32").values
proba = clf.predict_proba(X)[0]
pred_idx = int(np.argmax(proba))
print("Prediction:", inv_map[pred_idx], proba)

Downloads last month: -; Downloads are not tracked for this model. How to track

Evaluation results

accuracy on risk_tolerance_dataset_v1 (synthetic, 2k rows)
self-reported

0.922
macro_f1 on risk_tolerance_dataset_v1 (synthetic, 2k rows)
self-reported

0.921
log_loss on risk_tolerance_dataset_v1 (synthetic, 2k rows)
self-reported

0.184

View on Papers With Code