An MCP server for SPSS-compatible datasets with rich statistical analysis, frequencies, crosstabs, reliability analysis, and AI-ready interfaces.
SAF is a privacy-first Model Context Protocol (MCP) server for working with SPSS-compatible datasets (.sav, .zsav, .por) and tabular files (.csv, .tsv, .xlsx). Built on pyreadstat, it enables AI clients and LLMs to inspect metadata, preview rows, profile variables, run frequencies & crosstabs, assess scale reliability, generate SPSS syntax, and perform comprehensive statistical analyses — all while keeping data local and secure.
- ✅ A Python MCP server implementing 24 specialized tools for dataset inspection and analysis
- ✅ Local-first, privacy-preserving assistant for
.sav,.zsav,.por,.csv,.tsv, and.xlsxfiles - ✅ Designed for research, survey, and teaching workflows with sensitive respondent data
- ✅ Sandbox-enforced with filesystem isolation via
SAF_DATA_ROOT
- ❌ Not a replacement for IBM SPSS
- ❌ Does not execute arbitrary SPSS syntax or shell commands
- ❌ Does not upload, transmit, or externally store datasets
| Format | Support |
|---|---|
.sav |
✅ SPSS system files |
.zsav |
✅ Compressed SPSS system files |
.por |
✅ SPSS portable files |
.csv |
✅ Comma-separated values |
.tsv |
✅ Tab-separated values |
.xlsx |
✅ Excel workbooks |
git clone https://github.com/azwafazil/SAF.git
cd SAF
python -m venv .venv
source .venv/bin/activate
pip install -e .export SAF_DATA_ROOT=/absolute/path/to/your/saf-datasaf-mcp # stdio (default)
MCP_TRANSPORT=streamable-http saf-mcp # HTTP/SSESAF exposes 24 tools in four groups:
| Tool | Purpose |
|---|---|
list_data_files |
Enumerate supported files under SAF_DATA_ROOT |
inspect_spss_metadata |
Read column names, labels, value labels, formats, measures |
preview_spss_data |
Return first N rows (1–500) |
profile_spss_data |
Per-column: dtype, missing count, top values, summary stats |
convert_spss_to_csv |
Write .sav / .zsav / .por → CSV |
convert_csv_to_sav |
Write .csv / .tsv / .xlsx → .sav |
generate_basic_spss_syntax |
Generate SPSS syntax (read-only) |
| Tool | Purpose |
|---|---|
cross_tabulation |
Contingency tables with row/column/total percentages |
saf_stat_frequencies |
Frequency tables for categorical/ordinal variables |
saf_stat_reliability |
Cronbach's alpha for scale reliability with item diagnostics |
saf_stat_describe_all |
Describe all numeric columns at once |
| Tool | Computes |
|---|---|
saf_stat_descriptives |
n, mean, SD, median, min/max, IQR, MAD, skew, kurtosis, 95% CI |
saf_stat_assumptions |
Shapiro-Wilk (or D'Agostino for n > 5000) + Levene's test |
saf_stat_compare_groups |
t-test (indep/paired/Welch) + Mann-Whitney + Cohen's d |
saf_stat_anova |
One-way ANOVA / Welch / Kruskal-Wallis + Tukey HSD + η² |
saf_stat_correlate |
Pearson (Fisher-z CI), Spearman, Kendall + bootstrap CIs |
saf_stat_regress |
OLS, logistic, robust (HC3) regression with CIs, AIC/BIC, VIF |
saf_stat_chi_square |
Independence test + Cramér's V + Fisher exact; GOF |
saf_stat_nonparametric |
Wilcoxon, sign, Friedman, Mann-Whitney, Kruskal-Wallis |
saf_stat_effect_size |
Cohen's d, η², Cramér's V, odds ratio, rank-biserial |
saf_stat_power |
Solve for power, sample size, or effect size |
saf_stat_outliers |
IQR (Tukey), z-score, modified z, Mahalanobis |
saf_stat_missing |
Per-column missing counts/%, complete-row ratio, MCAR sketch |
| Tool | Purpose |
|---|---|
list_tools |
List all available SAF MCP tools with descriptions |
| Resource | Purpose |
|---|---|
saf://guide |
Quick orientation to SAF and its capabilities |
saf://repo-ingestion |
Reminder of sandbox rules and data safety |
analyze_dataset_prompt |
Guided analysis workflow: inspect → preview → profile → frequencies → describe → missing |
survey_analysis_prompt |
Survey-specific workflow with optional reliability analysis |
SAF is publicly accessible at:
https://nasf.cloud/mcp
Connect any MCP-compatible agent (opencode, Hermes, Claude Code) via:
{
"mcpServers": {
"saf": {
"type": "remote",
"url": "https://nasf.cloud/mcp"
}
}
}- "List the datasets in my SAF data root."
- "Inspect metadata for
survey.sav." - "Run frequencies on
gender,age_group, andeducationfromsurvey.sav." - "Cross-tabulate
satisfactionbyregionwith column percentages." - "Check Cronbach's alpha for items q1 through q10 in
questionnaire.sav." - "Describe all numeric columns in
data.sav." - "Convert
data.xlsxtodata.sav." - "Run descriptive statistics for age, split by gender, from
survey.sav."
- All file paths resolved under
SAF_DATA_ROOT - Path traversal attempts blocked
- Supported file extensions validated
- Read/write operations confined to the sandbox
- No external uploads: data stays on your machine
- No telemetry: SAF does not phone home
- Local-first: all processing is local
saf_mcp/
├── server.py # MCP server entrypoint & tool registration
├── security.py # Sandbox enforcement & extension validation
├── spss_utils.py # SPSS/tabular read, write, preview, profile, frequencies
├── stats.py # 15 statistical analysis primitives
└── __init__.py # Package metadata
MIT License — see LICENSE for details.
Built by Azwa Fazil · azwafazil21@gmail.com