Status: Not fully validated. Use for research and experimentation. See
theory.pdffor detailed background theory and motivations.
SaxsEst is a high-performance Fortran codebase for estimating Small Angle X-ray Scattering (SAXS) intensity profiles from atomic coordinate files (XYZ). It implements three estimators for the Debye equation:
- debyeEst — exact O(N²) pairwise computation exploiting symmetry
- propoEst — constant-factor approximation of the atomic weight factor, based on the PropEstimator algorithm of Beretta & Tetek (2022)
- stratEst — stratified importance sampling with Hansen–Hurwitz correction, partitioning atoms into heavy/light strata by scattering weight
Both propoEst and stratEst treat I(q) as a purely computational object rather than a physical/geometric one.
- Fortran 90+ compiler (gfortran recommended)
- GNU Make
- OCaml runtime (for the Fortran ↔ CSV bridge in
CsvInterface/) - R with
ggplot2(for the analysis pipeline inAnalysis/) - Standard POSIX utilities (bash, coreutils)
This evaluates all .xyz files in AtomXYZ/data/ and generates a report:
./saxsEst.sh [--debug] [--help]
- CSV files with Q vs I(Q) for each estimator
- Timing statistics
- Analysis logs and plots in
Analysis/
SaxsEst/ CLI and top-level program
Est/ Estimators and sampling code (Est.f90 + inc/*.inc)
FormFact/ Atomic form factors and anomalous scattering data
Freq/ Frequency/CDF construction for stratification
AtomXYZ/ Coordinate types, distance utilities, benchmark .xyz files
CsvInterface/ Fortran ↔ OCaml CSV bridge
Analysis/ R scripts (CsvCombine.R, Plot.R), saved logs and plots
stratEst has two orthogonal toggles, both controlled via commented blocks in Est/inc/stratEst.inc.
By default, stratEst samples atoms once using q(1) and reuses the same drawn set for all q values. To resample (rebuild the CDF and redraw) for each q independently:
- Comment out lines 355–358
- Uncomment lines 360–362, 371, and 372
Seven strategies control how the sample budget is split between the heavy and light strata. Let h, l = heavy/light strata; a ∈ (0,1) = sampling fraction; budget = ⌈a × totalPopulation⌉.
To switch strategies, ensure code in lines 123–237 are fully commented out, then uncomment the block for your chosen strategy:
Allocates proportional to σ × N per stratum, clamped to [lowerBound, upperBound − lowerBound]:
upperBound = ⌈totalPopulation × a⌉
lowerBound = 2
denominator = heavy.stdev × heavy.population + light.stdev × light.population
heavySamples = min(max(⌈heavy.stdev × heavy.population × upperBound / denominator⌉, lowerBound), upperBound − lowerBound)
lightSamples = min(max(⌈light.stdev × light.population × upperBound / denominator⌉, lowerBound), upperBound − lowerBound)
totalSamples = heavySamples + lightSamples
Note: both strata are independently clamped and ceiled, so totalSamples may not equal upperBound.
Each stratum is ceiled independently (no global budget constraint):
heavySamples = ⌈heavy.population × a⌉
lightSamples = ⌈light.population × a⌉
totalSamples = heavySamples + lightSamples
Rounds the heavy stratum; light gets the remainder:
totalSamples = ⌈a × totalPopulation⌉
heavySamples = ⌈a × heavy.population⌉
lightSamples = totalSamples − heavySamples
Rounds the light stratum; heavy gets the remainder:
totalSamples = ⌈a × totalPopulation⌉
lightSamples = ⌈a × light.population⌉
heavySamples = totalSamples − lightSamples
Allocates proportional to stratum mean scattering weight. Both strata are independently ceiled (no strict budget):
upperBound = ⌈a × totalPopulation⌉
sumStratMean = heavy.mean + light.mean
heavySamples = ⌈heavy.mean / sumStratMean × upperBound⌉
lightSamples = ⌈light.mean / sumStratMean × upperBound⌉
totalSamples = heavySamples + lightSamples
Rounds the heavy stratum by mean weight; light gets the remainder:
budget = ⌈a × totalPopulation⌉
sumStratMean = heavy.mean + light.mean
heavySamples = ⌈heavy.mean / sumStratMean × budget⌉
lightSamples = budget − heavySamples
This is the default strategy in the current build.
Rounds the light stratum by mean weight; heavy gets the remainder:
budget = ⌈a × totalPopulation⌉
sumStratMean = heavy.mean + light.mean
lightSamples = ⌈light.mean / sumStratMean × budget⌉
heavySamples = budget − lightSamples
Each strategy can be combined with either sampling mode (toggle A), giving 14 total configurations.