CMA Spectral Unfolder

Multi-species spectral unfolding from scintillator stack detector data using CMA-ES (Covariance Matrix Adaptation Evolution Strategy), with Huber-robust fidelity, Tikhonov / logistic smoothness regularisation, and optional per-detector calibration tuning.

This code reconstructs particle energy spectra from a stack of scintillator detectors by inverting a pre-computed FLUKA Monte Carlo response matrix (RM). The inverse problem is ill-posed: the code solves it by minimising a regularised objective function using the derivative-free CMA-ES optimizer, supporting up to three particle species simultaneously.

Scope

The optimizer is designed for high-energy-density physics (HEDP) diagnostics, specifically scintillator-based calorimeters used in laser–plasma interaction experiments. However, the RM-inversion framework is general and applicable to any detector system for which a Monte Carlo response matrix is available.

Physics Background

The Inverse Problem

A stack of N scintillator slabs records the energy deposited by a broad-spectrum particle beam. The measured signal in detector channel \(d_j\) (\(j = 1 \ldots N_\text{FLUKA}\)) is a linear superposition of mono-energetic responses:

d_j = \int R_j(E)\,\frac{dN}{dE}\,dE \;\approx\; \sum_{i=1}^{N_\text{guess}} R_j(E_i)\,S_i

where \(R_j(E_i)\) is the response matrix element (energy deposited in detector \(j\) by a particle of energy \(E_i\)), and \(S_i = (dN/dE)_i\) is the unknown spectrum evaluated at the \(i\)-th energy bin. In matrix form: \(\mathbf{d} = \mathbf{R}\,\mathbf{s}\).

Why CMA-ES?

The response matrix is typically rank-deficient or severely ill-conditioned (condition numbers of \(10^6\)–\(10^{12}\) are common for thick-stack calorimeters). Classical linear inversions (SVD truncation, Tikhonov) struggle with positivity constraints and multi-species superposition. CMA-ES is a gradient-free evolutionary strategy that handles:

Non-convex, multi-modal landscapes
Box constraints (positivity via log-space parameterisation)
Heterogeneous variable types (spectra + calibration factors)
Arbitrary regularisation terms without differentiability requirements

Multi-Species Unfolding

When the detector is sensitive to multiple particle types (e.g. photons + electrons), the measured signal is:

d_j = \sum_{k=1}^{K} \sum_{i=1}^{N_k} R_j^{(k)}(E_i^{(k)})\,S_i^{(k)}

where \(k\) indexes the species, each with its own RM and energy grid. The optimiser solves for all species simultaneously, which breaks the degeneracy that arises from fitting a single RM to a mixed signal.

Architecture

┌─────────────────────────────────────────────────────────────────┐ │ Pipeline Overview │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ ┌──────────────┐ ┌───────────────┐ ┌───────────────┐ │ │ │ FLUKA RM │ │ Scintillator │ │ Error Files │ │ │ │ *_FLUKA.txt │ │ raw_data.tiff │ │ mean*_VAC.txt │ │ │ │ *_Energy.txt │ └──────┬────────┘ └──────┬────────┘ │ │ │ *_Spectrum.txt│ │ │ │ │ └──────┬───────┘ │ │ │ │ │ │ │ │ │ ▼ ▼ ▼ │ │ ┌──────────────┐ ┌─────────────┐ ┌──────────────┐ │ │ │DataProcessor │ │DataProcessor│ │ErrorAnalysis │ │ │ │ import_RM() │ │ read_image() │ │ get_errors() │ │ │ └──────┬───────┘ └──────┬──────┘ └──────┬───────┘ │ │ │ │ │ │ │ ▼ ▼ │ │ │ ┌──────────────┐ ┌───────────┐ │ │ │ │SpeciesConfig │────▶│MultiSpecies│ │ │ │ │ (per species)│ │Optimizer │ │ │ │ └──────────────┘ │ │ │ │ │ │ CMA-ES │ │ │ │ │ loop │ │ │ │ └─────┬─────┘ │ │ │ │ │ │ │ ▼ │ │ │ ┌──────────────────┐ │ │ │ │OptimizationResult│───────────┘ │ │ │ spectra_linear │ │ │ │ spectra_log │──▶ Plotter │ │ │ facts │ │ │ └──────────────────┘ │ └─────────────────────────────────────────────────────────────────┘

Fig. 1 — Data flow from raw inputs through the optimiser to results and error analysis.

The code is organised into six classes and a set of free functions:

Component	Role
`Config`	Global parameters: paths, energy grids, ROI, calibration, error files
`SpeciesConfig`	Per-species RM + energy grid + smoothing hyperparameters
`MultiSpeciesOptimizer`	CMA-ES driver: builds objective, runs evolution loop
`OptimizationResult`	Structured output container with accessor methods
`DataProcessor`	RM import and experimental image reading
`ErrorAnalysis`	2-D error matrix interpolation and error-bar plotting
`Plotter`	Standard diagnostic plots

Getting Started

Installation

Requirements

Python ≥ 3.9
NumPy, Pandas, Matplotlib, Pillow, SciPy
cma (pycma) — the CMA-ES implementation by N. Hansen

# Install dependencies
pip install numpy pandas matplotlib pillow scipy cma

# Clone and enter the project
git clone https://github.com/your-org/cma-unfolder.git
cd cma-unfolder

Performance Warning

CMA-ES scales as \(\mathcal{O}(n^2)\) per generation with the number of variables \(n\). With N_guess = 25 per species and calibration tuning off, \(n = 25\) (or 30 for two species). This is well within the efficient regime. Do not exceed N_guess ≈ 200 on a workstation without switching to a large-scale CMA variant.

Directory Structure

cma-unfolder/
├── CMA_optimizer.py          # Main script
├── images/
│   └── raw_data.tiff         # Scintillator image
├── RM/
│   └── Response_matrix_double_population/
│       ├── Response_matrix_p/ # Photon RM files
│       │   ├── 0.05_Energy.txt
│       │   ├── 0.05_Spectrum.txt
│       │   ├── 0.05_FLUKA.txt
│       │   ├── 0.1_Energy.txt
│       │   └── ...
│       └── Response_matrix_e/ # Electron RM files
│           └── ...
├── Error/
│   ├── mean1_VAC.txt          # Error level 1 (signal > 10%)
│   ├── mean2_VAC.txt          # Error level 2 (1% < signal < 10%)
│   ├── mean3_VAC.txt          # Error level 3 (0.1% < signal < 1%)
│   └── mean4_VAC.txt          # Error level 4 (signal < 0.1%)
└── README.md

Response Matrix Format

Each mono-energetic FLUKA simulation at energy \(E_k\) (in MeV) produces a triplet of files, named using the energy value as prefix:

File	Contents	Shape
`{E_k}_Energy.txt`	Energy bin edges of the simulated spectrum	\(N_\text{sim} \times 1\)
`{E_k}_Spectrum.txt`	Deposited energy spectrum (reserved, not currently used)	\(N_\text{sim} \times 1\)
`{E_k}_FLUKA.txt`	Integrated detector response per scintillator slab	\(N_\text{FLUKA} \times 1\)

Naming Convention

The file prefix must be a parseable floating-point number (e.g. 0.05, 1.5, 100). The importer sorts by this value and uses it as the RM energy axis.

Configuration — `Config`

All user-tunable parameters live in the Config class as class-level attributes. Modify this class to adapt the code to a new experimental setup. No constructor is needed — all values are set at import time.

Energy Grids

Attribute	Type	Description
`N_guess`	int	Number of spectral bins (CMA variables per species). Default 25.
`E_guess_range`	tuple	\(\log_{10}(E_\text{min}/\text{MeV}),\;\log_{10}(E_\text{max}/\text{MeV})\)
`E_guess`	ndarray	Log-spaced energy centroids, shape `(N_guess,)`
`smooth_factor`	float	Smoothness penalty weight, auto-scaled by mean Δlog₁₀E

Choosing N_guess

The number of spectral bins controls the trade-off between spectral resolution and numerical stability. Too few bins under-resolve spectral features; too many introduce null-space degrees of freedom that the regulariser must suppress. Start with N_guess ≈ N_FLUKA and increase if the residuals show systematic structure.

ROI & Calibration

The ROI array defines pixel rectangles for each scintillator slab in the experimental image. Each row is [y_min, y_max, x_min, x_max]. The factor array contains per-detector calibration multipliers applied to the raw FLUKA output during RM import.

User Input Required

The factor array must be determined experimentally for your specific detector stack. If your RM is already properly calibrated, set all entries to 1.0. Incorrect calibration factors will bias the unfolded spectrum systematically.

Error Analysis Files

Four text files, each containing a vector of length len(E_error), encoding the relative unfolding uncertainty at four signal-to-noise thresholds. These are pre-computed by running Calc_errors.py on the RM.

File	Signal Regime
`mean1_VAC.txt`	Signal / peak > 10%
`mean2_VAC.txt`	1% < signal / peak < 10%
`mean3_VAC.txt`	0.1% < signal / peak < 1%
`mean4_VAC.txt`	signal / peak < 0.1%

Second Species (Electrons)

When n_species = 2, the code loads a second RM from folder_path_e with its own energy grid (E_guess_e, N_guess_e). The electron energy range and smoothing weight are configured independently.

Forward Model

For a single species with spectrum \(\{S_i\}_{i=1}^{N_\text{guess}}\) in log₁₀ space, the predicted detector signal is:

\hat{d}_j = \sum_{i=1}^{N_\text{guess}} 10^{S_i} \cdot R_{ij} \cdot f_i \cdot c_j

where:

\(R_{ij}\) = FLUKA_tot[nearest[i], j] — normalised RM element
\(f_i\) = FLUKA_fact[nearest[i]] — peak normalization factor (restores absolute scale)
\(c_j\) = facts[j] — optional per-detector calibration correction (\(\approx 1\))

For \(K\) species the total signal is the sum: \(\hat{d}_j = \sum_{k=1}^{K} \hat{d}_j^{(k)}\). This is implemented in _calc_FLUKA_species() and is fully vectorised over energy bins.

Loss Function

The total objective minimised by CMA-ES is:

\mathcal{L}(\mathbf{x}) = \mathcal{L}_\text{fid} + \lambda_s\,\mathcal{L}_\text{smooth} + \lambda_c\,\mathcal{L}_\text{calib}

Fidelity Term

Relative squared residuals with Huber robustification:

\mathcal{L}_\text{fid} = \sum_{j=1}^{N_\text{FLUKA}} H_\delta\!\left(\frac{(d_j - \hat{d}_j)^2}{d_j^2}\right)

where \(H_\delta(r) = \frac{1}{2}\min(|r|,\delta)^2 + \delta\,(|r| - \min(|r|,\delta))\) transitions from quadratic to linear at threshold \(\delta\). This suppresses the influence of outlier detector channels (e.g. damaged scintillators or channels with poor RM coverage).

Huber δ

The default huber_delta = 1e-2 means that relative residuals above ~10% are down-weighted linearly rather than quadratically. Increase δ if your data is clean; decrease it if you suspect systematic errors in specific channels.

Smoothness Penalties

Simple (Tikhonov-like)

When adaptive_smoothing = False:

\mathcal{L}_\text{smooth}^{(k)} = \sum_{i=1}^{N-2}\left(S_{i+2}^{(k)} - 2S_{i+1}^{(k)} + S_i^{(k)}\right)^2

This is the squared \(\ell_2\) norm of the second finite difference — a discrete approximation to the curvature penalty \(\int |S''(E)|^2 dE\).

Logistic-Weighted

When adaptive_smoothing = True, the penalty uses first differences with bin-dependent weights:

w(x) = w_\min + (1 - w_\min)\;\sigma(-k(x - x_0))

where \(x = i/(N-1)\) is the normalised bin index and \(\sigma\) is the logistic function. This reduces regularisation strength at the high-energy tail where the RM sensitivity drops and the spectrum is expected to fall sharply, avoiding the artificial flattening that uniform smoothing produces near the cutoff.

Parameter	Attribute	Default	Effect
Onset	`smooth_transition_start`	0.85	Fraction of bins before weight starts dropping
Floor	`smooth_min_weight`	0.01	Minimum weight at the tail (0 = no regularisation)
Steepness	`smooth_steepness`	30	Logistic transition sharpness

Calibration Factor Tuning

When tune_calibration = True, \(N_\text{FLUKA}\) additional variables are appended to the CMA vector, representing per-detector multiplicative corrections \(c_j\). They are initialised near 1.0 and constrained by a smooth quartic barrier:

\mathcal{L}_\text{calib} = \sum_j \left[\text{sp}(1 - c_j)^4 + \text{sp}(c_j - 1)^4\right]

where \(\text{sp}(x) = \frac{1}{k}\log(1 + e^{kx})\) is the softplus function with sharpness \(k = 300\). This allows small deviations from unity while strongly penalising large excursions.

When to Use

Enable calibration tuning only when you have reason to believe the pre-computed factor array has residual errors (e.g. from aging scintillators or position-dependent light collection). The additional degrees of freedom can absorb real spectral features if the penalty weight facts_penalty_weight is too low.

CMA-ES Internals

The optimizer uses pycma with the following default settings:

Option	Value	Rationale
`sigma0`	3.0	Initial step size (log₁₀ scale; covers ~3 decades)
`CMA_stds`	5.0 (spectra), 0.05 (facts)	Heterogeneous scaling for different variable types
`popsize`	max(17, 4+3ln n)	Default CMA heuristic, floor at 17
`maxiter`	100,000	Upper bound; typically converges in 5k–30k
`tolx`	1e-8	Convergence on variable change
`CMA_active`	1	Active CMA (exploits negative covariance information)

The objective function is dispatched once before the loop via _build_objective_fn(), which selects among four variants (base / smooth / calib / smooth+calib) based on the active flags. This eliminates per-evaluation branching overhead.

Optimisation Vector Layout

x = [ S_photon (N_guess_p) | S_electron (N_guess_e) | facts (N_FLUKA) ]
      ←── spectral vars ──────────────────────────→  ←── optional ──→

API Reference

`SpeciesConfig`

Dataclass. Container for one particle species' RM and optimiser settings.

Constructor Parameters

Name	Type	Description
`label`	str	Identifier (e.g. `'photon'`, `'electron'`)
`E`	ndarray (N_data,)	RM energy axis (MeV, ascending)
`E_guess`	ndarray (N_guess,)	Desired bin centroids; snapped to RM axis by `resolve()`
`FLUKA_tot`	ndarray (N_data, N_FLUKA)	Normalised detector RM
`FLUKA_fact`	ndarray (N_data, 1)	Peak normalisation factors
`smooth_factor`	float	Smoothness penalty weight. Default `1.3e-5`
`smooth_transition_start`	float	Logistic onset (0–1). Default `0.85`
`smooth_min_weight`	float	Logistic floor. Default `0.01`
`smooth_steepness`	float	Logistic k. Default `30.0`

Methods

`resolve()`

Must be called once before optimisation. Maps E_guess onto the RM energy axis via nearest-neighbour lookup (injective), sets nearest_indices and N_guess.

`MultiSpeciesOptimizer`

Constructor Parameters

Name	Type	Default	Description
`species_list`	List[SpeciesConfig]	—	1–3 species configs
`Exp_FLUKA`	ndarray	—	Normalised experimental signal
`tune_calibration`	bool	False	Optimise per-detector factors
`smoothing`	bool	True	Enable smoothness regularisation
`adaptive_smoothing`	bool	False	Logistic (True) vs simple (False)
`lower_bound`	float	-20	CMA lower bound (log₁₀)
`upper_bound`	float	20	CMA upper bound (log₁₀)
`facts_bounds`	tuple	(-0.2, 0.2)	Calibration factor range around 1
`huber_delta`	float	1e-2	Huber transition threshold
`facts_penalty_weight`	float	1e-2	Weight on calibration constraint
`cma_options`	dict	None	Override CMA-ES settings

Methods

`run_CMA() → OptimizationResult`

Execute the CMA-ES evolution loop. Returns a structured OptimizationResult.

`calc_FLUKA(result) → ndarray`

Reconstruct the total detector signal from an OptimizationResult. Useful for computing post-hoc residuals.

`mmin(x) → float`

Internal objective function called by CMA-ES. Unpacks x and delegates to the pre-selected variant.

`OptimizationResult`

Dataclass returned by run_CMA().

Attribute	Type	Description
`spectra_linear`	list[ndarray]	Unfolded spectra (linear scale), one per species
`spectra_log`	list[ndarray]	Same in log₁₀ (raw CMA output)
`facts`	ndarray or None	Optimised calibration factors
`species_labels`	list[str]	Species names
`E_guess_list`	list[ndarray]	Energy grids (snapped to RM)
`es_result`	CMAResult	Raw pycma result object

`get(label) → (E_guess, spectrum_linear)`

Convenience accessor. Raises KeyError if the label is not found.

E, spec = result.get('photon')
plt.loglog(E, spec)

`DataProcessor`

`import_RM() → (Spectrum_tot, FLUKA_tot, FLUKA_fact, E)`

Reads all *_FLUKA.txt / *_Energy.txt file triplets from Config.folder_path. Applies Config.factor, peak-normalises each row, and returns arrays sorted by ascending energy.

`read_image(norm_flag=True, plot_flag=False) → ndarray`

Opens Config.image_path, extracts integrated pixel intensity per ROI. If norm_flag is True, each value is divided by the ROI pixel count.

`find_nearest(array, values) → ndarray`

Injective nearest-neighbour mapping: returns one unique index per value. Raises ValueError if any value falls outside the array range.

`ErrorAnalysis`

Constructor

ErrorAnalysis(E_error, error_files, ddv, window_size=200)

Loads four error-level files and builds a 2-D error matrix over (normalised signal, energy), smoothed with a moving-average kernel of width window_size.

`get_errors(x_vals, data_vals) → ndarray`

Bilinear interpolation on the error matrix. data_vals is peak-normalised internally. Returns NaN where interpolation fails (out-of-bounds).

`plot_error_results(E_guess, signal, y_errors, std)`

Produces a publication-ready log–log plot with error bars combining unfolding uncertainty, systematic error (std), and Poisson counting noise.

`Plotter`

`plot_results(Exp_FLUKA, FLUKA_sim)`

Overlays experimental and reconstructed detector signals (normalised).

`plot_spectrum(E_guess, sim)`

Log–log plot of the unfolded spectrum.

Helper Functions

Function	Description
`_smoothness_simple(spectrum, E_guess)`	Second-difference curvature penalty
`_smoothness_logistic(spectrum, ...)`	Logistic-weighted first-difference penalty
`_huber_loss(residuals, delta, weights)`	Element-wise Huber loss
`_softplus(x, k)`	Smooth ReLU approximation
`_facts_penalty(facts, k)`	Quartic barrier for calibration factors
`make_synthetic_exp(...)`	Forward-model a known spectrum into synthetic detector data

Tutorial: Single-Species Unfolding

The simplest use case: unfold a photon spectrum from 17 scintillator channels.

from CMA_optimizer import (
    Config, DataProcessor, SpeciesConfig,
    MultiSpeciesOptimizer, Plotter
)

# 1. Load RM
_, FLUKA_tot, FLUKA_fact, E = DataProcessor.import_RM()

# 2. Read experimental data
Exp_FLUKA = DataProcessor.read_image(norm_flag=True)
Exp_max = Exp_FLUKA.max()
Exp_FLUKA /= Exp_max

# 3. Configure species
species = SpeciesConfig(
    label='photon',
    E=E,
    E_guess=Config.E_guess.copy(),
    FLUKA_tot=FLUKA_tot,
    FLUKA_fact=FLUKA_fact,
    smooth_factor=5e-6,
)

# 4. Run optimizer
opt = MultiSpeciesOptimizer(
    species_list=[species],
    Exp_FLUKA=Exp_FLUKA,
    smoothing=True,
)
result = opt.run_CMA()

# 5. Plot
E_out, spec = result.get('photon')
Plotter.plot_spectrum(E_out, spec * Exp_max)

Tutorial: Multi-Species (Photon + Electron)

# After loading photon RM as above...

# Load electron RM by temporarily swapping folder_path
_backup = Config.folder_path
Config.folder_path = Config.folder_path_e
_, FLUKA_tot_e, FLUKA_fact_e, E_e = DataProcessor.import_RM()
Config.folder_path = _backup

species_e = SpeciesConfig(
    label='electron',
    E=E_e,
    E_guess=Config.E_guess_e,
    FLUKA_tot=FLUKA_tot_e,
    FLUKA_fact=FLUKA_fact_e,
    smooth_factor=1.3e-9,
    smooth_transition_start=0.30,
    smooth_min_weight=0.10,
    smooth_steepness=10,
)

opt = MultiSpeciesOptimizer(
    species_list=[species_p, species_e],
    Exp_FLUKA=Exp_FLUKA,
    smoothing=True,
)
result = opt.run_CMA()

# Access each species independently
E_p, spec_p = result.get('photon')
E_e, spec_e = result.get('electron')

Electron Smoothing

The electron species typically requires a much lower smooth_factor and an earlier logistic transition onset because electron RMs tend to have sharper energy dependence and fewer spectral bins.

Tutorial: Synthetic Closure Test

The main() function demonstrates a closure test: a known analytical spectrum is forward-modelled through the RM to produce synthetic data, then recovered by the optimizer. This validates the pipeline end-to-end.

from CMA_optimizer import make_synthetic_exp

# Define a known exponential photon spectrum
E_temp = 10   # MeV temperature
known = (1 / Config.E_guess) * np.exp(-Config.E_guess / E_temp)

# Generate synthetic detector signal
Exp_FLUKA = make_synthetic_exp(FLUKA_tot, FLUKA_fact, E, Config.E_guess, known)
Exp_max = Exp_FLUKA.max()
Exp_FLUKA /= Exp_max

# Run optimizer and compare result.spectra_linear[0] against `known`

Diagnostic Value

If the closure test fails to recover the input spectrum, possible causes include: insufficient N_guess, incorrect smooth_factor, or RM energy coverage that does not span the input spectrum. Always run a closure test before analysing real data.

Tutorial: Calibration Factor Tuning

opt = MultiSpeciesOptimizer(
    species_list=[species_p],
    Exp_FLUKA=Exp_FLUKA,
    tune_calibration=True,
    facts_penalty_weight=1e-2,
    smoothing=True,
)
result = opt.run_CMA()

# Inspect optimised calibration factors
print(result.facts)  # should be close to 1.0

Degeneracy Risk

With calibration tuning enabled, the optimizer has \(N_\text{guess} + N_\text{FLUKA}\) free parameters but only \(N_\text{FLUKA}\) constraints. The problem is under-determined unless the smoothness penalty and facts penalty jointly constrain the solution. Monitor result.facts — if values drift beyond ±10%, increase facts_penalty_weight.

Hyperparameter Guide

Parameter	Typical Range	Effect of Increase
`N_guess`	10–100	Higher spectral resolution; more ill-posed
`smooth_factor`	1e-7 – 1e-3	Smoother spectrum; risk of over-regularisation
`huber_delta`	1e-3 – 1e-1	Less outlier robustness; closer to L2 loss
`facts_penalty_weight`	1e-3 – 1	Facts pinned closer to 1; less flexibility
`smooth_transition_start`	0.3 – 0.95	Later onset → more bins fully regularised
`sigma0`	1 – 5	Broader initial search; slower convergence
`popsize`	17 – 100	Better exploration; more evaluations per generation

Tuning Strategy

Start with the closure test. Adjust smooth_factor until the recovered spectrum matches the input within the expected error bars. Then apply to real data. If the residual plot shows systematic structure (oscillations in the detector-signal comparison), the smoothness is too high; if the spectrum shows high-frequency noise, it is too low.

Diagnostics & Debugging

Convergence Checks

Inspect result.es_result.stop() to see which CMA termination criterion fired.
Use cma.plot() on the logger for convergence curves (objective value, σ, axis ratios).
If maxiter is reached without convergence, increase it or widen sigma0.

Residual Analysis

FLUKA_sim = opt.calc_FLUKA(result)
residual = (Exp_FLUKA.flatten() - FLUKA_sim / FLUKA_sim.max())
plt.bar(range(Config.N_FLUKA), residual)
plt.ylabel('Relative residual')
plt.xlabel('Detector channel')
plt.show()

Common Failure Modes

Spectrum hits the upper/lower bound

If many bins saturate at upper_bound or lower_bound, widen the bounds. This typically happens when the normalisation of Exp_FLUKA is inconsistent with the RM scale, or when the RM does not cover the energy range of the true spectrum.

Optimizer converges to wrong spectrum

Run the closure test first. If it passes but real data fails, the issue is likely in the RM (missing physics, incorrect calibration) or the experimental data (background subtraction, saturation). Try enabling tune_calibration to absorb per-channel systematics.

CMA-ES stalls (σ collapses prematurely)

Increase popsize (e.g. 50–100) and/or set 'tolfun': 1e-12 in cma_options. Premature σ-collapse often indicates a flat or degenerate landscape — check that the RM has sufficient rank for the chosen N_guess.

Performance & Scaling

Configuration	N_vars	Typical Runtime
1 species, 25 bins, no calib	25	1–5 min
2 species (25+5), no calib	30	3–10 min
2 species + calibration	47	10–30 min
1 species, 100 bins	100	30–90 min

Runtimes are approximate for a single-core desktop (3 GHz). CMA-ES is inherently sequential in its covariance update, but the objective function evaluations within each generation are independent and could be parallelised via cma.CMAEvolutionStrategy's eval_parallel callback. The forward model itself (_calc_FLUKA_species) is fully vectorised with NumPy.

Extending the Code

Adding a Third Species

Create a third SpeciesConfig with its own RM and energy grid, then pass all three to MultiSpeciesOptimizer. The code supports up to 3 species natively.

Custom Regularisation

To add a new regularisation term (e.g. entropy, L1 sparsity), define a function with signature f(spectrum: ndarray) → float and add it to the appropriate _objective_* method. Create a new objective variant if needed and register it in _build_objective_fn().

Alternative Fidelity Metrics

The Huber loss can be replaced by any robust estimator (Cauchy, Tukey biweight) by modifying _huber_loss(). The relative-residual normalisation \((d - \hat{d})^2 / d^2\) can also be changed to absolute residuals or Poisson log-likelihood depending on the noise model.

Known Issues

Config as mutable class attributes: swapping Config.folder_path to load a second RM is a side-effect pattern. A future refactor should pass the folder path as an argument to import_RM().
ErrorAnalysis double-smoothing: the error matrix is smoothed once inside _create_error_matrix() and again in __init__. This is inherited from the original code and produces slightly more diffuse error estimates than a single pass.
Injective nearest-neighbour in find_nearest: if two E_guess values map to the same RM index, the second is bumped to the next available index. This can shift bins at the edges of the energy grid. Prefer well-separated E_guess values.
No automatic normalisation consistency check between Exp_FLUKA and the RM scale. The user must ensure consistent units.

License

This software is released under the PolyForm Noncommercial License 1.0.0. You may use, modify, and distribute the code for any non-commercial purpose. Commercial use requires a separate license from the author.

CMA Spectral Unfolder

Physics Background

The Inverse Problem

Why CMA-ES?

Multi-Species Unfolding

Architecture

Getting Started

Installation

Requirements

Directory Structure

Response Matrix Format

Configuration — Config

Energy Grids

ROI & Calibration

Error Analysis Files

Second Species (Electrons)

Forward Model

Loss Function

Fidelity Term

Smoothness Penalties

Simple (Tikhonov-like)

Logistic-Weighted

Calibration Factor Tuning

CMA-ES Internals

Optimisation Vector Layout

API Reference

SpeciesConfig

Constructor Parameters

Methods

resolve()

MultiSpeciesOptimizer

Constructor Parameters

Methods

run_CMA() → OptimizationResult

calc_FLUKA(result) → ndarray

mmin(x) → float

OptimizationResult

get(label) → (E_guess, spectrum_linear)

DataProcessor

import_RM() → (Spectrum_tot, FLUKA_tot, FLUKA_fact, E)

read_image(norm_flag=True, plot_flag=False) → ndarray

find_nearest(array, values) → ndarray

ErrorAnalysis

Constructor

get_errors(x_vals, data_vals) → ndarray

plot_error_results(E_guess, signal, y_errors, std)

Plotter

plot_results(Exp_FLUKA, FLUKA_sim)

plot_spectrum(E_guess, sim)

Helper Functions

Tutorial: Single-Species Unfolding

Tutorial: Multi-Species (Photon + Electron)

Tutorial: Synthetic Closure Test

Tutorial: Calibration Factor Tuning

Hyperparameter Guide

Diagnostics & Debugging

Convergence Checks

Residual Analysis

Common Failure Modes

Performance & Scaling

Extending the Code

Adding a Third Species

Custom Regularisation

Alternative Fidelity Metrics

Known Issues

License

Configuration — `Config`

`SpeciesConfig`

`resolve()`

`MultiSpeciesOptimizer`

`run_CMA() → OptimizationResult`

`calc_FLUKA(result) → ndarray`

`mmin(x) → float`

`OptimizationResult`

`get(label) → (E_guess, spectrum_linear)`

`DataProcessor`

`import_RM() → (Spectrum_tot, FLUKA_tot, FLUKA_fact, E)`

`read_image(norm_flag=True, plot_flag=False) → ndarray`

`find_nearest(array, values) → ndarray`

`ErrorAnalysis`

`get_errors(x_vals, data_vals) → ndarray`

`plot_error_results(E_guess, signal, y_errors, std)`

`Plotter`

`plot_results(Exp_FLUKA, FLUKA_sim)`

`plot_spectrum(E_guess, sim)`