Post-Processing Framework
The osiris_utils package provides a lightweight framework for post-processing OSIRIS simulation data.
Post-processors are implemented as wrappers around the existing Simulation / Diagnostic interface so that:
indexing works the same way everywhere (lazy, sliceable, tuple-sliceable)
arithmetic between diagnostics keeps working (
+,-,*,/, etc.)post-processors can be chained when it makes sense (e.g., derivatives)
This page documents the framework and the main post-processors currently available.
PostProcess Base Class
- class osiris_utils.postprocessing.postprocess.PostProcess(name: str, simulation: Any)Source
Bases:
objectBase class for post-processing wrappers (FFT_Simulation, Derivative_Simulation, etc).
- It wraps a Simulation-like object (Simulation or another PostProcess) and exposes:
.species
.loaded_diagnostics
.add_diagnostic()
Subclasses implement __getitem__ to return processed diagnostics.
- property loaded_diagnostics: dictSource
Delegate to wrapped simulation if it has loaded_diagnostics. Otherwise return empty dict.
- add_diagnostic(diagnostic: Diagnostic, name: str | None = None) strSource
Delegate add_diagnostic downwards. This lets you do: sim_fft.add_diagnostic(…)
The PostProcess class is the base class for all simulation-level post-processors.
Design goals
A post-processor should behave like a Simulation:
pp["e3"]returns a diagnostic-like object (usually aDiagnosticsubclass)pp["electrons"]["n"]returns a species diagnosticit caches created wrappers so repeated access is cheap
Most post-processors follow the same pattern:
A simulation wrapper (
Something_Simulation) inheriting fromPostProcessA diagnostic wrapper (
Something_Diagnostic) inheriting fromDiagnosticAn optional species handler (
Something_Species_Handler) for species-dependent quantities
Integration with OSIRIS Diagnostics
Post-processors are designed to work seamlessly with the OSIRIS diagnostic system:
Chaining (when appropriate): some post-processors can wrap other post-processors (e.g., derivatives of already-derived quantities).
Compatibility with visualization: results remain
Diagnostic-like and can be plotted the same way.Uniform indexing contract:
diag[i]returns a single timestepnp.ndarray(shape(x, y, z)depending on dimension)diag[i:j]returns a stacked time array (shape(t, x, y, z))diag[i, :, 100:200]uses tuple indexing (time index + spatial slices)
Important implementation note
The base Diagnostic.__getitem__ already implements:
int and slice time indexing
tuple indexing (time + spatial slices)
fast path for in-memory data
So post-processed diagnostics should generally implement:
load_all()to buildself._data(shape includes time)_frame(index, data_slice=None)to return a single timestep lazily (shape excludes time)
You usually do not need to override __getitem__ in post-processed diagnostics.
Derivative Post-Processing
The Derivative_Simulation module provides tools for computing time and spatial derivatives of diagnostics.
Derivative_Simulation Class
- class osiris_utils.postprocessing.derivative.Derivative_Simulation(simulation: Any, deriv_type: str, axis: int | tuple[int, int] | None = None, order: int = 4, stencil: Iterable[int] | None = None, deriv_order: int = 1, periodic: bool = False)Source
Bases:
PostProcessClass to compute the derivative of a diagnostic. Works as a wrapper for the Derivative_Diagnostic class. Inherits from PostProcess to ensure all operation overloads work properly.
This class can be initialized with either a Simulation object or another Derivative_Simulation object, allowing for chaining derivatives (e.g., second derivative = Derivative_Simulation(Derivative_Simulation(…))).
- Parameters:
simulation (Simulation or Derivative_Simulation) – The simulation object or another derivative simulation object.
deriv_type (str) – The type of derivative to compute. Options are: - ‘t’ for time derivative. - ‘x1’ for first spatial derivative. - ‘x2’ for second spatial derivative. - ‘x3’ for third spatial derivative. - ‘xx’ for second spatial derivative in two axis. - ‘xt’ for mixed derivative in time and one spatial axis. - ‘tx’ for mixed derivative in one spatial axis and time.
axis (int or tuple) – The axis to compute the derivative. Only used for ‘xx’, ‘xt’ and ‘tx’ types.
order (int) – Scheme selector (2 or 4). Ignored if stencil is provided.
stencil (Iterable[int] or None) – Integer offsets (in grid steps), e.g. [-2,-1,0,1,2] or [0,1,2,3].
deriv_order (int) – Derivative order (1 for first derivative, 2 for second derivative, …).
periodic (bool) – If True, derivatives are computed with periodic boundary conditions.
- property species: listSource
Return list of species, making this compatible with Simulation interface
- property loaded_diagnostics: dictSource
Return loaded diagnostics, making this compatible with Simulation interface
- add_diagnostic(diagnostic: Diagnostic, name: str | None = None) strSource
Add a custom diagnostic to the derivative simulation.
Derivative_Simulation behaves like an operator acting on all diagnostics inside a simulation:
wraps a
Simulation(or a compatible post-process wrapper)returns derivative diagnostics on demand
supports species and non-species diagnostics
supports lazy evaluation and efficient spatial slicing
may be chained (e.g., derivative of a derivative)
Derivative types
t: time derivative \(\\partial/\\partial t\)x1: spatial derivative \(\\partial/\\partial x_1\)x2: spatial derivative \(\\partial/\\partial x_2\)x3: spatial derivative \(\\partial/\\partial x_3\)xx: second derivative in two spatial axes (mixed or repeated), e.g. \(\\partial^2/\\partial x_i\\partial x_j\)xt: \(\\partial/\\partial x\) of \(\\partial/\\partial t\)tx: \(\\partial/\\partial t\) of \(\\partial/\\partial x\)
Usage examples
from osiris_utils.data import Simulation
from osiris_utils.postprocessing import Derivative_Simulation
sim = Simulation("/path/to/input/deck.inp")
# Spatial derivative d/dx1 of every diagnostic
dx1 = Derivative_Simulation(sim, "x1")
dE1_dx1 = dx1["e1"]
dvfl1_dx1 = dx1["electrons"]["vfl1"]
# Single timestep
arr = dE1_dx1[5]
# Time slice
arr_t = dE1_dx1[5:10]
# Spatial slicing (efficient if underlying diagnostic supports it)
arr_slice = dE1_dx1[5, :, 100:200]
# Full in-memory derivative (use only when needed)
dE1_dx1.load_all()
Implementation details and accuracy
Spatial derivatives use finite differences (2nd or 4th order depending on configuration).
Time derivatives must account for OSIRIS
ndump:effective \(\\Delta t = dt \\times ndump\)
For mixed derivatives (
tx,xt), the implementation applies the corresponding operator ordering.
Performance tips
Prefer indexing (lazy computation) when exploring data.
Use spatial slices to reduce IO and compute time.
Call
load_all()only when you truly need the full time history in memory.
Derivative_Diagnostic Class
- class osiris_utils.postprocessing.derivative.Derivative_Diagnostic(diagnostic: Diagnostic, deriv_type: str, axis: int | tuple[int, int] | None = None, order: int = 4, stencil: Iterable[int] | None = None, deriv_order: int = 1, periodic: bool = False)Source
Bases:
DiagnosticAuxiliar class to compute the derivative of a diagnostic, for it to be similar in behavior to a Diagnostic object. Inherits directly from Diagnostic to ensure all operation overloads work properly.
- Parameters:
diagnostic (Diagnostic) – The diagnostic object.
deriv_type (str) – The type of derivative to compute. Options are: ‘t’, ‘x1’, ‘x2’, ‘x3’, ‘xx’, ‘xt’ and ‘tx’.
axis (int or tuple) – The axis to compute the derivative. Only used for ‘xx’, ‘xt’ and ‘tx’ types
order (int) – Scheme selector (2 or 4). Ignored if stencil is provided.
stencil (Iterable[int] or None) – Integer offsets (in grid steps), e.g. [-2,-1,0,1,2] or [0,1,2,3].
deriv_order (int) – Derivative order (1 for first derivative, 2 for second derivative, …). Only used if stencil is provided.
periodic (bool) – If True, derivatives are computed with periodic boundary conditions.
- load_all()Source
Load all the data and compute the derivative.
- __getitem__(index)Source
Get data at a specific index.
- load_all(n_workers: int | None = None) ndarraySource
Load all data and compute the derivative.
- Parameters:
n_workers (int or None) –
Number of ProcessPoolExecutor workers for spatial derivatives (
deriv_typein{"x1", "x2", "x3"}). Spatial derivatives are independent across the time axis, so the data is split inton_workerschunks along axis 0 and processed in parallel.None(default) — single-process (safe everywhere, uses NumPy’s internal threading via MKL/OpenBLAS).> 1— activates chunked ProcessPoolExecutor. Uses theforkservermultiprocessing context, which is MPI-safe on HPC nodes (unlikefork, which can deadlock with MPI). SetNUMBA_NUM_THREADS/OMP_NUM_THREADSto 1 per worker when also using Numba to avoid thread over-subscription.
Time and mixed derivatives (
"t","xx","xt","tx") always run single-process because the stencil spans neighbouring time steps.
A derivative diagnostic is a Diagnostic wrapper that computes derivatives lazily using _frame and can compute eagerly using load_all.
It preserves metadata (grid spacing, dimensions, etc.) from the original diagnostic and remains compatible with diagnostic arithmetic.
Spectral Analysis with Fast Fourier Transform
The FFT_Simulation module provides tools for computing power spectra using the Fast Fourier Transform (FFT).
FFT_Simulation Class
- class osiris_utils.postprocessing.fft.FFT_Simulation(simulation: Simulation, fft_axis: int | list[int])Source
Bases:
PostProcessClass to handle the Fast Fourier Transform on data. Works as a wrapper for the FFT_Diagnostic class. Inherits from PostProcess to ensure all operation overloads work properly.
- Parameters:
simulation (Simulation) – The simulation object.
fft_axis (int or list of int) – The axis or axes to compute the FFT.
- process(diagnostic: Diagnostic) FFT_DiagnosticSource
Apply FFT to a diagnostic
FFT_Simulation is a simulation wrapper that returns FFT_Diagnostic objects.
Key points about FFT axes
OSIRIS diagnostics loaded with
load_all()have shape(t, x1, x2, x3)(depending on dimension). In that case: - axis0is time - axes1,2,3are spatialA single timestep returned by indexing has no time dimension, only spatial: - axes become
(x1, x2, x3)→ numpy axes0,1,2
Important constraint
If the FFT includes the time axis (axis
0), you cannot compute it from a single timestep. Time-domain FFT requires the full time series, therefore it requiresload_all()on the FFT diagnostic.Spatial FFTs can be computed per-timestep lazily.
Usage example
from osiris_utils.data import Simulation
from osiris_utils.postprocessing import FFT_Simulation
sim = Simulation("/path/to/input/deck.inp")
# Spatial spectrum at each timestep (FFT along x1)
fft_x1 = FFT_Simulation(sim, 1)
e3_k = fft_x1["e3"]
k_spectrum_t10 = e3_k[10] # OK: spatial FFT at timestep 10
# Dispersion-like spectrum (time + space) requires load_all
fft_wk = FFT_Simulation(sim, (0, 1))
e3_wk = fft_wk["e3"]
e3_wk.load_all() # required
P = e3_wk.data # power spectrum |FFT|^2
FFT_Diagnostic Class
- class osiris_utils.postprocessing.fft.FFT_Diagnostic(diagnostic: Diagnostic, fft_axis: int | list[int], *, window: str | None = 'hann', window_for_spatial: str | None = None, detrend: str | None = 'mean', normalize: str = 'ortho', assume_periodic: bool = False)Source
Bases:
DiagnosticAuxiliar class to compute the FFT of a diagnostic, for it to be similar in behavior to a Diagnostic object. Inherits directly from Diagnostic to ensure all operation overloads work properly.
- Parameters:
diagnostic (Diagnostic) – The diagnostic to compute the FFT.
axis (int) – The axis to compute the FFT.
window (str or None, optional) – The window to apply before computing the FFT. Can be “hann” or None (default “hann”).
detrend (str or None, optional) – The detrending method to apply before computing the FFT. Can be “mean” or None (default “mean”).
normalize (str, optional) – The normalization method for the FFT. Can be “none”, “ortho”, or “density” (default “ortho”).
assume_periodic (bool, optional) – Whether to assume the data is periodic when choosing default windowing for spatial FFTs (default True). If True, no spatial window is applied by default; if False, the same window is
- load_all() np.ndarraySource
Load all data and compute FFT (possibly including time axis). Returns power spectrum |FFT|^2 stored in self._data.
- _frame(index: int, data_slice: tuple | None = None) np.ndarraySource
Per-timestep (lazy) FFT. Only allowed for spatial FFT. If fft_axis includes time (0), user must call load_all(). Returns power spectrum |FFT|^2.
- omega() np.ndarraySource
Get the angular frequency array for the FFT along the time dimension.
- k(axis: int | None = None) np.ndarray | dict[int, np.ndarray]Source
Get the wavenumber array for the FFT along spatial dimension(s).
- load_all() ndarraySource
Load all data and compute FFT (possibly including time axis). Returns power spectrum |FFT|^2 stored in self._data.
- omega() ndarraySource
Get the angular frequency array for the FFT along the time dimension.
- k(axis: int | None = None) ndarray | dict[int, ndarray]Source
Get the wavenumber array for the FFT along spatial dimension(s).
- Parameters:
axis (int or None, optional) – The spatial axis to compute wavenumber for (1, 2, or 3). If None, returns wavenumbers for all spatial axes in fft_axis.
- Returns:
np.ndarray or dict – If axis is specified: wavenumber array for that axis. If axis is None: dictionary mapping axis -> wavenumber array.
Notes
When load_all() is used, time axis is 0 and spatial axes are 1,2,3. When accessing single timesteps, spatial axes are 0,1,2.
- property kmax: float | ndarraySource
Nyquist wavenumber (rad/m). Upper limit of representable k.
- property omega_maxSource
Nyquist angular frequency (rad/s). Upper limit of representable ω.
A specialized diagnostic representing an FFT-based power spectrum.
Behavior
load_all()computes the FFT over the requested axes (including time if axis 0 is included)._frame(index, data_slice=None)computes a per-timestep FFT only for spatial axes.The returned data is a power spectrum (typically \(|\\mathrm{FFT}|^2\)) and is shifted using
fftshiftso that zero frequency / wavenumber is centered.
Frequency / wavenumber arrays
Use
omega()for the time-frequency axis (when axis 0 was transformed and data is loaded).Use
k(axis=...)for spatial wavenumbers.kmaxreports the Nyquist wavenumber \(\\pi/\\Delta x\) for the relevant spatial axes.
Example: Dispersion relation (ω–k)
from osiris_utils.data import Simulation
from osiris_utils.postprocessing import FFT_Simulation
import numpy as np
import matplotlib.pyplot as plt
sim = Simulation("/path/to/input/deck.inp")
fft = FFT_Simulation(sim, (0, 1))
e1_fft = fft["e1"]
e1_fft.load_all()
omega = e1_fft.omega() # angular frequency axis (shifted)
k1 = e1_fft.k(1) # k for x1 axis (shifted)
P = np.log10(e1_fft.data + 1e-30) # safer for plotting
plt.figure()
plt.pcolormesh(k1, omega, P, shading="auto")
plt.xlabel("k1")
plt.ylabel("ω")
plt.title("Dispersion-like spectrum: E1")
plt.show()
Mean Field Theory Analysis
The Mean Field Theory (MFT) module provides tools for decomposing diagnostics into an average (mean) component and fluctuations along a chosen axis.
MFT_Simulation Class
- class osiris_utils.postprocessing.mft.MFT_Simulation(simulation: Simulation, mft_axis: int | None = None)Source
Bases:
PostProcessClass to compute the Mean Field Theory approximation of a diagnostic. Works as a wrapper for the MFT_Diagnostic class. Inherits from PostProcess to ensure all operation overloads work properly.
- Parameters:
simulation (Simulation) – The simulation object.
mft_axis (int) – The axis to compute the mean field theory.
The MFT decomposition returns a container-like diagnostic with two components:
"avg": mean field \(\\langle A \\rangle\)"delta": fluctuations \(\\delta A = A - \\langle A \\rangle\)
Usage example:
from osiris_utils.data import Simulation
from osiris_utils.postprocessing import MFT_Simulation
sim = Simulation("/path/to/input/deck.inp")
mft = MFT_Simulation(sim, 1) # average along x1 direction
e1 = mft["e1"]
e1_avg = e1["avg"]
e1_delta = e1["delta"]
avg_t10 = e1_avg[10]
flt_t10 = e1_delta[10]
MFT_Diagnostic Classes
- class osiris_utils.postprocessing.mft.MFT_Diagnostic(diagnostic: Diagnostic, mft_axis: int)Source
Bases:
DiagnosticContainer: gives access to “avg” and “delta” diagnostics. Not itself a time-indexed diagnostic.
- __getitem__(key: str) DiagnosticSource
Retrieve timestep data with optional spatial slicing.
- Parameters:
index (int, slice, or tuple) –
If int, loads full spatial data for that timestep.
If slice, supports start:stop:step for time. Zero-length slices return an empty array of shape (0, …).
If tuple, first element is time index/slice, remaining elements are spatial slices for each dimension. Example: diag[5, :, 100:200] loads timestep 5 with spatial slicing applied. Efficient: HDF5 reads only the requested spatial slice from disk, not the full array.
- Returns:
np.ndarray – Array for that timestep (or stacked array for a slice), optionally spatially sliced.
- Raises:
IndexError – If the index is out of range or no data generator is available.
RuntimeError – If loading a specific timestep fails.
Examples
>>> data = diag[5] # Load full spatial data for timestep 5 >>> data = diag[5, :, 100:200] # Load timestep 5 with spatial slice (only reads slice from disk!) >>> data = diag[0:10] # Load timesteps 0-9 (full spatial domain) >>> data = diag[0:10, :, 50:] # Load timesteps 0-9 with spatial slice
- class osiris_utils.postprocessing.mft.MFT_Diagnostic_Average(diagnostic: Diagnostic, mft_axis: int)Source
Bases:
DiagnosticClass to compute the average component of mean field theory. Inherits from Diagnostic to ensure all operation overloads work properly.
- Parameters:
diagnostic (Diagnostic) – The diagnostic object.
mft_axis (int) – The axis to compute the mean field theory.
- load_all() np.ndarraySource
Load all data into memory (all iterations), in a pre-allocated array.
- Parameters:
n_workers (int, optional) – Number of parallel workers. If None, an HPC-safe default is chosen: - “thread”: min(_MAX_IO_WORKERS, cpu_count, size-1) - “process”: min(cpu_count, size-1)
use_parallel (bool, optional) – If True, force parallel loading. If False, force sequential. If None (default), automatically choose based on data size.
executor_type ({"thread", "process"}, optional) –
Executor backend for parallel loading (default “thread”).
”thread” —
ThreadPoolExecutor. Correct for I/O-bound HDF5 reads: h5py releases the GIL during reads, so threads genuinely run concurrently. I/O workers are capped at_MAX_IO_WORKERS(default 32) to prevent overwhelming parallel filesystems (Lustre/GPFS on HPC clusters such as Marenostrum 5 or Deucalion).”process” —
ProcessPoolExecutor. Bypasses the GIL entirely; useful when each frame involves heavy CPU computation beyond raw I/O. Each worker opens its own HDF5 handle via the picklable_load_frame_workerhelper. Requiresself._file_listto be populated (i.e. only valid on the baseDiagnostic, not on post-processing wrappers that override_frame).
- Returns:
data (np.ndarray) – The data for all iterations. Also stored in self._data.
Notes
On HPC systems using vendor-tuned BLAS/FFTW (MKL on Marenostrum 5, AOCL on Deucalion), post-processing numpy operations (FFT, gradient) are already internally multi-threaded. Prefer
executor_type="thread"for the I/O phase and rely onOMP_NUM_THREADS/MKL_NUM_THREADSfor the compute phase.
- class osiris_utils.postprocessing.mft.MFT_Diagnostic_Fluctuations(diagnostic: Diagnostic, mft_axis: int)Source
Bases:
Diagnosticdelta = f - avg, where avg uses keepdims to broadcast.
- load_all() np.ndarraySource
Load all data into memory (all iterations), in a pre-allocated array.
- Parameters:
n_workers (int, optional) – Number of parallel workers. If None, an HPC-safe default is chosen: - “thread”: min(_MAX_IO_WORKERS, cpu_count, size-1) - “process”: min(cpu_count, size-1)
use_parallel (bool, optional) – If True, force parallel loading. If False, force sequential. If None (default), automatically choose based on data size.
executor_type ({"thread", "process"}, optional) –
Executor backend for parallel loading (default “thread”).
”thread” —
ThreadPoolExecutor. Correct for I/O-bound HDF5 reads: h5py releases the GIL during reads, so threads genuinely run concurrently. I/O workers are capped at_MAX_IO_WORKERS(default 32) to prevent overwhelming parallel filesystems (Lustre/GPFS on HPC clusters such as Marenostrum 5 or Deucalion).”process” —
ProcessPoolExecutor. Bypasses the GIL entirely; useful when each frame involves heavy CPU computation beyond raw I/O. Each worker opens its own HDF5 handle via the picklable_load_frame_workerhelper. Requiresself._file_listto be populated (i.e. only valid on the baseDiagnostic, not on post-processing wrappers that override_frame).
- Returns:
data (np.ndarray) – The data for all iterations. Also stored in self._data.
Notes
On HPC systems using vendor-tuned BLAS/FFTW (MKL on Marenostrum 5, AOCL on Deucalion), post-processing numpy operations (FFT, gradient) are already internally multi-threaded. Prefer
executor_type="thread"for the I/O phase and rely onOMP_NUM_THREADS/MKL_NUM_THREADSfor the compute phase.
Field Centering
The FieldCentering module provides tools to convert electromagnetic fields from the Yee mesh locations to cell centers.
FieldCentering_Simulation Class
- class osiris_utils.postprocessing.field_centering.FieldCentering_Simulation(simulation: Simulation)Source
Bases:
PostProcessClass to handle the field centering on data.It converts fields from the Osiris yee mesh to the center of the cells. Works as a wrapper for the FieldCentering_Diagnostic class. Inherits from PostProcess to ensure all operation overloads work properly.
It only works for periodic boundaries.
- Parameters:
simulation (Simulation) – The simulation object.
field (str) – The field to center.
Field centering is a purely spatial operation and can be computed lazily per timestep.
Usage example:
from osiris_utils.data import Simulation
from osiris_utils.postprocessing import FieldCentering_Simulation
sim = Simulation("/path/to/input/deck.inp")
centered = FieldCentering_Simulation(sim)
e1c = centered["e1"]
arr = e1c[10] # centered field at timestep 10
arr_slice = e1c[10, :, :] # supports spatial slicing
FieldCentering_Diagnostic Class
- class osiris_utils.postprocessing.field_centering.FieldCentering_Diagnostic(diagnostic: Diagnostic)Source
Bases:
Diagnostic- load_all() ndarraySource
Load all data into memory (all iterations), in a pre-allocated array.
- Parameters:
n_workers (int, optional) – Number of parallel workers. If None, an HPC-safe default is chosen: - “thread”: min(_MAX_IO_WORKERS, cpu_count, size-1) - “process”: min(cpu_count, size-1)
use_parallel (bool, optional) – If True, force parallel loading. If False, force sequential. If None (default), automatically choose based on data size.
executor_type ({"thread", "process"}, optional) –
Executor backend for parallel loading (default “thread”).
”thread” —
ThreadPoolExecutor. Correct for I/O-bound HDF5 reads: h5py releases the GIL during reads, so threads genuinely run concurrently. I/O workers are capped at_MAX_IO_WORKERS(default 32) to prevent overwhelming parallel filesystems (Lustre/GPFS on HPC clusters such as Marenostrum 5 or Deucalion).”process” —
ProcessPoolExecutor. Bypasses the GIL entirely; useful when each frame involves heavy CPU computation beyond raw I/O. Each worker opens its own HDF5 handle via the picklable_load_frame_workerhelper. Requiresself._file_listto be populated (i.e. only valid on the baseDiagnostic, not on post-processing wrappers that override_frame).
- Returns:
data (np.ndarray) – The data for all iterations. Also stored in self._data.
Notes
On HPC systems using vendor-tuned BLAS/FFTW (MKL on Marenostrum 5, AOCL on Deucalion), post-processing numpy operations (FFT, gradient) are already internally multi-threaded. Prefer
executor_type="thread"for the I/O phase and rely onOMP_NUM_THREADS/MKL_NUM_THREADSfor the compute phase.
This diagnostic wraps a field diagnostic and returns its cell-centered version. The implementation assumes periodic boundaries when using shifts/rolls. ``