Post-Processing Framework

The osiris_utils package provides a lightweight framework for post-processing OSIRIS simulation data. Post-processors are implemented as wrappers around the existing Simulation / Diagnostic interface so that:

indexing works the same way everywhere (lazy, sliceable, tuple-sliceable)
arithmetic between diagnostics keeps working (+, -, *, /, etc.)
post-processors can be chained when it makes sense (e.g., derivatives)

This page documents the framework and the main post-processors currently available.

PostProcess Base Class

class osiris_utils.postprocessing.postprocess.PostProcess(name: str, simulation: Any)Source

Bases: object

Base class for post-processing wrappers (FFT_Simulation, Derivative_Simulation, etc).

It wraps a Simulation-like object (Simulation or another PostProcess) and exposes:

.species
.loaded_diagnostics
.add_diagnostic()

Subclasses implement __getitem__ to return processed diagnostics.

property loaded_diagnostics: dictSource: Delegate to wrapped simulation if it has loaded_diagnostics. Otherwise return empty dict.

add_diagnostic(diagnostic: Diagnostic, name: str | None = None) → strSource: Delegate add_diagnostic downwards. This lets you do: sim_fft.add_diagnostic(…)

The PostProcess class is the base class for all simulation-level post-processors.

Design goals

A post-processor should behave like a Simulation:

pp["e3"] returns a diagnostic-like object (usually a Diagnostic subclass)
pp["electrons"]["n"] returns a species diagnostic
it caches created wrappers so repeated access is cheap

Most post-processors follow the same pattern:

A simulation wrapper (Something_Simulation) inheriting from PostProcess
A diagnostic wrapper (Something_Diagnostic) inheriting from Diagnostic
An optional species handler (Something_Species_Handler) for species-dependent quantities

Integration with OSIRIS Diagnostics

Post-processors are designed to work seamlessly with the OSIRIS diagnostic system:

Chaining (when appropriate): some post-processors can wrap other post-processors (e.g., derivatives of already-derived quantities).
Compatibility with visualization: results remain Diagnostic-like and can be plotted the same way.
Uniform indexing contract:
- diag[i] returns a single timestep np.ndarray (shape (x, y, z) depending on dimension)
- diag[i:j] returns a stacked time array (shape (t, x, y, z))
- diag[i, :, 100:200] uses tuple indexing (time index + spatial slices)

Important implementation note

The base Diagnostic.__getitem__ already implements:

int and slice time indexing
tuple indexing (time + spatial slices)
fast path for in-memory data

So post-processed diagnostics should generally implement:

load_all() to build self._data (shape includes time)
_frame(index, data_slice=None) to return a single timestep lazily (shape excludes time)

You usually do not need to override __getitem__ in post-processed diagnostics.

Derivative Post-Processing

The Derivative_Simulation module provides tools for computing time and spatial derivatives of diagnostics.

Derivative_Simulation Class

class osiris_utils.postprocessing.derivative.Derivative_Simulation(simulation: Any, deriv_type: str, axis: int | tuple[int, int] | None = None, order: int = 4, stencil: Iterable[int] | None = None, deriv_order: int = 1, periodic: bool = False)Source

Bases: PostProcess

Class to compute the derivative of a diagnostic. Works as a wrapper for the Derivative_Diagnostic class. Inherits from PostProcess to ensure all operation overloads work properly.

This class can be initialized with either a Simulation object or another Derivative_Simulation object, allowing for chaining derivatives (e.g., second derivative = Derivative_Simulation(Derivative_Simulation(…))).

Parameters:

simulation (Simulation or Derivative_Simulation) – The simulation object or another derivative simulation object.
deriv_type (str) – The type of derivative to compute. Options are: - ‘t’ for time derivative. - ‘x1’ for first spatial derivative. - ‘x2’ for second spatial derivative. - ‘x3’ for third spatial derivative. - ‘xx’ for second spatial derivative in two axis. - ‘xt’ for mixed derivative in time and one spatial axis. - ‘tx’ for mixed derivative in one spatial axis and time.
axis (int or tuple) – The axis to compute the derivative. Only used for ‘xx’, ‘xt’ and ‘tx’ types.
order (int) – Scheme selector (2 or 4). Ignored if stencil is provided.
stencil (Iterable[int] or None) – Integer offsets (in grid steps), e.g. [-2,-1,0,1,2] or [0,1,2,3].
deriv_order (int) – Derivative order (1 for first derivative, 2 for second derivative, …).
periodic (bool) – If True, derivatives are computed with periodic boundary conditions.

property species: listSource: Return list of species, making this compatible with Simulation interface

property loaded_diagnostics: dictSource: Return loaded diagnostics, making this compatible with Simulation interface

add_diagnostic(diagnostic: Diagnostic, name: str | None = None) → strSource: Add a custom diagnostic to the derivative simulation.

Derivative_Simulation behaves like an operator acting on all diagnostics inside a simulation:

wraps a Simulation (or a compatible post-process wrapper)
returns derivative diagnostics on demand
supports species and non-species diagnostics
supports lazy evaluation and efficient spatial slicing
may be chained (e.g., derivative of a derivative)

Derivative types

t : time derivative \(\\partial/\\partial t\)
x1 : spatial derivative \(\\partial/\\partial x_1\)
x2 : spatial derivative \(\\partial/\\partial x_2\)
x3 : spatial derivative \(\\partial/\\partial x_3\)
xx : second derivative in two spatial axes (mixed or repeated), e.g. \(\\partial^2/\\partial x_i\\partial x_j\)
xt : \(\\partial/\\partial x\) of \(\\partial/\\partial t\)
tx : \(\\partial/\\partial t\) of \(\\partial/\\partial x\)

Usage examples

from osiris_utils.data import Simulation
from osiris_utils.postprocessing import Derivative_Simulation

sim = Simulation("/path/to/input/deck.inp")

# Spatial derivative d/dx1 of every diagnostic
dx1 = Derivative_Simulation(sim, "x1")

dE1_dx1 = dx1["e1"]
dvfl1_dx1 = dx1["electrons"]["vfl1"]

# Single timestep
arr = dE1_dx1[5]

# Time slice
arr_t = dE1_dx1[5:10]

# Spatial slicing (efficient if underlying diagnostic supports it)
arr_slice = dE1_dx1[5, :, 100:200]

# Full in-memory derivative (use only when needed)
dE1_dx1.load_all()

Implementation details and accuracy

Spatial derivatives use finite differences (2nd or 4th order depending on configuration).
Time derivatives must account for OSIRIS ndump:

effective \(\\Delta t = dt \\times ndump\)
For mixed derivatives (tx, xt), the implementation applies the corresponding operator ordering.

Performance tips

Prefer indexing (lazy computation) when exploring data.
Use spatial slices to reduce IO and compute time.
Call load_all() only when you truly need the full time history in memory.

Derivative_Diagnostic Class

class osiris_utils.postprocessing.derivative.Derivative_Diagnostic(diagnostic: Diagnostic, deriv_type: str, axis: int | tuple[int, int] | None = None, order: int = 4, stencil: Iterable[int] | None = None, deriv_order: int = 1, periodic: bool = False)Source

Bases: Diagnostic

Auxiliar class to compute the derivative of a diagnostic, for it to be similar in behavior to a Diagnostic object. Inherits directly from Diagnostic to ensure all operation overloads work properly.

Parameters:

diagnostic (Diagnostic) – The diagnostic object.
deriv_type (str) – The type of derivative to compute. Options are: ‘t’, ‘x1’, ‘x2’, ‘x3’, ‘xx’, ‘xt’ and ‘tx’.
axis (int or tuple) – The axis to compute the derivative. Only used for ‘xx’, ‘xt’ and ‘tx’ types
order (int) – Scheme selector (2 or 4). Ignored if stencil is provided.
stencil (Iterable[int] or None) – Integer offsets (in grid steps), e.g. [-2,-1,0,1,2] or [0,1,2,3].
deriv_order (int) – Derivative order (1 for first derivative, 2 for second derivative, …). Only used if stencil is provided.
periodic (bool) – If True, derivatives are computed with periodic boundary conditions.

load_all()Source: Load all the data and compute the derivative.

__getitem__(index)Source: Get data at a specific index.

load_all(n_workers: int | None = None) → ndarraySource

Load all data and compute the derivative.

Parameters:

n_workers (int or None) –

Number of ProcessPoolExecutor workers for spatial derivatives (deriv_type in {"x1", "x2", "x3"}). Spatial derivatives are independent across the time axis, so the data is split into n_workers chunks along axis 0 and processed in parallel.

None (default) — single-process (safe everywhere, uses NumPy’s internal threading via MKL/OpenBLAS).
> 1 — activates chunked ProcessPoolExecutor. Uses the forkserver multiprocessing context, which is MPI-safe on HPC nodes (unlike fork, which can deadlock with MPI). Set NUMBA_NUM_THREADS / OMP_NUM_THREADS to 1 per worker when also using Numba to avoid thread over-subscription.

Time and mixed derivatives ("t", "xx", "xt", "tx") always run single-process because the stencil spans neighbouring time steps.

A derivative diagnostic is a Diagnostic wrapper that computes derivatives lazily using _frame and can compute eagerly using load_all. It preserves metadata (grid spacing, dimensions, etc.) from the original diagnostic and remains compatible with diagnostic arithmetic.

Spectral Analysis with Fast Fourier Transform

The FFT_Simulation module provides tools for computing power spectra using the Fast Fourier Transform (FFT).

FFT_Simulation Class

class osiris_utils.postprocessing.fft.FFT_Simulation(simulation: Simulation, fft_axis: int | list[int])Source

Bases: PostProcess

Class to handle the Fast Fourier Transform on data. Works as a wrapper for the FFT_Diagnostic class. Inherits from PostProcess to ensure all operation overloads work properly.

Parameters:

simulation (Simulation) – The simulation object.
fft_axis (int or list of int) – The axis or axes to compute the FFT.

process(diagnostic: Diagnostic) → FFT_DiagnosticSource: Apply FFT to a diagnostic

FFT_Simulation is a simulation wrapper that returns FFT_Diagnostic objects.

Key points about FFT axes

OSIRIS diagnostics loaded with load_all() have shape (t, x1, x2, x3) (depending on dimension). In that case: - axis 0 is time - axes 1,2,3 are spatial
A single timestep returned by indexing has no time dimension, only spatial: - axes become (x1, x2, x3) → numpy axes 0,1,2

Important constraint

If the FFT includes the time axis (axis 0), you cannot compute it from a single timestep. Time-domain FFT requires the full time series, therefore it requires load_all() on the FFT diagnostic.
Spatial FFTs can be computed per-timestep lazily.

Usage example

from osiris_utils.data import Simulation
from osiris_utils.postprocessing import FFT_Simulation

sim = Simulation("/path/to/input/deck.inp")

# Spatial spectrum at each timestep (FFT along x1)
fft_x1 = FFT_Simulation(sim, 1)
e3_k = fft_x1["e3"]
k_spectrum_t10 = e3_k[10]  # OK: spatial FFT at timestep 10

# Dispersion-like spectrum (time + space) requires load_all
fft_wk = FFT_Simulation(sim, (0, 1))
e3_wk = fft_wk["e3"]
e3_wk.load_all()           # required
P = e3_wk.data             # power spectrum |FFT|^2

FFT_Diagnostic Class

class osiris_utils.postprocessing.fft.FFT_Diagnostic(diagnostic: Diagnostic, fft_axis: int | list[int], *, window: str | None = 'hann', window_for_spatial: str | None = None, detrend: str | None = 'mean', normalize: str = 'ortho', assume_periodic: bool = False)Source

Bases: Diagnostic

Auxiliar class to compute the FFT of a diagnostic, for it to be similar in behavior to a Diagnostic object. Inherits directly from Diagnostic to ensure all operation overloads work properly.

Parameters:

diagnostic (Diagnostic) – The diagnostic to compute the FFT.
axis (int) – The axis to compute the FFT.
window (str or None, optional) – The window to apply before computing the FFT. Can be “hann” or None (default “hann”).
detrend (str or None, optional) – The detrending method to apply before computing the FFT. Can be “mean” or None (default “mean”).
normalize (str, optional) – The normalization method for the FFT. Can be “none”, “ortho”, or “density” (default “ortho”).
assume_periodic (bool, optional) – Whether to assume the data is periodic when choosing default windowing for spatial FFTs (default True). If True, no spatial window is applied by default; if False, the same window is

load_all() → np.ndarraySource: Load all data and compute FFT (possibly including time axis). Returns power spectrum |FFT|^2 stored in self._data.

_frame(index: int, data_slice: tuple | None = None) → np.ndarraySource: Per-timestep (lazy) FFT. Only allowed for spatial FFT. If fft_axis includes time (0), user must call load_all(). Returns power spectrum |FFT|^2.

omega() → np.ndarraySource: Get the angular frequency array for the FFT along the time dimension.

k(axis: int | None = None) → np.ndarray | dict[int, np.ndarray]Source: Get the wavenumber array for the FFT along spatial dimension(s).

load_all() → ndarraySource: Load all data and compute FFT (possibly including time axis). Returns power spectrum |FFT|^2 stored in self._data.

omega() → ndarraySource: Get the angular frequency array for the FFT along the time dimension.

k(axis: int | None = None) → ndarray | dict[int, ndarray]Source

Get the wavenumber array for the FFT along spatial dimension(s).

Parameters:: axis (int or None, optional) – The spatial axis to compute wavenumber for (1, 2, or 3). If None, returns wavenumbers for all spatial axes in fft_axis.
Returns:: np.ndarray or dict – If axis is specified: wavenumber array for that axis. If axis is None: dictionary mapping axis -> wavenumber array.

Notes

When load_all() is used, time axis is 0 and spatial axes are 1,2,3. When accessing single timesteps, spatial axes are 0,1,2.

property kmax: float | ndarraySource: Nyquist wavenumber (rad/m). Upper limit of representable k.

property omega_maxSource: Nyquist angular frequency (rad/s). Upper limit of representable ω.

A specialized diagnostic representing an FFT-based power spectrum.

Behavior

load_all() computes the FFT over the requested axes (including time if axis 0 is included).
_frame(index, data_slice=None) computes a per-timestep FFT only for spatial axes.
The returned data is a power spectrum (typically \(|\\mathrm{FFT}|^2\)) and is shifted using fftshift so that zero frequency / wavenumber is centered.

Frequency / wavenumber arrays

Use omega() for the time-frequency axis (when axis 0 was transformed and data is loaded).
Use k(axis=...) for spatial wavenumbers.
kmax reports the Nyquist wavenumber \(\\pi/\\Delta x\) for the relevant spatial axes.

Example: Dispersion relation (ω–k)

from osiris_utils.data import Simulation
from osiris_utils.postprocessing import FFT_Simulation
import numpy as np
import matplotlib.pyplot as plt

sim = Simulation("/path/to/input/deck.inp")

fft = FFT_Simulation(sim, (0, 1))
e1_fft = fft["e1"]
e1_fft.load_all()

omega = e1_fft.omega()      # angular frequency axis (shifted)
k1 = e1_fft.k(1)            # k for x1 axis (shifted)

P = np.log10(e1_fft.data + 1e-30)  # safer for plotting

plt.figure()
plt.pcolormesh(k1, omega, P, shading="auto")
plt.xlabel("k1")
plt.ylabel("ω")
plt.title("Dispersion-like spectrum: E1")
plt.show()

Mean Field Theory Analysis

The Mean Field Theory (MFT) module provides tools for decomposing diagnostics into an average (mean) component and fluctuations along a chosen axis.

MFT_Simulation Class

class osiris_utils.postprocessing.mft.MFT_Simulation(simulation: Simulation, mft_axis: int | None = None)Source

Bases: PostProcess

Class to compute the Mean Field Theory approximation of a diagnostic. Works as a wrapper for the MFT_Diagnostic class. Inherits from PostProcess to ensure all operation overloads work properly.

Parameters:

simulation (Simulation) – The simulation object.
mft_axis (int) – The axis to compute the mean field theory.

The MFT decomposition returns a container-like diagnostic with two components:

"avg" : mean field \(\\langle A \\rangle\)
"delta" : fluctuations \(\\delta A = A - \\langle A \\rangle\)

Usage example:

from osiris_utils.data import Simulation
from osiris_utils.postprocessing import MFT_Simulation

sim = Simulation("/path/to/input/deck.inp")

mft = MFT_Simulation(sim, 1)       # average along x1 direction
e1 = mft["e1"]

e1_avg = e1["avg"]
e1_delta = e1["delta"]

avg_t10 = e1_avg[10]
flt_t10 = e1_delta[10]

MFT_Diagnostic Classes

class osiris_utils.postprocessing.mft.MFT_Diagnostic(diagnostic: Diagnostic, mft_axis: int)Source

Bases: Diagnostic

Container: gives access to “avg” and “delta” diagnostics. Not itself a time-indexed diagnostic.

__getitem__(key: str) → DiagnosticSource

Retrieve timestep data with optional spatial slicing.

Parameters:

index (int, slice, or tuple) –

If int, loads full spatial data for that timestep.
If slice, supports start:stop:step for time. Zero-length slices return an empty array of shape (0, …).
If tuple, first element is time index/slice, remaining elements are spatial slices for each dimension. Example: diag[5, :, 100:200] loads timestep 5 with spatial slicing applied. Efficient: HDF5 reads only the requested spatial slice from disk, not the full array.

Returns:

np.ndarray – Array for that timestep (or stacked array for a slice), optionally spatially sliced.

Raises:

IndexError – If the index is out of range or no data generator is available.
RuntimeError – If loading a specific timestep fails.

Examples

>>> data = diag[5]              # Load full spatial data for timestep 5
>>> data = diag[5, :, 100:200]  # Load timestep 5 with spatial slice (only reads slice from disk!)
>>> data = diag[0:10]           # Load timesteps 0-9 (full spatial domain)
>>> data = diag[0:10, :, 50:]   # Load timesteps 0-9 with spatial slice

class osiris_utils.postprocessing.mft.MFT_Diagnostic_Average(diagnostic: Diagnostic, mft_axis: int)Source

Bases: Diagnostic

Class to compute the average component of mean field theory. Inherits from Diagnostic to ensure all operation overloads work properly.

Parameters:

diagnostic (Diagnostic) – The diagnostic object.
mft_axis (int) – The axis to compute the mean field theory.

load_all() → np.ndarraySource

Load all data into memory (all iterations), in a pre-allocated array.

Parameters:

n_workers (int, optional) – Number of parallel workers. If None, an HPC-safe default is chosen: - “thread”: min(_MAX_IO_WORKERS, cpu_count, size-1) - “process”: min(cpu_count, size-1)
use_parallel (bool, optional) – If True, force parallel loading. If False, force sequential. If None (default), automatically choose based on data size.
executor_type ({"thread", "process"}, optional) –
Executor backend for parallel loading (default “thread”).
- ”thread” — ThreadPoolExecutor. Correct for I/O-bound HDF5 reads: h5py releases the GIL during reads, so threads genuinely run concurrently. I/O workers are capped at _MAX_IO_WORKERS (default 32) to prevent overwhelming parallel filesystems (Lustre/GPFS on HPC clusters such as Marenostrum 5 or Deucalion).
- ”process” — ProcessPoolExecutor. Bypasses the GIL entirely; useful when each frame involves heavy CPU computation beyond raw I/O. Each worker opens its own HDF5 handle via the picklable _load_frame_worker helper. Requires self._file_list to be populated (i.e. only valid on the base Diagnostic, not on post-processing wrappers that override _frame).

Returns:

data (np.ndarray) – The data for all iterations. Also stored in self._data.

Notes

On HPC systems using vendor-tuned BLAS/FFTW (MKL on Marenostrum 5, AOCL on Deucalion), post-processing numpy operations (FFT, gradient) are already internally multi-threaded. Prefer executor_type="thread" for the I/O phase and rely on OMP_NUM_THREADS / MKL_NUM_THREADS for the compute phase.

class osiris_utils.postprocessing.mft.MFT_Diagnostic_Fluctuations(diagnostic: Diagnostic, mft_axis: int)Source

Bases: Diagnostic

delta = f - avg, where avg uses keepdims to broadcast.

load_all() → np.ndarraySource

Load all data into memory (all iterations), in a pre-allocated array.

Parameters:

n_workers (int, optional) – Number of parallel workers. If None, an HPC-safe default is chosen: - “thread”: min(_MAX_IO_WORKERS, cpu_count, size-1) - “process”: min(cpu_count, size-1)
use_parallel (bool, optional) – If True, force parallel loading. If False, force sequential. If None (default), automatically choose based on data size.
executor_type ({"thread", "process"}, optional) –
Executor backend for parallel loading (default “thread”).
- ”thread” — ThreadPoolExecutor. Correct for I/O-bound HDF5 reads: h5py releases the GIL during reads, so threads genuinely run concurrently. I/O workers are capped at _MAX_IO_WORKERS (default 32) to prevent overwhelming parallel filesystems (Lustre/GPFS on HPC clusters such as Marenostrum 5 or Deucalion).
- ”process” — ProcessPoolExecutor. Bypasses the GIL entirely; useful when each frame involves heavy CPU computation beyond raw I/O. Each worker opens its own HDF5 handle via the picklable _load_frame_worker helper. Requires self._file_list to be populated (i.e. only valid on the base Diagnostic, not on post-processing wrappers that override _frame).

Returns:

data (np.ndarray) – The data for all iterations. Also stored in self._data.

Notes

On HPC systems using vendor-tuned BLAS/FFTW (MKL on Marenostrum 5, AOCL on Deucalion), post-processing numpy operations (FFT, gradient) are already internally multi-threaded. Prefer executor_type="thread" for the I/O phase and rely on OMP_NUM_THREADS / MKL_NUM_THREADS for the compute phase.

Field Centering

The FieldCentering module provides tools to convert electromagnetic fields from the Yee mesh locations to cell centers.

FieldCentering_Simulation Class

class osiris_utils.postprocessing.field_centering.FieldCentering_Simulation(simulation: Simulation)Source

Bases: PostProcess

Class to handle the field centering on data.It converts fields from the Osiris yee mesh to the center of the cells. Works as a wrapper for the FieldCentering_Diagnostic class. Inherits from PostProcess to ensure all operation overloads work properly.

It only works for periodic boundaries.

Parameters:

simulation (Simulation) – The simulation object.
field (str) – The field to center.

Field centering is a purely spatial operation and can be computed lazily per timestep.

Usage example:

from osiris_utils.data import Simulation
from osiris_utils.postprocessing import FieldCentering_Simulation

sim = Simulation("/path/to/input/deck.inp")

centered = FieldCentering_Simulation(sim)

e1c = centered["e1"]
arr = e1c[10]              # centered field at timestep 10
arr_slice = e1c[10, :, :]  # supports spatial slicing

FieldCentering_Diagnostic Class

class osiris_utils.postprocessing.field_centering.FieldCentering_Diagnostic(diagnostic: Diagnostic)Source

Bases: Diagnostic

load_all() → ndarraySource

Load all data into memory (all iterations), in a pre-allocated array.

Parameters:

n_workers (int, optional) – Number of parallel workers. If None, an HPC-safe default is chosen: - “thread”: min(_MAX_IO_WORKERS, cpu_count, size-1) - “process”: min(cpu_count, size-1)
use_parallel (bool, optional) – If True, force parallel loading. If False, force sequential. If None (default), automatically choose based on data size.
executor_type ({"thread", "process"}, optional) –
Executor backend for parallel loading (default “thread”).
- ”thread” — ThreadPoolExecutor. Correct for I/O-bound HDF5 reads: h5py releases the GIL during reads, so threads genuinely run concurrently. I/O workers are capped at _MAX_IO_WORKERS (default 32) to prevent overwhelming parallel filesystems (Lustre/GPFS on HPC clusters such as Marenostrum 5 or Deucalion).
- ”process” — ProcessPoolExecutor. Bypasses the GIL entirely; useful when each frame involves heavy CPU computation beyond raw I/O. Each worker opens its own HDF5 handle via the picklable _load_frame_worker helper. Requires self._file_list to be populated (i.e. only valid on the base Diagnostic, not on post-processing wrappers that override _frame).

Returns:

data (np.ndarray) – The data for all iterations. Also stored in self._data.

Notes

On HPC systems using vendor-tuned BLAS/FFTW (MKL on Marenostrum 5, AOCL on Deucalion), post-processing numpy operations (FFT, gradient) are already internally multi-threaded. Prefer executor_type="thread" for the I/O phase and rely on OMP_NUM_THREADS / MKL_NUM_THREADS for the compute phase.

This diagnostic wraps a field diagnostic and returns its cell-centered version. The implementation assumes periodic boundaries when using shifts/rolls. ``