Simulation Interface

The Simulation class provides a high-level interface for handling OSIRIS simulation data and accessing diagnostics through a consistent, dictionary-like API.

Simulation Class

class osiris_utils.data.simulation.Simulation(input_deck_path: str)Source

Class to handle the simulation data. It is a wrapper for the Diagnostic class.’

Parameters:: input_deck_path (str) – Path to the input deck (It must be in the folder where the simulation was run)

simulation_folderSource

The simulation folder.

Type:: str

speciesSource

The species to analyze.

Type:: Species object

diagnosticsSource

Dictionary to store diagnostics for each quantity when load_all method is used.

Type:: dict

delete_all_diagnostics()Source: Delete all diagnostics.

delete_diagnostic(key)Source: Delete a diagnostic.

__getitem__(key)Source: Get a diagnostic.

delete_all_diagnostics()Source: Delete all diagnostics.

delete_diagnostic(key)Source: Delete a diagnostic.”

__getitem__(key: str) → Diagnostic | Species_HandlerSource

add_diagnostic(diagnostic, name=None)Source

Add a custom diagnostic to the simulation.

Parameters:

diagnostic (Diagnostic or array-like) – The diagnostic to add. If not a Diagnostic object, it will be wrapped in a Diagnostic object.
name (str, optional) – The name to use as the key for accessing the diagnostic. If None, an auto-generated name will be used.

Returns:

str – The name (key) used to store the diagnostic

Example

>>> sim = Simulation('path/to/simulation/input_deck.txt')
>>> nT = sim['electrons']['n'] * sim['electrons']['T11']
>>> sim.add_diagnostic(nT, 'nT')
>>> sim['nT']  # Access the custom diagnostic

property species: list[str]Source

property loaded_diagnostics: dict[str, Diagnostic]Source

A wrapper class that manages access to multiple diagnostic quantities from a single OSIRIS run.

What Simulation gives you

Dictionary-style access to diagnostics:
- sim["e1"] returns a non-species diagnostic (fields, global quantities, etc.)
- sim["electrons"]["charge"] returns a species diagnostic
Centralized metadata:
- simulation folder is inferred from the input deck path
- species list comes from the parsed input deck
Caching of created diagnostics:
- repeated access returns the same wrapper object
- you can delete cached diagnostics to free memory

Key attributes

species: list of species names found in the input deck
loaded_diagnostics: dictionary of cached non-species diagnostics

Note

Simulation does not automatically load data. Diagnostics are lazy by default. Data is read only when you index a diagnostic or call load_all().

Usage examples

Basic access (lazy)

from osiris_utils.data import Simulation

sim = Simulation("/path/to/osiris.inp")

# Non-species diagnostic (no data read yet)
e1 = sim["e1"]

# Species diagnostic
charge = sim["electrons"]["charge"]

Load specific timesteps and slices

Diagnostics support:

time indexing: diag[i] and diag[i:j]
tuple indexing: diag[i, x_slice, y_slice, ...] (time + spatial slicing)

e1 = sim["e1"]

# Single timestep (reads one file)
arr_t10 = e1[10]

# Multiple timesteps (reads only requested files)
arr_t10_20 = e1[10:20]

# Time index + spatial slicing (efficient: reads only slice from disk if supported)
# Example for 2D: (time, x1, x2)
arr_roi = e1[10, :, 100:200]

Loading everything into memory

e1 = sim["e1"]
e1.load_all()          # loads all timesteps into memory
full = e1.data         # numpy array with shape (t, ...)

Cache management

# remove one diagnostic from the simulation cache
sim.delete_diagnostic("e1")

# remove all cached diagnostics
sim.delete_all_diagnostics()

Accessing tracks diagnostics: Tracks_Diagnostic objects do not support operations but they work similarly to the
basic Diagnostic objects.

.. code-block:: python
     # tracks is a Tracks_Diagnostic object
     tracks = sim["electrons"]["tracks"]

     # Print all the data keys you can access
     print(tracks.quants)

     # Access x1 for first 10 particles across all timesteps (lazy)
     print(track_diag["x1"][0:10, :])

     # Load everything into memory
     track_diag.load_all()

     # Access data directly from memory
     print(track_diag["p2"][0:10, :])
     print(track_diag.data["p2"][0:10, :])

Integration with Diagnostics

The workflow is:

sim[quantity] or sim[species][quantity] creates a Diagnostic wrapper (lazy)
Accessing data (via indexing or load_all()) reads from disk
After the first load_all(), the Simulation caches the diagnostic so future access returns the same object
You can remove cached diagnostics explicitly to manage memory

This keeps interactive analysis fast while still scaling to large datasets.

Diagnostic System

The Diagnostic class is the foundation of data handling in osiris_utils. It represents a time series of grid data stored in OSIRIS output files, plus any derived quantities created through operations.

Diagnostic Base Class

class osiris_utils.data.diagnostic.Diagnostic(simulation_folder: str | None = None, species: Species = None, input_deck: InputDeckIO | None = None)Source

Class to handle diagnostics. This is the “base” class of the code. Diagnostics can be loaded from OSIRIS output files, but are also created when performing operations with other diagnostics. Post-processed quantities are also considered diagnostics. This way, we can perform operations with them as well.

Parameters:

species (str) – The species to handle the diagnostics.
simulation_folder (str) – The path to the simulation folder. This is the path to the folder where the input deck is located.
input_deck (str or dict, optional) – The input deck to load the diagnostic attributes. If None, the attributes are loaded from the files. If a string is provided, it is assumed to be the path to the input deck file. If a dict is provided, it is assumed to be the parsed input deck.

speciesSource

The species to handle the diagnostics.

Type:: str

dxSource

The grid spacing in each direction. If the dimension is 1, this is a float. If the dimension is 2 or 3, this is a np.ndarray.

Type:: np.ndarray(float) or float

nxSource

The number of grid points in each direction. If the dimension is 1, this is a int. If the dimension is 2 or 3, this is a np.ndarray.

Type:: np.ndarray(int) or int

xSource

The grid points.

Type:: np.ndarray

dtSource

The time step.

Type:: float

gridSource

The grid boundaries.

Type:: np.ndarray

axisSource

The axis information. Each key is a direction and the value is a dictionary with the keys “name”, “long_name”, “units” and “plot_label”.

Type:: dict

unitsSource

The units of the diagnostic. This info may not be available for all diagnostics, ie, diagnostics resulting from operations and postprocessing.

Type:: str

nameSource

The name of the diagnostic. This info may not be available for all diagnostics, ie, diagnostics resulting from operations and postprocessing.

Type:: str

labelSource

The label of the diagnostic. This info may not be available for all diagnostics, ie, diagnostics resulting from operations and postprocessing.

Type:: str

dimSource

The dimension of the diagnostic.

Type:: int

ndumpSource

The number of steps between dumps (global).

Type:: int

iterSource

The iteration of the first file after file 000000.

Type:: int

maxiterSource

The maximum number of iterations.

Type:: int

tunitsSource

The time units.

Type:: str

pathSource

The path to the diagnostic.

Type:: str

simulation_folderSource

The path to the simulation folder.

Type:: str

all_loadedSource

If the data is already loaded into memory. This is useful to avoid loading the data multiple times.

Type:: bool

dataSource

The diagnostic data. This is created only when the data is loaded into memory.

Type:: np.ndarray

get_quantity(quantity)Source: Get the data for a given quantity.

load_all()Source: Load all data into memory.

load(index)Source: Load data for a given index.

__getitem__(index)Source: Get data for a given index. Does not load the data into memory.

__iter__()Source: Iterate over the data. Does not load the data into memory.

__add__(other)Source: Add two diagnostics.

__sub__(other)Source: Subtract two diagnostics.

__mul__(other)Source: Multiply two diagnostics.

__truediv__(other)Source: Divide two diagnostics.

__pow__(other)Source: Power of a diagnostic.

plot_3d(idx, scale_type='default', boundaries=None)Source: Plot a 3D scatter plot of the diagnostic data.

time(index)Source: Get the time for a given index.

get_quantity(quantity: str) → NoneSource

Get the data for a given quantity.

Parameters:: quantity (str) – The quantity to get the data.

load_all(n_workers: int | None = None, use_parallel: bool | None = None, executor_type: Literal['thread', 'process'] = 'thread') → ndarraySource

Load all data into memory (all iterations), in a pre-allocated array.

Parameters:

n_workers (int, optional) – Number of parallel workers. If None, an HPC-safe default is chosen: - “thread”: min(_MAX_IO_WORKERS, cpu_count, size-1) - “process”: min(cpu_count, size-1)
use_parallel (bool, optional) – If True, force parallel loading. If False, force sequential. If None (default), automatically choose based on data size.
executor_type ({"thread", "process"}, optional) –
Executor backend for parallel loading (default “thread”).
- ”thread” — ThreadPoolExecutor. Correct for I/O-bound HDF5 reads: h5py releases the GIL during reads, so threads genuinely run concurrently. I/O workers are capped at _MAX_IO_WORKERS (default 32) to prevent overwhelming parallel filesystems (Lustre/GPFS on HPC clusters such as Marenostrum 5 or Deucalion).
- ”process” — ProcessPoolExecutor. Bypasses the GIL entirely; useful when each frame involves heavy CPU computation beyond raw I/O. Each worker opens its own HDF5 handle via the picklable _load_frame_worker helper. Requires self._file_list to be populated (i.e. only valid on the base Diagnostic, not on post-processing wrappers that override _frame).

Returns:

data (np.ndarray) – The data for all iterations. Also stored in self._data.

Notes

On HPC systems using vendor-tuned BLAS/FFTW (MKL on Marenostrum 5, AOCL on Deucalion), post-processing numpy operations (FFT, gradient) are already internally multi-threaded. Prefer executor_type="thread" for the I/O phase and rely on OMP_NUM_THREADS / MKL_NUM_THREADS for the compute phase.

unload() → NoneSource: Unload data from memory. This is useful to free memory when the data is not needed anymore.

__len__() → intSource: Return the number of timesteps available.

__getitem__(index: int | slice | tuple) → ndarraySource

Retrieve timestep data with optional spatial slicing.

Parameters:

index (int, slice, or tuple) –

If int, loads full spatial data for that timestep.
If slice, supports start:stop:step for time. Zero-length slices return an empty array of shape (0, …).
If tuple, first element is time index/slice, remaining elements are spatial slices for each dimension. Example: diag[5, :, 100:200] loads timestep 5 with spatial slicing applied. Efficient: HDF5 reads only the requested spatial slice from disk, not the full array.

Returns:

np.ndarray – Array for that timestep (or stacked array for a slice), optionally spatially sliced.

Raises:

IndexError – If the index is out of range or no data generator is available.
RuntimeError – If loading a specific timestep fails.

Examples

>>> data = diag[5]              # Load full spatial data for timestep 5
>>> data = diag[5, :, 100:200]  # Load timestep 5 with spatial slice (only reads slice from disk!)
>>> data = diag[0:10]           # Load timesteps 0-9 (full spatial domain)
>>> data = diag[0:10, :, 50:]   # Load timesteps 0-9 with spatial slice

__add__(other: Diagnostic | float | ndarray) → DiagnosticSource

__sub__(other: Diagnostic | float | ndarray) → DiagnosticSource

__mul__(other: Diagnostic | float | ndarray) → DiagnosticSource

__truediv__(other: Diagnostic | float | ndarray) → DiagnosticSource

__pow__(other: Diagnostic | float | ndarray) → DiagnosticSource: Power operation. Raises the diagnostic data to the power of other. If other is a Diagnostic, it raises each timestep’s data to the corresponding timestep’s power. If other is a scalar or ndarray, it raises all data to that power.

to_h5(savename: str | None = None, index: int | list[int] | None = None, all: bool = False, verbose: bool = False, path: str | None = None) → NoneSource

Save the diagnostic data to HDF5 files.

Parameters:

savename (str, optional) – The name of the HDF5 file. If None, uses the diagnostic name.
index (int, or list of ints, optional) – The index or indices of the data to save.
all (bool, optional) – If True, save all data. Default is False.
verbose (bool, optional) – If True, print messages about the saving process.
path (str, optional) – The path to save the HDF5 files. If None, uses the default save path (in simulation folder).

plot_3d(idx, scale_type: Literal['zero_centered', 'pos', 'neg', 'default'] = 'default', boundaries: ndarray = None)Source

DEPRECATED: Use osiris_utils.vis.plot_3d instead.

Plots a 3D scatter plot of the diagnostic data (grid data).

__str__()Source: String representation of the diagnostic.

__repr__()Source: Detailed string representation of the diagnostic.

property path: strSource

property simulation_folder: strSource

property all_loaded: boolSource

property maxiter: intSource

property type: strSource

property file_list: list[str] | NoneSource: Return the cached list of HDF5 file paths (read-only).

time(index) → list[float | str]Source

attributes_to_save(index: int = 0) → NoneSource: Prints the attributes of the diagnostic.

property dx: floatSource

property nx: int | ndarraySource

property x: ndarray | list[ndarray]Source

property dt: floatSource

property grid: ndarraySource

property axis: list[dict]Source

property units: strSource

property tunits: strSource

property name: strSource

property dim: intSource

property ndump: intSource

property iter: intSource

property data: ndarraySource

property quantity: strSource

property label: strSource

Key features

Lazy loading: reads only the requested timestep(s) from disk
Time slicing: supports diag[i] and diag[i:j]
Spatial slicing: supports tuple indexing (time + spatial slices) when backed by OSIRIS HDF5
Derived diagnostics: arithmetic between diagnostics produces new diagnostics without immediately loading data
Metadata propagation: grid/time metadata is preserved for derived quantities

Common attributes

dx: grid spacing (float for 1D, array-like for >1D)
nx: number of grid points
x: coordinate arrays
dt: simulation timestep (from file metadata)
ndump: dump interval (from input deck when available; defaults to 1 otherwise)
grid: physical bounds of each axis
axis: axis metadata (names/labels/units)
dim: dimensionality (1/2/3)
maxiter: number of timesteps available for this diagnostic
name / label / units: metadata when available
data: full in-memory array after calling load_all()

Usage examples

Create and access diagnostics

In most cases you should use Simulation rather than creating Diagnostic manually:

from osiris_utils.data import Simulation

sim = Simulation("/path/to/osiris.inp")

e1 = sim["e1"]
ne = sim["electrons"]["n"]

# read a single timestep (lazy)
e1_t10 = e1[10]

# read a time slice (lazy, reads only needed files)
ne_t0_20 = ne[0:20]

Derived diagnostics (lazy)

Operations between diagnostics return a new Diagnostic-like object that can still be indexed lazily:

e1 = sim["e1"]
e2 = sim["e2"]
e3 = sim["e3"]

e_mag = (e1**2 + e2**2 + e3**2) ** 0.5

# computed on demand
e_mag_t10 = e_mag[10]

# can also load full result if desired
e_mag.load_all()
full = e_mag.data

Available Diagnostic Quantities in OSIRIS

OSIRIS provides many diagnostics. In osiris_utils, quantities are exposed via their OSIRIS-style names.

Field quantities

e1, e2, e3 — electric field components
b1, b2, b3 — magnetic field components
(and optionally part_* / ext_* variants if present in the output)

Species-dependent grid quantities

charge — charge density
j1, j2, j3 — current density components
q1, q2, q3 — charge flux components
n — convenience alias for density (implemented via OSIRIS charge with sign convention)

Velocity / moment quantities

vfl1, vfl2, vfl3 — flow velocity components
ufl1, ufl2, ufl3 — momentum components
pressure / temperature tensor components: P11, P12, …, T11, …

Phase space quantities

p1x1, p1x2, … (depends on what was configured in the OSIRIS input)

To list what the package recognizes:

from osiris_utils.data.diagnostic import which_quantities
which_quantities()

Memory-efficient processing

For large simulations, prefer lazy access patterns:

Single timestep: diag[i]
Time slice: diag[i:j]
Spatial ROI: diag[i, :, 100:200] (time + spatial slices)

Only use load_all() when you truly need the full time series in memory.

Derived diagnostics and metadata propagation

Derived diagnostics created by arithmetic preserve important metadata:

grid spacing and coordinates (dx, x, grid)
dimensionality (dim) and timestep count (maxiter)
time metadata (dt, ndump)

This makes derived quantities “first-class” diagnostics that can be:

indexed like any other diagnostic
used in further operations
passed into post-processing routines (FFT, derivatives, MFT, etc.)