Simulation Interface

The Simulation class provides a high-level interface for handling OSIRIS simulation data and accessing diagnostics through a consistent, dictionary-like API.

Simulation Class

class osiris_utils.data.simulation.Simulation(input_deck_path: str)Source

Class to handle the simulation data. It is a wrapper for the Diagnostic class.’

Parameters:

input_deck_path (str) – Path to the input deck (It must be in the folder where the simulation was run)

simulation_folderSource

The simulation folder.

Type:

str

speciesSource

The species to analyze.

Type:

Species object

diagnosticsSource

Dictionary to store diagnostics for each quantity when load_all method is used.

Type:

dict

delete_all_diagnostics()Source

Delete all diagnostics.

delete_diagnostic(key)Source

Delete a diagnostic.

__getitem__(key)Source

Get a diagnostic.

delete_all_diagnostics()Source

Delete all diagnostics.

delete_diagnostic(key)Source

Delete a diagnostic.”

__getitem__(key: str) Diagnostic | Species_HandlerSource
add_diagnostic(diagnostic, name=None)Source

Add a custom diagnostic to the simulation.

Parameters:
  • diagnostic (Diagnostic or array-like) – The diagnostic to add. If not a Diagnostic object, it will be wrapped in a Diagnostic object.

  • name (str, optional) – The name to use as the key for accessing the diagnostic. If None, an auto-generated name will be used.

Returns:

str – The name (key) used to store the diagnostic

Example

>>> sim = Simulation('path/to/simulation/input_deck.txt')
>>> nT = sim['electrons']['n'] * sim['electrons']['T11']
>>> sim.add_diagnostic(nT, 'nT')
>>> sim['nT']  # Access the custom diagnostic
property species: list[str]Source
property loaded_diagnostics: dict[str, Diagnostic]Source

A wrapper class that manages access to multiple diagnostic quantities from a single OSIRIS run.

What Simulation gives you

  • Dictionary-style access to diagnostics:

    • sim["e1"] returns a non-species diagnostic (fields, global quantities, etc.)

    • sim["electrons"]["charge"] returns a species diagnostic

  • Centralized metadata:

    • simulation folder is inferred from the input deck path

    • species list comes from the parsed input deck

  • Caching of created diagnostics:

    • repeated access returns the same wrapper object

    • you can delete cached diagnostics to free memory

Key attributes

  • species: list of species names found in the input deck

  • loaded_diagnostics: dictionary of cached non-species diagnostics

Note

Simulation does not automatically load data. Diagnostics are lazy by default. Data is read only when you index a diagnostic or call load_all().

Usage examples

Basic access (lazy)

from osiris_utils.data import Simulation

sim = Simulation("/path/to/osiris.inp")

# Non-species diagnostic (no data read yet)
e1 = sim["e1"]

# Species diagnostic
charge = sim["electrons"]["charge"]

Load specific timesteps and slices

Diagnostics support:

  • time indexing: diag[i] and diag[i:j]

  • tuple indexing: diag[i, x_slice, y_slice, ...] (time + spatial slicing)

e1 = sim["e1"]

# Single timestep (reads one file)
arr_t10 = e1[10]

# Multiple timesteps (reads only requested files)
arr_t10_20 = e1[10:20]

# Time index + spatial slicing (efficient: reads only slice from disk if supported)
# Example for 2D: (time, x1, x2)
arr_roi = e1[10, :, 100:200]

Loading everything into memory

e1 = sim["e1"]
e1.load_all()          # loads all timesteps into memory
full = e1.data         # numpy array with shape (t, ...)

Cache management

# remove one diagnostic from the simulation cache
sim.delete_diagnostic("e1")

# remove all cached diagnostics
sim.delete_all_diagnostics()

Accessing tracks diagnostics: Tracks_Diagnostic objects do not support operations but they work similarly to the
basic Diagnostic objects.

.. code-block:: python
     # tracks is a Tracks_Diagnostic object
     tracks = sim["electrons"]["tracks"]

     # Print all the data keys you can access
     print(tracks.quants)

     # Access x1 for first 10 particles across all timesteps (lazy)
     print(track_diag["x1"][0:10, :])

     # Load everything into memory
     track_diag.load_all()

     # Access data directly from memory
     print(track_diag["p2"][0:10, :])
     print(track_diag.data["p2"][0:10, :])

Integration with Diagnostics

The workflow is:

  1. sim[quantity] or sim[species][quantity] creates a Diagnostic wrapper (lazy)

  2. Accessing data (via indexing or load_all()) reads from disk

  3. After the first load_all(), the Simulation caches the diagnostic so future access returns the same object

  4. You can remove cached diagnostics explicitly to manage memory

This keeps interactive analysis fast while still scaling to large datasets.

Diagnostic System

The Diagnostic class is the foundation of data handling in osiris_utils. It represents a time series of grid data stored in OSIRIS output files, plus any derived quantities created through operations.

Diagnostic Base Class

class osiris_utils.data.diagnostic.Diagnostic(simulation_folder: str | None = None, species: Species = None, input_deck: InputDeckIO | None = None)Source

Class to handle diagnostics. This is the “base” class of the code. Diagnostics can be loaded from OSIRIS output files, but are also created when performing operations with other diagnostics. Post-processed quantities are also considered diagnostics. This way, we can perform operations with them as well.

Parameters:
  • species (str) – The species to handle the diagnostics.

  • simulation_folder (str) – The path to the simulation folder. This is the path to the folder where the input deck is located.

  • input_deck (str or dict, optional) – The input deck to load the diagnostic attributes. If None, the attributes are loaded from the files. If a string is provided, it is assumed to be the path to the input deck file. If a dict is provided, it is assumed to be the parsed input deck.

speciesSource

The species to handle the diagnostics.

Type:

str

dxSource

The grid spacing in each direction. If the dimension is 1, this is a float. If the dimension is 2 or 3, this is a np.ndarray.

Type:

np.ndarray(float) or float

nxSource

The number of grid points in each direction. If the dimension is 1, this is a int. If the dimension is 2 or 3, this is a np.ndarray.

Type:

np.ndarray(int) or int

xSource

The grid points.

Type:

np.ndarray

dtSource

The time step.

Type:

float

gridSource

The grid boundaries.

Type:

np.ndarray

axisSource

The axis information. Each key is a direction and the value is a dictionary with the keys “name”, “long_name”, “units” and “plot_label”.

Type:

dict

unitsSource

The units of the diagnostic. This info may not be available for all diagnostics, ie, diagnostics resulting from operations and postprocessing.

Type:

str

nameSource

The name of the diagnostic. This info may not be available for all diagnostics, ie, diagnostics resulting from operations and postprocessing.

Type:

str

labelSource

The label of the diagnostic. This info may not be available for all diagnostics, ie, diagnostics resulting from operations and postprocessing.

Type:

str

dimSource

The dimension of the diagnostic.

Type:

int

ndumpSource

The number of steps between dumps (global).

Type:

int

iterSource

The iteration of the first file after file 000000.

Type:

int

maxiterSource

The maximum number of iterations.

Type:

int

tunitsSource

The time units.

Type:

str

pathSource

The path to the diagnostic.

Type:

str

simulation_folderSource

The path to the simulation folder.

Type:

str

all_loadedSource

If the data is already loaded into memory. This is useful to avoid loading the data multiple times.

Type:

bool

dataSource

The diagnostic data. This is created only when the data is loaded into memory.

Type:

np.ndarray

get_quantity(quantity)Source

Get the data for a given quantity.

load_all()Source

Load all data into memory.

load(index)Source

Load data for a given index.

__getitem__(index)Source

Get data for a given index. Does not load the data into memory.

__iter__()Source

Iterate over the data. Does not load the data into memory.

__add__(other)Source

Add two diagnostics.

__sub__(other)Source

Subtract two diagnostics.

__mul__(other)Source

Multiply two diagnostics.

__truediv__(other)Source

Divide two diagnostics.

__pow__(other)Source

Power of a diagnostic.

plot_3d(idx, scale_type='default', boundaries=None)Source

Plot a 3D scatter plot of the diagnostic data.

time(index)Source

Get the time for a given index.

get_quantity(quantity: str) NoneSource

Get the data for a given quantity.

Parameters:

quantity (str) – The quantity to get the data.

load_all(n_workers: int | None = None, use_parallel: bool | None = None, executor_type: Literal['thread', 'process'] = 'thread') ndarraySource

Load all data into memory (all iterations), in a pre-allocated array.

Parameters:
  • n_workers (int, optional) – Number of parallel workers. If None, an HPC-safe default is chosen: - “thread”: min(_MAX_IO_WORKERS, cpu_count, size-1) - “process”: min(cpu_count, size-1)

  • use_parallel (bool, optional) – If True, force parallel loading. If False, force sequential. If None (default), automatically choose based on data size.

  • executor_type ({"thread", "process"}, optional) –

    Executor backend for parallel loading (default “thread”).

    • ”thread” — ThreadPoolExecutor. Correct for I/O-bound HDF5 reads: h5py releases the GIL during reads, so threads genuinely run concurrently. I/O workers are capped at _MAX_IO_WORKERS (default 32) to prevent overwhelming parallel filesystems (Lustre/GPFS on HPC clusters such as Marenostrum 5 or Deucalion).

    • ”process” — ProcessPoolExecutor. Bypasses the GIL entirely; useful when each frame involves heavy CPU computation beyond raw I/O. Each worker opens its own HDF5 handle via the picklable _load_frame_worker helper. Requires self._file_list to be populated (i.e. only valid on the base Diagnostic, not on post-processing wrappers that override _frame).

Returns:

data (np.ndarray) – The data for all iterations. Also stored in self._data.

Notes

On HPC systems using vendor-tuned BLAS/FFTW (MKL on Marenostrum 5, AOCL on Deucalion), post-processing numpy operations (FFT, gradient) are already internally multi-threaded. Prefer executor_type="thread" for the I/O phase and rely on OMP_NUM_THREADS / MKL_NUM_THREADS for the compute phase.

unload() NoneSource

Unload data from memory. This is useful to free memory when the data is not needed anymore.

__len__() intSource

Return the number of timesteps available.

__getitem__(index: int | slice | tuple) ndarraySource

Retrieve timestep data with optional spatial slicing.

Parameters:

index (int, slice, or tuple) –

  • If int, loads full spatial data for that timestep.

  • If slice, supports start:stop:step for time. Zero-length slices return an empty array of shape (0, …).

  • If tuple, first element is time index/slice, remaining elements are spatial slices for each dimension. Example: diag[5, :, 100:200] loads timestep 5 with spatial slicing applied. Efficient: HDF5 reads only the requested spatial slice from disk, not the full array.

Returns:

np.ndarray – Array for that timestep (or stacked array for a slice), optionally spatially sliced.

Raises:
  • IndexError – If the index is out of range or no data generator is available.

  • RuntimeError – If loading a specific timestep fails.

Examples

>>> data = diag[5]              # Load full spatial data for timestep 5
>>> data = diag[5, :, 100:200]  # Load timestep 5 with spatial slice (only reads slice from disk!)
>>> data = diag[0:10]           # Load timesteps 0-9 (full spatial domain)
>>> data = diag[0:10, :, 50:]   # Load timesteps 0-9 with spatial slice
__add__(other: Diagnostic | float | ndarray) DiagnosticSource
__sub__(other: Diagnostic | float | ndarray) DiagnosticSource
__mul__(other: Diagnostic | float | ndarray) DiagnosticSource
__truediv__(other: Diagnostic | float | ndarray) DiagnosticSource
__pow__(other: Diagnostic | float | ndarray) DiagnosticSource

Power operation. Raises the diagnostic data to the power of other. If other is a Diagnostic, it raises each timestep’s data to the corresponding timestep’s power. If other is a scalar or ndarray, it raises all data to that power.

to_h5(savename: str | None = None, index: int | list[int] | None = None, all: bool = False, verbose: bool = False, path: str | None = None) NoneSource

Save the diagnostic data to HDF5 files.

Parameters:
  • savename (str, optional) – The name of the HDF5 file. If None, uses the diagnostic name.

  • index (int, or list of ints, optional) – The index or indices of the data to save.

  • all (bool, optional) – If True, save all data. Default is False.

  • verbose (bool, optional) – If True, print messages about the saving process.

  • path (str, optional) – The path to save the HDF5 files. If None, uses the default save path (in simulation folder).

plot_3d(idx, scale_type: Literal['zero_centered', 'pos', 'neg', 'default'] = 'default', boundaries: ndarray = None)Source

DEPRECATED: Use osiris_utils.vis.plot_3d instead.

Plots a 3D scatter plot of the diagnostic data (grid data).

__str__()Source

String representation of the diagnostic.

__repr__()Source

Detailed string representation of the diagnostic.

property path: strSource
property simulation_folder: strSource
property all_loaded: boolSource
property maxiter: intSource
property type: strSource
property file_list: list[str] | NoneSource

Return the cached list of HDF5 file paths (read-only).

time(index) list[float | str]Source
attributes_to_save(index: int = 0) NoneSource

Prints the attributes of the diagnostic.

property dx: floatSource
property nx: int | ndarraySource
property x: ndarray | list[ndarray]Source
property dt: floatSource
property grid: ndarraySource
property axis: list[dict]Source
property units: strSource
property tunits: strSource
property name: strSource
property dim: intSource
property ndump: intSource
property iter: intSource
property data: ndarraySource
property quantity: strSource
property label: strSource

Key features

  • Lazy loading: reads only the requested timestep(s) from disk

  • Time slicing: supports diag[i] and diag[i:j]

  • Spatial slicing: supports tuple indexing (time + spatial slices) when backed by OSIRIS HDF5

  • Derived diagnostics: arithmetic between diagnostics produces new diagnostics without immediately loading data

  • Metadata propagation: grid/time metadata is preserved for derived quantities

Common attributes

  • dx: grid spacing (float for 1D, array-like for >1D)

  • nx: number of grid points

  • x: coordinate arrays

  • dt: simulation timestep (from file metadata)

  • ndump: dump interval (from input deck when available; defaults to 1 otherwise)

  • grid: physical bounds of each axis

  • axis: axis metadata (names/labels/units)

  • dim: dimensionality (1/2/3)

  • maxiter: number of timesteps available for this diagnostic

  • name / label / units: metadata when available

  • data: full in-memory array after calling load_all()

Usage examples

Create and access diagnostics

In most cases you should use Simulation rather than creating Diagnostic manually:

from osiris_utils.data import Simulation

sim = Simulation("/path/to/osiris.inp")

e1 = sim["e1"]
ne = sim["electrons"]["n"]

# read a single timestep (lazy)
e1_t10 = e1[10]

# read a time slice (lazy, reads only needed files)
ne_t0_20 = ne[0:20]

Derived diagnostics (lazy)

Operations between diagnostics return a new Diagnostic-like object that can still be indexed lazily:

e1 = sim["e1"]
e2 = sim["e2"]
e3 = sim["e3"]

e_mag = (e1**2 + e2**2 + e3**2) ** 0.5

# computed on demand
e_mag_t10 = e_mag[10]

# can also load full result if desired
e_mag.load_all()
full = e_mag.data

Available Diagnostic Quantities in OSIRIS

OSIRIS provides many diagnostics. In osiris_utils, quantities are exposed via their OSIRIS-style names.

Field quantities

  • e1, e2, e3 — electric field components

  • b1, b2, b3 — magnetic field components

  • (and optionally part_* / ext_* variants if present in the output)

Species-dependent grid quantities

  • charge — charge density

  • j1, j2, j3 — current density components

  • q1, q2, q3 — charge flux components

  • n — convenience alias for density (implemented via OSIRIS charge with sign convention)

Velocity / moment quantities

  • vfl1, vfl2, vfl3 — flow velocity components

  • ufl1, ufl2, ufl3 — momentum components

  • pressure / temperature tensor components: P11, P12, …, T11, …

Phase space quantities

  • p1x1, p1x2, … (depends on what was configured in the OSIRIS input)

To list what the package recognizes:

from osiris_utils.data.diagnostic import which_quantities
which_quantities()

Memory-efficient processing

For large simulations, prefer lazy access patterns:

  1. Single timestep: diag[i]

  2. Time slice: diag[i:j]

  3. Spatial ROI: diag[i, :, 100:200] (time + spatial slices)

Only use load_all() when you truly need the full time series in memory.

Derived diagnostics and metadata propagation

Derived diagnostics created by arithmetic preserve important metadata:

  • grid spacing and coordinates (dx, x, grid)

  • dimensionality (dim) and timestep count (maxiter)

  • time metadata (dt, ndump)

This makes derived quantities “first-class” diagnostics that can be:

  • indexed like any other diagnostic

  • used in further operations

  • passed into post-processing routines (FFT, derivatives, MFT, etc.)