Simulation Interface
The Simulation class provides a high-level interface for handling OSIRIS simulation data and accessing diagnostics through a consistent, dictionary-like API.
Simulation Class
- class osiris_utils.data.simulation.Simulation(input_deck_path: str)Source
Class to handle the simulation data. It is a wrapper for the Diagnostic class.’
- Parameters:
input_deck_path (str) – Path to the input deck (It must be in the folder where the simulation was run)
- simulation_folderSource
The simulation folder.
- Type:
str
- speciesSource
The species to analyze.
- Type:
Species object
- diagnosticsSource
Dictionary to store diagnostics for each quantity when load_all method is used.
- Type:
dict
- delete_all_diagnostics()Source
Delete all diagnostics.
- delete_diagnostic(key)Source
Delete a diagnostic.
- __getitem__(key)Source
Get a diagnostic.
- delete_all_diagnostics()Source
Delete all diagnostics.
- delete_diagnostic(key)Source
Delete a diagnostic.”
- __getitem__(key: str) Diagnostic | Species_HandlerSource
- add_diagnostic(diagnostic, name=None)Source
Add a custom diagnostic to the simulation.
- Parameters:
diagnostic (Diagnostic or array-like) – The diagnostic to add. If not a Diagnostic object, it will be wrapped in a Diagnostic object.
name (str, optional) – The name to use as the key for accessing the diagnostic. If None, an auto-generated name will be used.
- Returns:
str – The name (key) used to store the diagnostic
Example
>>> sim = Simulation('path/to/simulation/input_deck.txt') >>> nT = sim['electrons']['n'] * sim['electrons']['T11'] >>> sim.add_diagnostic(nT, 'nT') >>> sim['nT'] # Access the custom diagnostic
- property species: list[str]Source
- property loaded_diagnostics: dict[str, Diagnostic]Source
A wrapper class that manages access to multiple diagnostic quantities from a single OSIRIS run.
What Simulation gives you
Dictionary-style access to diagnostics:
sim["e1"]returns a non-species diagnostic (fields, global quantities, etc.)sim["electrons"]["charge"]returns a species diagnostic
Centralized metadata:
simulation folder is inferred from the input deck path
species list comes from the parsed input deck
Caching of created diagnostics:
repeated access returns the same wrapper object
you can delete cached diagnostics to free memory
Key attributes
species: list of species names found in the input deckloaded_diagnostics: dictionary of cached non-species diagnostics
Note
Simulation does not automatically load data. Diagnostics are lazy by default.
Data is read only when you index a diagnostic or call load_all().
Usage examples
Basic access (lazy)
from osiris_utils.data import Simulation
sim = Simulation("/path/to/osiris.inp")
# Non-species diagnostic (no data read yet)
e1 = sim["e1"]
# Species diagnostic
charge = sim["electrons"]["charge"]
Load specific timesteps and slices
Diagnostics support:
time indexing:
diag[i]anddiag[i:j]tuple indexing:
diag[i, x_slice, y_slice, ...](time + spatial slicing)
e1 = sim["e1"]
# Single timestep (reads one file)
arr_t10 = e1[10]
# Multiple timesteps (reads only requested files)
arr_t10_20 = e1[10:20]
# Time index + spatial slicing (efficient: reads only slice from disk if supported)
# Example for 2D: (time, x1, x2)
arr_roi = e1[10, :, 100:200]
Loading everything into memory
e1 = sim["e1"]
e1.load_all() # loads all timesteps into memory
full = e1.data # numpy array with shape (t, ...)
Cache management
# remove one diagnostic from the simulation cache
sim.delete_diagnostic("e1")
# remove all cached diagnostics
sim.delete_all_diagnostics()
Accessing tracks diagnostics: Tracks_Diagnostic objects do not support operations but they work similarly to the
basic Diagnostic objects.
.. code-block:: python
# tracks is a Tracks_Diagnostic object
tracks = sim["electrons"]["tracks"]
# Print all the data keys you can access
print(tracks.quants)
# Access x1 for first 10 particles across all timesteps (lazy)
print(track_diag["x1"][0:10, :])
# Load everything into memory
track_diag.load_all()
# Access data directly from memory
print(track_diag["p2"][0:10, :])
print(track_diag.data["p2"][0:10, :])
Integration with Diagnostics
The workflow is:
sim[quantity]orsim[species][quantity]creates aDiagnosticwrapper (lazy)Accessing data (via indexing or
load_all()) reads from diskAfter the first
load_all(), theSimulationcaches the diagnostic so future access returns the same objectYou can remove cached diagnostics explicitly to manage memory
This keeps interactive analysis fast while still scaling to large datasets.
Diagnostic System
The Diagnostic class is the foundation of data handling in osiris_utils. It represents a time series of grid data stored in OSIRIS output files, plus any derived quantities created through operations.
Diagnostic Base Class
- class osiris_utils.data.diagnostic.Diagnostic(simulation_folder: str | None = None, species: Species = None, input_deck: InputDeckIO | None = None)Source
Class to handle diagnostics. This is the “base” class of the code. Diagnostics can be loaded from OSIRIS output files, but are also created when performing operations with other diagnostics. Post-processed quantities are also considered diagnostics. This way, we can perform operations with them as well.
- Parameters:
species (str) – The species to handle the diagnostics.
simulation_folder (str) – The path to the simulation folder. This is the path to the folder where the input deck is located.
input_deck (str or dict, optional) – The input deck to load the diagnostic attributes. If None, the attributes are loaded from the files. If a string is provided, it is assumed to be the path to the input deck file. If a dict is provided, it is assumed to be the parsed input deck.
- speciesSource
The species to handle the diagnostics.
- Type:
str
- dxSource
The grid spacing in each direction. If the dimension is 1, this is a float. If the dimension is 2 or 3, this is a np.ndarray.
- Type:
np.ndarray(float) or float
- nxSource
The number of grid points in each direction. If the dimension is 1, this is a int. If the dimension is 2 or 3, this is a np.ndarray.
- Type:
np.ndarray(int) or int
- xSource
The grid points.
- Type:
np.ndarray
- dtSource
The time step.
- Type:
float
- gridSource
The grid boundaries.
- Type:
np.ndarray
- axisSource
The axis information. Each key is a direction and the value is a dictionary with the keys “name”, “long_name”, “units” and “plot_label”.
- Type:
dict
- unitsSource
The units of the diagnostic. This info may not be available for all diagnostics, ie, diagnostics resulting from operations and postprocessing.
- Type:
str
- nameSource
The name of the diagnostic. This info may not be available for all diagnostics, ie, diagnostics resulting from operations and postprocessing.
- Type:
str
- labelSource
The label of the diagnostic. This info may not be available for all diagnostics, ie, diagnostics resulting from operations and postprocessing.
- Type:
str
- dimSource
The dimension of the diagnostic.
- Type:
int
- ndumpSource
The number of steps between dumps (global).
- Type:
int
- iterSource
The iteration of the first file after file 000000.
- Type:
int
- maxiterSource
The maximum number of iterations.
- Type:
int
- tunitsSource
The time units.
- Type:
str
- pathSource
The path to the diagnostic.
- Type:
str
- simulation_folderSource
The path to the simulation folder.
- Type:
str
- all_loadedSource
If the data is already loaded into memory. This is useful to avoid loading the data multiple times.
- Type:
bool
- dataSource
The diagnostic data. This is created only when the data is loaded into memory.
- Type:
np.ndarray
- get_quantity(quantity)Source
Get the data for a given quantity.
- load_all()Source
Load all data into memory.
- load(index)Source
Load data for a given index.
- __getitem__(index)Source
Get data for a given index. Does not load the data into memory.
- __iter__()Source
Iterate over the data. Does not load the data into memory.
- __add__(other)Source
Add two diagnostics.
- __sub__(other)Source
Subtract two diagnostics.
- __mul__(other)Source
Multiply two diagnostics.
- __truediv__(other)Source
Divide two diagnostics.
- __pow__(other)Source
Power of a diagnostic.
- plot_3d(idx, scale_type='default', boundaries=None)Source
Plot a 3D scatter plot of the diagnostic data.
- time(index)Source
Get the time for a given index.
- get_quantity(quantity: str) NoneSource
Get the data for a given quantity.
- Parameters:
quantity (str) – The quantity to get the data.
- load_all(n_workers: int | None = None, use_parallel: bool | None = None, executor_type: Literal['thread', 'process'] = 'thread') ndarraySource
Load all data into memory (all iterations), in a pre-allocated array.
- Parameters:
n_workers (int, optional) – Number of parallel workers. If None, an HPC-safe default is chosen: - “thread”: min(_MAX_IO_WORKERS, cpu_count, size-1) - “process”: min(cpu_count, size-1)
use_parallel (bool, optional) – If True, force parallel loading. If False, force sequential. If None (default), automatically choose based on data size.
executor_type ({"thread", "process"}, optional) –
Executor backend for parallel loading (default “thread”).
”thread” —
ThreadPoolExecutor. Correct for I/O-bound HDF5 reads: h5py releases the GIL during reads, so threads genuinely run concurrently. I/O workers are capped at_MAX_IO_WORKERS(default 32) to prevent overwhelming parallel filesystems (Lustre/GPFS on HPC clusters such as Marenostrum 5 or Deucalion).”process” —
ProcessPoolExecutor. Bypasses the GIL entirely; useful when each frame involves heavy CPU computation beyond raw I/O. Each worker opens its own HDF5 handle via the picklable_load_frame_workerhelper. Requiresself._file_listto be populated (i.e. only valid on the baseDiagnostic, not on post-processing wrappers that override_frame).
- Returns:
data (np.ndarray) – The data for all iterations. Also stored in self._data.
Notes
On HPC systems using vendor-tuned BLAS/FFTW (MKL on Marenostrum 5, AOCL on Deucalion), post-processing numpy operations (FFT, gradient) are already internally multi-threaded. Prefer
executor_type="thread"for the I/O phase and rely onOMP_NUM_THREADS/MKL_NUM_THREADSfor the compute phase.
- unload() NoneSource
Unload data from memory. This is useful to free memory when the data is not needed anymore.
- __len__() intSource
Return the number of timesteps available.
- __getitem__(index: int | slice | tuple) ndarraySource
Retrieve timestep data with optional spatial slicing.
- Parameters:
index (int, slice, or tuple) –
If int, loads full spatial data for that timestep.
If slice, supports start:stop:step for time. Zero-length slices return an empty array of shape (0, …).
If tuple, first element is time index/slice, remaining elements are spatial slices for each dimension. Example: diag[5, :, 100:200] loads timestep 5 with spatial slicing applied. Efficient: HDF5 reads only the requested spatial slice from disk, not the full array.
- Returns:
np.ndarray – Array for that timestep (or stacked array for a slice), optionally spatially sliced.
- Raises:
IndexError – If the index is out of range or no data generator is available.
RuntimeError – If loading a specific timestep fails.
Examples
>>> data = diag[5] # Load full spatial data for timestep 5 >>> data = diag[5, :, 100:200] # Load timestep 5 with spatial slice (only reads slice from disk!) >>> data = diag[0:10] # Load timesteps 0-9 (full spatial domain) >>> data = diag[0:10, :, 50:] # Load timesteps 0-9 with spatial slice
- __add__(other: Diagnostic | float | ndarray) DiagnosticSource
- __sub__(other: Diagnostic | float | ndarray) DiagnosticSource
- __mul__(other: Diagnostic | float | ndarray) DiagnosticSource
- __truediv__(other: Diagnostic | float | ndarray) DiagnosticSource
- __pow__(other: Diagnostic | float | ndarray) DiagnosticSource
Power operation. Raises the diagnostic data to the power of other. If other is a Diagnostic, it raises each timestep’s data to the corresponding timestep’s power. If other is a scalar or ndarray, it raises all data to that power.
- to_h5(savename: str | None = None, index: int | list[int] | None = None, all: bool = False, verbose: bool = False, path: str | None = None) NoneSource
Save the diagnostic data to HDF5 files.
- Parameters:
savename (str, optional) – The name of the HDF5 file. If None, uses the diagnostic name.
index (int, or list of ints, optional) – The index or indices of the data to save.
all (bool, optional) – If True, save all data. Default is False.
verbose (bool, optional) – If True, print messages about the saving process.
path (str, optional) – The path to save the HDF5 files. If None, uses the default save path (in simulation folder).
- plot_3d(idx, scale_type: Literal['zero_centered', 'pos', 'neg', 'default'] = 'default', boundaries: ndarray = None)Source
DEPRECATED: Use osiris_utils.vis.plot_3d instead.
Plots a 3D scatter plot of the diagnostic data (grid data).
- __str__()Source
String representation of the diagnostic.
- __repr__()Source
Detailed string representation of the diagnostic.
- property path: strSource
- property simulation_folder: strSource
- property all_loaded: boolSource
- property maxiter: intSource
- property type: strSource
- property file_list: list[str] | NoneSource
Return the cached list of HDF5 file paths (read-only).
- time(index) list[float | str]Source
- attributes_to_save(index: int = 0) NoneSource
Prints the attributes of the diagnostic.
- property dx: floatSource
- property nx: int | ndarraySource
- property x: ndarray | list[ndarray]Source
- property dt: floatSource
- property grid: ndarraySource
- property axis: list[dict]Source
- property units: strSource
- property tunits: strSource
- property name: strSource
- property dim: intSource
- property ndump: intSource
- property iter: intSource
- property data: ndarraySource
- property quantity: strSource
- property label: strSource
Key features
Lazy loading: reads only the requested timestep(s) from disk
Time slicing: supports
diag[i]anddiag[i:j]Spatial slicing: supports tuple indexing (time + spatial slices) when backed by OSIRIS HDF5
Derived diagnostics: arithmetic between diagnostics produces new diagnostics without immediately loading data
Metadata propagation: grid/time metadata is preserved for derived quantities
Common attributes
dx: grid spacing (float for 1D, array-like for >1D)nx: number of grid pointsx: coordinate arraysdt: simulation timestep (from file metadata)ndump: dump interval (from input deck when available; defaults to 1 otherwise)grid: physical bounds of each axisaxis: axis metadata (names/labels/units)dim: dimensionality (1/2/3)maxiter: number of timesteps available for this diagnosticname/label/units: metadata when availabledata: full in-memory array after callingload_all()
Usage examples
Create and access diagnostics
In most cases you should use Simulation rather than creating Diagnostic manually:
from osiris_utils.data import Simulation
sim = Simulation("/path/to/osiris.inp")
e1 = sim["e1"]
ne = sim["electrons"]["n"]
# read a single timestep (lazy)
e1_t10 = e1[10]
# read a time slice (lazy, reads only needed files)
ne_t0_20 = ne[0:20]
Derived diagnostics (lazy)
Operations between diagnostics return a new Diagnostic-like object that can still be indexed lazily:
e1 = sim["e1"]
e2 = sim["e2"]
e3 = sim["e3"]
e_mag = (e1**2 + e2**2 + e3**2) ** 0.5
# computed on demand
e_mag_t10 = e_mag[10]
# can also load full result if desired
e_mag.load_all()
full = e_mag.data
Available Diagnostic Quantities in OSIRIS
OSIRIS provides many diagnostics. In osiris_utils, quantities are exposed via their OSIRIS-style names.
Field quantities
e1,e2,e3— electric field componentsb1,b2,b3— magnetic field components(and optionally
part_*/ext_*variants if present in the output)
Species-dependent grid quantities
charge— charge densityj1,j2,j3— current density componentsq1,q2,q3— charge flux componentsn— convenience alias for density (implemented via OSIRIS charge with sign convention)
Velocity / moment quantities
vfl1,vfl2,vfl3— flow velocity componentsufl1,ufl2,ufl3— momentum componentspressure / temperature tensor components:
P11,P12, …,T11, …
Phase space quantities
p1x1,p1x2, … (depends on what was configured in the OSIRIS input)
To list what the package recognizes:
from osiris_utils.data.diagnostic import which_quantities
which_quantities()
Memory-efficient processing
For large simulations, prefer lazy access patterns:
Single timestep:
diag[i]Time slice:
diag[i:j]Spatial ROI:
diag[i, :, 100:200](time + spatial slices)
Only use load_all() when you truly need the full time series in memory.
Derived diagnostics and metadata propagation
Derived diagnostics created by arithmetic preserve important metadata:
grid spacing and coordinates (
dx,x,grid)dimensionality (
dim) and timestep count (maxiter)time metadata (
dt,ndump)
This makes derived quantities “first-class” diagnostics that can be:
indexed like any other diagnostic
used in further operations
passed into post-processing routines (FFT, derivatives, MFT, etc.)