Post-Processing Framework

The osiris_utils package provides a lightweight framework for post-processing OSIRIS simulation data. Post-processors are implemented as wrappers around the existing Simulation / Diagnostic interface so that:

  • indexing works the same way everywhere (lazy, sliceable, tuple-sliceable)

  • arithmetic between diagnostics keeps working (+, -, *, /, etc.)

  • post-processors can be chained when it makes sense (e.g., derivatives)

This page documents the framework and the main post-processors currently available.

PostProcess Base Class

class osiris_utils.postprocessing.postprocess.PostProcess(name: str, simulation: Any)Source

Bases: object

Base class for post-processing wrappers (FFT_Simulation, Derivative_Simulation, etc).

It wraps a Simulation-like object (Simulation or another PostProcess) and exposes:
  • .species

  • .loaded_diagnostics

  • .add_diagnostic()

Subclasses implement __getitem__ to return processed diagnostics.

property loaded_diagnostics: dictSource

Delegate to wrapped simulation if it has loaded_diagnostics. Otherwise return empty dict.

add_diagnostic(diagnostic: Diagnostic, name: str | None = None) strSource

Delegate add_diagnostic downwards. This lets you do: sim_fft.add_diagnostic(…)

The PostProcess class is the base class for all simulation-level post-processors.

Design goals

A post-processor should behave like a Simulation:

  • pp["e3"] returns a diagnostic-like object (usually a Diagnostic subclass)

  • pp["electrons"]["n"] returns a species diagnostic

  • it caches created wrappers so repeated access is cheap

Most post-processors follow the same pattern:

  1. A simulation wrapper (Something_Simulation) inheriting from PostProcess

  2. A diagnostic wrapper (Something_Diagnostic) inheriting from Diagnostic

  3. An optional species handler (Something_Species_Handler) for species-dependent quantities

Integration with OSIRIS Diagnostics

Post-processors are designed to work seamlessly with the OSIRIS diagnostic system:

  • Chaining (when appropriate): some post-processors can wrap other post-processors (e.g., derivatives of already-derived quantities).

  • Compatibility with visualization: results remain Diagnostic-like and can be plotted the same way.

  • Uniform indexing contract:

    • diag[i] returns a single timestep np.ndarray (shape (x, y, z) depending on dimension)

    • diag[i:j] returns a stacked time array (shape (t, x, y, z))

    • diag[i, :, 100:200] uses tuple indexing (time index + spatial slices)

Important implementation note

The base Diagnostic.__getitem__ already implements:

  • int and slice time indexing

  • tuple indexing (time + spatial slices)

  • fast path for in-memory data

So post-processed diagnostics should generally implement:

  • load_all() to build self._data (shape includes time)

  • _frame(index, data_slice=None) to return a single timestep lazily (shape excludes time)

You usually do not need to override __getitem__ in post-processed diagnostics.

Derivative Post-Processing

The Derivative_Simulation module provides tools for computing time and spatial derivatives of diagnostics.

Derivative_Simulation Class

class osiris_utils.postprocessing.derivative.Derivative_Simulation(simulation: Any, deriv_type: str, axis: int | tuple[int, int] | None = None, order: int = 4, stencil: Iterable[int] | None = None, deriv_order: int = 1, periodic: bool = False)Source

Bases: PostProcess

Class to compute the derivative of a diagnostic. Works as a wrapper for the Derivative_Diagnostic class. Inherits from PostProcess to ensure all operation overloads work properly.

This class can be initialized with either a Simulation object or another Derivative_Simulation object, allowing for chaining derivatives (e.g., second derivative = Derivative_Simulation(Derivative_Simulation(…))).

Parameters:
  • simulation (Simulation or Derivative_Simulation) – The simulation object or another derivative simulation object.

  • deriv_type (str) – The type of derivative to compute. Options are: - ‘t’ for time derivative. - ‘x1’ for first spatial derivative. - ‘x2’ for second spatial derivative. - ‘x3’ for third spatial derivative. - ‘xx’ for second spatial derivative in two axis. - ‘xt’ for mixed derivative in time and one spatial axis. - ‘tx’ for mixed derivative in one spatial axis and time.

  • axis (int or tuple) – The axis to compute the derivative. Only used for ‘xx’, ‘xt’ and ‘tx’ types.

  • order (int) – Scheme selector (2 or 4). Ignored if stencil is provided.

  • stencil (Iterable[int] or None) – Integer offsets (in grid steps), e.g. [-2,-1,0,1,2] or [0,1,2,3].

  • deriv_order (int) – Derivative order (1 for first derivative, 2 for second derivative, …).

  • periodic (bool) – If True, derivatives are computed with periodic boundary conditions.

property species: listSource

Return list of species, making this compatible with Simulation interface

property loaded_diagnostics: dictSource

Return loaded diagnostics, making this compatible with Simulation interface

add_diagnostic(diagnostic: Diagnostic, name: str | None = None) strSource

Add a custom diagnostic to the derivative simulation.

Derivative_Simulation behaves like an operator acting on all diagnostics inside a simulation:

  • wraps a Simulation (or a compatible post-process wrapper)

  • returns derivative diagnostics on demand

  • supports species and non-species diagnostics

  • supports lazy evaluation and efficient spatial slicing

  • may be chained (e.g., derivative of a derivative)

Derivative types

  • t : time derivative \(\\partial/\\partial t\)

  • x1 : spatial derivative \(\\partial/\\partial x_1\)

  • x2 : spatial derivative \(\\partial/\\partial x_2\)

  • x3 : spatial derivative \(\\partial/\\partial x_3\)

  • xx : second derivative in two spatial axes (mixed or repeated), e.g. \(\\partial^2/\\partial x_i\\partial x_j\)

  • xt : \(\\partial/\\partial x\) of \(\\partial/\\partial t\)

  • tx : \(\\partial/\\partial t\) of \(\\partial/\\partial x\)

Usage examples

from osiris_utils.data import Simulation
from osiris_utils.postprocessing import Derivative_Simulation

sim = Simulation("/path/to/input/deck.inp")

# Spatial derivative d/dx1 of every diagnostic
dx1 = Derivative_Simulation(sim, "x1")

dE1_dx1 = dx1["e1"]
dvfl1_dx1 = dx1["electrons"]["vfl1"]

# Single timestep
arr = dE1_dx1[5]

# Time slice
arr_t = dE1_dx1[5:10]

# Spatial slicing (efficient if underlying diagnostic supports it)
arr_slice = dE1_dx1[5, :, 100:200]

# Full in-memory derivative (use only when needed)
dE1_dx1.load_all()

Implementation details and accuracy

  • Spatial derivatives use finite differences (2nd or 4th order depending on configuration).

  • Time derivatives must account for OSIRIS ndump:

    effective \(\\Delta t = dt \\times ndump\)

  • For mixed derivatives (tx, xt), the implementation applies the corresponding operator ordering.

Performance tips

  • Prefer indexing (lazy computation) when exploring data.

  • Use spatial slices to reduce IO and compute time.

  • Call load_all() only when you truly need the full time history in memory.

Derivative_Diagnostic Class

class osiris_utils.postprocessing.derivative.Derivative_Diagnostic(diagnostic: Diagnostic, deriv_type: str, axis: int | tuple[int, int] | None = None, order: int = 4, stencil: Iterable[int] | None = None, deriv_order: int = 1, periodic: bool = False)Source

Bases: Diagnostic

Auxiliar class to compute the derivative of a diagnostic, for it to be similar in behavior to a Diagnostic object. Inherits directly from Diagnostic to ensure all operation overloads work properly.

Parameters:
  • diagnostic (Diagnostic) – The diagnostic object.

  • deriv_type (str) – The type of derivative to compute. Options are: ‘t’, ‘x1’, ‘x2’, ‘x3’, ‘xx’, ‘xt’ and ‘tx’.

  • axis (int or tuple) – The axis to compute the derivative. Only used for ‘xx’, ‘xt’ and ‘tx’ types

  • order (int) – Scheme selector (2 or 4). Ignored if stencil is provided.

  • stencil (Iterable[int] or None) – Integer offsets (in grid steps), e.g. [-2,-1,0,1,2] or [0,1,2,3].

  • deriv_order (int) – Derivative order (1 for first derivative, 2 for second derivative, …). Only used if stencil is provided.

  • periodic (bool) – If True, derivatives are computed with periodic boundary conditions.

load_all()Source

Load all the data and compute the derivative.

__getitem__(index)Source

Get data at a specific index.

load_all(n_workers: int | None = None) ndarraySource

Load all data and compute the derivative.

Parameters:

n_workers (int or None) –

Number of ProcessPoolExecutor workers for spatial derivatives (deriv_type in {"x1", "x2", "x3"}). Spatial derivatives are independent across the time axis, so the data is split into n_workers chunks along axis 0 and processed in parallel.

  • None (default) — single-process (safe everywhere, uses NumPy’s internal threading via MKL/OpenBLAS).

  • > 1 — activates chunked ProcessPoolExecutor. Uses the forkserver multiprocessing context, which is MPI-safe on HPC nodes (unlike fork, which can deadlock with MPI). Set NUMBA_NUM_THREADS / OMP_NUM_THREADS to 1 per worker when also using Numba to avoid thread over-subscription.

Time and mixed derivatives ("t", "xx", "xt", "tx") always run single-process because the stencil spans neighbouring time steps.

A derivative diagnostic is a Diagnostic wrapper that computes derivatives lazily using _frame and can compute eagerly using load_all. It preserves metadata (grid spacing, dimensions, etc.) from the original diagnostic and remains compatible with diagnostic arithmetic.

Spectral Analysis with Fast Fourier Transform

The FFT_Simulation module provides tools for computing power spectra using the Fast Fourier Transform (FFT).

FFT_Simulation Class

class osiris_utils.postprocessing.fft.FFT_Simulation(simulation: Simulation, fft_axis: int | list[int])Source

Bases: PostProcess

Class to handle the Fast Fourier Transform on data. Works as a wrapper for the FFT_Diagnostic class. Inherits from PostProcess to ensure all operation overloads work properly.

Parameters:
  • simulation (Simulation) – The simulation object.

  • fft_axis (int or list of int) – The axis or axes to compute the FFT.

process(diagnostic: Diagnostic) FFT_DiagnosticSource

Apply FFT to a diagnostic

FFT_Simulation is a simulation wrapper that returns FFT_Diagnostic objects.

Key points about FFT axes

  • OSIRIS diagnostics loaded with load_all() have shape (t, x1, x2, x3) (depending on dimension). In that case: - axis 0 is time - axes 1,2,3 are spatial

  • A single timestep returned by indexing has no time dimension, only spatial: - axes become (x1, x2, x3) → numpy axes 0,1,2

Important constraint

  • If the FFT includes the time axis (axis 0), you cannot compute it from a single timestep. Time-domain FFT requires the full time series, therefore it requires load_all() on the FFT diagnostic.

  • Spatial FFTs can be computed per-timestep lazily.

Usage example

from osiris_utils.data import Simulation
from osiris_utils.postprocessing import FFT_Simulation

sim = Simulation("/path/to/input/deck.inp")

# Spatial spectrum at each timestep (FFT along x1)
fft_x1 = FFT_Simulation(sim, 1)
e3_k = fft_x1["e3"]
k_spectrum_t10 = e3_k[10]  # OK: spatial FFT at timestep 10

# Dispersion-like spectrum (time + space) requires load_all
fft_wk = FFT_Simulation(sim, (0, 1))
e3_wk = fft_wk["e3"]
e3_wk.load_all()           # required
P = e3_wk.data             # power spectrum |FFT|^2

FFT_Diagnostic Class

class osiris_utils.postprocessing.fft.FFT_Diagnostic(diagnostic: Diagnostic, fft_axis: int | list[int], *, window: str | None = 'hann', window_for_spatial: str | None = None, detrend: str | None = 'mean', normalize: str = 'ortho', assume_periodic: bool = False)Source

Bases: Diagnostic

Auxiliar class to compute the FFT of a diagnostic, for it to be similar in behavior to a Diagnostic object. Inherits directly from Diagnostic to ensure all operation overloads work properly.

Parameters:
  • diagnostic (Diagnostic) – The diagnostic to compute the FFT.

  • axis (int) – The axis to compute the FFT.

  • window (str or None, optional) – The window to apply before computing the FFT. Can be “hann” or None (default “hann”).

  • detrend (str or None, optional) – The detrending method to apply before computing the FFT. Can be “mean” or None (default “mean”).

  • normalize (str, optional) – The normalization method for the FFT. Can be “none”, “ortho”, or “density” (default “ortho”).

  • assume_periodic (bool, optional) – Whether to assume the data is periodic when choosing default windowing for spatial FFTs (default True). If True, no spatial window is applied by default; if False, the same window is

load_all() np.ndarraySource

Load all data and compute FFT (possibly including time axis). Returns power spectrum |FFT|^2 stored in self._data.

_frame(index: int, data_slice: tuple | None = None) np.ndarraySource

Per-timestep (lazy) FFT. Only allowed for spatial FFT. If fft_axis includes time (0), user must call load_all(). Returns power spectrum |FFT|^2.

omega() np.ndarraySource

Get the angular frequency array for the FFT along the time dimension.

k(axis: int | None = None) np.ndarray | dict[int, np.ndarray]Source

Get the wavenumber array for the FFT along spatial dimension(s).

load_all() ndarraySource

Load all data and compute FFT (possibly including time axis). Returns power spectrum |FFT|^2 stored in self._data.

omega() ndarraySource

Get the angular frequency array for the FFT along the time dimension.

k(axis: int | None = None) ndarray | dict[int, ndarray]Source

Get the wavenumber array for the FFT along spatial dimension(s).

Parameters:

axis (int or None, optional) – The spatial axis to compute wavenumber for (1, 2, or 3). If None, returns wavenumbers for all spatial axes in fft_axis.

Returns:

np.ndarray or dict – If axis is specified: wavenumber array for that axis. If axis is None: dictionary mapping axis -> wavenumber array.

Notes

When load_all() is used, time axis is 0 and spatial axes are 1,2,3. When accessing single timesteps, spatial axes are 0,1,2.

property kmax: float | ndarraySource

Nyquist wavenumber (rad/m). Upper limit of representable k.

property omega_maxSource

Nyquist angular frequency (rad/s). Upper limit of representable ω.

A specialized diagnostic representing an FFT-based power spectrum.

Behavior

  • load_all() computes the FFT over the requested axes (including time if axis 0 is included).

  • _frame(index, data_slice=None) computes a per-timestep FFT only for spatial axes.

  • The returned data is a power spectrum (typically \(|\\mathrm{FFT}|^2\)) and is shifted using fftshift so that zero frequency / wavenumber is centered.

Frequency / wavenumber arrays

  • Use omega() for the time-frequency axis (when axis 0 was transformed and data is loaded).

  • Use k(axis=...) for spatial wavenumbers.

  • kmax reports the Nyquist wavenumber \(\\pi/\\Delta x\) for the relevant spatial axes.

Example: Dispersion relation (ω–k)

from osiris_utils.data import Simulation
from osiris_utils.postprocessing import FFT_Simulation
import numpy as np
import matplotlib.pyplot as plt

sim = Simulation("/path/to/input/deck.inp")

fft = FFT_Simulation(sim, (0, 1))
e1_fft = fft["e1"]
e1_fft.load_all()

omega = e1_fft.omega()      # angular frequency axis (shifted)
k1 = e1_fft.k(1)            # k for x1 axis (shifted)

P = np.log10(e1_fft.data + 1e-30)  # safer for plotting

plt.figure()
plt.pcolormesh(k1, omega, P, shading="auto")
plt.xlabel("k1")
plt.ylabel("ω")
plt.title("Dispersion-like spectrum: E1")
plt.show()

Mean Field Theory Analysis

The Mean Field Theory (MFT) module provides tools for decomposing diagnostics into an average (mean) component and fluctuations along a chosen axis.

MFT_Simulation Class

class osiris_utils.postprocessing.mft.MFT_Simulation(simulation: Simulation, mft_axis: int | None = None)Source

Bases: PostProcess

Class to compute the Mean Field Theory approximation of a diagnostic. Works as a wrapper for the MFT_Diagnostic class. Inherits from PostProcess to ensure all operation overloads work properly.

Parameters:
  • simulation (Simulation) – The simulation object.

  • mft_axis (int) – The axis to compute the mean field theory.

The MFT decomposition returns a container-like diagnostic with two components:

  • "avg" : mean field \(\\langle A \\rangle\)

  • "delta" : fluctuations \(\\delta A = A - \\langle A \\rangle\)

Usage example:

from osiris_utils.data import Simulation
from osiris_utils.postprocessing import MFT_Simulation

sim = Simulation("/path/to/input/deck.inp")

mft = MFT_Simulation(sim, 1)       # average along x1 direction
e1 = mft["e1"]

e1_avg = e1["avg"]
e1_delta = e1["delta"]

avg_t10 = e1_avg[10]
flt_t10 = e1_delta[10]

MFT_Diagnostic Classes

class osiris_utils.postprocessing.mft.MFT_Diagnostic(diagnostic: Diagnostic, mft_axis: int)Source

Bases: Diagnostic

Container: gives access to “avg” and “delta” diagnostics. Not itself a time-indexed diagnostic.

__getitem__(key: str) DiagnosticSource

Retrieve timestep data with optional spatial slicing.

Parameters:

index (int, slice, or tuple) –

  • If int, loads full spatial data for that timestep.

  • If slice, supports start:stop:step for time. Zero-length slices return an empty array of shape (0, …).

  • If tuple, first element is time index/slice, remaining elements are spatial slices for each dimension. Example: diag[5, :, 100:200] loads timestep 5 with spatial slicing applied. Efficient: HDF5 reads only the requested spatial slice from disk, not the full array.

Returns:

np.ndarray – Array for that timestep (or stacked array for a slice), optionally spatially sliced.

Raises:
  • IndexError – If the index is out of range or no data generator is available.

  • RuntimeError – If loading a specific timestep fails.

Examples

>>> data = diag[5]              # Load full spatial data for timestep 5
>>> data = diag[5, :, 100:200]  # Load timestep 5 with spatial slice (only reads slice from disk!)
>>> data = diag[0:10]           # Load timesteps 0-9 (full spatial domain)
>>> data = diag[0:10, :, 50:]   # Load timesteps 0-9 with spatial slice
class osiris_utils.postprocessing.mft.MFT_Diagnostic_Average(diagnostic: Diagnostic, mft_axis: int)Source

Bases: Diagnostic

Class to compute the average component of mean field theory. Inherits from Diagnostic to ensure all operation overloads work properly.

Parameters:
  • diagnostic (Diagnostic) – The diagnostic object.

  • mft_axis (int) – The axis to compute the mean field theory.

load_all() np.ndarraySource

Load all data into memory (all iterations), in a pre-allocated array.

Parameters:
  • n_workers (int, optional) – Number of parallel workers. If None, an HPC-safe default is chosen: - “thread”: min(_MAX_IO_WORKERS, cpu_count, size-1) - “process”: min(cpu_count, size-1)

  • use_parallel (bool, optional) – If True, force parallel loading. If False, force sequential. If None (default), automatically choose based on data size.

  • executor_type ({"thread", "process"}, optional) –

    Executor backend for parallel loading (default “thread”).

    • ”thread” — ThreadPoolExecutor. Correct for I/O-bound HDF5 reads: h5py releases the GIL during reads, so threads genuinely run concurrently. I/O workers are capped at _MAX_IO_WORKERS (default 32) to prevent overwhelming parallel filesystems (Lustre/GPFS on HPC clusters such as Marenostrum 5 or Deucalion).

    • ”process” — ProcessPoolExecutor. Bypasses the GIL entirely; useful when each frame involves heavy CPU computation beyond raw I/O. Each worker opens its own HDF5 handle via the picklable _load_frame_worker helper. Requires self._file_list to be populated (i.e. only valid on the base Diagnostic, not on post-processing wrappers that override _frame).

Returns:

data (np.ndarray) – The data for all iterations. Also stored in self._data.

Notes

On HPC systems using vendor-tuned BLAS/FFTW (MKL on Marenostrum 5, AOCL on Deucalion), post-processing numpy operations (FFT, gradient) are already internally multi-threaded. Prefer executor_type="thread" for the I/O phase and rely on OMP_NUM_THREADS / MKL_NUM_THREADS for the compute phase.

class osiris_utils.postprocessing.mft.MFT_Diagnostic_Fluctuations(diagnostic: Diagnostic, mft_axis: int)Source

Bases: Diagnostic

delta = f - avg, where avg uses keepdims to broadcast.

load_all() np.ndarraySource

Load all data into memory (all iterations), in a pre-allocated array.

Parameters:
  • n_workers (int, optional) – Number of parallel workers. If None, an HPC-safe default is chosen: - “thread”: min(_MAX_IO_WORKERS, cpu_count, size-1) - “process”: min(cpu_count, size-1)

  • use_parallel (bool, optional) – If True, force parallel loading. If False, force sequential. If None (default), automatically choose based on data size.

  • executor_type ({"thread", "process"}, optional) –

    Executor backend for parallel loading (default “thread”).

    • ”thread” — ThreadPoolExecutor. Correct for I/O-bound HDF5 reads: h5py releases the GIL during reads, so threads genuinely run concurrently. I/O workers are capped at _MAX_IO_WORKERS (default 32) to prevent overwhelming parallel filesystems (Lustre/GPFS on HPC clusters such as Marenostrum 5 or Deucalion).

    • ”process” — ProcessPoolExecutor. Bypasses the GIL entirely; useful when each frame involves heavy CPU computation beyond raw I/O. Each worker opens its own HDF5 handle via the picklable _load_frame_worker helper. Requires self._file_list to be populated (i.e. only valid on the base Diagnostic, not on post-processing wrappers that override _frame).

Returns:

data (np.ndarray) – The data for all iterations. Also stored in self._data.

Notes

On HPC systems using vendor-tuned BLAS/FFTW (MKL on Marenostrum 5, AOCL on Deucalion), post-processing numpy operations (FFT, gradient) are already internally multi-threaded. Prefer executor_type="thread" for the I/O phase and rely on OMP_NUM_THREADS / MKL_NUM_THREADS for the compute phase.

Field Centering

The FieldCentering module provides tools to convert electromagnetic fields from the Yee mesh locations to cell centers.

FieldCentering_Simulation Class

class osiris_utils.postprocessing.field_centering.FieldCentering_Simulation(simulation: Simulation)Source

Bases: PostProcess

Class to handle the field centering on data.It converts fields from the Osiris yee mesh to the center of the cells. Works as a wrapper for the FieldCentering_Diagnostic class. Inherits from PostProcess to ensure all operation overloads work properly.

It only works for periodic boundaries.

Parameters:
  • simulation (Simulation) – The simulation object.

  • field (str) – The field to center.

Field centering is a purely spatial operation and can be computed lazily per timestep.

Usage example:

from osiris_utils.data import Simulation
from osiris_utils.postprocessing import FieldCentering_Simulation

sim = Simulation("/path/to/input/deck.inp")

centered = FieldCentering_Simulation(sim)

e1c = centered["e1"]
arr = e1c[10]              # centered field at timestep 10
arr_slice = e1c[10, :, :]  # supports spatial slicing

FieldCentering_Diagnostic Class

class osiris_utils.postprocessing.field_centering.FieldCentering_Diagnostic(diagnostic: Diagnostic)Source

Bases: Diagnostic

load_all() ndarraySource

Load all data into memory (all iterations), in a pre-allocated array.

Parameters:
  • n_workers (int, optional) – Number of parallel workers. If None, an HPC-safe default is chosen: - “thread”: min(_MAX_IO_WORKERS, cpu_count, size-1) - “process”: min(cpu_count, size-1)

  • use_parallel (bool, optional) – If True, force parallel loading. If False, force sequential. If None (default), automatically choose based on data size.

  • executor_type ({"thread", "process"}, optional) –

    Executor backend for parallel loading (default “thread”).

    • ”thread” — ThreadPoolExecutor. Correct for I/O-bound HDF5 reads: h5py releases the GIL during reads, so threads genuinely run concurrently. I/O workers are capped at _MAX_IO_WORKERS (default 32) to prevent overwhelming parallel filesystems (Lustre/GPFS on HPC clusters such as Marenostrum 5 or Deucalion).

    • ”process” — ProcessPoolExecutor. Bypasses the GIL entirely; useful when each frame involves heavy CPU computation beyond raw I/O. Each worker opens its own HDF5 handle via the picklable _load_frame_worker helper. Requires self._file_list to be populated (i.e. only valid on the base Diagnostic, not on post-processing wrappers that override _frame).

Returns:

data (np.ndarray) – The data for all iterations. Also stored in self._data.

Notes

On HPC systems using vendor-tuned BLAS/FFTW (MKL on Marenostrum 5, AOCL on Deucalion), post-processing numpy operations (FFT, gradient) are already internally multi-threaded. Prefer executor_type="thread" for the I/O phase and rely on OMP_NUM_THREADS / MKL_NUM_THREADS for the compute phase.

This diagnostic wraps a field diagnostic and returns its cell-centered version. The implementation assumes periodic boundaries when using shifts/rolls. ``