Your browser doesn't support the features required by impress.js,
so you are presented with a simplified version of this presentation.
For the best experience please use the latest Chrome, Safari or
Firefox browser.
A meta-data standard
for mesh based data ...
and particle data sets.
Clean Common Structure
... and Domain Specific Extensions
*currently implemented, but not limited to
For hierarchical, self-describing data formats.
How to avoid confusion between actual low-level data sets &
attributes (HDF5 speak), variables &
attributes (ADIOS speak), files & groups/folders
and the actual physical quantities?
openPMD naming convention
Physical quanities are
records:
- discretized vector field \(\vec F(\vec r)\) on a mesh
- discretized scalar field \(T(\vec r)\) on a mesh
- particle property position \(\vec r_i\)
- ...
Record: each particle property or mesh
Their actual components are stored in (multi-dimensional) arrays
(= data sets / variables) inside those records!
Position.x: not a record, it's a component
Required
attributes for each
record:
- unitSI: conversion factor to common unit system
- unitDimension: parsable dimensionality
- time / timeUnitSI: iteration != time
Attributes for all records
unitDimension:
- automate description of units
- powers of the 7 (SI) base measures
- e.g., `V/m` is length1.0 mass1.0 time-3.0 electrical current-1.0 thermodynamic temperature0.0 ammount of substance0.0 luminous intensity0.0: `[1.0, 1.0, -3.0, -1.0, 0.0, 0.0, 0.0]`
- (if your record can be scaled in a general normalized way, choose a reference and write a note in the `comment` attribute)
Attributes for all records
Required
attributes for each
mesh record:
- geometry (+ parameters)
- grid: spacing, global offset, axis labels
- data order (C/Fortran)
Required
attributes for each
mesh record component:
Attributes for mesh records
Required records for each
particle species:
- position
- position offset (can be constant)
Optional:
Attributes for particle records
Examples for records
electric field \(\vec E(\vec r)\):
temperature \(T(\vec r)\):
Legend:
Group /
(multi-dim) Array
electron position \(\vec r\):
- / ... / particles / electrons / position /
electron charge \(Q\):
- / ... / particles / electrons
Legend:
Group /
(1D) Array
Ok ok, I got it! But what about constant components in a record?
electron charge \(Q\) might be constant for all particles stored in species
electrons:
- / ... / particles / electrons
- charge ← might be very large
Legend:
Group /
Array /
Attribute
- / ... / particles / electrons
- charge
- value ← few bytes
- shape
- unitSI
Legend:
Group /
Array /
Attribute
possible for
any record component, e.g.:
- / ... / particles / electrons
Legend:
Group /
Array /
Attribute
All right. But how to handle Petabytes of data written from thousands (-millions) of compute nodes?
- parallel, community file formats: writing/reading based on MPI & MPI-I/O
- examples:
- PHDF5 .h5 (parallel/strided, uncompressed)
- ADIOS .bp (aggregated, compressed)
Parallel file formats
Particle Patches: Honor Decomposition
Particle Patches: Disjoint Particle Sets
Particle Patches: [Offset:Offset+Count]
Particle Patches: (Spatial) Hyperrectangles
In principle and everywhere*: a human-readable comment (text) attribute is encouraged for everything not covered by the standard.
* reserved for each group and data set
Comment Attribute
openPMD defines a minimal set of attributes.
You can always add more attributes and records!
openPMD is a not exclusive
openPMD defines a minimal set of attributes, e.g.
- openPMD: identifier
- basePath: prefix, currently fixed to `/data/`
- meshesPath: relative sub-group, e.g., `meshes/`
- particlesPath: relative sub-group, e.g., `particles/`
Required Base Attributes for `/`
- author: My Name <email@example.com>
- software: e.g., PIConGPU
- softwareVersion: e.g., 0.1.0
- date: 2015-12-02 17:48:42 +0100
Recommended Base Attributes for `/`
$ ./checkOpenPMD_h5.py -i example.h5 --EDPIC
Warning: Attribute softwareVersion (recommended) does NOT exist in `/`!
Found 1 iteration(s)
Iteration 0 : found 2 meshes
Iteration 0 : found 1 particle species
Result: 0 Errors and 1 Warning.
Developer Tools: Checker Script
Serial processing: high-level API
- meta-data parsing
- file-format independent (ADIOS & HDF5)
- object oriented: meshes, particles & iterations
Developer Tools: Serial Processing
openPMD viewer
- Python API: openPMD object aware
- GUI: IPython Notebook (interactive, remote)
- ideal for investigating 1-2D data (or slices)
- modular: e.g., domain specific analysis chains
Developer Tools: Parallel Processing
Parallel processing I: integration in suites such as
Developer Tools: Parallel Processing
Parallel processing II: integration in parallel python processing via
pyDive
- numpy-like parallel access
- read and write support
- based on zeroMQ / Jupyter notebook
Developer Tools: Parallel Processing
Common tools
- openPMD-viewer numpy-like parallel access
- read and write support
- GUI: Jupyter notebook widgets
Developer Tools: Parallel Processing
Extensions: Domain-Specific Additions
Implemented by PIConGPU & Warp
- electro-dynamic and electro-static PIC
- additional attributes
- naming conventions for records
The ED-PIC Extension
One could add extensions for
- Experimental data: CCD images, interferograms, ...
- Simulations: MD, FEM, ...
Other Domain-Specific Extensions
For future versions
- record patches and AMR support
- irregular cartesian grids
- more mesh types: if required
Future Plans