Structuring heterogeneous data

Dear All,

I'm following the discussion on the mailing list for quite a while and have a rather specific question for which I couldn't find an answer so far. I would really appreciate any comments, ideas and questions on my problem.Basically, I need to decide on a structure for the output data of a simulation toolkit which is developed as part of my PhD project (crystal plasticity, damask.mpie.de) to store it with HDF5.

As usual in simulations, we do a time stepping and write out results at selected time steps. My idea is to organize the time steps in groups. Since we're using the finite element method, the data is available on integration points (IP), so depending on the problem we have results on N integration points for M time steps. Since the structure of this software reflects a multiscale problem, the result on each IP consist of multiple layers of inhomogeneous data. The outmost layer is called the homogenization layer and depending on the selected scheme (with a unique name), the type and the size of the results per IP may vary.

Depending on the selected homogenization scheme, each IP consists of 1 to X components having a combination of two outputs called "crystallite" and "constitutive". Again, depending on the selected types the size of these outputs may vary (in size and type) and they can be arbitrarily combined. First I thought about using a compound data type for each active combination of homogenization/crystallite/constitutive, but this might have a negative influence on the performance and moreover would make postprocessing (adding further derived quantities to the data) complicated. If this option is used, there are basically two options: Store all quantities of the same combination in a dataset and add an additional mapping from IP to dataset + position in dataset or store all points in the correct order (and don't use datasets).
The other option would be to store each output in a dataset with length M, but in that case it would be necessary to have a efficient way to store the information that certain information is not available on a number of IPs.

I'm looking forwards for any suggestions how to structure my data. Any replies are appreciated

best regards
Martin

···

---
Max-Planck-Institut für Eisenforschung GmbH
Max-Planck-Straße 1
D-40237 Düsseldorf

Handelsregister B 2533
Amtsgericht Düsseldorf

Geschäftsführung
Prof. Dr. Gerhard Dehm
Prof. Dr. Jörg Neugebauer
Prof. Dr. Dierk Raabe
Prof. Dr. Martin Stratmann
Dr. Kai de Weldige

Ust.-Id.-Nr.: DE 11 93 58 514
Steuernummer: 105 5891 1000
-------------------------------------------------