convert seismic data (segy format) to HDF5 data?

Hi,every one!
How to convert seismic data (segy format) to HDF5 data? Could you give me an example?Thank you.

Have a look at these:

I think this is more of a question for geoscientists than for computer scientists or engineers. G.

I see you are asking about this on the HSDS forum, not HDF5. Are you interested in being able to read SEG-Y data stores via HSDS?

To support HSDS reading data from HDF5 files (rather than the native HSDS format), we use an approach where only the metadata of the HDF5 file needs to be converted and the bulk of the data is accessed from the HDF5 file as needed. During a hyperslab selection, for example, data is for each chunk is read from the HDF5 based on its position within the file. The advantage of this approach is that you don’t double the storage requirements to keep both HDF5 and HSDS versions of the data. You can read about the design here: https://github.com/HDFGroup/hsds/blob/master/docs/design/single_object/SingleObject.md.

Anyway, I’ve been thinking about a similar approach for non-HDF file formats like GeoTiff. As long as the source data format is organized in some sort of regular blocks, the same approach should be feasible.

With SEG-Y it’s a bit of a challenge since in general the trace-blocks are variable length, which wouldn’t map neatly into the HDF-style chunk layout. Still this would be an interesting topic to explore.

BTW, here’s an interesting example of using HSDS with SEG-Y data that was converted to HDF5 files and then accessed with HSDS: https://github.com/openEDI/documentation/blob/main/PoroTomo/PoroTomo_Distributed_Acoustic_Sensing_(DAS)_Data_hsds.ipynb.

Hi, @526945425!

I’ve done a bit of work in this area. It’s actually not that hard, given that it’s possible to create a 1:1 mapping from SEG-Y to HDF5, such as “/text_headers”, “/binary_header”, “/trace_headers”, and “/trace_data”. Here’s a figure of the Seismic-H5 system that my colleague and I developed a couple of years ago:

We introduced some extra datasets (“metadata” and “indices”) to speed up querying of traces of interest, but that’s optional.

We have opted to store compressed trace data using HDF5 chunks, which improved the file size to a good extent. Please refer to our slides and our short paper for more details on how the mapping is established.

Having said this, I’m not aware of an out-of-the-box tool that converts SEG-Y to HDF5, as there are several ways to encode it as HDF5. PH5 may be a good fit if you’d like to try an open source HDF5-based file format for seismic data.

Good luck!
Lucas

2 Likes