We have a workflow that copies the last 120 of 121 records from one NetCDF-Classic file to another. The data are compressed and chunked. Each record is a chunk. Profiling shows the vast majority of time is spent decompressing (read from the first file) then recompressing the data (write to the target file). Both deflate and Zstd show the same behavior. A faster way would avoid decompressing and recompressing in the first place. I can see needing to decompress the data if the user wants to get at the real values, but I just want copy from one file to another. I was thinking of a low level block copy. Something like ‘dd’ command would do.
Is that possible with HDF5/NetCDF?
Does HDF5 NEED to decompress/recompress in this scenario?
I’m using nco-5.2.4 built using spack and running on a zen2 chip under SLES15sp4.
Example usage:
ncrcat -7 -d time,1,120 -L 4 file.in file.out
ncrcat -7 -d time,1,120 --cmp=‘shf|zst,4’ file.in file.out