Attribute size: C struct padding problem

Hello,

While testing VOL-REST in combination with HSDS, I think I found a bug related the padding of a C struct. Suppose, one the following c struct

typedef struct {
   char[64] str1;
   char[64] str2;
   int32_t  value1;
   double   value2;
   uint64_t value3;
   char[64] str3;
} att_t;

This struct is used to hold the data of certain HDF5 attribute. The corresponding HDF5 attribute type has been specified based on this struct. Without padding the struct is 212 bytes while with the padding is 216 bytes. At least on my machine. The VOL-REST API determines the size of this attribute type using H5Tget_size() which reports 216 bytes. Now I noticed that also HSDS itself determines the size of this HDF5 attribute based on its description but unfortunately it does not take the padding into account and therefore the size is 212 bytes. Since these sizes are different HSDS reports a warning and does not writing the attribute data to the JSON file. If I add __attribute__((__packed__)) to the C struct declaration the data will be properly written to the JSON file.

Naievely, I would think the HSDS implementation should be improved such it takes the padding into account. Or should I use another method to get everything working. Personally, I would not prefer to make use of the __attribute__((__packed__)) statement in the C struct declaration.

Thanks for reporting this!

HSDS stores attributes as JSON and JSON is also used in the REST API to pass values via http. On the other hand, dataset data is always stored as binary blobs, but JSON or binary can be used in http requests. Typically binary is used for performance reasons.

So my thinking is that retrieving the attribute from HSDS and fitting it into the expected C struct should be something that can be handled on the REST VOL side. HSDS is not keep track of padding bytes in the storage format.

Hopefully Matt can weigh in on this.

Hello @jreadey,

Reading your response, I realized that I did not make it explicitly clear that I am talking about writing an attribute to HSDS using REST API. But yes you are right maybe we have a similar issue for reading this attribute.

Best regards,
Jan-Willem

John and I discussed this, and what we’ll likely do is have HSDS take in optional information about field offsets in compound types, and then have clients 1. send this information along with the datatype, and 2. get it back from HSDS and re-assemble the datatype with the exact offsets intact. See this issue on github.

Hello @mlarson and @jreadey,

Your proposed solution sounds great. It will also the issue when an user want to store a subset of defined C struct in HDF5. I encountered this case by accident due to the fact I forgot to remove an unused variable from the struct. Besides some small issues, the combination VOL-REST and HSDS work very well. So far most of my tests are working.

Best regards,
Jan-Willem