Offset issues for Compound Data in 64-bit systems

Hi all,
I am using compound data types with vlen strings and integers. The struct containing the strings and integers also contains a couple of methods, which prevents me from using the HOFFSET macro, so I used sizeof() and/or the getSize() method from the H5 data types to determine the offset of the components. This works under my personal 32-bit system, but ends with a segmentation fault under a 64-bit system.
I found out that the offset determined by HOFFSET is different from the sizeof() and getSize() methods. I avoided this problem by inserting the int as the last element.
A small sample code is added at the end of this mail.

My test system is:
> lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 9.10
Release: 9.10
Codename: karmic
> uname -a
Linux xxxxx 2.6.31-22-generic #70-Ubuntu SMP Wed Dec 1 23:48:17 UTC 2010 x86_64 GNU/Linux
> gcc --version
gcc (Ubuntu 4.4.1-4ubuntu9) 4.4.1

Is this a general problem/issue or due wrong usage?

~Mathias

···

---

#include <iostream>
#include <string>

using std::cout;
using std::endl;
using std::string;

#include "H5Cpp.h"

using namespace H5;

char* tc_string(const string& s) {
  char* ret = new char[s.length() + 1];
  size_t l = s.copy(ret, s.length(), 0);
  ret[l] = 0;
  return ret;
}

int main(int argc, char* argv[]) {

  {
    H5File* file = new H5File("test1.h5", H5F_ACC_TRUNC);

    StrType stringtype(PredType::C_S1, H5T_VARIABLE);

    struct cv {
      char* id;
      int accession;
    };
    cout << "char* first: " << endl;
    cout << " sizeof(char*) " << sizeof(char*) << endl;
    cout << " stringtype.getSize() " << stringtype.getSize() << endl;
    cout << " HOFFSET(cv, accession) " << HOFFSET(cv, accession) << endl;

    CompType cvtype(sizeof(cv));
    size_t offset = 0;
    cvtype.insertMember("id", offset, stringtype);
    offset += stringtype.getSize();
    cvtype.insertMember("accession", offset, PredType::NATIVE_INT);
    offset += PredType::NATIVE_INT.getSize();

    cv* test = new cv[2];
    test[0].accession = 12;
    test[0].id = tc_string("1.1");
    test[1].accession = 23;
    test[1].id = tc_string("2.2");

    DSetCreatPropList cparms;
    hsize_t dimds[1] = { 2 };
    DataSpace dataspace(1, dimds, dimds);
    DataSet ds = file->createDataSet("test1", cvtype, dataspace, cparms);
    ds.write(test, cvtype);
    for (int i = 0; i < 2; ++i) {
      delete[] test[i].id;
    }
    delete[] test;
    delete file;
  }

  cout << endl;

  {
    H5File* file = new H5File("test2.h5", H5F_ACC_TRUNC);

    StrType stringtype(PredType::C_S1, H5T_VARIABLE);

    struct cv {
      int accession;
      char* id;
    };

    cout << "int* first:" << endl;
    cout << " sizeof(int) " << sizeof(int) << endl;
    cout << " PredType::NATIVE_INT " << PredType::NATIVE_INT.getSize() << endl;
    cout << " HOFFSET(cv, id) " << HOFFSET(cv, id) << endl;

    CompType cvtype(sizeof(cv));
    size_t offset = 0;
    cvtype.insertMember("id", offset, stringtype);
    offset += stringtype.getSize();
    cvtype.insertMember("accession", offset, PredType::NATIVE_INT);

    cv* test = new cv[2];
    test[0].accession = 12;
    test[0].id = tc_string("1.1");
    test[1].accession = 23;
    test[1].id = tc_string("2.2");

    DSetCreatPropList cparms;
    hsize_t dimds[1] = { 2 };
    DataSpace dataspace(1, dimds, dimds);
    DataSet ds = file->createDataSet("test2", cvtype, dataspace, cparms);
    ds.write(test, cvtype);
    for (int i = 0; i < 2; ++i) {
      delete[] test[i].id;
    }
    delete[] test;
    delete file;
  }
  return 0;
}

Hi Mathias,

Hi all,
I am using compound data types with vlen strings and integers. The struct containing the strings and integers also contains a couple of methods, which prevents me from using the HOFFSET macro, so I used sizeof() and/or the getSize() method from the H5 data types to determine the offset of the components. This works under my personal 32-bit system, but ends with a segmentation fault under a 64-bit system.
I found out that the offset determined by HOFFSET is different from the sizeof() and getSize() methods. I avoided this problem by inserting the int as the last element.
A small sample code is added at the end of this mail.

My test system is:
> lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 9.10
Release: 9.10
Codename: karmic
> uname -a
Linux xxxxx 2.6.31-22-generic #70-Ubuntu SMP Wed Dec 1 23:48:17 UTC 2010 x86_64 GNU/Linux
> gcc --version
gcc (Ubuntu 4.4.1-4ubuntu9) 4.4.1

Is this a general problem/issue or due wrong usage?

  You should use HOFFSET for inserting your fields into the compound datatype. The compiler may insert padding before/after fields in the struct and that information is not reflected in the value returned from getSize().

  Quincey

···

On Feb 24, 2011, at 4:30 PM, Mathias Wilhelm wrote:

~Mathias

---

#include <iostream>
#include <string>

using std::cout;
using std::endl;
using std::string;

#include "H5Cpp.h"

using namespace H5;

char* tc_string(const string& s) {
  char* ret = new char[s.length() + 1];
  size_t l = s.copy(ret, s.length(), 0);
  ret[l] = 0;
  return ret;
}

int main(int argc, char* argv[]) {

  {
    H5File* file = new H5File("test1.h5", H5F_ACC_TRUNC);

    StrType stringtype(PredType::C_S1, H5T_VARIABLE);

    struct cv {
      char* id;
      int accession;
    };
    cout << "char* first: " << endl;
    cout << " sizeof(char*) " << sizeof(char*) << endl;
    cout << " stringtype.getSize() " << stringtype.getSize() << endl;
    cout << " HOFFSET(cv, accession) " << HOFFSET(cv, accession) << endl;

    CompType cvtype(sizeof(cv));
    size_t offset = 0;
    cvtype.insertMember("id", offset, stringtype);
    offset += stringtype.getSize();
    cvtype.insertMember("accession", offset, PredType::NATIVE_INT);
    offset += PredType::NATIVE_INT.getSize();

    cv* test = new cv[2];
    test[0].accession = 12;
    test[0].id = tc_string("1.1");
    test[1].accession = 23;
    test[1].id = tc_string("2.2");

    DSetCreatPropList cparms;
    hsize_t dimds[1] = { 2 };
    DataSpace dataspace(1, dimds, dimds);
    DataSet ds = file->createDataSet("test1", cvtype, dataspace, cparms);
    ds.write(test, cvtype);
    for (int i = 0; i < 2; ++i) {
      delete[] test[i].id;
    }
    delete[] test;
    delete file;
  }

  cout << endl;

  {
    H5File* file = new H5File("test2.h5", H5F_ACC_TRUNC);

    StrType stringtype(PredType::C_S1, H5T_VARIABLE);

    struct cv {
      int accession;
      char* id;
    };

    cout << "int* first:" << endl;
    cout << " sizeof(int) " << sizeof(int) << endl;
    cout << " PredType::NATIVE_INT " << PredType::NATIVE_INT.getSize() << endl;
    cout << " HOFFSET(cv, id) " << HOFFSET(cv, id) << endl;

    CompType cvtype(sizeof(cv));
    size_t offset = 0;
    cvtype.insertMember("id", offset, stringtype);
    offset += stringtype.getSize();
    cvtype.insertMember("accession", offset, PredType::NATIVE_INT);

    cv* test = new cv[2];
    test[0].accession = 12;
    test[0].id = tc_string("1.1");
    test[1].accession = 23;
    test[1].id = tc_string("2.2");

    DSetCreatPropList cparms;
    hsize_t dimds[1] = { 2 };
    DataSpace dataspace(1, dimds, dimds);
    DataSet ds = file->createDataSet("test2", cvtype, dataspace, cparms);
    ds.write(test, cvtype);
    for (int i = 0; i < 2; ++i) {
      delete[] test[i].id;
    }
    delete[] test;
    delete file;
  }
  return 0;
}

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Hi Quincey,
Quincey Koziol wrote:

I am using compound data types with vlen strings and integers. The struct containing the strings and integers also contains a couple of methods, which prevents me from using the HOFFSET macro, so I used sizeof() and/or the getSize() method from the H5 data types to determine the offset of the components. This works under my personal 32-bit system, but ends with a segmentation fault under a 64-bit system.
I found out that the offset determined by HOFFSET is different from the sizeof() and getSize() methods. I avoided this problem by inserting the int as the last element.
Is this a general problem/issue or due wrong usage?

  You should use HOFFSET for inserting your fields into the compound datatype. The compiler may insert padding before/after fields in the struct and that information is not reflected in the value returned from getSize().

I would like to use HOFFSET, but since my structs have several methods my compiler gives:

x.cpp:301: warning: invalid access to non-static data member ‘main1(int, char**)::Test::id’ of NULL object
x.cpp:301: warning: (perhaps the ‘offsetof’ macro was used incorrectly)

because they are non-PODs.

I just found a solution for it. Is this a proper way to do it?

struct AData {//contains only variables
  //variables
  int acc;
  char* id;
};

struct A : AData { //contains only functions
  //some functions
}

Using the HOFFSET(AData, id) macro compiles now without warnings and can be used in the insertMember function of CompType.

FYI:
My compound data structure is quite complex and consists of several nested VLEN data types, including strings and other VLEN-data as well as atomic types. I need those functions to convert data structures from other API's to H5 compound data types, but also for memory management.

~Mathias

Hi Mathias,

Hi Quincey,
Quincey Koziol wrote:

I am using compound data types with vlen strings and integers. The struct containing the strings and integers also contains a couple of methods, which prevents me from using the HOFFSET macro, so I used sizeof() and/or the getSize() method from the H5 data types to determine the offset of the components. This works under my personal 32-bit system, but ends with a segmentation fault under a 64-bit system.
I found out that the offset determined by HOFFSET is different from the sizeof() and getSize() methods. I avoided this problem by inserting the int as the last element.
Is this a general problem/issue or due wrong usage?

  You should use HOFFSET for inserting your fields into the compound datatype. The compiler may insert padding before/after fields in the struct and that information is not reflected in the value returned from getSize().

I would like to use HOFFSET, but since my structs have several methods my compiler gives:

x.cpp:301: warning: invalid access to non-static data member ‘main1(int, char**)::Test::id’ of NULL object
x.cpp:301: warning: (perhaps the ‘offsetof’ macro was used incorrectly)

because they are non-PODs.

I just found a solution for it. Is this a proper way to do it?

struct AData {//contains only variables
  //variables
  int acc;
  char* id;
};

struct A : AData { //contains only functions
  //some functions
}

Using the HOFFSET(AData, id) macro compiles now without warnings and can be used in the insertMember function of CompType.

  Looks OK to me, but I'm far from enough of a C++ expert to comment about the style. :slight_smile: You could also just use the offsetof() macro directly, if that works better.

  Quincey

···

On Mar 1, 2011, at 8:21 AM, Mathias Wilhelm wrote:

FYI:
My compound data structure is quite complex and consists of several nested VLEN data types, including strings and other VLEN-data as well as atomic types. I need those functions to convert data structures from other API's to H5 compound data types, but also for memory management.

~Mathias

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Hi Quincey,
Quincey Koziol wrote:

I just found a solution for it. Is this a proper way to do it?

struct AData {//contains only variables
  //variables
  int acc;
  char* id;
};

struct A : AData { //contains only functions
  //some functions
}

Using the HOFFSET(AData, id) macro compiles now without warnings and can be used in the insertMember function of CompType.

  Looks OK to me, but I'm far from enough of a C++ expert to comment about the style. :slight_smile: You could also just use the offsetof() macro directly, if that works better.

I tried this, but still the same compile error. HOFFSET and offset are basically the same, since HOFFSET is a redef of offsetof in H5Tpublic.h.
The main problem lies in the struct. offsetof() can only be used on POD("plain old data").

However, I will tell you if I have trouble with this version.

Thanks,
~Mathias