Limitation with vlen-struct-vlen? [SEC=UNCLASSIFIED]

UNCLASSIFIED

Hi all,

We seem to have tripped a bug in HDF 1.8.14, unless it is an intentional limitation?

We have a scalar dataset with type VLEN { STRUCT { double-precision-floating-point, C-string } }. The memory space uses 24 bytes of storage per element while the file space uses 12 for this format, and so a "struct(no-opt)" conversion path is selected internally by HDF for reads and writes.

We initially write 100x { number, string } pairs into the dataset, and that has no problems.

We then *over-write* with 99x { number, string } pairs. This causes an error when, having actually completed the write of the new 99x elements internally, the following code is run from H5Tconv.c:3371...

                        if(!noop_conv) {
                            /* For nested VL case, free leftover heap objects from the deeper level if the length of new data elements is shorter than the old data elements.*/
                            if(nested && seq_len < bg_seq_len) {
                                size_t parent_seq_len;
                                const uint8_t *tmp;
                                size_t u;

                                /* TMP_P is reset each time in the loop because DST_BASE_SIZE may include some data in addition to VL info. - SLU */
                                for(u = seq_len; u < bg_seq_len; u++) {
                                    tmp = (uint8_t *)tmp_buf + u * dst_base_size;
                                    UINT32DECODE(tmp, parent_seq_len);
                                    if(parent_seq_len > 0) {
                                        H5F_addr_decode(dst->shared->u.vlen.f, &tmp, &(parent_hobjid.addr));
                                        UINT32DECODE(tmp, parent_hobjid.idx);
                                        if(H5HG_remove(dst->shared->u.vlen.f, dxpl_id, &parent_hobjid) < 0)
                                            HGOTO_ERROR(H5E_DATATYPE, H5E_WRITEERROR, FAIL, "Unable to remove heap object")
                                    } /* end if */
                                } /* end for */
                            } /* end if */
                        } /* end if */

Conceptually this makes sense that remaining unused elements have their nested VLEN parts freed. However, the pointer arithmetic for "tmp" points to the start of the { number, string } PAIR... and so UINT32DECODE fills the parent_seq_len variable not with the length of the string (the nested VLEN), but with the first 4 bytes of the floating-point number in the structure. This seems to be a bug; it seems as if there should be some generic processing here, similar to how H5T_convert() is called recursively to prepare the write, as there may be more than one VLEN nested part?

I was wondering whether HDF is currently supposed to support VLEN-STRUCT-VLEN nested types in scalar datasets?

Thanks in advance,
Mark Hodson

IMPORTANT: This email remains the property of the Department of Defence and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in error, you are requested to contact the sender and delete the email.

Mark,

This is a bug. Please provide us with example to reproduce the problem.

Thank you!

Elena

···

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On May 25, 2015, at 10:50 PM, Hodson, Mark (Contractor) <Mark.Hodson@dsto.defence.gov.au<mailto:Mark.Hodson@dsto.defence.gov.au>> wrote:

UNCLASSIFIED

Hi all,

We seem to have tripped a bug in HDF 1.8.14, unless it is an intentional limitation?

We have a scalar dataset with type VLEN { STRUCT { double-precision-floating-point, C-string } }. The memory space uses 24 bytes of storage per element while the file space uses 12 for this format, and so a "struct(no-opt)" conversion path is selected internally by HDF for reads and writes.

We initially write 100x { number, string } pairs into the dataset, and that has no problems.

We then *over-write* with 99x { number, string } pairs. This causes an error when, having actually completed the write of the new 99x elements internally, the following code is run from H5Tconv.c:3371...

                       if(!noop_conv) {
                           /* For nested VL case, free leftover heap objects from the deeper level if the length of new data elements is shorter than the old data elements.*/
                           if(nested && seq_len < bg_seq_len) {
                               size_t parent_seq_len;
                               const uint8_t *tmp;
                               size_t u;

                               /* TMP_P is reset each time in the loop because DST_BASE_SIZE may include some data in addition to VL info. - SLU */
                               for(u = seq_len; u < bg_seq_len; u++) {
                                   tmp = (uint8_t *)tmp_buf + u * dst_base_size;
                                   UINT32DECODE(tmp, parent_seq_len);
                                   if(parent_seq_len > 0) {
                                       H5F_addr_decode(dst->shared->u.vlen.f, &tmp, &(parent_hobjid.addr));
                                       UINT32DECODE(tmp, parent_hobjid.idx);
                                       if(H5HG_remove(dst->shared->u.vlen.f, dxpl_id, &parent_hobjid) < 0)
                                           HGOTO_ERROR(H5E_DATATYPE, H5E_WRITEERROR, FAIL, "Unable to remove heap object")
                                   } /* end if */
                               } /* end for */
                           } /* end if */
                       } /* end if */

Conceptually this makes sense that remaining unused elements have their nested VLEN parts freed. However, the pointer arithmetic for "tmp" points to the start of the { number, string } PAIR... and so UINT32DECODE fills the parent_seq_len variable not with the length of the string (the nested VLEN), but with the first 4 bytes of the floating-point number in the structure. This seems to be a bug; it seems as if there should be some generic processing here, similar to how H5T_convert() is called recursively to prepare the write, as there may be more than one VLEN nested part?

I was wondering whether HDF is currently supposed to support VLEN-STRUCT-VLEN nested types in scalar datasets?

Thanks in advance,
Mark Hodson

IMPORTANT: This email remains the property of the Department of Defence and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in error, you are requested to contact the sender and delete the email.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

UNCLASSIFIED
Hi Elena, I knew you'd ask for an example!

I've managed to reproduce the problem within the HDF test harness. Please find attached a patch for trunk/test/dsets.c for your SVN trunk at revision 27127 that exhibits the failure.

We're still unable to fix this issue in the HDF code base. Can you please provide me with a HDFFV issue number we can record in our bug tracking system?

Thanks!
Mark

IMPORTANT: This email remains the property of the Department of Defence and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in error, you are requested to contact the sender and delete the email.

dsets.c.r27127.patch (6.35 KB)

···

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Elena Pourmal
Sent: Tuesday, 2 June 2015 1:22 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] Limitation with vlen-struct-vlen? [SEC=UNCLASSIFIED]

Mark,

This is a bug. Please provide us with example to reproduce the problem.

Thank you!

Elena
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On May 25, 2015, at 10:50 PM, Hodson, Mark (Contractor) <Mark.Hodson@dsto.defence.gov.au<mailto:Mark.Hodson@dsto.defence.gov.au>> wrote:

UNCLASSIFIED

Hi all,

We seem to have tripped a bug in HDF 1.8.14, unless it is an intentional limitation?

We have a scalar dataset with type VLEN { STRUCT { double-precision-floating-point, C-string } }. The memory space uses 24 bytes of storage per element while the file space uses 12 for this format, and so a "struct(no-opt)" conversion path is selected internally by HDF for reads and writes.

We initially write 100x { number, string } pairs into the dataset, and that has no problems.

We then *over-write* with 99x { number, string } pairs. This causes an error when, having actually completed the write of the new 99x elements internally, the following code is run from H5Tconv.c:3371...

                       if(!noop_conv) {
                           /* For nested VL case, free leftover heap objects from the deeper level if the length of new data elements is shorter than the old data elements.*/
                           if(nested && seq_len < bg_seq_len) {
                               size_t parent_seq_len;
                               const uint8_t *tmp;
                               size_t u;

                               /* TMP_P is reset each time in the loop because DST_BASE_SIZE may include some data in addition to VL info. - SLU */
                               for(u = seq_len; u < bg_seq_len; u++) {
                                   tmp = (uint8_t *)tmp_buf + u * dst_base_size;
                                   UINT32DECODE(tmp, parent_seq_len);
                                   if(parent_seq_len > 0) {
                                       H5F_addr_decode(dst->shared->u.vlen.f, &tmp, &(parent_hobjid.addr));
                                       UINT32DECODE(tmp, parent_hobjid.idx);
                                       if(H5HG_remove(dst->shared->u.vlen.f, dxpl_id, &parent_hobjid) < 0)
                                           HGOTO_ERROR(H5E_DATATYPE, H5E_WRITEERROR, FAIL, "Unable to remove heap object")
                                   } /* end if */
                               } /* end for */
                           } /* end if */
                       } /* end if */

Conceptually this makes sense that remaining unused elements have their nested VLEN parts freed. However, the pointer arithmetic for "tmp" points to the start of the { number, string } PAIR... and so UINT32DECODE fills the parent_seq_len variable not with the length of the string (the nested VLEN), but with the first 4 bytes of the floating-point number in the structure. This seems to be a bug; it seems as if there should be some generic processing here, similar to how H5T_convert() is called recursively to prepare the write, as there may be more than one VLEN nested part?

I was wondering whether HDF is currently supposed to support VLEN-STRUCT-VLEN nested types in scalar datasets?

Thanks in advance,
Mark Hodson

IMPORTANT: This email remains the property of the Department of Defence and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in error, you are requested to contact the sender and delete the email.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Hi Mark,

Thank you for the patch! The issue number is HDFFV-9408.

Elena

···

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Jun 2, 2015, at 12:00 AM, Hodson, Mark (Contractor) <Mark.Hodson@dsto.defence.gov.au<mailto:Mark.Hodson@dsto.defence.gov.au>> wrote:

UNCLASSIFIED

Hi Elena, I knew you’d ask for an example!

I’ve managed to reproduce the problem within the HDF test harness. Please find attached a patch for trunk/test/dsets.c for your SVN trunk at revision 27127 that exhibits the failure.

We’re still unable to fix this issue in the HDF code base. Can you please provide me with a HDFFV issue number we can record in our bug tracking system?

Thanks!
Mark

IMPORTANT: This email remains the property of the Department of Defence and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in error, you are requested to contact the sender and delete the email.

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Elena Pourmal
Sent: Tuesday, 2 June 2015 1:22 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] Limitation with vlen-struct-vlen? [SEC=UNCLASSIFIED]

Mark,

This is a bug. Please provide us with example to reproduce the problem.

Thank you!

Elena
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org<http://hdfgroup.org/>
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On May 25, 2015, at 10:50 PM, Hodson, Mark (Contractor) <Mark.Hodson@dsto.defence.gov.au<mailto:Mark.Hodson@dsto.defence.gov.au>> wrote:

UNCLASSIFIED

Hi all,

We seem to have tripped a bug in HDF 1.8.14, unless it is an intentional limitation?

We have a scalar dataset with type VLEN { STRUCT { double-precision-floating-point, C-string } }. The memory space uses 24 bytes of storage per element while the file space uses 12 for this format, and so a "struct(no-opt)" conversion path is selected internally by HDF for reads and writes.

We initially write 100x { number, string } pairs into the dataset, and that has no problems.

We then *over-write* with 99x { number, string } pairs. This causes an error when, having actually completed the write of the new 99x elements internally, the following code is run from H5Tconv.c:3371...

                       if(!noop_conv) {
                           /* For nested VL case, free leftover heap objects from the deeper level if the length of new data elements is shorter than the old data elements.*/
                           if(nested && seq_len < bg_seq_len) {
                               size_t parent_seq_len;
                               const uint8_t *tmp;
                               size_t u;

                               /* TMP_P is reset each time in the loop because DST_BASE_SIZE may include some data in addition to VL info. - SLU */
                               for(u = seq_len; u < bg_seq_len; u++) {
                                   tmp = (uint8_t *)tmp_buf + u * dst_base_size;
                                   UINT32DECODE(tmp, parent_seq_len);
                                   if(parent_seq_len > 0) {
                                       H5F_addr_decode(dst->shared->u.vlen.f, &tmp, &(parent_hobjid.addr));
                                       UINT32DECODE(tmp, parent_hobjid.idx);
                                       if(H5HG_remove(dst->shared->u.vlen.f, dxpl_id, &parent_hobjid) < 0)
                                           HGOTO_ERROR(H5E_DATATYPE, H5E_WRITEERROR, FAIL, "Unable to remove heap object")
                                   } /* end if */
                               } /* end for */
                           } /* end if */
                       } /* end if */

Conceptually this makes sense that remaining unused elements have their nested VLEN parts freed. However, the pointer arithmetic for "tmp" points to the start of the { number, string } PAIR... and so UINT32DECODE fills the parent_seq_len variable not with the length of the string (the nested VLEN), but with the first 4 bytes of the floating-point number in the structure. This seems to be a bug; it seems as if there should be some generic processing here, similar to how H5T_convert() is called recursively to prepare the write, as there may be more than one VLEN nested part?

I was wondering whether HDF is currently supposed to support VLEN-STRUCT-VLEN nested types in scalar datasets?

Thanks in advance,
Mark Hodson

IMPORTANT: This email remains the property of the Department of Defence and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in error, you are requested to contact the sender and delete the email.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

<dsets.c.r27127.patch>_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

UNCLASSIFIED
Hi again,

Good news! I tried a bit harder and have managed to fix the problem within the HDF library. I have run your regression tests (Windows, MSVC 9.0 / VS 2008, 32-bit) and all pass, including the new updated test for this issue contained in the updated patch I have attached (see below). I did it by writing a new internal static function that performs the necessary recursion to locate the nested VLEN allocations within the background buffer. The only concern I have is whether the use of UINT32DECODE() (which was there before - that's not my doing!) is correct for HDF addressing on 64-bit platforms; there might be some issues with the mix of "size_t" and "uint32_t" so please test on a 64-bit platform as well.

What are these patches?

dsets.c.r27127 = A patch wrt the HDF SVN trunk at r27127 which trips the bug that we identified regarding over-writing a scalar dataset with a shorter VLEN payload than was there previously. This uses ARRAY, COMPOUND and VLEN types (including VLEN strings) to test all data types that could lead to the bug. This replaces the one I sent you yesterday (and it adds ARRAY types to the test).

H5Tconv.c.r27127 = A patch wrt the HDF SVN trunk at r27127 that fixes the above problem by adding an internal static function that performs the necessary recursion over the datatype descriptor that describes where all of the internal VLEN allocations are, and for each element in the background buffer that needs to be freed.

Best of luck.
Mark

IMPORTANT: This email remains the property of the Department of Defence and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in error, you are requested to contact the sender and delete the email.

dsets.c.r27127.patch (7.02 KB)

H5Tconv.c.r27127.patch (4.5 KB)

···

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Elena Pourmal
Sent: Wednesday, 3 June 2015 12:21 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] Limitation with vlen-struct-vlen? [SEC=UNCLASSIFIED]

Hi Mark,

Thank you for the patch! The issue number is HDFFV-9408.

Elena
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Jun 2, 2015, at 12:00 AM, Hodson, Mark (Contractor) <Mark.Hodson@dsto.defence.gov.au<mailto:Mark.Hodson@dsto.defence.gov.au>> wrote:

UNCLASSIFIED
Hi Elena, I knew you'd ask for an example!

I've managed to reproduce the problem within the HDF test harness. Please find attached a patch for trunk/test/dsets.c for your SVN trunk at revision 27127 that exhibits the failure.

We're still unable to fix this issue in the HDF code base. Can you please provide me with a HDFFV issue number we can record in our bug tracking system?

Thanks!
Mark

IMPORTANT: This email remains the property of the Department of Defence and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in error, you are requested to contact the sender and delete the email.
From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Elena Pourmal
Sent: Tuesday, 2 June 2015 1:22 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] Limitation with vlen-struct-vlen? [SEC=UNCLASSIFIED]

Mark,

This is a bug. Please provide us with example to reproduce the problem.

Thank you!

Elena
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org<http://hdfgroup.org/>
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On May 25, 2015, at 10:50 PM, Hodson, Mark (Contractor) <Mark.Hodson@dsto.defence.gov.au<mailto:Mark.Hodson@dsto.defence.gov.au>> wrote:

UNCLASSIFIED

Hi all,

We seem to have tripped a bug in HDF 1.8.14, unless it is an intentional limitation?

We have a scalar dataset with type VLEN { STRUCT { double-precision-floating-point, C-string } }. The memory space uses 24 bytes of storage per element while the file space uses 12 for this format, and so a "struct(no-opt)" conversion path is selected internally by HDF for reads and writes.

We initially write 100x { number, string } pairs into the dataset, and that has no problems.

We then *over-write* with 99x { number, string } pairs. This causes an error when, having actually completed the write of the new 99x elements internally, the following code is run from H5Tconv.c:3371...

                       if(!noop_conv) {
                           /* For nested VL case, free leftover heap objects from the deeper level if the length of new data elements is shorter than the old data elements.*/
                           if(nested && seq_len < bg_seq_len) {
                               size_t parent_seq_len;
                               const uint8_t *tmp;
                               size_t u;

                               /* TMP_P is reset each time in the loop because DST_BASE_SIZE may include some data in addition to VL info. - SLU */
                               for(u = seq_len; u < bg_seq_len; u++) {
                                   tmp = (uint8_t *)tmp_buf + u * dst_base_size;
                                   UINT32DECODE(tmp, parent_seq_len);
                                   if(parent_seq_len > 0) {
                                       H5F_addr_decode(dst->shared->u.vlen.f, &tmp, &(parent_hobjid.addr));
                                       UINT32DECODE(tmp, parent_hobjid.idx);
                                       if(H5HG_remove(dst->shared->u.vlen.f, dxpl_id, &parent_hobjid) < 0)
                                           HGOTO_ERROR(H5E_DATATYPE, H5E_WRITEERROR, FAIL, "Unable to remove heap object")
                                   } /* end if */
                               } /* end for */
                           } /* end if */
                       } /* end if */

Conceptually this makes sense that remaining unused elements have their nested VLEN parts freed. However, the pointer arithmetic for "tmp" points to the start of the { number, string } PAIR... and so UINT32DECODE fills the parent_seq_len variable not with the length of the string (the nested VLEN), but with the first 4 bytes of the floating-point number in the structure. This seems to be a bug; it seems as if there should be some generic processing here, similar to how H5T_convert() is called recursively to prepare the write, as there may be more than one VLEN nested part?

I was wondering whether HDF is currently supposed to support VLEN-STRUCT-VLEN nested types in scalar datasets?

Thanks in advance,
Mark Hodson

IMPORTANT: This email remains the property of the Department of Defence and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in error, you are requested to contact the sender and delete the email.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

<dsets.c.r27127.patch>_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Wow! that was fast! Thank you!

I updated the JIRA issue.

···

On Jun 2, 2015, at 11:09 PM, Hodson, Mark (Contractor) <Mark.Hodson@dsto.defence.gov.au<mailto:Mark.Hodson@dsto.defence.gov.au>> wrote:

UNCLASSIFIED

Hi again,

Good news! I tried a bit harder and have managed to fix the problem within the HDF library. I have run your regression tests (Windows, MSVC 9.0 / VS 2008, 32-bit) and all pass, including the new updated test for this issue contained in the updated patch I have attached (see below). I did it by writing a new internal static function that performs the necessary recursion to locate the nested VLEN allocations within the background buffer. The only concern I have is whether the use of UINT32DECODE() (which was there before – that’s not my doing!) is correct for HDF addressing on 64-bit platforms; there might be some issues with the mix of “size_t” and “uint32_t” so please test on a 64-bit platform as well.

We always test on a 64-bit platform, no worries.

Thanks again!

Elena
What are these patches?

dsets.c.r27127 = A patch wrt the HDF SVN trunk at r27127 which trips the bug that we identified regarding over-writing a scalar dataset with a shorter VLEN payload than was there previously. This uses ARRAY, COMPOUND and VLEN types (including VLEN strings) to test all data types that could lead to the bug. This replaces the one I sent you yesterday (and it adds ARRAY types to the test).

H5Tconv.c.r27127 = A patch wrt the HDF SVN trunk at r27127 that fixes the above problem by adding an internal static function that performs the necessary recursion over the datatype descriptor that describes where all of the internal VLEN allocations are, and for each element in the background buffer that needs to be freed.

Best of luck.
Mark

IMPORTANT: This email remains the property of the Department of Defence and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in error, you are requested to contact the sender and delete the email.

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Elena Pourmal
Sent: Wednesday, 3 June 2015 12:21 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] Limitation with vlen-struct-vlen? [SEC=UNCLASSIFIED]

Hi Mark,

Thank you for the patch! The issue number is HDFFV-9408.

Elena
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org<http://hdfgroup.org/>
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Jun 2, 2015, at 12:00 AM, Hodson, Mark (Contractor) <Mark.Hodson@dsto.defence.gov.au<mailto:Mark.Hodson@dsto.defence.gov.au>> wrote:

UNCLASSIFIED

Hi Elena, I knew you’d ask for an example!

I’ve managed to reproduce the problem within the HDF test harness. Please find attached a patch for trunk/test/dsets.c for your SVN trunk at revision 27127 that exhibits the failure.

We’re still unable to fix this issue in the HDF code base. Can you please provide me with a HDFFV issue number we can record in our bug tracking system?

Thanks!
Mark

IMPORTANT: This email remains the property of the Department of Defence and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in error, you are requested to contact the sender and delete the email.

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Elena Pourmal
Sent: Tuesday, 2 June 2015 1:22 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] Limitation with vlen-struct-vlen? [SEC=UNCLASSIFIED]

Mark,

This is a bug. Please provide us with example to reproduce the problem.

Thank you!

Elena
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org<http://hdfgroup.org/>
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On May 25, 2015, at 10:50 PM, Hodson, Mark (Contractor) <Mark.Hodson@dsto.defence.gov.au<mailto:Mark.Hodson@dsto.defence.gov.au>> wrote:

UNCLASSIFIED

Hi all,

We seem to have tripped a bug in HDF 1.8.14, unless it is an intentional limitation?

We have a scalar dataset with type VLEN { STRUCT { double-precision-floating-point, C-string } }. The memory space uses 24 bytes of storage per element while the file space uses 12 for this format, and so a "struct(no-opt)" conversion path is selected internally by HDF for reads and writes.

We initially write 100x { number, string } pairs into the dataset, and that has no problems.

We then *over-write* with 99x { number, string } pairs. This causes an error when, having actually completed the write of the new 99x elements internally, the following code is run from H5Tconv.c:3371...

                       if(!noop_conv) {
                           /* For nested VL case, free leftover heap objects from the deeper level if the length of new data elements is shorter than the old data elements.*/
                           if(nested && seq_len < bg_seq_len) {
                               size_t parent_seq_len;
                               const uint8_t *tmp;
                               size_t u;

                               /* TMP_P is reset each time in the loop because DST_BASE_SIZE may include some data in addition to VL info. - SLU */
                               for(u = seq_len; u < bg_seq_len; u++) {
                                   tmp = (uint8_t *)tmp_buf + u * dst_base_size;
                                   UINT32DECODE(tmp, parent_seq_len);
                                   if(parent_seq_len > 0) {
                                       H5F_addr_decode(dst->shared->u.vlen.f, &tmp, &(parent_hobjid.addr));
                                       UINT32DECODE(tmp, parent_hobjid.idx);
                                       if(H5HG_remove(dst->shared->u.vlen.f, dxpl_id, &parent_hobjid) < 0)
                                           HGOTO_ERROR(H5E_DATATYPE, H5E_WRITEERROR, FAIL, "Unable to remove heap object")
                                   } /* end if */
                               } /* end for */
                           } /* end if */
                       } /* end if */

Conceptually this makes sense that remaining unused elements have their nested VLEN parts freed. However, the pointer arithmetic for "tmp" points to the start of the { number, string } PAIR... and so UINT32DECODE fills the parent_seq_len variable not with the length of the string (the nested VLEN), but with the first 4 bytes of the floating-point number in the structure. This seems to be a bug; it seems as if there should be some generic processing here, similar to how H5T_convert() is called recursively to prepare the write, as there may be more than one VLEN nested part?

I was wondering whether HDF is currently supposed to support VLEN-STRUCT-VLEN nested types in scalar datasets?

Thanks in advance,
Mark Hodson

IMPORTANT: This email remains the property of the Department of Defence and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in error, you are requested to contact the sender and delete the email.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

<dsets.c.r27127.patch>_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

<dsets.c.r27127.patch><H5Tconv.c.r27127.patch>_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5