Hello,
I have encountered an error which seems non-deterministic. 2 other
developers have reported the same issue but there was no conclusion made
previously:
1.
http://hdf-forum.184993.n3.nabble.com/H5SL-insert-common-can-t-insert-duplicate-key-td4026817.html
2.
http://hdf-forum.184993.n3.nabble.com/H5SL-duplicate-key-errors-with-HDF5-1-8-13-td4027300.html
My process is a single-threaded write from CSV to HDF5 under a couple of
different configurations such as compression algorithms (I use LZO and
BLOSC filters), chunk sizes, and incoming dataset sizes(number of rows). I
only used HDF5 Table API, and I made some minor modification to
H5TBmake_table, which enables me to use other kinds of compressors and
compression levels. No other changes to the API code.
I have 100 configurations to be used sequentially, and each one creates one
new H5 file. All configs were fed with the same input csv files. The H5
file size will be about 2-3GB with compression enabled. The whole process
takes hours to complete.
1 of the configs encountered failure from the function
H5TBappend_records(), but as I singled out the problematic config and ran
the write again with it, the whole writing process went smoothly. In other
words, the error appears to only occur when running the 100 configs all at
once.
The complete error message is as follows:
···
-----
Running Configuration #22/100:
HDF5-DIAG: Error detected in HDF5 (1.8.13) thread 0:
#000: H5Tnative.c line 122 in H5Tget_native_type(): unable to register
data type
major: Datatype
minor: Unable to register new atom
#001: H5I.c line 895 in H5I_register(): can't insert ID node into skip
list
major: Object atom
minor: Unable to insert object
#002: H5SL.c line 995 in H5SL_insert(): can't create new skip list node
major: Skip Lists
minor: Unable to insert object
#003: H5SL.c line 687 in H5SL_insert_common(): can't insert duplicate key
major: Skip Lists
minor: Unable to insert object
H5TBappend_table returns negative value at file: test_file.csv
-----
I am sure that the process did not try to insert the same key to the group
hierarchy more than once, because as I re-ran the one problematic
configuration, I could not reproduce the error. The process instead ran
correctly.
More, on different boxes, the process encountered the error in different
configurations.
I checked the code of the function H5SL_insert_common(), is it possible
that the hashval is running out of space?
Any idea about the reason behind this error? What is a skip-list? Any
possible walk-around?
My HDF5 installation configuration is as follows (built with thread-safe
already):
----------------------------------------------------------------
SUMMARY OF THE HDF5 CONFIGURATION
=================================
General Information:
-------------------
HDF5 Version: 1.8.13
Configured on: Mon Sep 8 10:45:51 EDT 2014
Configured by: hidden
Configure mode: production
Host system: x86_64-unknown-linux-gnu
Uname information: Linux 2.6.32-431.20.3.el6.x86_64 #1 SMP Thu
Jun 19 21:14:45 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
Byte sex: little-endian
Libraries: static, shared
Installation point: hidden
Compiling Options:
------------------
Compilation Mode: production
C Compiler: /usr/bin/gcc ( gcc (GCC) 4.6.2 20111027 )
CFLAGS:
H5_CFLAGS: -std=c99 -pedantic -Wall -Wextra -Wundef
-Wshadow -Wpointer-arith -Wbad-function-cast -Wcast-qual -Wcast-align
-Wwrite-strings -Wconversion -Waggregate-return -Wstrict-prototypes
-Wmissing-prototypes -Wmissing-declarations -Wredundant-decls
-Wnested-externs -Winline -Wno-long-long -Wfloat-equal
-Wmissing-format-attribute -Wmissing-noreturn -Wpacked
-Wdisabled-optimization -Wformat=2 -Wendif-labels
-Wdeclaration-after-statement -Wold-style-definition -Winvalid-pch
-Wvariadic-macros -Wnonnull -Winit-self -Wmissing-include-dirs
-Wswitch-default -Wswitch-enum -Wunused-macros -Wunsafe-loop-optimizations
-Wc++-compat -Wstrict-overflow -Wlogical-op -Wlarger-than=2048 -Wvla
-Wsync-nand -Wframe-larger-than=16384 -Wpacked-bitfield-compat
-Wstrict-aliasing -Wstrict-overflow=5 -Wjump-misses-init
-Wunsuffixed-float-constants -Wdouble-promotion -Wsuggest-attribute=const
-Wtrampolines -O3 -fomit-frame-pointer -finline-functions
AM_CFLAGS:
CPPFLAGS:
H5_CPPFLAGS: -D_POSIX_C_SOURCE=199506L -DNDEBUG
-UH5_DEBUG_API
AM_CPPFLAGS: -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE
-D_BSD_SOURCE
Shared C Library: yes
Static C Library: yes
Statically Linked Executables: no
LDFLAGS:
H5_LDFLAGS:
AM_LDFLAGS: -L/usr/local/szlib/lib/lib
Extra libraries: -lpthread -lz -lrt -ldl -lm
Archiver: ar
Ranlib: ranlib
Debugged Packages:
API Tracing: no
Languages:
----------
Fortran: no
C++: yes
C++ Compiler: /usr/bin/c++
C++ Flags:
H5 C++ Flags:
AM C++ Flags:
Shared C++ Library: yes
Static C++ Library: yes
Features:
---------
Parallel HDF5: no
High Level library: yes
Threadsafety: yes
Default API Mapping: v18
With Deprecated Public Symbols: yes
I/O filters (external): deflate(zlib)
I/O filters (internal): shuffle,fletcher32,nbit,scaleoffset
MPE: no
Direct VFD: no
dmalloc: no
Clear file buffers before write: yes
Using memory checker: no
Function Stack Tracing: no
Strict File Format Checks: no
Optimization Instrumentation: no
Large File Support (LFS): yes
-----------------------------------------------------------------------
Thanks.
Best,
Ching-Chia