design considerations for PyTables

I don't use python at all, myself, so please excuse these questions if they seem a little obvious... However, the code I'm writing will be generating data that people will probably want to process using PyTables, so I need to take it into account..

Anyway, what sort of things do I need to take into account when defining an HDF structure in order to make it most compatible with Python/PyTables? I've heard from folks who have done a little digging that having spaces in field names of compound types can cause difficulties. Is that true? Do I need to restrict names to a specific set of characters? Are there any other such gotchas that I need to be aware of?

Thanks :slight_smile:

(For the record, I did poke around a bit in the online PyTables documentation, but didn't see any such information in my brief skimming)

Hi John,

A Wednesday 12 January 2011 19:49:49 John Knutson escrigué:

I don't use python at all, myself, so please excuse these questions
if they seem a little obvious... However, the code I'm writing will
be generating data that people will probably want to process using
PyTables, so I need to take it into account..

Anyway, what sort of things do I need to take into account when
defining an HDF structure in order to make it most compatible with
Python/PyTables? I've heard from folks who have done a little
digging that having spaces in field names of compound types can
cause difficulties. Is that true? Do I need to restrict names to a
specific set of characters?

No. Spaces in field names (as well as in node names) are supported.
The only drawback in doing so is that you won't be able to use the
natural naming technique for accessing field names. For example, with
PyTables you can access fields in tables with the `cols` accessor and
natural naming this way:

table.cols.info2[1:5]
table.cols.info2.info3[1:5] # nested field

but, in case you field names have spaces, then you should use the
`_f_col` method instead:

table.cols._f_col('info 2')
table.cols._f_col('info 2/info 3') # nested field

I personally find natural naming more convenient to use, specially
interactively, so you may want to restrict yourself and to not add
spaces to names (but you are not forced to).

Are there any other such gotchas that I
need to be aware of?

Yes. PyTables does not support the HDF5 specification entirely, but a
'large enough' set of features. See:

http://www.pytables.org/docs/manual/apf.html

for an almost (I see now that links are not listed here, but they are
supported too) complete specification of the HDF5 features that are
supported.

(For the record, I did poke around a bit in the online PyTables
documentation, but didn't see any such information in my brief
skimming)

More info on column accessors:

http://www.pytables.org/docs/manual/ch03.html#id333050

For future questions on PyTables, please send them to the PyTables
mailing list:

https://lists.sourceforge.net/lists/listinfo/pytables-users

Hope that helps,

···

--
Francesc Alted