[netcdfgroup] Subsetting data with C++ API calls

Hi

It is an excellent suggestion, that I also follow.

Both netCDF and HDF5 are written in C, and both have C++ "wrappers", that are just C++ classes that call the C API.

Over the years I have written programs that use the netCDF/HDF5 libraries, either in C, like h5diff, or C++, like h5merge. h5merge re-does much of h5diff in a C++ way.

h5diff was written in 2003, and at the time it did not occur to me (and probably to nobody else at the then NCSA HDF Group), that it might
as well be written in C++.

Advantages of writing a C++ program that uses the C APIs:

1) C++ is a much more powerful language than C.

2) It has libraries like STL (Standard Template Libraries), that provide data structures like vectors, lists, maps.

http://www.cplusplus.com/reference/stl/

3) The C++ "wrappers" just add another layer of functions with the same name as the underlying API.

If what they only do is to call the C API, why not do that myself in my program, avoiding that extra layer of functions that I don't need?

4) The C++ wrappers offer only a subset of the C API. Some functions that are sometimes needed are not available.

Pedro

···

------
Pedro Vicente, Earth System Science
University of California, Irvine
http://www.ess.uci.edu/

----- Original Message ----- From: Taylor Binnington
To: Lynnes, Christopher S. (GSFC-6102)
Cc: netcdfgroup
Sent: Thursday, March 07, 2013 6:31 PM
Subject: Re: [netcdfgroup] Subsetting data with C++ API calls

Thank you, that's an excellent suggestion Christopher. I've spent the past few days using the regular C libraries with much more success, and minimal bandwidth usage. I didn't really realize how simple that could be until I tried: I had just assumed that if I wanted to program C++, I would have to use those libraries.
Thanks again!

----- Original Message ----- From: "Lynnes, Christopher S. (GSFC-6102)" <christopher.s.lynnes@nasa.gov>
To: "Taylor Binnington" <tbinnington@gmail.com>
Cc: "netcdfgroup" <netcdfgroup@unidata.ucar.edu>
Sent: Sunday, March 03, 2013 4:18 PM
Subject: Re: [netcdfgroup] Subsetting data with C++ API calls

Taylor,
I can't help thinking that the C++ library you are using seems a little more brittle than the C route at this phase in its evolution. Have you considered making calls to the C API from your C++ call? THe methods for extracting subsets of variables are quite clear in the C API...

On Mar 3, 2013, at 4:34 PM, Taylor Binnington <tbinnington@gmail.com> > wrote:

Hello,

I'm attempting to read only certain parts (specific indices of specific variable arrays), remotely, from a MERRA HDFEOS file.

I've recently upgraded to NetCDF 4.2.1.1, using Lynton's C++ library. At first, I was using trying to subset the data directly from an OPeNDAP URL supplied to NcFile, but it's been suggested to me, by an earlier post in the OPeNDAP forums, that this is not a good way to go. Instead, I should use NetCDF API calls.

I have carefully read through the C++ interface guide, including this example:

http://www.unidata.ucar.edu/software/netcdf/docs/cxx4/test_var_8cpp-example.html

but am struggling to understand how to do this. A push in the right direction would be very appreciated. The example (I don't fully understand it, but was trying to emulate some of the example that I linked above)

#include <iostream>
#include <netcdf>
int main() {
NcFile dataFile("http://goldsmr2.sci.gsfc.nasa.gov/opendap/hyrax/MERRA/MAT1NXSLV.5.2.0/1991/01/MERRA100.prod.assim.tavg1_2d_slv_Nx.19910101.hdf", NcFile::read);
NcGroup grouptest(dataFile.addGroup("Dataset"));
}

gives me the error:

terminate called after throwing an instance of 'netCDF::exceptions::NcNotNc4'
  what(): NcNotNc4: Attempting netcdf-4 operation on netcdf-3 file.
file: ncGroup.cpp line:265
Aborted

This is not surprising, since the file is not a NetCDF-3 file.

Thank you in advance.
Taylor

--
Taylor Binnington
e. tbinnington@gmail.com

_______________________________________________
netcdfgroup mailing list
netcdfgroup@unidata.ucar.edu
For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/

--
Dr. Christopher Lynnes, NASA/GSFC, ph: 301-614-5185

_______________________________________________
netcdfgroup mailing list
netcdfgroup@unidata.ucar.edu
For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/

Hi Dennis, Lynton

Writing libraries in C is a far better solution than C++, indeed.

My comments were, having the choice as a user to write a program, not a library, in C or C++, I prefer C++.

2 different things: libraries, programs that use libraries.

An example

I wrote an API that had to do HDF5 API calls and, the calling program was
a C++ program.

This API, called H5NX.. This is actually a "library", but we can think of it also as a layer that calls a library, in this the case, the HDF5 C library (not the HDF5 C++ library).

http://www.space-research.org/nexus/h5nx.html

It is similar to the HDF5 Lite "High-Level" API

http://www.hdfgroup.org/HDF5/hdf5_hl/doc/RM_hdf5lt.html

It allows to write HDF5 datasets, groups and attributes, taking advantage of
some features Lynton mentioned , specially

(i) use of parametric polymorphisms (C++ templates functions)

Here's an example of 3 calls that write a HDF5 file with a dataset

int main()
{
  H5nx h5nx;

  if (h5nx.H5NXcreate_file("h5nx_ex_scalar.h5") < 0) { };

  if (h5nx.H5NXmake_dataset_scalar("/", "dset", float(1)) < 0) { };

  if (h5nx.H5NXclose_file() < 0) { };

  return 0;
}

If you take a look at the C HDF5 Lite High-Level, there are about 7 functions to
write a dataset, each one for each type

The above, just 1 function with this name

h5nx.H5NXmake_dataset_scalar("/", "dset", float(1));

By means of C++ templates, the function can be used for any defined type.

The argument here is of type "float", in this case the float template is
called.

Not only that, but one of the things that I thought was well achieved too
was the way locations are made.

I took advantage of the C++ string class, so I can do things like

std::string path;

path = "/entry";
h5nx.H5NXmake_group( path, "NXentry")
path += "/";
path += DATASET_NAME;
h5nx.H5NXmake_attribute_scalar(path, "attr", uint8_t(1))

basically, just concatenating names with the "/" HDF5 and netCDF separator
path separator.

As you can see, that are absolutely no identifiers (IDs for short) here.

All is done with constructed path names.

HDF5 and netCDF use IDs to join things together.

For example

to create a HDF5 dataset

H5Dcreate( hid_t loc_id, const char *name, hid_t dtype_id, hid_t space_id,
hid_t lcpl_id, hid_t dcpl_id, hid_t dapl_id )

You have no more no less than 6 IDs... It may offer some powerful features,
but maybe I just want to a simple thing, like writing a scalar float dataset,
and not worry about finding out about all these IDs?

Like in my example, of IDs as an equivalent of the paper ticket number I am given when I take the train and want to keep my bags at a station.
Do I need 6 IDs to get my bags back? No :slight_smile:

This motivatiuon was actually the motivation for the HDF5 Lite "High-Level"
API , to "wrap" HDF5 C calls into "High-Level" functions.

Since Lite is C, it still has IDs, as parameters. If it was C++, I could have *encapsulated* all HDF5 IDs inside some "Lite" class, basically what I did in H5NX.

So, all advantages in C++

But... beneath this H5NX API, I had to make HDF5 function calls

Then I had two choices

1) Use the HDF5 C++ API
2) Use the HDF5 C API.

Looking at the HDF5 C++ API I saw no reason why I should use it instead of
just making direct HDF5 C API calls, that I was much more familiar to begin with.
It was just an extra layer that I did not need.

Have a good C (or C++ ) weekend :slight_smile:

Pedro

Note: This was an email thread started at the netcdf list, that I forwarded
also to the HDF5 mailing list..

PS:

I intended to release

this 15months ago, but unfortunately

it has been delayed due to a a bug, we think in the HDF5 layer.

Did you try help@hdfgroup.org ?

···

------
Pedro Vicente, Earth System Science
University of California, Irvine
http://www.ess.uci.edu/

----- Original Message ----- From: "Appel, Lynton" <Lynton.Appel@ccfe.ac.uk>
To: "Lynnes, Christopher S. (GSFC-6102)" <christopher.s.lynnes@nasa.gov>
Cc: "netcdfgroup" <netcdfgroup@unidata.ucar.edu>; "Taylor Binnington" <tbinnington@gmail.com>
Sent: Friday, March 08, 2013 2:22 AM
Subject: Re: [netcdfgroup] Subsetting data with C++ API calls

Hellp,

I agree with the comments made in this email conversation. However, it may be useful to understand the motivation for implementing a specific C++ API.
These are essentially that the use of it in a C++ code should appear simpler and be more robust than the C API. This is achieved by
(i) use of parametric polymorphisms (C++ templates functions)
(ii) encapsulation (existence of private data).
(iii) error handling: the presence of an error automatically "throws" an error. There is extensive consistency checking within the API.
(iv) inheritance (this relates classes with similiar "traits", eg NcFile and NcGroup; also NcType, NcVLen, Nc EnumType etc).

The use of complicated user-defined types in the exisiting C-API can be difficult, particularly if you
do not know the sizes of elements at compile time. This is because you need to deal with offsets and
alignment issues of the data components. I have implemented a C++ API that can handle this
for user-defined types of arbitrary complexity. I intended to release this 15months ago, but unfortunately
it has been delayed due to a a bug, we think in the HDF5 layer.

Lynton Appel

----- Original Message ----- From: "Dennis Heimbigner" <dmh@unidata.ucar.edu>
To: "Pedro Vicente" <pvicente@uci.edu>
Cc: "Lynnes, Christopher S. (GSFC-6102)" <christopher.s.lynnes@nasa.gov>; "Taylor Binnington" <tbinnington@gmail.com>; "HDF Users Discussion List" <hdf-forum@hdfgroup.org>; <netcdfgroup@unidata.ucar.edu>
Sent: Friday, March 08, 2013 9:11 AM
Subject: Re: [netcdfgroup] Subsetting data with C++ API calls

I must disagree. Writing libraries in C is a far
better solution than C++ primarily because
almost all programming languages systems
(Java, Python, etc) can access C functions
but few can access C++ because of the name mangling
issues.

=Dennis Heimbigner
Unidata

Pedro Vicente wrote:

Hi

It is an excellent suggestion, that I also follow.

Both netCDF and HDF5 are written in C, and both have C++ "wrappers", that are just C++ classes that call the C API.

Over the years I have written programs that use the netCDF/HDF5 libraries, either in C, like h5diff, or C++, like h5merge. h5merge re-does much of h5diff in a C++ way.

h5diff was written in 2003, and at the time it did not occur to me (and probably to nobody else at the then NCSA HDF Group), that it might
as well be written in C++.

Advantages of writing a C++ program that uses the C APIs:

1) C++ is a much more powerful language than C.

2) It has libraries like STL (Standard Template Libraries), that provide data structures like vectors, lists, maps.

http://www.cplusplus.com/reference/stl/

3) The C++ "wrappers" just add another layer of functions with the same name as the underlying API.

If what they only do is to call the C API, why not do that myself in my program, avoiding that extra layer of functions that I don't need?

4) The C++ wrappers offer only a subset of the C API. Some functions that are sometimes needed are not available.

Pedro

------
Pedro Vicente, Earth System Science
University of California, Irvine
http://www.ess.uci.edu/

----- Original Message ----- From: Taylor Binnington
To: Lynnes, Christopher S. (GSFC-6102)
Cc: netcdfgroup
Sent: Thursday, March 07, 2013 6:31 PM
Subject: Re: [netcdfgroup] Subsetting data with C++ API calls

Thank you, that's an excellent suggestion Christopher. I've spent the past few days using the regular C libraries with much more success, and minimal bandwidth usage. I didn't really realize how simple that could be until I tried: I had just assumed that if I wanted to program C++, I would have to use those libraries.
Thanks again!

----- Original Message ----- From: "Lynnes, Christopher S. (GSFC-6102)" <christopher.s.lynnes@nasa.gov>
To: "Taylor Binnington" <tbinnington@gmail.com>
Cc: "netcdfgroup" <netcdfgroup@unidata.ucar.edu>
Sent: Sunday, March 03, 2013 4:18 PM
Subject: Re: [netcdfgroup] Subsetting data with C++ API calls

Taylor,
I can't help thinking that the C++ library you are using seems a little more brittle than the C route at this phase in its evolution. Have you considered making calls to the C API from your C++ call? THe methods for extracting subsets of variables are quite clear in the C API...

On Mar 3, 2013, at 4:34 PM, Taylor Binnington <tbinnington@gmail.com> >>> wrote:

Hello,

I'm attempting to read only certain parts (specific indices of specific variable arrays), remotely, from a MERRA HDFEOS file.

I've recently upgraded to NetCDF 4.2.1.1, using Lynton's C++ library. At first, I was using trying to subset the data directly from an OPeNDAP URL supplied to NcFile, but it's been suggested to me, by an earlier post in the OPeNDAP forums, that this is not a good way to go. Instead, I should use NetCDF API calls.

I have carefully read through the C++ interface guide, including this example:

http://www.unidata.ucar.edu/software/netcdf/docs/cxx4/test_var_8cpp-example.html

but am struggling to understand how to do this. A push in the right direction would be very appreciated. The example (I don't fully understand it, but was trying to emulate some of the example that I linked above)

#include <iostream>
#include <netcdf>
int main() {
NcFile dataFile("http://goldsmr2.sci.gsfc.nasa.gov/opendap/hyrax/MERRA/MAT1NXSLV.5.2.0/1991/01/MERRA100.prod.assim.tavg1_2d_slv_Nx.19910101.hdf", NcFile::read);
NcGroup grouptest(dataFile.addGroup("Dataset"));
}

gives me the error:

terminate called after throwing an instance of 'netCDF::exceptions::NcNotNc4'
  what(): NcNotNc4: Attempting netcdf-4 operation on netcdf-3 file.
file: ncGroup.cpp line:265
Aborted

This is not surprising, since the file is not a NetCDF-3 file.

Thank you in advance.
Taylor

--
Taylor Binnington
e. tbinnington@gmail.com

_______________________________________________
netcdfgroup mailing list
netcdfgroup@unidata.ucar.edu
For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/

--
Dr. Christopher Lynnes, NASA/GSFC, ph: 301-614-5185

_______________________________________________
netcdfgroup mailing list
netcdfgroup@unidata.ucar.edu
For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/

_______________________________________________
netcdfgroup mailing list
netcdfgroup@unidata.ucar.edu
For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/