Segmentation fault on NERSC

Hello All,

I am trying to run a large run of a code on NERSC cluster. I get the following
error on runtime. Has anybody encountered something similar before? Segmentation
faults can be difficult to track, so if anyone can advise be about debugging
this error, I'll be grateful. My guess is that it is because of the lack of memory.

Running job in /usr/common/homes/n/nikhill/testrun
**** Segmentation fault! Fault address: 0x28f3000
**** Segmentation fault! Fault address: 0x291b000
**** Segmentation fault! Fault address: 0x28f4000
**** Segmentation fault! Fault address: 0x28f4000
**** Segmentation fault! Fault address: 0x28f4000
**** Segmentation fault! Fault address: 0x28f4000
**** Segmentation fault! Fault address: 0x28f4000
**** Segmentation fault! Fault address: 0x28f4000
**** Segmentation fault! Fault address: 0x28f4000

Fault address is 0 bytes above the nearest valid
mapping boundary, which is at 0x291b000.
:
:
:

You can obtain a view of your program's memory map at
the time of the crash by rerunning with the F90_DUMP_MAP
environment variable set to a non-empty string.
Abort
Abort
Fault address is 0 bytes above the nearest valid
mapping boundary, which is at 0x28f3000.

Regards,
Nikhil

···

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

My usual approach would be to find out in what part of my
application it failed. Run it with gdb, the where command
would show a stack trace. That usually narrows thing
down a bit. But in order to use do stack trace, the
program should be compiled with the -g option.

So, try compile your program with -g. If it shows
the error is buried in some HDF5 calls, try rebuild
an hdf5 library configured with "--disable-production" to both
allow stack tracing and also turn on a lot of debugging
code in the HDF5 library. The executable would be slower
but it may catch some errors not detectable in the production
mode.

Those would be my initial approaches. Hope it helps.

-Albert

···

At 11:55 AM 7/16/2008, Nikhil Laghave wrote:

Hello All,

I am trying to run a large run of a code on NERSC cluster. I get the following
error on runtime. Has anybody encountered something similar before? Segmentation
faults can be difficult to track, so if anyone can advise be about debugging
this error, I'll be grateful. My guess is that it is because of the lack of memory.

Running job in /usr/common/homes/n/nikhill/testrun
**** Segmentation fault! Fault address: 0x28f3000
**** Segmentation fault! Fault address: 0x291b000
**** Segmentation fault! Fault address: 0x28f4000
**** Segmentation fault! Fault address: 0x28f4000
**** Segmentation fault! Fault address: 0x28f4000
**** Segmentation fault! Fault address: 0x28f4000
**** Segmentation fault! Fault address: 0x28f4000
**** Segmentation fault! Fault address: 0x28f4000
**** Segmentation fault! Fault address: 0x28f4000

Fault address is 0 bytes above the nearest valid
mapping boundary, which is at 0x291b000.
:
:
:

You can obtain a view of your program's memory map at
the time of the crash by rerunning with the F90_DUMP_MAP
environment variable set to a non-empty string.
Abort
Abort
Fault address is 0 bytes above the nearest valid
mapping boundary, which is at 0x28f3000.

Regards,
Nikhil

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.