Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when opening a HDF5 Dataset with np.ndarray #202

Open
sbyrdsell opened this issue Oct 23, 2019 · 1 comment
Open

Error when opening a HDF5 Dataset with np.ndarray #202

sbyrdsell opened this issue Oct 23, 2019 · 1 comment

Comments

@sbyrdsell
Copy link

When I try to open an HDF5 Dataset with np.ndarray the application quit unexpectedly with no errors. I have python 2.7 installed.

See DataFile and screenshots.

image

image

Canada_Population.h5.zip

@ghost
Copy link

ghost commented Oct 28, 2019

I tried the same HDF5 dataset with HDF Compass v0.6.0 and it worked. Since you mentioned np.ndarray, I also tried with Python 3.6.9. Below is ipython output:

In [1]: import h5py

In [2]: h5py.version.version
Out[2]: '2.10.0'

In [3]: h5py.version.hdf5_version
Out[3]: '1.10.4'

In [4]: f = h5py.File('Canada_Population.h5', 'r')

In [5]: labels = f['/Record/Labels/Values']

In [6]: labels.shape
Out[6]: (1,)

In [7]: labels.dtype
Out[7]: dtype([('Country', 'O', (1,)), ('Continent', 'O', (1,)), ('Abbreviation', 'O', (1,)), ('Language', 'O', (2,)), ('DataSource', 'O', (1,))])

In [8]: labels[0]
Segmentation fault: 11

The reported stack trace indicates the segmentation fault happened during conversion of the dataset's data to NumPy memory structures by h5py:

0   _conv.cpython-36m-darwin.so     0x000000010efbde9a __pyx_f_4h5py_5_conv_conv_vlen2str + 186
1   _conv.cpython-36m-darwin.so     0x000000010efbdd58 __pyx_f_4h5py_5_conv_generic_converter + 680
2   _conv.cpython-36m-darwin.so     0x000000010efbc76e __pyx_f_4h5py_5_conv_vlen2str + 62
3   libhdf5.103.dylib               0x000000010bfadede H5T_convert + 478
4   libhdf5.103.dylib               0x000000010bfc9f31 H5T__conv_array + 2481
5   libhdf5.103.dylib               0x000000010bfadfc1 H5T_convert + 705
6   libhdf5.103.dylib               0x000000010bfc5314 H5T__conv_struct_opt + 2788
7   libhdf5.103.dylib               0x000000010bfadfc1 H5T_convert + 705
8   libhdf5.103.dylib               0x000000010bfadc28 H5Tconvert + 1272
9   defs.cpython-36m-darwin.so      0x000000010c1f433c __pyx_f_4h5py_4defs_H5Tconvert + 76
10  _proxy.cpython-36m-darwin.so    0x000000010f1aaa18 __pyx_f_4h5py_6_proxy_dset_rw + 2824
11  h5d.cpython-36m-darwin.so       0x000000010f1bbf2e __pyx_pw_4h5py_3h5d_9DatasetID_1read + 542

h5dump description of the the dataset's datatype is:

            DATATYPE  H5T_COMPOUND {
               H5T_ARRAY { [1] H5T_STRING {
                  STRSIZE H5T_VARIABLE;
                  STRPAD H5T_STR_NULLTERM;
                  CSET H5T_CSET_ASCII;
                  CTYPE H5T_C_S1;
               } } "Country";
               H5T_ARRAY { [1] H5T_STRING {
                  STRSIZE H5T_VARIABLE;
                  STRPAD H5T_STR_NULLTERM;
                  CSET H5T_CSET_ASCII;
                  CTYPE H5T_C_S1;
               } } "Continent";
               H5T_ARRAY { [1] H5T_STRING {
                  STRSIZE H5T_VARIABLE;
                  STRPAD H5T_STR_NULLTERM;
                  CSET H5T_CSET_ASCII;
                  CTYPE H5T_C_S1;
               } } "Abbreviation";
               H5T_ARRAY { [2] H5T_STRING {
                  STRSIZE H5T_VARIABLE;
                  STRPAD H5T_STR_NULLTERM;
                  CSET H5T_CSET_ASCII;
                  CTYPE H5T_C_S1;
               } } "Language";
               H5T_ARRAY { [1] H5T_STRING {
                  STRSIZE H5T_VARIABLE;
                  STRPAD H5T_STR_NULLTERM;
                  CSET H5T_CSET_ASCII;
                  CTYPE H5T_C_S1;
               } } "DataSource";
            }

which in my opinion is a bit unconventional. I'd suggest simplifying some of the compound fields if interoperability of this file format is important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant