Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing Eigen::MatrixXd as column major. #532

Closed
1uc opened this issue Mar 31, 2022 · 11 comments
Closed

Writing Eigen::MatrixXd as column major. #532

1uc opened this issue Mar 31, 2022 · 11 comments
Labels
bug Eigen Any issues related to Eigen.

Comments

@1uc
Copy link
Collaborator

1uc commented Mar 31, 2022

I've created a PR for adding an example which writes an Eigen::MatrixXd, see #531. It's supposed to create a matrix

  0    1    2
100  101  102
...
900  901  902

This matrix is then written to file. When dumping the resulting file I get

HDF5 "eigen_matrix_example.h5" {
GROUP "/" {
   DATASET "dset" {
      DATATYPE  H5T_IEEE_F64LE
      DATASPACE  SIMPLE { ( 10, 3 ) / ( 10, 3 ) }
      DATA {
      (0,0):   0,   100, 200,
      (1,0): 300, 400, 500,
      (2,0): 600, 700, 800,
      (3,0): 900,   1, 101,
      (4,0): 201, 301, 401,
      (5,0): 501, 601, 701,
      (6,0): 801, 901,   2,
      (7,0): 102, 202, 302,
      (8,0): 402, 502, 602,
      (9,0): 702, 802, 902
      }
   }
}
}

Which is consistent with HDF5 assuming data is row-major (C-style) and the actual Eigen matrix being column-major (Fortran/BLAS-style). I couldn't immediately find any code that transposes the matrix before writing it.

Note that a write-read cycle using HighFive returns, as expected, the same matrix.

Can someone confirm that I've set up the example properly and maybe comment on the desired behaviour?

@tdegeus
Copy link
Collaborator

tdegeus commented Apr 6, 2022

Thanks for pointing this out!
I would say that this is not desired. As for as I know HDF5 in row-major and the data should be presented as such. Writing column-major and row-major and then reinterpreting it falsely to correct for it does not make any sense to me. What if you were to read it in with let's say h5py? Then everything would be mixed up!

What I do in H5Easy is to cast to row-major before writing

row_major_type row_major(data);
I think that the core should do this too

@matz-e
Copy link
Member

matz-e commented Apr 6, 2022

Casting it like that seems like a good idea (element of least surprise)! Should this be unified / lifted to "core" to avoid duplication? And would it be wise to make it an option to throw an error / assert if a type conversion happens?

Thinking of bigger matrices that one wouldn't really want to store in memory multiple times.

@tdegeus
Copy link
Collaborator

tdegeus commented Apr 6, 2022

@matz-e I must admit that casting was in part laziness that we can fix if we fix Eigen in the core. If we want to do thing nicely memory efficient we have to just use strides and write item-by-item in a loop

@tdegeus tdegeus added the bug label Apr 7, 2022
@1uc 1uc added the Eigen Any issues related to Eigen. label Mar 17, 2023
@cpp977
Copy link

cpp977 commented Apr 19, 2023

Are there any plans to fix the bug when writing column major matrices?

@1uc
Copy link
Collaborator Author

1uc commented Apr 19, 2023

There's few things that make this a difficult decision:

  • Any application that only uses HighFive to read and write HDF5 files, is not affected. The mistake happens consistently both during writing and reading. The problem only manifest if one reads an array written by some other library, pure HDF5, h5py or from within HighFive but when writing the array was not an Eigen::Matrix, but maybe an xtensor object.
  • As soon as we fix the bug, files that were written by an older version of HighFive can't be correctly read by the new version of HighFive, if any of the datasets was written from an Eigen::Matrix. This means we'd be breaking correct workflows.
  • The difference is often undetectable, unless the matrix has some property that's lost by this bug, it's virtually impossible to tell the difference. I don't know how we can write any test to warn, let alone undo the bug, if we're given some dataset to read. We don't know if it was written with HighFive, we don't know which version, we don't know what the type was from which we serialized the dataset. We can't tell from the shapes, we can't tell from the values either.
  • Let's stress this once more, after fixing the bug, a previously correct workflow will produce bad values. If the user is lucky then notice. If they're unlucky they wont notice, until they look at the final output, which only in the best case is obviously affected, worst case it's subtly wrong; but wrong.

Therefore, I'm not sure it's a good idea to fix it during the 2.x series (others may of course have different opinions).

That said, it's one of several things that need fixing in HighFive. The others being:

  • CMake & header organization,
  • Thread-safety (likely doesn't work at all),
  • this Eigen bug,
  • fixed length strings.

Both the CMake and serialization of strings would be much easier to fix if we could break API slightly. Hence, they might all get bundled, delayed until all of them have been fixed.

If you're using Eigen, maybe just avoid HighFive all together. OTOH that's a bit radical, maybe use it to create the groups, datasets and datatypes; but don't use it to write/read data from/to Eigen::Matrix. At that point you'd revert to pure HDF5, there's {group,file,dataset,...}.getId() which will give you an HDF5 identifier.

@tdegeus
Copy link
Collaborator

tdegeus commented Apr 19, 2023

Well... If I'm not mistaken I think that the H5Easy interface does read/write correctly (it casts to row-major before writing, or from row-major after reading). So it is not so trivial to say that one should not fix this bug directly to be backward compatible. What if one wrote with H5Easy and now reads with the core API?

@1uc
Copy link
Collaborator Author

1uc commented Apr 19, 2023

Yes, this is just another variation of reading data written by a different library; or from/to something that isn't an Eigen::Matrix. It's absolutely possible that someone is doing this. However, I suspect it's more common to stick to one of the sublibraries, i.e. either core or easy; but that is just a guess.

@cpp977
Copy link

cpp977 commented Apr 19, 2023

Yes, I see. This would lead to a huge break of existing code.
Maybe, it can be fixed in version 3 with a big exclamation mark in the release notes.
I did not know that H5Easy casts to row major. I can try this or do the cast by myself before writing the matrix.

@tdegeus
Copy link
Collaborator

tdegeus commented Apr 19, 2023

For reference

row_major_type row_major(data);
DataSet dataset = initDataset<value_type>(file, path, shape(data), options);
dataset.write_raw(row_major.data());

if (data.IsVectorAtCompileTime || data.IsRowMajor) {
return data;
}
using col_major = typename types<T>::col_major;
return col_major(data.data(), dims[0], dims[1]);


I don't care because I don't use the defective code. However,

I would like to point out that purposefully keeping a bug seems unwise: people might be creating new 'corrupted' data that might lead to (undetected?) errors when reading with other libraries (e.g. h5py) - which is not unlikely as having binding for many languages and programs is one of the key features of HDF5. I would rather think about a scheme with a runtime warning that can be silenced and a(n) (pre-processor) option to fall back on the current incorrect behaviour. Regardless on what default is kept until the next major update, the warning should really be there ASAP IMO.

Having an option to interpret col-major data row-major and inverse could even be considered a valid overload for permanent use. Though I'm not a big fan unless we find a use-case.

@1uc
Copy link
Collaborator Author

1uc commented Apr 19, 2023

Very nice write up of the alternative chain of reasoning.

Realistically, I don't see this getting fixed soon, as unpleasant as that may be. What can be done with little effort is to break it completely. It's already broken, we would simply make it break loudly.

1uc added a commit that referenced this issue Apr 19, 2023
HighFive hasn't been respecting that Eigen uses Fortran order, as a
result it's been writing `Eigen::Matrix` to file as if the data were C
order. It's been consistently reading datasets back into `Eigen::Matrix`
has if it already was in Fortran order.

This error is symmetric, i.e., if one writes an `Eigen::Matrix` to disk
via core HighFive and then reads that dataset back into an
`Eigen::Matrix` the outcome is correct (which has been check in CI).

More information can be found at:
#532
1uc added a commit that referenced this issue Apr 19, 2023
HighFive hasn't been respecting that Eigen uses Fortran order, as a
result it's been writing `Eigen::Matrix` to file as if the data were C
order. It's been consistently reading datasets back into `Eigen::Matrix`
has if it already was in Fortran order.

This error is symmetric, i.e., if one writes an `Eigen::Matrix` to disk
via core HighFive and then reads that dataset back into an
`Eigen::Matrix` the outcome is correct (which has been check in CI).

More information can be found at:
#532
alkino pushed a commit that referenced this issue May 12, 2023
HighFive hasn't been respecting that Eigen uses Fortran order, as a
result it's been writing `Eigen::Matrix` to file as if the data were C
order. It's been consistently reading datasets back into `Eigen::Matrix`
has if it already was in Fortran order.

This error is symmetric, i.e., if one writes an `Eigen::Matrix` to disk
via core HighFive and then reads that dataset back into an
`Eigen::Matrix` the outcome is correct (which has been check in CI).

More information can be found at:
#532
@1uc
Copy link
Collaborator Author

1uc commented Feb 21, 2024

Eigen support has been fixed/re-implemented in #968 for v3 and onwards. To find out if a particular codebase using v2.7.x and earlier is affected, first update to v2.8 or v2.9. Then run an suitably extensive test suite, or application runs. If the code tries to write or read any 2D Eigen objects, in a faulty manner, it will throw.

Eigen support in v3 includes row- and colum-major arrays; and maps thereof. Not (yet) supported are strided maps and padded arrays.

@1uc 1uc closed this as completed Feb 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Eigen Any issues related to Eigen.
Projects
None yet
Development

No branches or pull requests

4 participants