Out of memory error using Matrix.get with many observations #465

nikobenho · 2023-10-18T07:59:02Z

I ran into an out of memory error using the Matrix.get method in a case where I have a lot of zero-weighted observations that I would like to carry through history-matching as metadata. It looks like the error occurs when np.diag is called:

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
Cell In[59], line 6
      3 print(len(obs_cov.x))
      5 # reduce cov down to only include non-zero obsverations
----> 6 obs_cov = obs_cov.get(row_names=pst.nnz_obs_names, col_names=pst.nnz_obs_names, )

File ~\Documents\Development\repos\GMDSI_notebooks\dependencies\pyemu\pyemu\mat\mat_handler.py:1686, in Matrix.get(self, row_names, col_names, drop)
   1684     return Cov(x=extract, names=names, isdiagonal=self.isdiagonal)
   1685 if self.isdiagonal:
-> 1686     extract = np.diag(self.__x[:, 0])
   1687 else:
   1688     extract = self.__x.copy()

File ~\AppData\Local\anaconda3\envs\gmdsitut\lib\site-packages\numpy\lib\twodim_base.py:293, in diag(v, k)
    291 if len(s) == 1:
    292     n = s[0]+abs(k)
--> 293     res = zeros((n, n), v.dtype)
    294     if k >= 0:
    295         i = k

MemoryError: Unable to allocate 110. GiB for an array with shape (121252, 121252) and data type float64

It seems modifying the method to use scipy.sparse.diags can offer a workaround:

    def spget(self, row_names=None, col_names=None, drop=False):

        if row_names is None and col_names is None:
            raise Exception(
                "Matrix.get(): must pass at least" + " row_names or col_names"
            )

        if row_names is not None and not isinstance(row_names, list):
            row_names = [row_names]
        if col_names is not None and not isinstance(col_names, list):
            col_names = [col_names]

        if isinstance(self, Cov) and (row_names is None or col_names is None):
            if row_names is not None:
                idxs = self.indices(row_names, axis=0)
                names = row_names
            else:
                idxs = self.indices(col_names, axis=1)
                names = col_names

            if self.isdiagonal:
                extract = self.__x[idxs].copy()
            else:
                extract = self.__x[idxs, :].copy()
                extract = extract[:, idxs]
            if drop:
                self.drop(names, 0)
            return Cov(x=extract, names=names, isdiagonal=self.isdiagonal)
        if self.isdiagonal:
            extract = sp.diags(self.__x[:, 0], offsets=0, format='csr') # switch from np to sp
        else:
            extract = self.__x.copy()

        if row_names is not None:
            row_idxs = self.indices(row_names, axis=0)
            extract = extract[row_idxs, :]  # this row is modified
            if drop:
                self.drop(row_names, axis=0)
        else:
            row_names = self.row_names

        if col_names is not None:
            col_idxs = self.indices(col_names, axis=1)
            extract = extract[:, col_idxs]  # this row is also modified
            if drop:
                self.drop(col_names, axis=1)
        else:
            col_names = copy.deepcopy(self.col_names)

        return type(self)(x=extract.toarray(), row_names=row_names, col_names=col_names) # modified to use .toarray()

Comparing the methods using a smaller amount of observations (allowing the use of both get and spget):

obs_cov_reduced = obs_cov.get(row_names=pst.nnz_obs_names, col_names=pst.nnz_obs_names, )
obs_cov_reduced_sp = obs_cov.spget(row_names=pst.nnz_obs_names, col_names=pst.nnz_obs_names, )
print(np.array_equal(obs_cov_reduced.x, obs_cov_reduced_sp.x))

True

I wanted to bring this to attention because quite a lot of use-cases seem to rely on this method. Although I'm not sure whether this modification has any unforseen impact further down the line.

The text was updated successfully, but these errors were encountered:

jtwhite79 · 2023-10-18T23:07:13Z

Thanks @nikobenho . Ive considered swapping to sparse storage for a while now bc it does handle really large cases better (or at least using sparse for a few of the more critical pinch points like get())...pull requests welcome!

wkitlasten · 2024-05-09T21:52:31Z

Maybe a similar issue(?), but I am hitting memory issues when I try to build a parcov using an .unc file with about 150,000 pars (small sample of the file below):

...
start covariance_matrix
file pp_mbr.flow_layer8_inst0pp.dat_cov.mat
variance_multiplier 0.09
first_parameter p164624
last_parameter p164840
end covariance_matrix

start covariance_matrix
file pp_mbr.flow_layer9_inst0pp.dat_cov.mat
variance_multiplier 0.09
first_parameter p164841
last_parameter p165057
end covariance_matrix

start standard_deviation
p0 1.0
p1 1.0
...

Curious if there is an easy way to build these as a bunch of smaller "blocks" and then combined into a block diagonal matrix to get around the memory issue? Or if that is already what is being done and I am just out of luck.

jtwhite79 · 2024-05-09T22:00:01Z

The pyemu helpers already do the group/block based side-stepping trick to avoid having to form that full matrix for drawing realizations. The problem is that the Cov object holds the matrix as dense in mem so its going to be huge and mostly zeros...

briochh added the enhancement label Mar 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Out of memory error using Matrix.get with many observations #465

Out of memory error using Matrix.get with many observations #465

nikobenho commented Oct 18, 2023 •

edited

Loading

jtwhite79 commented Oct 18, 2023

wkitlasten commented May 9, 2024

jtwhite79 commented May 9, 2024

Out of memory error using Matrix.get with many observations #465

Out of memory error using Matrix.get with many observations #465

Comments

nikobenho commented Oct 18, 2023 • edited Loading

jtwhite79 commented Oct 18, 2023

wkitlasten commented May 9, 2024

jtwhite79 commented May 9, 2024

nikobenho commented Oct 18, 2023 •

edited

Loading