Skip to content

Pivoted Cholesky #252

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
vlad17 opened this issue Sep 1, 2020 · 3 comments
Open

Pivoted Cholesky #252

vlad17 opened this issue Sep 1, 2020 · 3 comments

Comments

@vlad17
Copy link
Contributor

vlad17 commented Sep 1, 2020

Currently, Cholesky will fail on PSD but not PD matrices, because it calls ?pptrf.

use ndarray::{array, Array, Axis};
use ndarray_linalg::cholesky::*;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let x1 = array![1., 2., 3., 4., 7.];
    let x2 = array![-1., 2., 3., 5., 8.];
    let x3 = array![1., -2., 3., 6., 9.];
    let x4 = &x1 * 1. + &x2 * 2. - &x3 * 3.;
    let xs = [x1, x2, x3, x4];
    let xs = xs
        .iter()
        .map(|x| x.view().insert_axis(Axis(1)))
        .collect::<Vec<_>>();
    let X = ndarray::stack(Axis(1), &xs)?;
    println!("{:?}", X);
    let XTX = X.t().dot(&X);
    println!("{:?}", XTX);
    let chol = XTX.factorizec(UPLO::Lower)?; // Error: Lapack { return_code: 4 }
    Ok(())
}

However, if we allow pivoting, then we can return a cholesky factor U, pivot matrix P such that P U^T U P^T = A for an input matrix A that's merely PSD (this also returns the rank r).

It would be nice if ndarray-linalg could also provide this pivoted version, e.g., as shown here in python:

from scipy.linalg.lapack import dpstrf
import numpy as np
xt = np.array([
[1, 2, 3, 4, 7],
[-1, 2, 3, 5, 8],
[1, -2, 3, 6, 9]]).astype(float)
x = np.insert(xt, len(xt), xt[0] + 2 * xt[1] - 3 * xt[2], axis=0).T
xtx = x.T.dot(x)
U, P, rank, info = dpstrf(xtx)
assert info > 0
assert rank == 3
P -= 1
U = np.triu(U)
invP = np.empty_like(P)
invP[P] = np.arange(len(P), dtype=int)
print(np.linalg.norm(U.T.dot(U)[np.ix_(invP, invP)] - xtx, ord='fro')) # row indexing inverts permutations
# 3.552713678800501e-14

An interesting design question would be what the interface should be. Clearly this routine should return the factor, pivot, and rank in some form. But it'd be nice if I could take my pivoted Cholesky output, truncate U to its leading principal minor of order r, and initialize a CholeskyFactorized struct directly, so that I can just re-use existing methods for solving the reduced subsystem.

@termoshtt
Copy link
Member

There is solveh submodule for PD matrices which calls [sd]sytr[fsi] and [cz]hetr[fsi].

@vlad17
Copy link
Contributor Author

vlad17 commented Sep 5, 2020

Awesome, thanks for pointing me to it! For anyone else coming in to this issue, to save you some sleuthing I needed to do:

?sytrf/?hetrf perform the Bunch–Kaufman factorization. Sec 4.4 in Matrix Computations and Chapter 11 of Higham describe these and other methods for solving symmetric indefinite systems.

There's basically LDL^T decomposition where D is block-diagonal (for which Bunch–Kaufman provides a partial pivoting strategy, rook pivoting is an "intermediate" pivoting strategy, and Bunch and Parlett give complete pivoting).

Interestingly, a recent LAPACK working note discusses another approach based on Aasen's method, where empirical results for stability generally favor it (but it does not dominate, see Figure 8). Aasen's method decomposes into LTL^T where T is symmetric tridiagonal.

Note that these indefinite decompositions are solving more general problems than the original issue points to, which is just looking at positive semidefinite systems. I'll try the indefinite approach in my use case.

@termoshtt I couldn't find anything that compared the stability of these methods when applied to positive semidefinite systems, where in principle ?pstrf seems like the right tool for the job. Do you have a sense of how the stability would compare?

@vlad17 vlad17 closed this as completed Sep 5, 2020
@vlad17 vlad17 reopened this Sep 7, 2020
@Arnaud15
Copy link

Arnaud15 commented Sep 15, 2020

The solveh as it stands does not let users work with PSD matrices in a reliable fashion.

Example:

use ndarray::{Axis, array, arr1};
use ndarray_linalg::{FactorizeHInto, SolveH};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let x1 = array![1., 2., 3., 4., 7.] / 10.; // Remove this division and factorization will not return an error, thanks to numerical imprecisions.
    let x2 = array![-1., 2., 3., 5., 8.] / 10.;
    let x3 = 2. * &x2;
    let xs = [x1, x2, x3];
    let xs = xs
        .iter()
        .map(|x| x.view().insert_axis(Axis(1)))
        .collect::<Vec<_>>();
    let X = ndarray::stack(Axis(1), &xs)?;
    assert_eq!(X.shape(), &[5, 3]);
    let XTX = X.t().dot(&X);
    println!("{:?}", X);
    println!("{:?}", XTX);
    let f = XTX.factorizeh_into()?; // Error: Lapack { return_code: 2 }
}

The problem:

  1. x_2 and x_3 are colinear.
  2. The second diagonal entry for D in the factorization X^T X = U * D * U^T is exactly zero.
  3. LAPACK spots it and returns an INFO=2. Note that the factorization completed successfully at this stage (see docs).
  4. The solveh code as it is treats any INFO =/= 0 as an error and returns Err(LinalgError::Lapack { return_code })

In my opinion, cases where INFO > 0 for the solveh factorization should not be treated as an error, as it is expected to happen for some PSD inputs, and the factorization in those cases is still successful.

Remark: the error disappears if I remove the "/ 10." above, because then the second diagonal entry for D is only almost 0, due to numerical imprecisions : )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants