Detailled solver documentation

SGL

gglasso.solver.single_admm_solver.ADMM_SGL(S: ndarray, lambda1: float, Omega_0: ndarray, Theta_0: ndarray = array([], dtype=float64), X_0: ndarray = array([], dtype=float64), rho: float = 1.0, max_iter: int = 1000, tol: float = 1e-07, rtol: float = 0.0001, stopping_criterion: str = 'boyd', update_rho: bool = True, verbose: bool = False, measure: bool = False, latent: bool = False, mu1: float | None = None, lambda1_mask: ndarray | None = None)

This is an ADMM solver for the (Latent variable) Single Graphical Lasso problem (SGL). If latent=False, this function solves

\[ \begin{align}\begin{aligned}\min_{\Omega, \Theta \in \mathbb{S}^p_{++}} - \log \det \Omega + \mathrm{Tr}(S\Omega) + \lambda \|\Theta\|_{1,od}\\s.t. \quad \Omega = \Theta\end{aligned}\end{align} \]

If latent=True, this function solves

\[ \begin{align}\begin{aligned}\min_{\Omega, \Theta, L \in \mathbb{S}^p_{++}} - \log \det (\Omega) + \mathrm{Tr}(S \Omega) + \lambda_1 \|\Theta\|_{1,od} + \mu_1 \|L\|_{\star}\\s.t. \quad \Omega = \Theta - L\end{aligned}\end{align} \]

Note

  • Typically, sol['Omega'] is positive definite and sol['Theta'] is sparse.

  • We use scaled ADMM, i.e. X are the scaled (with 1/rho) dual variables for the equality constraint.

  • lambda1 can be an array using the argument lambda1_mask. The problem is then solved with the element-wise regularization strength lambda1 * lambda1_mask.

Parameters:
  • S (array (p,p)) – empirical covariance matrix. Needs to be symmetric and positive semidefinite.

  • lambda1 (float, positive) – sparsity regularization parameter.

  • Omega_0 (array (p,p)) – starting point for the Omega variable. Choose np.eye(p) if no better starting point is known.

  • Theta_0 (array (p,p), optional) – starting point for the Theta variable. If not specified, it is set to the same as Omega_0.

  • X_0 (array (p,p), optional) – starting point for the X variable. If not specified, it is set to zero array.

  • rho (float, positive, optional) – step size paramater for the augmented Lagrangian in ADMM. The default is 1. Tune this parameter for optimal performance.

  • max_iter (int, optional) – maximum number of iterations. The default is 1000.

  • tol (float, positive, optional) – tolerance for the primal residual. See “Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers”, Boyd et al. for details. The default is 1e-7.

  • rtol (float, positive, optional) – tolerance for the dual residual. The default is 1e-4.

  • stopping_criterion (str, optional) –

    • ‘boyd’: Stopping criterion after Boyd et al.

    • ’kkt’: KKT residual is chosen as stopping criterion. This is computationally expensive to compute.

    The default is ‘boyd’.

  • update_rho (boolean, optional) – Whether the penalty parameter rho is updated, see Boyd et al. page 20-21 for details. The default is True.

  • verbose (boolean, optional) – verbosity of the solver. The default is False.

  • measure (boolean, optional) – turn on/off measurements of runtime per iteration. The default is False.

  • latent (boolean, optional) – Solve the SGL with or without latent variables (see above for the exact formulations). The default is False.

  • mu1 (float, positive, optional) – low-rank regularization parameter. Only needs to be specified if latent=True.

  • lambda1_mask (array (p,p), non-negative, symmetric, optional) – A mask for the regularization parameter. If specified, the problem is solved with the element-wise regularization strength lambda1 * lambda1_mask.

Returns:

  • sol (dict) – contains the solution, i.e. Omega, Theta, X (and L if latent=True) after termination. All elements are (p,p) arrays.

  • info (dict) – status and measurement information from the solver.

gglasso.solver.single_admm_solver.block_SGL(S: ndarray, lambda1: float, Omega_0: ndarray, Theta_0: ndarray | None = None, X_0: ndarray | None = None, rho: float = 1.0, max_iter: int = 1000, tol: float = 1e-07, rtol: float = 0.001, stopping_criterion: str = 'boyd', update_rho: bool = True, verbose: bool = False, measure: bool = False, lambda1_mask: ndarray | None = None)

This is a wrapper for solving SGL problems on connected components of the solution and solving each block separately. See Witten, Friedman, Simon “New Insights for the Graphical Lasso” for details.

It solves

\[ \begin{align}\begin{aligned}\min_{\Omega, \Theta \in \mathbb{S}^p_{++}} - \log \det \Omega + \mathrm{Tr}(S\Omega) + \lambda \|\Theta\|_{1,od}\\s.t. \quad \Omega = \Theta\end{aligned}\end{align} \]

Note

  • In the original paper the l1-norm is applied as well on the diagonal (here: off-diagonal) which results in a small modification.

  • The returned solution for X is not guaranteed to be identical to the dual variable of the full solution, but can be used as starting point (e.g. in grid search).

Parameters:
  • S (array (p,p)) – empirical covariance matrix. Needs to be symmetric and positive semidefinite.

  • lambda1 (float, positive) – sparsity regularization parameter.

  • Omega_0 (array (p,p)) – starting point for the Omega variable. Choose np.eye(p) if no better starting point is known.

  • Theta_0 (array (p,p), optional) – starting point for the Theta variable. If not specified, it is set to the same as Omega_0.

  • X_0 (array (p,p), optional) – starting point for the X variable. If not specified, it is set to zero array.

  • rho (float, positive, optional) – step size paramater for the augmented Lagrangian in ADMM. The default is 1. Tune this parameter for optimal performance.

  • max_iter (int, optional) – maximum number of iterations. The default is 1000.

  • tol (float, positive, optional) – tolerance for the primal residual. See “Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers”, Boyd et al. for details. The default is 1e-7.

  • rtol (float, positive, optional) – tolerance for the dual residual. The default is 1e-4.

  • stopping_criterion (str, optional) –

    • ‘boyd’: Stopping criterion after Boyd et al.

    • ’kkt’: KKT residual is chosen as stopping criterion. This is computationally expensive to compute.

    The default is ‘boyd’.

  • update_rho (boolean, optional) – Whether the penalty parameter rho is updated, see Boyd et al. page 20-21 for details. The default is True.

  • verbose (boolean, optional) – verbosity of the solver. The default is False.

  • measure (boolean, optional) – turn on/off measurements of runtime per iteration. The default is False.

  • lambda1_mask (array (p,p), non-negative, symmetric, optional) – A mask for the regularization parameter. If specified, the problem is solved with the element-wise regularization strength lambda1 * lambda1_mask.

Returns:

  • sol (dict) – contains the solution, i.e. Omega, Theta, X after termination.

  • info (dict) – status and measurement information from the solver.

MGL

gglasso.solver.admm_solver.ADMM_MGL(S: ndarray, lambda1: float, lambda2: float, reg: str, Omega_0: ndarray, Theta_0: ndarray = array([], dtype=float64), X_0: ndarray = array([], dtype=float64), n_samples: int | ndarray | None = None, tol: float = 1e-05, rtol: float = 0.0001, stopping_criterion: str = 'boyd', update_rho: bool = True, rho: float = 1.0, max_iter: int = 1000, verbose: bool = False, measure: bool = False, latent: bool = False, mu1: float | ndarray | None = None)

This is an ADMM solver for the (Latent variable) Multiple Graphical Lasso problem (MGL). It jointly estimates K precision matrices of shape (p,p).

If latent=False, this function solves

\[ \begin{align}\begin{aligned}\min_{\Omega, \Theta} \sum_{k=1}^{K} (-\log\det(\Omega^{(k)}) + \mathrm{Tr}(S^{(k)} \Omega^{(k)}) ) + \mathcal{P}(\Theta)\\s.t. \quad \Omega^{(k)} = \Theta^{(k)} \quad k=1,\dots,K\end{aligned}\end{align} \]

Here, \(\mathcal{P}\) is a regularization function which depends on the application. Group Graphical Lasso (GGL) or Fused Graphical Lasso (FGL) is implemented. If latent=True, this function solves

\[ \begin{align}\begin{aligned}\min_{\Omega, \Theta, L}\quad \sum_{k=1}^{K} (-\log\det(\Omega^{(k)}) + \mathrm{Tr}(S^{(k)},\Omega^{(k)}) ) + \mathcal{P}(\Theta) +\sum_{k=1}^{K} \mu_{1,k} \|L^{(k)}\|_{\star}\\s.t. \quad \Omega^{(k)} = \Theta^{(k)} - L^{(k)} \quad k=1,\dots,K\end{aligned}\end{align} \]

Note

  • Typically, sol['Omega'] is positive definite and sol['Theta'] is sparse.

  • We use scaled ADMM, i.e. X are the scaled (with 1/rho) dual variables for the equality constraint.

Parameters:
  • S (array (K,p,p)) – empirical covariance matrices, i.e. S[k,:,:] contains the empirical cov. matrix of the k-th instance. Each S[k,:,:] needs to be symmetric and positive semidefinite.

  • lambda1 (float, positive) – sparsity regularization parameter.

  • lambda2 (float, positive) – group sparsity/ total variation regularization parameter.

  • reg (str) –

    choose either

    • ’GGL’: Group Graphical Lasso

    • ’FGL’: Fused Graphical Lasso

  • Omega_0 (array (K,p,p)) – starting point for the Omega variable. Use get_K_identity(K, p) from gglasso.helper.utils if no better starting point is known.

  • Theta_0 (array (p,p), optional) – starting point for the Theta variable. If not specified, it is set to the same as Omega_0.

  • X_0 (array (p,p), optional) – starting point for the X variable. If not specified, it is set to zero array.

  • n_samples (int or array of shape(K,), optional) – neg. log-likelihood is sometimes weighted with the sample size. If this is desired, specify sample size of each instance. Default is no weighting, i.e. n_samples = 1.

  • rho (float, positive, optional) – step size paramater for the augmented Lagrangian in ADMM. The default is 1. Tune this parameter for optimal performance.

  • max_iter (int, optional) – maximum number of iterations. The default is 1000.

  • tol (float, positive, optional) – tolerance for the primal residual. See ‘Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers’, Boyd et al. for details. The default is 1e-7.

  • rtol (float, positive, optional) – tolerance for the dual residual. The default is 1e-4.

  • stopping_criterion (str, optional) –

    • ‘boyd’: Stopping criterion after Boyd et al.

    • ’kkt’: KKT residual is chosen as stopping criterion. This is computationally expensive to compute.

    The default is ‘boyd’.

  • update_rho (boolean, optional) – Whether the penalty parameter rho is updated, see Boyd et al. page 20-21 for details. The default is True.

  • verbose (boolean, optional) – verbosity of the solver. The default is False.

  • measure (boolean, optional) – turn on/off measurements of runtime per iteration. The default is False.

  • latent (boolean, optional) – Solve the MGL problem with or without latent variables (see above for the exact formulations). The default is False.

  • mu1 (float or array of shape(K,), positive, optional) – low-rank regularization parameter, possibly different for each instance k=1,..,K. Only needs to be specified if latent=True.

Returns:

  • sol (dict) – contains the solution, i.e. Omega, Theta, X (and L if latent=True) after termination. All arrays are of shape (K,p,p).

  • info (dict) – status and measurement information from the solver.

gglasso.solver.ppdna_solver.PPDNA(S, lambda1, lambda2, reg, Omega_0, Theta_0=array([], dtype=float64), X_0=array([], dtype=float64), ppdna_params=None, eps_ppdna=1e-05, verbose=False, measure=False)

Proximal Point algorithm for the Multiple Graphical Lasso problem (MGL). It solves

\[ \begin{align}\begin{aligned}\min_{\Omega, \Theta} \sum_{k=1}^{K} (-\log\det(\Omega^{(k)}) + \mathrm{Tr}(S^{(k)} \Omega^{(k)}) ) + \mathcal{P}(\Theta)\\s.t. \quad \Omega^{(k)} = \Theta^{(k)} \quad k=1,\dots,K\end{aligned}\end{align} \]

Here, \(\mathcal{P}\) is a regularization function which depends on the application. Group Graphical Lasso (GGL) or Fused Graphical Lasso (FGL) is implemented.

Parameters:
  • S (array (K,p,p)) – empirical covariance matrices, i.e. S[k,:,:] contains the empirical cov. matrix of the k-th instance. Each S[k,:,:] needs to be symmetric and positive semidefinite.

  • lambda1 (float, positive) – sparsity regularization parameter.

  • lambda2 (float, positive) – group sparsity/ total variation regularization parameter.

  • reg (str) –

    choose either

    • ’GGL’: Group Graphical Lasso

    • ’FGL’: Fused Graphical Lasso

  • Omega_0 (array (K,p,p)) – starting point for the Omega variable. Use get_K_identity(K, p) from gglasso.helper.utils if no better starting point is known.

  • Theta_0 (array (p,p), optional) – starting point for the Theta variable. If not specified, it is set to the same as Omega_0.

  • X_0 (array (p,p), optional) – starting point for the X variable. If not specified, it is set to zero array.

  • ppdna_params (dict, optional) –

    dictionary with solver parameters. Possible keys:

    • max_iter: maximum number of iterations, default is 20.

    • sigma_0: step size starting point. Step size is increased by 1.3 in every iteration. Default for sigma_0 is 100.

  • eps_ppdna (float, positive, optional) – tolerance for the kkt residual. See ‘A proximal point dual Newton algorithm for solving group graphical Lasso problems’, Zhang et al. for details. The default is 1e-5.

  • verbose (boolean, optional) – verbosity of the solver. The default is False.

  • measure (boolean, optional) – turn on/off measurements of runtime per iteration. The default is False.

Returns:

  • sol (dict) – contains the solution, i.e. Omega, Theta, X after termination. All arrays are of shape (K,p,p).

  • info (dict) – status and measurement information from the solver.

GGL nonconforming

gglasso.solver.ext_admm_solver.ext_ADMM_MGL(S: dict, lambda1: float, lambda2: float, reg: str, Omega_0: dict, G: ndarray, X0: dict | None = None, X1: dict | None = None, tol: float = 1e-05, rtol: float = 0.0001, stopping_criterion: str = 'boyd', rho: float = 1.0, max_iter: int = 1000, verbose: bool = False, measure: bool = False, latent: bool = False, mu1: float | None = None)

This is an ADMM algorithm for solving the Group Graphical Lasso problem where not all instances have the same number of dimensions, i.e. some variables are present in some instances and not in others. A group sparsity penalty is applied to all pairs of variables present in multiple instances.

IMPORTANT: As the arrays are non-conforming in dimensions here, we operate on dictionaries with keys 1,..,K (as int) and each value is a array of shape \((p_k,p_k)\).

If latent=False, this function solves

\[ \begin{align}\begin{aligned}\min_{\Omega,\Theta,\Lambda} \sum_{k=1}^K - \log \det(\Omega^{(k)}) + \mathrm{Tr}(S^{(k)}\Omega^{(k)}) + \sum_{k=1}^K \lambda_1 ||\Theta^{(k)}||_{1,od} + \sum_{l} \lambda_2 \beta_l ||\Lambda_{[l]}||_2\\s.t. \quad \Omega^{(k)} = \Theta^{(k)} \quad k=1,\dots,K\\\quad \quad \Lambda^{(k)} = \Theta^{(k)} \quad k=1,\dots,K\end{aligned}\end{align} \]

where l indexes the groups of overlapping variables and \(\Lambda_{[l]}\) is the array of all respective components. To account for differing group sizes we multiply with \(\beta_l\), the square root of the group size.

If latent=True, this function solves

\[ \begin{align}\begin{aligned}\min_{\Omega,\Theta,\Lambda,L} \sum_{k=1}^K - \log \det(\Omega^{(k)}) + \mathrm{Tr}(S^{(k)}\Omega^{(k)}) + \sum_{k=1}^K \lambda_1 ||\Theta^{(k)}||_{1,od}\\+ \sum_{l} \lambda_2 \beta_l ||\Lambda_{[l]}||_2 +\sum_{k=1}^{K} \mu_{1,k} \|L^{(k)}\|_{\star}\\s.t. \quad \Omega^{(k)} = \Theta^{(k)} - L^{(k)} \quad k=1,\dots,K\\\quad \quad \Lambda^{(k)} = \Theta^{(k)} \quad k=1,\dots,K\end{aligned}\end{align} \]

Note

  • Typically, sol['Omega'] is positive definite and sol['Theta'] is sparse.

  • We use scaled ADMM, i.e. X0 and X1 are the scaled (with 1/rho) dual variables for the equality constraints.

Parameters:
  • S (dict) – empirical covariance matrices. S should have keys 1,..,K (as integers) and S[k] contains the \((p_k,p_k)\)-array of the empirical cov. matrix of the k-th instance. Each S[k] needs to be symmetric and positive semidefinite.

  • lambda1 (float, positive) – sparsity regularization parameter.

  • lambda2 (float, positive) – group sparsity regularization parameter.

  • reg (str) – so far only Group Graphical Lasso is available, hence choose ‘GGL’.

  • Omega_0 (dict) – starting point for the Omega variable. Should be of same form as S. If no better starting point is available, choose Omega_0[k] = np.eye(p_k) for k=1,…,K

  • G (array) – bookkeeping arrays which contains information where the respective entries for each group can be found.

  • X0 (dict, optional) – starting point for the X0 variable. If not specified, it is set to zeros.

  • X1 (dict, optional) – starting point for the X1 variable. If not specified, it is set to zeros.

  • rho (float, positive, optional) – step size paramater for the augmented Lagrangian in ADMM. The default is 1. Tune this parameter for optimal performance.

  • max_iter (int, optional) – maximum number of iterations. The default is 1000.

  • tol (float, positive, optional) – tolerance for the primal residual. See “Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers”, Boyd et al. for details. The default is 1e-7.

  • rtol (float, positive, optional) – tolerance for the dual residual. The default is 1e-4.

  • stopping_criterion (str, optional) –

    • ‘boyd’: Stopping criterion after Boyd et al.

    • ’kkt’: KKT residual is chosen as stopping criterion. This is computationally expensive to compute.

    The default is ‘boyd’.

  • verbose (boolean, optional) – verbosity of the solver. The default is False.

  • measure (boolean, optional) – turn on/off measurements of runtime per iteration. The default is False.

  • latent (boolean, optional) – Solve the GGL problem with or without latent variables (see above for the exact formulations). The default is False.

  • mu1 (float, positive, optional) – low-rank regularization parameter, possibly different for each instance k=1,..,K. Only needs to be specified if latent=True.

Returns:

  • sol (dict) – contains the solution, i.e. Omega, Theta, X0, X1 (and L if latent=True) after termination. All elements are dictionaries with keys 1,..,K and (p_k,p_k)-arrays as values.

  • info (dict) – status and measurement information from the solver.

Functional SGL

gglasso.solver.functional_sgl_admm.ADMM_FSGL(S: ndarray, lambda1: float, M: int, Omega_0: ndarray, Theta_0: ndarray = array([], dtype=float64), X_0: ndarray = array([], dtype=float64), rho: float = 1.0, max_iter: int = 1000, tol: float = 1e-07, rtol: float = 0.0001, update_rho: bool = True, verbose: bool = False, measure: bool = False, latent: bool = False, mu1: float | None = None)

This is an ADMM solver for the (Latent variable) Functional Single Graphical Lasso problem (FSGL). It solves a SGL problem for the case when each of the p variables has an M-dimensional functional representation (e.g. Fourier transform). Therefore, the input data has the shape (p*M,p*M).

If latent=False, this function solves

\[ \begin{align}\begin{aligned}\min_{\Omega, \Theta \in \mathbb{S}^{pM}_{++}} - \log \det \Omega + \mathrm{Tr}(S\Omega) + \lambda_1 \sum_{j \neq l} \|\Theta_{jl}^M\|_{F}.\\s.t. \quad \Omega = \Theta.\end{aligned}\end{align} \]

If latent=True, this function solves

\[ \begin{align}\begin{aligned}\min_{\Omega, \Theta, L \in \mathbb{S}^{p\cdot M}_{++}} - \log \det (\Omega) + \mathrm{Tr}(S \Omega) + \lambda_1 \sum_{j\neq l} \|\Theta_{jl}^M\|_{F} + \mu_1 \|L\|_{\star}\\s.t. \quad \Omega = \Theta - L.\end{aligned}\end{align} \]

Note

  • We use scaled ADMM, i.e. X are the scaled (with 1/rho) dual variables for the equality constraint.

Parameters:
  • S (array (p*M,p*M)) – empirical covariance matrix. Needs to be symmetric and positive semidefinite.

  • lambda1 (float, positive) – regularization parameter for the Frobenius norm of the MxM subblocks.

  • M (int) – Dimension of the functional representation. See “Functional Graphical Models”, Qiao et al. for details.

  • Omega_0 (array (p*M,p*M)) – starting point for the Omega variable. Choose np.eye(p*M) if no better starting point is known.

  • Theta_0 (array (p*M,p*M), optional) – starting point for the Theta variable. If not specified, it is set to the same as Omega_0.

  • X_0 (array (p*M,p*M), optional) – starting point for the X variable. If not specified, it is set to zero array.

  • rho (float, positive, optional) – step size paramater for the augmented Lagrangian in ADMM. The default is 1. Tune this parameter for optimal performance.

  • max_iter (int, optional) – maximum number of iterations. The default is 1000.

  • tol (float, positive, optional) – tolerance for the primal residual. See “Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers”, Boyd et al. for details. The default is 1e-7.

  • rtol (float, positive, optional) – tolerance for the dual residual. The default is 1e-4.

  • update_rho (boolean, optional) – Whether the penalty parameter rho is updated, see Boyd et al. page 20-21 for details. The default is True.

  • verbose (boolean, optional) – verbosity of the solver. The default is False.

  • measure (boolean, optional) – turn on/off measurements of runtime per iteration. The default is False.

  • latent (boolean, optional) – Solve the SGL with or without latent variables (see above for the exact formulations). The default is False.

  • mu1 (float, positive, optional) – low-rank regularization parameter. Only needs to be specified if latent=True.

Returns:

  • sol (dict) – contains the solution, i.e. Omega, Theta, X (and L if latent=True) after termination. All elements are (p*M,p*M) arrays.

  • info (dict) – status and measurement information from the solver.