Using the problem object

If you want to solve a (Multiple) Graphical Lasso problem, you can of course use the solvers we list in Algorithms directly. However, in most situations it is not clear how to choose the regularization parameters a priori and thus model selection becomes necessary. Below, we describe the model selection functionalities implemented in GGLasso. In order to make its usage as simple as possible, we implemented a class glasso_problem which calls the solvers/model selection procedures internally and returns an estimator of the precision matrix/matrices in sklearn-style.

Class glasso_problem

class gglasso.problem.glasso_problem(S, N, reg='GGL', reg_params=None, latent=False, G=None, do_scaling=False)

Class for Graphical Lasso problems. After solving, you can access the estimators with self.solution. See documentation of GGLassoEstimator for details.

Important attributes which determine the problem type:

self.multiple: specifies if SGL or MGL.
self.latent: specifies if latent variables are modeled.
self.reg: specifies if FGL or GGL (if MGL).
self.conforming: specifies if all variables are present in all instances (for GGL).

An instance of this class can be printed in order to inspect the derived problem formulation.

Parameters:

S (2d/3d-array or list/dict) –
Empirical covariance matrices.
- For SGL, use 2d array of shape (p,p).
- For MGL use 3d array of shape (K,p,p).
- For GGL with non-conforming dimensions, use a list/dict of length K. Will be transformed to a dict with keys 1,..K.
For MGL, each S[k] has to be symmetric and positive semidefinite. Note: scaling S to correlations might be helpful, see option do_scaling.
N (int or integer array of length K) – Number of samples for each instance k=1,..,K.
reg (str, optional) –
Type of regularization for MGL problems.
- ’FGL’ = Fused Graphical Lasso
- ’GGL’ = Group Graphical Lasso
The default is ‘GGL’.
reg_params (dict, optional) –
Dictionary of regularization parameters. Needs to be specified before calling .solve().

Possible keys are:
- 'lambda1': float, positive
- 'lambda2': float, positive
- 'mu1': float or array of length K, positive. Only needed if latent = True.
- 'lambda1_mask': array (p,p) , non-negative, symmetric. The \(\lambda_1\) parameter is multiplied element-wise with this array. Only available for SGL.
latent (boolean, optional) –
Specify whether latent variables should be modeled.
- latent = True: inverse covariance is assumed to have form \(\Theta-L\) (sparse - low rank).
- latent = False: inverse covariance is assumed to have form \(\Theta\) (sparse).
The default is False.
G (3d-array of shape(2,L,K), optional) – Only needed when dimensions are non-conforming, i.e. if number of variables is different in each instance. See Nonconforming GGL on how to create G.
do_scaling (boolean, optional) – Whether to scale input S to correlations. The default is False. If True, the output is re-scaled to covariances after solving.

glasso_problem.solve(Omega_0=None, solver_params={}, tol=1e-08, rtol=1e-07, solver='admm', verbose=False)

Method for solving the Graphical Lasso problem formulation. After solving, an instance of GGLassoEstimator will be created and assigned to self.solution.

Parameters:

Omega_0 (2d/3d-array or dict, optional) –
Start point for solver. If not specified, identity matrix is used as a starting point.
- For SGL, specifiy a symmetric 2d-array of shape (p,p).
- For MGL, specifiy a symmetric (for each k) 3d-array of shape (K,p,p).
- For non-conforming MGL, specifiy a dictionary with keys 1,…,K and symmetric 2d-arrays of shape \((p_k,p_k)\) as values.
solver_params (dict, optional) – Parameters for the solvers. This is given as kwargs into the solver. See the docs of the solvers for more details. The verbose option of the solver will always be overwritten by the value of the verbose option of this function.
tol (float, optional) – Tolerance for solving. The smaller it is, the longer it will take to solve the problem. The default is 1e-5.
rtol (float, optional) – Relative Tolerance for solving. The smaller it is, the longer it will take to solve the problem. The default is 1e-4.
solver (str, optional) – Solver name. At this point, we use ADMM for all formulations. The default is ‘admm’.
verbose (boolean, optional) – Verbosity of the solver. The default is False.

Return type:

None.

glasso_problem.model_selection(modelselect_params=None, method='eBIC', gamma=0.1, tol=1e-07, rtol=1e-07, store_all=False)

Method for doing model selection, i.e. trying to find the best regularization parameters. An instance of GGLassoEstimator will be created and assigned to self.solution.

Strategy for the different problem formulations:

SGL: solve on a path of \(\lambda_1\) values or on a grid of \((\lambda_1, \mu_1)\) values if latent=True. Choose the grid point where the eBIC is minimal.
MGL and latent=False: solve on a grid of \((\lambda_1, \lambda_2)\) values. Choose the grid point where the eBIC is minimal.
MGL and latent=True: in a first stage, solve SGL on a \((\lambda_1, \mu_1)\) for each instance \(k=1,\dots,K\) independently. Then, do a grid search on \((\lambda_1, \lambda_2)\) values and for each \(\lambda_1\) and each instance \(k=1,\dots,K\) pick the \(\mu_1\) value which had minimal eBIC in stage one. Then, pick again the grid point with minimal eBIC.

Parameters:

modelselect_params (dict, optional) – Dictionary with (a subset of) parameters for the grid search. This allows you to specify the grid which is used. Calls self.set_modelselect_params(), see doc of this method for details.
method (str, optional) – Method for choosing the best solution in the grid. Options are ‘AIC’ (Akaike Information criterion) and ‘eBIC’ (extended Bayesian information criterion). The default is ‘eBIC’.
gamma (float, optional) – Gamma value for eBIC. Should be between 0 and 1. The larger gamma, the more eBIC tends to pick sparse solutions. The default is 0.1.
tol (float, positive, optional) – Tolerance for the primal residual used for the solver at each grid point. The default is 1e-7.
rtol (float, positive, optional) – Tolerance for the dual residual used for the solver at each grid point. The default is 1e-7.
store_all (bool, optional) – Stores solution at every grid point (for SGL) and computes best estimator with uniform lambda1 for MGL.

Return type:

Sets self.reg_params to the best regularization parameter. Diagnostics can be accessed in self.modelselect_stats.

Other methods of glasso_problem

glasso_problem.set_modelselect_params(modelselect_params=None)

Set the ranges of regularization parameters for the grid searches.

Parameters:

modelselect_params (dict) –

Contains values for (a subset of) the grid parameters for \(\lambda_1\), \(\lambda_2\), \(\mu_1\). See below on how the dictionary should be structured. For optimal performance sort \(\lambda_1\) in a descending order.

Possible dictionary keys:

'lambda1_range': array of values for \(\lambda_1\) parameter.
'lambda2_range': array of values for \(\lambda_2\) parameter.
'mu1_range': array of values for \(\mu_1\) parameter.
'lambda1_mask': array (p,p), non-negative, symmetric. The \(\lambda_1\) parameter is multiplied element-wise with this array. Only available for SGL.

glasso_problem.set_reg_params(reg_params=None)

Sets/updates the regularization parameters for the problem.

Parameters:

reg_params (dict, optional) –

Possible keys:

'lambda1': float, positive
'lambda2': float, positive
'mu1': float or array of length K, positive. Only needed if latent = True.
'lambda1_mask': array (p,p) , non-negative, symmetric. The \(\lambda_1\) parameter is multiplied element-wise with this array. Only available for SGL.

glasso_problem.set_start_point(Omega_0=None)

Set the starting point for solving the problem.

Parameters:: Omega_0 (array or dict, optional) – Starting point for solver. Needs to be of same type as input data S. The default is None.

Class GGLassoEstimator

class gglasso.problem.GGLassoEstimator(S, N, p, K, multiple=True, latent=False, conforming=True)

Estimator object for the solution to the Graphical Lasso problems. Reading as well the documentation of glasso_problem is highly recommended. Attribute naming is inspired by scikit-learn.

Important attributes:

self.precision_: The estimator for the sparse component of the precision matrix.

self.lowrank_: Only relevant if latent=True. The estimator for the low-rank component of the precision matrix.

self.sample_covariance_: Empirical covariance matrix used as input for Graphical Lasso.

Parameters:

S (2d/3d-array or dict) – Empirical covariance matrices.
N (int or integer array of length K.) – Number of samples for each instance k=1,..,K.
p (int or array of integers) – Dimension of the problem (i.e. number of variables). For non-confoming MGL, specify an array of length K.
K (int) – Number of instances. For SGL, use K=1.
multiple (boolean, optional) – Indicates whether SGL or MGL problem is solved.
latent (boolean, optional) – Indicates whether latent variables are modeled.
conforming (boolean, optional) – Indicates whether dimensions of MGL problem are conforming. If False, then all attributes are dictionaries with keys 1,..,K.

Model selection

Choosing the regularization parameters \(\lambda_1\) and \(\lambda_2\) (and \(\mu_1\) in the latent variable case) has crucial impact how well the Graphical Lasso solution recovers true edges/ non-zero entries of the precision matrix.

GGLasso contains model selection functionalities for each of the problem described in Mathematical description. Model selection is done via grid searches on the regularization parameters where the quality of a solution is assessed either with the AIC (Akaike Information Criterion) or the eBIC (Extended Bayesian Information Criterion).

Typically, the eBIC chooses sparser solutions and thus leads to less false discoveries. For a single precision matrix estimate \(\hat{\Theta}\) of dimension \(p\) with \(N\) samples it is given by

\[eBIC_\gamma(\lambda_1) = N \left(-\log \det \hat{\Theta} + \langle S, \hat{\Theta}\rangle\right) + E \log N + 4 E \gamma \log p\]

where \(E\) is the number of non-zero off-diagonal entries on the upper triangular of \(\hat{\Theta}\) and \(\gamma \in [0,1]\). The larger you choose \(\gamma\), the sparser the solution will become. For MGL problems, we extend this (according to [ref2]) by

\[eBIC_\gamma(\lambda_1, \lambda_2) := \sum_{k=1}^{K} N_k \left(-\log \det \hat{\Theta}^{(k)} + \langle S^{(k)}, \hat{\Theta}^{(k)}\rangle\right) + E_k \log N_k + 4 E_k \gamma \log p_k\]

where \(E_k\) is the analogous of \(E\) for \(\hat{\Theta}^{(k)}\). We refer to [ref10] for details on the eBIC.