Benchmarking

We compare the performance of the Graphical Lasso solvers implemented in GGLasso to two commonly used packages, i.e.

regain : contains an ADMM solver which is doing almost the same operations as ADMM_SGL. For details, see the original paper. [ref3]
sklearn: by default uses the coordinate descent algorithm which was originally proposed by Friedman et al. [ref1]

The results can be generated using the notebook in benchmarks/benchmarks.ipynb in the Github repository. From GGLasso we use the standard solver ADMM_SGL labeled by gglasso and the block-wise solver labeled by gglasso-block. For details, we refer to SGL solver.

import pandas as pd
import numpy as np

from gallery_helper import plot_bm

Synthetic power-law networks

We compare the solvers for a SGL problem using synthetic sparse powerlaw networks, which are generated as described in [ref2]. The solvers are tested for different values of $\lambda_1$ and different dimensionalities. These values are printed below. We solve a SGL problem using each of the solvers independently, but using the same CPUs with 64GB of RAM in total.

The results were generated on a machine equipped with AMD Opteron(tm) 6378 @ 1.40GHz (max 2.40 GHz) (8 Cores per socket, hyper-threading).

df = pd.read_csv("../data/synthetic/bm5000.csv", index_col = 0)
df.reset_index(drop=True)

	time	accuracy	hamming	iter	method	tol	rtol	p	N	l1
0	0.519696	9.849324e-03	30728	12	gglasso	1.000000e-06	1.000000e-06	500	550	0.05
1	0.792978	1.131881e-03	30632	20	gglasso	1.000000e-07	1.000000e-07	500	550	0.05
2	0.994830	5.210053e-04	30626	23	gglasso	5.000000e-08	5.000000e-08	500	550	0.05
3	1.110264	1.208771e-04	30618	29	gglasso	1.000000e-08	1.000000e-08	500	550	0.05
4	0.538926	9.849324e-03	30728	0	gglasso-block	1.000000e-06	1.000000e-06	500	550	0.05
...	...	...	...	...	...	...	...	...	...	...
205	433.196051	1.229067e-03	9838	17	regain	1.000000e-05	1.000000e-05	5000	5500	0.50
206	530.698426	2.653190e-04	9838	21	regain	5.000000e-06	5.000000e-06	5000	5500	0.50
207	747.611447	1.769017e-05	9838	30	regain	1.000000e-06	1.000000e-06	5000	5500	0.50
208	197.463880	1.191635e-08	9838	2	sklearn	1.000000e-02	1.000000e-04	5000	5500	0.50
209	199.192011	1.191635e-08	9838	2	sklearn	1.000000e-01	1.000000e-04	5000	5500	0.50

210 rows × 10 columns

all_p_N = sorted(list(set(zip(df['p'], df['N']))))
print("Dimensionality and sample size: (p,N) =", all_p_N )

all_l1 = pd.unique(df.l1)
print("Values of lambda_1:  =", all_l1 )

Dimensionality and sample size: (p,N) = [(500, 550), (1000, 1100), (2000, 2200), (4000, 4400), (5000, 5500)]
Values of lambda_1:  = [0.05 0.1  0.5 ]

Setup

Each solver terminates after a given number of maximum iterations or when some optimality condition is met. Hence, the performance is difficult to compare as these optimality criteria may differ. Thus, we select a range of values for relative (rtol) and absolute (tol) tolerance (used in ADMM) and similarly tolerance values for sklearn.

Calculating the accuracy

After solving each of the problems which each solver for different tolerance values, we compare the obtained solutions to a reference solution, denoted by $Z^\ast$.

$Z^\ast$ is obtained by solving a SGL problem by one of the solvers for very small tolerance values (we used regain and set tol=rtol=1e-10). Finally, for a solution $Z$, we calculate its accuracy using the normalized Euclidean distance:

\[\text{accuracy}(Z) = \frac{\|Z^\ast - Z \|}{ \| Z^\ast\| }.\]

Runtime and accuracy with respect to $\lambda_1$.

Now, determine a maximal accuracy $\epsilon$. For each solver, we now select the run with minimal runtime where $\text{accuracy}(Z) \leq \epsilon$ is fulfilled. We plot the results for two values of $\epsilon$:

Accuracy of $\epsilon=5\cdot10^{-3}$

plot_bm(df, min_acc= 5e-3, lambda_list=all_l1)

$$\lambda_1$ = 0.05, $\lambda_1$ = 0.1, $\lambda_1$ = 0.5$

Accuracy of $\epsilon=5\cdot10^{-2}$

plot_bm(df, min_acc = 5e-2, lambda_list=all_l1)

$$\lambda_1$ = 0.05, $\lambda_1$ = 0.1, $\lambda_1$ = 0.5$

Total running time of the script: (0 minutes 1.610 seconds)

Gallery generated by Sphinx-Gallery

Benchmarking

Synthetic power-law networks

Setup

Calculating the accuracy

Runtime and accuracy with respect to \(\lambda_1\).

Accuracy of \(\epsilon=5\cdot10^{-3}\)

Accuracy of \(\epsilon=5\cdot10^{-2}\)