.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/plot_nonconforming_ggl.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_plot_nonconforming_ggl.py: Nonconforming Group Graphical Lasso experiment =================================================== Example script for Group Graphical Lasso with non-conforming dimension, i.e. some variables exist in some instances but not in all. We generate one underlying precision matrix and then drop one block of variables in each instance. .. GENERATED FROM PYTHON SOURCE LINES 9-28 .. code-block:: Python import numpy as np import pandas as pd from matplotlib import pyplot as plt import seaborn as sns from gglasso.helper.ext_admm_helper import create_group_array, construct_indexer from gglasso.helper.utils import sparsity from gglasso.helper.data_generation import generate_precision_matrix, group_power_network, sample_covariance_matrix from gglasso.helper.ext_admm_helper import check_G, consensus from gglasso.problem import glasso_problem K = 4 p = 50 M = 10 B = int(p/M) N = 200 .. GENERATED FROM PYTHON SOURCE LINES 29-35 Generating the data ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We generate one precision matrix, sample observations, and finally drop one block of variables in each instance. It is important that the observations for each instance is a ``DataFrame`` of shape ``(p_k,N_k)`` where the index has unique ids for each variable. .. GENERATED FROM PYTHON SOURCE LINES 35-52 .. code-block:: Python p_arr = (p-B)*np.ones(K, dtype = int) num_samples = N*np.ones(K) Sigma, Theta = generate_precision_matrix(p=p, M=M, style = 'powerlaw', gamma = 2.8, prob = 0.1, seed = 3456) all_obs = dict() S = dict() for k in np.arange(K): _, obs = sample_covariance_matrix(Sigma, N, seed = 456) # drop the k-th block starting from the end all_obs[k] = pd.DataFrame(obs).drop(np.arange(p-(k+1)*B, p-k*B), axis = 0) S[k] = np.cov(all_obs[k], bias = True) .. GENERATED FROM PYTHON SOURCE LINES 53-62 Creating the groups ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ In this section, we create two important objects for the non-conforming case using functionalities of ``GGLasso``: ``ix_location`` is a dataframe with every variable as index and each columns is one dataset. It contains the index of the variable in the respective instance (``NaN`` when not present). ``G`` is a bookeeping array which keeps count of the indices of each group of overlapping variables. It is needed in the solver later on. **Important:** We only consider pairs of variables which appear at least in 3 instances here! .. GENERATED FROM PYTHON SOURCE LINES 62-75 .. code-block:: Python ix_exist, ix_location = construct_indexer(list(all_obs.values())) G = create_group_array(ix_exist, ix_location, min_inst = K-1) check_G(G, p) print("Dimensions p_k: ", p_arr) print("Sample sizes N_k: ", num_samples) print("Number of groups found: ", G.shape[1]) .. rst-class:: sphx-glr-script-out .. code-block:: none Creation of bookeeping array... 10% finished 20% finished 30% finished 40% finished 50% finished 60% finished 70% finished 80% finished 90% finished Dimensions p_k: [45 45 45 45] Sample sizes N_k: [200. 200. 200. 200.] Number of groups found: 1075 .. GENERATED FROM PYTHON SOURCE LINES 76-82 Visualizing ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We visualize the case of non-conforming variables by plotting the given empirical covariance matrices. Missing variable observations are in **white**. .. GENERATED FROM PYTHON SOURCE LINES 82-96 .. code-block:: Python fig, axs = plt.subplots(2,2, figsize = (8,8)) for k in range(K): ind = ix_exist.index[ix_exist.loc[:,k]] S_k = pd.DataFrame(S[k], index = ind, columns = ind) # extend matrix by nonexistent variables S_k = S_k.reindex(columns = ix_exist.index, index = ix_exist.index) ax = axs.ravel()[k] sns.heatmap(S_k, ax = ax, cmap = plt.cm.coolwarm, linewidth = 0.005, linecolor = 'lightgrey',\ cbar = False, vmin = -.5, vmax = .5, xticklabels = [], yticklabels = []) ax.set_title(f"Empirical covariance, instance {k}") .. image-sg:: /auto_examples/images/sphx_glr_plot_nonconforming_ggl_001.png :alt: Empirical covariance, instance 0, Empirical covariance, instance 1, Empirical covariance, instance 2, Empirical covariance, instance 3 :srcset: /auto_examples/images/sphx_glr_plot_nonconforming_ggl_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 97-101 Defining the GGL problem ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We now create the instance of Group Graphical Lasso problem. As we are in the non-conforming case, we need to spcify the array ``G`` which we created before. .. GENERATED FROM PYTHON SOURCE LINES 101-106 .. code-block:: Python P = glasso_problem(S = S, N = N, reg = "GGL", reg_params = None, latent = True, G = G, do_scaling = True) print(P) .. rst-class:: sphx-glr-script-out .. code-block:: none GROUP GRAPHICAL LASSO PROBLEM WITH LATENT VARIABLES Regularization parameters: {'lambda1': None, 'lambda2': None, 'mu1': None} .. GENERATED FROM PYTHON SOURCE LINES 107-110 Model selection ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Set the regularization parameter grids and do model selection. .. GENERATED FROM PYTHON SOURCE LINES 110-121 .. code-block:: Python l1 = np.logspace(0,-2,7) mu1 = np.logspace(1,-1,3) l2 = np.logspace(0,-2,4) modelselect_params = {'lambda1_range' : l1, 'mu1_range': mu1, 'lambda2_range': l2} P.model_selection(modelselect_params = modelselect_params, method = 'eBIC', gamma = 0.1, tol = 1e-7, rtol = 1e-7) print(P.reg_params) .. rst-class:: sphx-glr-script-out .. code-block:: none ------------Range search for instance 0------------ ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 81 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 81 iterations with status: optimal. ADMM terminated after 27 iterations with status: optimal. ADMM terminated after 25 iterations with status: optimal. ADMM terminated after 80 iterations with status: optimal. ADMM terminated after 33 iterations with status: optimal. ADMM terminated after 32 iterations with status: optimal. ADMM terminated after 95 iterations with status: optimal. ADMM terminated after 36 iterations with status: optimal. ADMM terminated after 38 iterations with status: optimal. ADMM terminated after 93 iterations with status: optimal. ADMM terminated after 25 iterations with status: optimal. ADMM terminated after 26 iterations with status: optimal. ADMM terminated after 47 iterations with status: optimal. ADMM terminated after 34 iterations with status: optimal. ADMM terminated after 32 iterations with status: optimal. ADMM terminated after 34 iterations with status: optimal. ------------Range search for instance 1------------ ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 82 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 82 iterations with status: optimal. ADMM terminated after 27 iterations with status: optimal. ADMM terminated after 25 iterations with status: optimal. ADMM terminated after 81 iterations with status: optimal. ADMM terminated after 34 iterations with status: optimal. ADMM terminated after 32 iterations with status: optimal. ADMM terminated after 100 iterations with status: optimal. ADMM terminated after 37 iterations with status: optimal. ADMM terminated after 39 iterations with status: optimal. ADMM terminated after 90 iterations with status: optimal. ADMM terminated after 22 iterations with status: optimal. ADMM terminated after 26 iterations with status: optimal. ADMM terminated after 49 iterations with status: optimal. ADMM terminated after 33 iterations with status: optimal. ADMM terminated after 31 iterations with status: optimal. ADMM terminated after 34 iterations with status: optimal. ------------Range search for instance 2------------ ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 81 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 81 iterations with status: optimal. ADMM terminated after 27 iterations with status: optimal. ADMM terminated after 25 iterations with status: optimal. ADMM terminated after 80 iterations with status: optimal. ADMM terminated after 34 iterations with status: optimal. ADMM terminated after 32 iterations with status: optimal. ADMM terminated after 94 iterations with status: optimal. ADMM terminated after 37 iterations with status: optimal. ADMM terminated after 39 iterations with status: optimal. ADMM terminated after 52 iterations with status: optimal. ADMM terminated after 23 iterations with status: optimal. ADMM terminated after 26 iterations with status: optimal. ADMM terminated after 49 iterations with status: optimal. ADMM terminated after 32 iterations with status: optimal. ADMM terminated after 31 iterations with status: optimal. ADMM terminated after 32 iterations with status: optimal. ------------Range search for instance 3------------ ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 82 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 82 iterations with status: optimal. ADMM terminated after 27 iterations with status: optimal. ADMM terminated after 25 iterations with status: optimal. ADMM terminated after 81 iterations with status: optimal. ADMM terminated after 34 iterations with status: optimal. ADMM terminated after 33 iterations with status: optimal. ADMM terminated after 95 iterations with status: optimal. ADMM terminated after 37 iterations with status: optimal. ADMM terminated after 39 iterations with status: optimal. ADMM terminated after 98 iterations with status: optimal. ADMM terminated after 24 iterations with status: optimal. ADMM terminated after 26 iterations with status: optimal. ADMM terminated after 51 iterations with status: optimal. ADMM terminated after 33 iterations with status: optimal. ADMM terminated after 31 iterations with status: optimal. ADMM terminated after 34 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 18 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 19 iterations with status: optimal. ADMM terminated after 21 iterations with status: optimal. ADMM terminated after 32 iterations with status: optimal. ADMM terminated after 32 iterations with status: optimal. ADMM terminated after 20 iterations with status: optimal. ADMM terminated after 28 iterations with status: optimal. ADMM terminated after 45 iterations with status: optimal. ADMM terminated after 43 iterations with status: optimal. ADMM terminated after 144 iterations with status: optimal. ADMM terminated after 143 iterations with status: optimal. ADMM terminated after 271 iterations with status: optimal. ADMM terminated after 214 iterations with status: optimal. ADMM terminated after 140 iterations with status: optimal. ADMM terminated after 144 iterations with status: optimal. ADMM terminated after 254 iterations with status: optimal. ADMM terminated after 176 iterations with status: optimal. ADMM terminated after 208 iterations with status: optimal. ADMM terminated after 219 iterations with status: optimal. ADMM terminated after 225 iterations with status: optimal. ADMM terminated after 165 iterations with status: optimal. {'lambda1': 0.2154434690031884, 'lambda2': 0.01, 'mu1': array([10., 10., 10., 10.])} .. GENERATED FROM PYTHON SOURCE LINES 122-129 Evaluation of results ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We print the ratio of non-zero entries per instance. We plot the distribution of non-zero entries per group. With group sparsity regularization we aim for many groups with either no non-zero or many non-zero entries per group. .. GENERATED FROM PYTHON SOURCE LINES 129-143 .. code-block:: Python print("Solution sparsity (ratio of nonzero entries): ") for k in np.arange(K): print(f"Instance {k}: ", sparsity(P.solution.precision_[k])) stats = P.modelselect_stats.copy() nnz,_,_ = consensus(P.solution.precision_, G) fig, ax = plt.subplots() sns.histplot(nnz, discrete = True, ax = ax) ax.set_yscale('log') ax.set_title('Nonzero entries per group') .. image-sg:: /auto_examples/images/sphx_glr_plot_nonconforming_ggl_002.png :alt: Nonzero entries per group :srcset: /auto_examples/images/sphx_glr_plot_nonconforming_ggl_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none Solution sparsity (ratio of nonzero entries): Instance 0: 0.03636363636363636 Instance 1: 0.03636363636363636 Instance 2: 0.03737373737373737 Instance 3: 0.03737373737373737 Text(0.5, 1.0, 'Nonzero entries per group') .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 14.486 seconds) .. _sphx_glr_download_auto_examples_plot_nonconforming_ggl.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_nonconforming_ggl.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_nonconforming_ggl.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_nonconforming_ggl.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_