cdt.utils

cdt.utils.R

Loading and executing functions from R packages.

This module defines the interface between R and Python using subprocess. At the initialization, the toolbox checks if R is available and sets cdt.SETTINGS.r_is_available to True if the R framework is detected. Else, this module is deactivated.

Next, each time an R function is called, the availability of the R package is tested using the DefaultRPackages.check_R_package function. The number of available packages is limited and the list is defined in DefaultRPackages.

If the package is available, the launch_R_script proceeds to the execution of the function, by:

Copying the R script template and modifying it with the given arguments
Copying all the data to a temporary folder
Launching a R subprocess using the modified template and the data, and the script saves the results in the temporary folder
Retrieving all the results in the Python process and cleaning up all the temporary files.

Note

For custom R configurations/path, a placeholder for the Rscript executable path is available at cdt.SETTINGS.rpath. It should be overriden with the full path as a string.

class cdt.utils.R.DefaultRPackages[source]

Define the R packages that can be imported and checks their availability.

The attributes define all the R packages that can be imported. Their value is initialized to None ; and as their are called, their availability will be checked and their value will be set to either True or False depending on the results. A package already tested (which value is not None) will not be tested again.

Variables

~DefaultRPackages.pcalg (bool) – Availability of the pcalg R package
~DefaultRPackages.kpcalg (bool) – Availability of the kpcalg R package
~DefaultRPackages.bnlearn (bool) – Availability of the bnlearn R package
~DefaultRPackages.D2C (bool) – Availability of the D2C R package
~DefaultRPackages.SID (bool) – Availability of the SID R package
~DefaultRPackages.CAM (bool) – Availability of the CAM R package
~DefaultRPackages.RCIT (bool) – Availability of the RCIT R package

Warning

The RCIT package is not the original one (github.com/ericstrobl/RCIT) but an adaptation made to fit in the PC algorithm, available at: https://github.com/Diviyan-Kalainathan/RCIT

check_R_package(package)[source]

Execute a subprocess to check the package’s availability.

Parameters: package (str) – Name of the package to be tested.
Returns: True if the package is available, False otherwise
Return type: bool

cdt.utils.R.launch_R_script(template, arguments, output_function=None, verbose=True, debug=False)[source]

Launch an R script, starting from a template and replacing text in file before execution.

Parameters

template (str) – path to the template of the R script
arguments (dict) – Arguments that modify the template’s placeholders with arguments
output_function (function) – Function to execute after the execution of the R script, and its output is returned by this function. Used traditionally as a function to retrieve the results of the execution.
verbose (bool) – Sets the verbosity of the R subprocess.
debug (bool) – If True, the generated scripts are not deleted.

Returns

Returns the output of the output_function if not None else True or False depending on whether the execution was successful.

cdt.utils.io

Formatting and import functions.

Author: Diviyan Kalainathan Date : 2/06/17

cdt.utils.io.read_causal_pairs(filename, scale=False, **kwargs)[source]

Convert a ChaLearn Cause effect pairs challenge format into numpy.ndarray.

Parameters

filename (str or pandas.DataFrame) – path of the file to read or DataFrame containing the data
scale (bool) – Scale the data
**kwargs – parameters to be passed to pandas.read_csv

Returns

Dataframe composed of (SampleID, a (numpy.ndarray) , b (numpy.ndarray))

Return type

pandas.DataFrame

Examples

>>> from cdt.utils import read_causal_pairs
>>> data = read_causal_pairs('file.tsv', scale=True, sep='\t')

cdt.utils.io.read_adjacency_matrix(filename, directed=True, **kwargs)[source]

Read a file (containing an adjacency matrix) and convert it into a directed or undirected networkx graph.

Parameters

filename (str or pandas.DataFrame) – file to read or DataFrame containing the data
directed (bool) – Return directed graph
kwargs – extra parameters to be passed to pandas.read_csv

Returns

networkx graph containing the graph.

Return type

networkx.DiGraph or networkx.Graph depending on the directed parameter.

Examples

>>> from cdt.utils import read_adjacency_matrix
>>> data = read_causal_pairs('graph_file.csv', directed=False)

cdt.utils.io.read_list_edges(filename, directed=True, **kwargs)[source]

Read a file (containing list of edges) and convert it into a directed or undirected networkx graph.

Parameters

filename (str or pandas.DataFrame) – file to read or DataFrame containing the data
directed (bool) – Return directed graph
kwargs – extra parameters to be passed to pandas.read_csv

Returns

networkx graph containing the graph.

Return type

networkx.DiGraph or networkx.Graph depending on the directed parameter.

Examples

>>> from cdt.utils import read_adjacency_matrix
>>> data = read_causal_pairs('graph_file.csv', directed=False)

cdt.utils.graph

Utilities for graph not included in Networkx.

cdt.utils.graph.network_deconvolution(mat, **kwargs)[source]

Python implementation/translation of network deconvolution by MIT-KELLIS LAB.

Note

For networkx graphs, use the cdt.utils.graph.remove_indirect_links function code author:gidonro [Github username](https://github.com/gidonro/Network-Deconvolution)

LICENSE: MIT-KELLIS LAB

AUTHORS: Algorithm was programmed by Soheil Feizi. Paper authors are S. Feizi, D. Marbach, M. M?©dard and M. Kellis Python implementation: Gideon Rosenthal

For more details, see the following paper: Network Deconvolution as a General Method to Distinguish Direct Dependencies over Networks

By: Soheil Feizi, Daniel Marbach, Muriel Médard and Manolis Kellis Nature Biotechnology

Parameters

mat (numpy.ndarray) – matrix, if it is a square matrix, the program assumes it is a relevance matrix where mat(i,j) represents the similarity content between nodes i and j. Elements of matrix should be non-negative.
beta (float) – Scaling parameter, the program maps the largest absolute eigenvalue of the direct dependency matrix to beta. It should be between 0 and 1.
alpha (float) – fraction of edges of the observed dependency matrix to be kept in deconvolution process.
control (int) – if 0, displaying direct weights for observed interactions, if 1, displaying direct weights for both observed and non-observed interactions.

Returns

Output deconvolved matrix (direct dependency matrix). Its components represent direct edge weights of observed interactions. Choosing top direct interactions (a cut-off) depends on the application and is not implemented in this code.

Return type

numpy.ndarray

Example

>>> from cdt.utils.graph import network_deconvolution
>>> import networkx as nx
>>> # Generate sample data
>>> from cdt.data import AcyclicGraphGenerator
>>> graph = AcyclicGraphGenerator(linear).generate()[1]
>>> adj_mat = nx.adjacency_matrix(graph).todense()
>>> output = network_deconvolution(adj_mat)

Note

To apply ND on regulatory networks, follow steps explained in Supplementary notes 1.4.1 and 2.1 and 2.3 of the paper. In this implementation, input matrices are made symmetric.

cdt.utils.graph.aracne(m, **kwargs)[source]

Implementation of the ARACNE algorithm.

Note

For networkx graphs, use the cdt.utils.graph.remove_indirect_links function

Parameters: mat (numpy.ndarray) – matrix, if it is a square matrix, the program assumes it is a relevance matrix where mat(i,j) represents the similarity content between nodes i and j. Elements of matrix should be non-negative.
Returns: Output deconvolved matrix (direct dependency matrix). Its components represent direct edge weights of observed interactions.
Return type: numpy.ndarray

Example

>>> from cdt.utils.graph import aracne
>>> import networkx as nx
>>> # Generate sample data
>>> from cdt.data import AcyclicGraphGenerator
>>> graph = AcyclicGraphGenerator(linear).generate()[1]
>>> adj_mat = nx.adjacency_matrix(graph).todense()
>>> output = aracne(adj_mat)

Note

Ref: ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context Adam A Margolin, Ilya Nemenman, Katia Basso, Chris Wiggins, Gustavo Stolovitzky, Riccardo Dalla Favera and Andrea Califano DOI: https://doi.org/10.1186/1471-2105-7-S1-S7

cdt.utils.graph.clr(M, **kwargs)[source]

Implementation of the Context Likelihood or Relatedness Network algorithm.

Note

For networkx graphs, use the cdt.utils.graph.remove_indirect_links function

Parameters: mat (numpy.ndarray) – matrix, if it is a square matrix, the program assumes it is a relevance matrix where mat(i,j) represents the similarity content between nodes i and j. Elements of matrix should be non-negative.
Returns: Output deconvolved matrix (direct dependency matrix). Its components represent direct edge weights of observed interactions.
Return type: numpy.ndarray

Example

>>> from cdt.utils.graph import clr
>>> import networkx as nx
>>> # Generate sample data
>>> from cdt.data import AcyclicGraphGenerator
>>> graph = AcyclicGraphGenerator(linear).generate()[1]
>>> adj_mat = nx.adjacency_matrix(graph).todense()
>>> output = clr(adj_mat)

Note

Ref:Jeremiah J. Faith, Boris Hayete, Joshua T. Thaden, Ilaria Mogno, Jamey Wierzbowski, Guillaume Cottarel, Simon Kasif, James J. Collins, and Timothy S. Gardner. Large-scale mapping and validation of escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biology, 2007

cdt.utils.graph.remove_indirect_links(g, alg='aracne', **kwargs)[source]

Apply deconvolution to a networkx graph.

Parameters

g (networkx.Graph) – Graph to apply deconvolution to
alg (str) – Algorithm to use (‘aracne’, ‘clr’, ‘nd’)
kwargs (dict) – extra options for algorithms

Returns

graph with undirected links removed.

Return type

networkx.Graph

Example

>>> from cdt.utils.graph import remove_indirect_links
>>> import networkx as nx
>>> # Generate sample data
>>> from cdt.data import AcyclicGraphGenerator
>>> graph = AcyclicGraphGenerator(linear).generate()[1]
>>> output = remove_indirect_links(graph, alg='aracne')

cdt.utils.graph.dagify_min_edge(g)[source]

Input a graph and output a DAG.

The heuristic is to reverse the edge with the lowest score of the cycle if possible, else remove it.

Parameters: g (networkx.DiGraph) – Graph to modify to output a DAG
Returns: DAG made out of the input graph.
Return type: networkx.DiGraph

Example

>>> from cdt.utils.graph import dagify_min_edge
>>> import networkx as nx
>>> import numpy as np
>>> # Generate sample data
>>> graph = nx.DiGraph((np.ones(4) - np.eye(4)) *
                       np.random.uniform(size=(4,4)))
>>> output = dagify_min_edge(graph)

cdt.utils.loss

Pytorch implementation of Losses and tools.

class cdt.utils.loss.MMDloss(input_size, bandwidths=None)[source]

[torch.nn.Module] Maximum Mean Discrepancy Metric to compare empirical distributions.

The MMD score is defined by:

\[\widehat{MMD_k}(\mathcal{D}, \widehat{\mathcal{D}}) = \frac{1}{n^2} \sum_{i, j = 1}^{n} k(x_i, x_j) + \frac{1}{n^2} \sum_{i, j = 1}^{n} k(\hat{x}_i, \hat{x}_j) - \frac{2}{n^2} \sum_{i,j = 1}^n k(x_i, \hat{x}_j)\]

where \(\mathcal{D} \text{ and } \widehat{\mathcal{D}}\) represent respectively the observed and empirical distributions, \(k\) represents the RBF kernel and \(n\) the batch size.

Parameters

input_size (int) – Fixed batch size.
bandwiths (list) – List of bandwiths to take account of. Defaults at [0.01, 0.1, 1, 10, 100]
device (str) – PyTorch device on which the computation will be made. Defaults at cdt.SETTINGS.default_device.

Inputs: empirical, observed

Forward pass: Takes both the true samples and the generated sample in any order and returns the MMD score between the two empirical distributions.

empirical distribution of shape (batch_size, features): torch.Tensor containing the empirical distribution
observed distribution of shape (batch_size, features): torch.Tensor containing the observed distribution.

Outputs: score

score of shape (1): Torch.Tensor containing the loss value.

Note

Ref: Gretton, A., Borgwardt, K. M., Rasch, M. J., Schölkopf, B., & Smola, A. (2012). A kernel two-sample test. Journal of Machine Learning Research, 13(Mar), 723-773.

Example

>>> from cdt.utils.loss import MMDloss
>>> import torch as th
>>> x, y = th.randn(100,10), th.randn(100, 10)
>>> mmd = MMDloss(100)  # 100 is the batch size
>>> mmd(x, y)
0.0766

class cdt.utils.loss.MomentMatchingLoss(n_moments=1)[source]

[torch.nn.Module] L2 Loss between k-moments between two distributions, k being a parameter.

These moments are raw moments and not normalized. The loss is an L2 loss between the moments:

\[MML(X, Y) = \sum_{m=1}^{m^*} \left( \frac{1}{n_x} \sum_{i=1}^{n_x} {x_i}^m - \frac{1}{n_y} \sum_{j=1}^{n_y} {y_j}^m \right)^2\]

where \(m^*\) represent the number of moments to compute.

Parameters: n_moments (int) – Number of moments to compute.

Input: (X, Y)

X represents the first empirical distribution in a torch.Tensor of shape (?, features)
Y represents the second empirical distribution in a torch.Tensor of shape (?, features)

Output: mml

mml is the output of the forward pass and is differenciable. torch.Tensor of shape (1)

Example

>>> from cdt.utils.loss import MomentMatchingLoss
>>> import torch as th
>>> x, y = th.randn(100,10), th.randn(100, 10)
>>> mml = MomentMatchingLoss(4)
>>> mml(x, y)

class cdt.utils.loss.TTestCriterion(max_iter, runs_per_iter, threshold=0.01)[source]

A loop criterion based on t-test to check significance of results.

Parameters

max_iter (int) – Maximum number of iterations authorized
runs_per_iter (int) – Number of runs performed per iteration
threshold (float) – p-value threshold, under which the loop is stopped.

Example

>>> from cdt.utils.loss import TTestCriterion
>>> l = TTestCriterion(50,5)
>>> x, y = [], []
>>> while l.loop(x, y):
    ...     # compute loop and update results in x, y
>>> x, y  # Two lists with significant difference in score

loop(xy, yx)[source]

Tests the loop condition based on the new results and the parameters.

Parameters

xy (list) – list containing all the results for one set of samples
yx (list) – list containing all the results for the other set.

Returns

True if the loop has to continue, False otherwise.

Return type

bool

cdt.utils.loss.notears_constr(adj_m, max_pow=None)[source]

No Tears constraint for binary adjacency matrixes. Represents a differenciable constraint to converge towards a DAG.

Warning

If adj_m is non binary: Feed adj_m * adj_m as input (Hadamard product).

Parameters

adj_m (array-like) – Adjacency matrix of the graph
max_pow (int) – maximum value to which the infinite sum is to be computed. defaults to the shape of the adjacency_matrix

Returns

Scalar value of the loss with the type: depending on the input.

Return type

np.ndarray or torch.Tensor

Note

Zheng, X., Aragam, B., Ravikumar, P. K., & Xing, E. P. (2018). DAGs with NO TEARS: Continuous Optimization for Structure Learning. In Advances in Neural Information Processing Systems (pp. 9472-9483).

cdt.utils.parallel

This module introduces tools for execution of jobs in parallel.

Per default, joblib is used for easy and efficient execution of parallel tasks. However, joblib does not support GPU management, and does not kill processes at the end of each task, thus keeping in GPU memory the pytorch execution context.

This module introduces equivalent tools for multiprocessing while avoiding GPU memory leak. This tool provides functions that make use GPUs: otherwise, joblib is called.

cdt.utils.parallel.parallel_run(function, *args, nruns=None, njobs=None, gpus=None, **kwargs)[source]

Mutiprocessed execution of a function with parameters, with GPU management.

This function is useful when the used wants to execute a bootstrap on a function on GPU devices, as joblib does not include such feature.

Parameters

function (function) – Function to execute.
*args – arguments going to be fed to the function.
nruns (int) – Total number of executions of the function.
njobs (int) – Number of parallel executions (defaults to cdt.SETTINGS.NJOBS).
gpus (int) – Number of GPU devices allocated to the job (defaults to cdt.SETTINGS.GPU)
**kwargs – Keyword arguments going to be fed to the function.

Returns

concatenated list of outputs of executions. The order of elements does not correspond to the initial order.

Return type

list

cdt.utils.parallel.parallel_run_generator(function, generator, njobs=None, gpus=None)[source]

Mutiprocessed execution of a function with parameters, with GPU management.

Variant of the `cdt.utils.parallel.parallel_run` function, with the exception that this function takes an iterable as args, kwargs and nruns.

Parameters

function (function) – Function to execute.
*args – arguments going to be fed to the function.
generator (iterable) – generator or list with the arguments for each run, each element much be a tuple of ([args], {kwargs}).
njobs (int) – Number of parallel executions (defaults to cdt.SETTINGS.NJOBS).
gpus (int) – Number of GPU devices allocated to the job (defaults to cdt.SETTINGS.GPU)
**kwargs – Keyword arguments going to be fed to the function.

Returns

concatenated list of outputs of executions. The order of elements does correspond to the initial order.

Return type

list

cdt.utils.torch

PyTorch utilities for models.

Author: Diviyan Kalainathan, Olivier Goudet Date: 09/3/2018

class cdt.utils.torch.ChannelBatchNorm1d(num_channels, num_features, *args, **kwargs)[source]

Applies Batch Normalization over a 2D or 3D input (a mini-batch of 1D inputs with optional additional channel dimension) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .

\[y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta\]

The mean and standard-deviation are calculated per-dimension over the mini-batches and \(\gamma\) and \(\beta\) are learnable parameter vectors of size C (where C is the input size).

By default, during training this layer keeps running estimates of its computed mean and variance, which are then used for normalization during evaluation. The running estimates are kept with a default momentum of 0.1.

If track_running_stats is set to False, this layer then does not keep running estimates, and batch statistics are instead used during evaluation time as well.

Note

This momentum argument is different from one used in optimizer classes and the conventional notion of momentum. Mathematically, the update rule for running statistics here is \(\hat{x}_\text{new} = (1 - \text{momentum}) \times \hat{x} + \text{momemtum} \times x_t\), where \(\hat{x}\) is the estimated statistic and \(x_t\) is the new observed value.

Because the Batch Normalization is done over the C dimension, computing statistics on (N, L) slices, it’s common terminology to call this Temporal Batch Normalization.

Parameters

num_features – \(C\) from an expected input of size \((N, C, L)\) or \(L\) from input of size \((N, L)\)
eps – a value added to the denominator for numerical stability. Default: 1e-5
momentum – the value used for the running_mean and running_var computation. Can be set to None for cumulative moving average (i.e. simple average). Default: 0.1
affine – a boolean value that when set to True, this module has learnable affine parameters. Default: True
track_running_stats – a boolean value that when set to True, this module tracks the running mean and variance, and when set to False, this module does not track such statistics and always uses batch statistics in both training and eval modes. Default: True

Shape:

Input: \((N, C)\) or \((N, C, L)\)
Output: \((N, C)\) or \((N, C, L)\) (same shape as input)

Examples:

>>> # With Learnable Parameters
>>> m = nn.BatchNorm1d(100)
>>> # Without Learnable Parameters
>>> m = nn.BatchNorm1d(100, affine=False)
>>> input = torch.randn(20, 100)
>>> output = m(input)

forward(input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class cdt.utils.torch.Linear3D(channels, in_features, out_features, batch_size=-1, bias=True, noise=False)[source]

Applies a linear transformation to the incoming data: \(y = Ax + b\).

Parameters

in_features – size of each input sample
out_features – size of each output sample
bias – If set to False, the layer will not learn an additive bias. Default: True

Shape:

Input: \((N, *, in\_features)\) where \(*\) means any number of additional dimensions
Output: \((N, *, out\_features)\) where all but the last dimension are the same shape as the input.

Variables

~Linear3D.weight – the learnable weights of the module of shape (out_features x in_features)
~Linear3D.bias – the learnable bias of the module of shape (out_features)

Examples:

>>> m = nn.Linear(3, 20, 30)
>>> input = torch.randn(128, 20)
>>> output = m(input)
>>> print(output.size())

extra_repr()[source]

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

forward(input, adj_matrix=None, permutation_matrix=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

cdt.utils.torch.gumbel_softmax(logits, tau=1, hard=False, eps=1e-10)[source]

Implementation of pytorch. (https://github.com/pytorch/pytorch/blob/e4eee7c2cf43f4edba7a14687ad59d3ed61d9833/torch/nn/functional.py) Sample from the Gumbel-Softmax distribution and optionally discretize. :param logits: [batch_size, n_class] unnormalized log-probs :param tau: non-negative scalar temperature :param hard: if True, take argmax, but differentiate w.r.t. soft sample y

Returns: [batch_size, n_class] sample from the Gumbel-Softmax distribution. If hard=True, then the returned sample will be one-hot, otherwise it will be a probability distribution that sums to 1 across classes

Constraints: - this implementation only works on batch_size x num_features tensor for now based on https://github.com/ericjang/gumbel-softmax/blob/3c8584924603869e90ca74ac20a6a03d99a91ef9/Categorical%20VAE.ipynb , (MIT license)

class cdt.utils.torch.MatrixSampler(graph_size, mask=None, gumble=False)[source]

Matrix Sampler, following a Bernoulli distribution. Differenciable.

forward(tau=1, drawhard=True)[source]: Return a sampled graph.