cdt.causality

cdt.causality.pairwise

class cdt.causality.pairwise.model.PairwiseModel[source]

Base class for all pairwise causal inference models

Usage for undirected/directed graphs and CEPC df format.

orient_graph(df_data, graph, printout=None, **kwargs)[source]

Orient an undirected graph using the pairwise method defined by the subclass.

The pairwise method is ran on every undirected edge.

Parameters

df_data (pandas.DataFrame) – Data
graph (networkx.Graph) – Graph to orient
printout (str) – (optional) Path to file where to save temporary results

Returns

a directed graph, which might contain cycles

Return type

networkx.DiGraph

Warning

Requirement : Name of the nodes in the graph correspond to name of the variables in df_data

predict(x, *args, **kwargs)[source]

Generic predict method, chooses which subfunction to use for a more suited.

Depending on the type of x and of *args, this function process to execute different functions in the priority order:

If args[0] is a networkx.(Di)Graph, then self.orient_graph is executed.
If args[0] exists, then self.predict_proba is executed.
If x is a pandas.DataFrame, then self.predict_dataset is executed.
If x is a pandas.Series, then self.predict_proba is executed.

Parameters

x (numpy.array or pandas.DataFrame or pandas.Series) – First variable or dataset.
args (numpy.array or networkx.Graph) – graph or second variable.

Returns

predictions output

Return type

pandas.Dataframe or networkx.Digraph

predict_dataset(x, **kwargs)[source]

Generic dataset prediction function.

Runs the score independently on all pairs.

Parameters

x (pandas.DataFrame) – a CEPC format Dataframe.
kwargs (dict) – additional arguments for the algorithms

Returns

a Dataframe with the predictions.

Return type

pandas.DataFrame

predict_proba(dataset, idx=0, **kwargs)[source]

Prediction method for pairwise causal inference.

predict_proba is meant to be overridden in all subclasses

Parameters

dataset (tuple) – Couple of np.ndarray variables to classify
idx (int) – (optional) index number for printing purposes

Returns

Causation score (Value : 1 if a->b and -1 if b->a)

Return type

float

ANM

class cdt.causality.pairwise.ANM[source]

ANM algorithm.

Description: The Additive noise model is one of the most popular approaches for pairwise causality. It bases on the fitness of the data to the additive noise model on one direction and the rejection of the model on the other direction.

Data Type: Continuous

Assumptions: Assuming that $x\rightarrow y$ then we suppose that the data follows an additive noise model, i.e. $y=f(x)+E$. E being a noise variable and f a deterministic function. The causal inference bases itself on the independence between x and e. It is proven that in such case if the data is generated using an additive noise model, the model would only be able to fit in the true causal direction.

Note

Ref : Hoyer, Patrik O and Janzing, Dominik and Mooij, Joris M and Peters, Jonas and Schölkopf, Bernhard, “Nonlinear causal discovery with additive noise models”, NIPS 2009 https://papers.nips.cc/paper/3548-nonlinear-causal-discovery-with-additive-noise-models.pdf

Example

>>> from cdt.causality.pairwise import ANM
>>> import networkx as nx
>>> import matplotlib.pyplot as plt
>>> from cdt.data import load_dataset
>>> data, labels = load_dataset('tuebingen')
>>> obj = ANM()
>>>
>>> # This example uses the predict() method
>>> output = obj.predict(data)
>>>
>>> # This example uses the orient_graph() method. The dataset used
>>> # can be loaded using the cdt.data module
>>> data, graph = load_dataset('sachs')
>>> output = obj.orient_graph(data, nx.DiGraph(graph))
>>>
>>> # To view the directed graph run the following command
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

anm_score(x, y)[source]

Compute the fitness score of the ANM model in the x->y direction.

Parameters

a (numpy.ndarray) – Variable seen as cause
b (numpy.ndarray) – Variable seen as effect

Returns

ANM fit score

Return type

float

predict_proba(data, **kwargs)[source]

Prediction method for pairwise causal inference using the ANM model.

Parameters: dataset (tuple) – Couple of np.ndarray variables to classify
Returns: Causation score (Value : 1 if a->b and -1 if b->a)
Return type: float

Bivariate Fit

class cdt.causality.pairwise.BivariateFit(ffactor=2, maxdev=3, minc=12)[source]

Bivariate Fit model.

Description: The bivariate fit model is based onon a best-fit criterion relying on a Gaussian Process regressor. Used as weak baseline.

Data Type: Continuous

Assumptions: This is often a model used to show that correlation $\neq$ causation. It holds very weak performance, as it states that the best predictive model is the causal model.

Example

>>> from cdt.causality.pairwise import BivariateFit
>>> import networkx as nx
>>> import matplotlib.pyplot as plt
>>> from cdt.data import load_dataset
>>> data, labels = load_dataset('tuebingen')
>>> obj = BivariateFit()
>>>
>>> # This example uses the predict() method
>>> output = obj.predict(data)
>>>
>>> # This example uses the orient_graph() method. The dataset used
>>> # can be loaded using the cdt.data module
>>> data, graph = load_dataset("sachs")
>>> output = obj.orient_graph(data, nx.Graph(graph))
>>>
>>> #To view the directed graph run the following command
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

b_fit_score(x, y)[source]

Computes the cds statistic from variable 1 to variable 2

Parameters

a (numpy.ndarray) – Variable 1
b (numpy.ndarray) – Variable 2

Returns

BF fit score

Return type

float

predict_proba(dataset, **kwargs)[source]

Infer causal relationships between 2 variables using regression.

Parameters: dataset (tuple) – Couple of np.ndarray variables to classify
Returns: Causation score (Value : 1 if a->b and -1 if b->a)
Return type: float

CDS

class cdt.causality.pairwise.CDS(ffactor=2, maxdev=3, minc=12)[source]

Conditional Distribution Similarity Statistic

Description: The Conditional Distribution Similarity Statistic measures the std. of the rescaled values of y (resp. x) after binning in the x (resp. y) direction. The lower the std. the more likely the pair to be x->y (resp. y->x). It is a single feature of the Jarfo model.

Data Type: Continuous and Discrete

Assumptions: This approach is a statistical feature of the joint distribution of the data mesuring the variance of the marginals, after conditioning on bins.

Note

Ref : Fonollosa, José AR, “Conditional distribution variability measures for causality detection”, 2016.

Example

>>> from cdt.causality.pairwise import CDS
>>> import networkx as nx
>>> import matplotlib.pyplot as plt
>>> from cdt.data import load_dataset
>>> data, labels = load_dataset('tuebingen')
>>> obj = CDS()
>>>
>>> # This example uses the predict() method
>>> output = obj.predict(data)
>>>
>>> # This example uses the orient_graph() method. The dataset used
>>> # can be loaded using the cdt.data module
>>> data, graph = load_dataset("sachs")
>>> output = obj.orient_graph(data, nx.Graph(graph))
>>>
>>> #To view the directed graph run the following command
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

cds_score(x_te, y_te)[source]

Computes the cds statistic from variable 1 to variable 2

Parameters

x_te (numpy.ndarray) – Variable 1
y_te (numpy.ndarray) – Variable 2

Returns

CDS fit score

Return type

float

predict_proba(dataset, **kwargs)[source]

Infer causal relationships between 2 variables using the CDS statistic

Parameters: dataset (tuple) – Couple of np.ndarray variables to classify
Returns: Causation score (Value : 1 if a->b and -1 if b->a)
Return type: float

GNN

class cdt.causality.pairwise.GNN(nh=20, lr=0.01, nruns=6, njobs=None, gpus=None, verbose=None, batch_size=-1, train_epochs=1000, test_epochs=1000, dataloader_workers=0)[source]

Shallow Generative Neural networks.

Description: Pairwise variant of the CGNN approach, Models the causal directions x->y and y->x with a 1-hidden layer neural network and a MMD loss. The causal direction is considered as the best-fit between the two causal directions.

Data Type: Continuous

Assumptions: The class of generative models is not restricted with a hard contraint, but with the hyperparameter nh. This algorithm greatly benefits from bootstrapped runs (nruns >=12 recommended), and is very computationnally heavy. GPUs are recommended.

Parameters

nh (int) – number of hidden units in the neural network
lr (float) – learning rate of the optimizer
nruns (int) – number of runs to execute per batch (before testing for significance with t-test).
njobs (int) – number of runs to execute in parallel. (defaults to cdt.SETTINGS.NJOBS)
gpus (bool) – Number of available gpus (defaults to cdt.SETTINGS.GPU)
idx (int) – (optional) index of the pair, for printing purposes
verbose (bool) – verbosity (defaults to cdt.SETTINGS.verbose)
batch_size (int) – batch size, defaults to full-batch
train_epochs (int) – Number of epochs used for training
test_epochs (int) – Number of epochs used for evaluation
dataloader_workers (int) – how many subprocesses to use for data loading. 0 means that the data will be loaded in the main process. (default: 0)

Note

Ref : Learning Functional Causal Models with Generative Neural Networks Olivier Goudet & Diviyan Kalainathan & Al. (https://arxiv.org/abs/1709.05321)

Example

>>> from cdt.causality.pairwise import GNN
>>> import networkx as nx
>>> import matplotlib.pyplot as plt
>>> from cdt.data import load_dataset
>>> data, labels = load_dataset('tuebingen')
>>> obj = GNN()
>>>
>>> # This example uses the predict() method
>>> output = obj.predict(data)
>>>
>>> # This example uses the orient_graph() method. The dataset used
>>> # can be loaded using the cdt.data module
>>> data, graph = load_dataset("sachs")
>>> output = obj.orient_graph(data, nx.Graph(graph))
>>>
>>> #To view the directed graph run the following command
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

orient_graph(df_data, graph, printout=None, **kwargs)[source]

Orient an undirected graph using the pairwise method defined by the subclass.

The pairwise method is ran on every undirected edge.

Parameters

df_data (pandas.DataFrame or MetaDataset) – Data (check cdt.utils.io.MetaDataset)
graph (networkx.Graph) – Graph to orient
printout (str) – (optional) Path to file where to save temporary results

Returns

a directed graph, which might contain cycles

Return type

networkx.DiGraph

Note

This function is an override of the base class, in order to be able to use the torch.utils.data.Dataset classes

Warning

Requirement : Name of the nodes in the graph correspond to name of the variables in df_data

predict_proba(dataset, idx=0)[source]

Run multiple times GNN to estimate the causal direction.

Parameters: dataset (torch.utils.data.Dataset or tuple) – pair (x, y) to classify. Either a tuple or a torch dataset.
Returns: Causal score of the pair (Value : 1 if a->b and -1 if b->a)
Return type: float

IGCI

class cdt.causality.pairwise.IGCI[source]

IGCI model.

Description: Information Geometric Causal Inference is a pairwise causal discovery model model considering the case of minimal noise $Y=f(X)$, with $f$ invertible and leverages assymetries to predict causal directions.

Data Type: Continuous

Assumptions: Only the case of invertible functions only is considered, as the prediction would be trivial otherwise if the noise is minimal.

Note

P. Daniušis, D. Janzing, J. Mooij, J. Zscheischler, B. Steudel, K. Zhang, B. Schölkopf: Inferring deterministic causal relations. Proceedings of the 26th Annual Conference on Uncertainty in Artificial Intelligence (UAI-2010). http://event.cwi.nl/uai2010/papers/UAI2010_0121.pdf

Example

>>> from cdt.causality.pairwise import IGCI
>>> import networkx as nx
>>> import matplotlib.pyplot as plt
>>> from cdt.data import load_dataset
>>> data, labels = load_dataset('tuebingen')
>>> obj = IGCI()
>>>
>>> # This example uses the predict() method
>>> output = obj.predict(data)
>>>
>>> # This example uses the orient_graph() method. The dataset used
>>> # can be loaded using the cdt.data module
>>> data, graph = load_dataset("sachs")
>>> output = obj.orient_graph(data, nx.Graph(graph))
>>>
>>> #To view the directed graph run the following command
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

predict_proba(dataset, ref_measure='gaussian', estimator='entropy', **kwargs)[source]

Evaluate a pair using the IGCI model.

Parameters

dataset (tuple) – Couple of np.ndarray variables to classify
refMeasure (str) – Scaling method (gaussian (default), integral or None)
estimator (str) – method used to evaluate the pairs (entropy (default) or integral)}

Returns

value of the IGCI model >0 if a->b otherwise if return <0

Return type

float

Jarfo

class cdt.causality.pairwise.Jarfo[source]

Jarfo model, 2nd of the Cause Effect Pairs challenge, 1st of the Fast Causation Challenge.

Description: The Jarfo model is an ensemble method for causal discovery: it builds lots of causally relevant features (such as ANM) with a gradient boosting classifier on top.

Data Type: Continuous, Categorical, Mixed

Assumptions: This method needs a substantial amount of labelled causal pairs to train itself. Its final performance depends on the training set used.

Note

Ref : Fonollosa, José AR, “Conditional distribution variability measures for causality detection”, 2016.

Example

>>> from cdt.causality.pairwise import Jarfo
>>> import networkx as nx
>>> import matplotlib.pyplot as plt
>>> from cdt.data import load_dataset
>>> from sklearn.model_selection import train_test_split
>>> data, labels = load_dataset('tuebingen')
>>> X_tr, X_te, y_tr, y_te = train_test_split(data, labels, train_size=.5)
>>>
>>> obj = Jarfo()
>>> obj.fit(X_tr, y_tr)
>>> # This example uses the predict() method
>>> output = obj.predict(X_te)
>>>
>>> # This example uses the orient_graph() method. The dataset used
>>> # can be loaded using the cdt.data module
>>> data, graph = load_dataset("sachs")
>>> output = obj.orient_graph(data, nx.Graph(graph))
>>>
>>> #To view the directed graph run the following command
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

orient_graph(df_data, graph, printout=None, **kwargs)[source]

Orient an undirected graph using Jarfo, function modified for optimization.

Parameters

df_data (pandas.DataFrame) – Data
umg (networkx.Graph) – Graph to orient
nruns (int) – number of times to rerun for each pair (bootstrap)
printout (str) – (optional) Path to file where to save temporary results

Returns

a directed graph, which might contain cycles

Return type

networkx.DiGraph

predict_dataset(df)[source]

Runs Jarfo independently on all pairs.

Parameters

x (pandas.DataFrame) – a CEPC format Dataframe.
kwargs (dict) – additional arguments for the algorithms

Returns

a Dataframe with the predictions.

Return type

pandas.DataFrame

predict_proba(dataset, idx=0, **kwargs)[source]

Use Jarfo to predict the causal direction of a pair of vars.

Parameters

dataset (tuple) – Couple of np.ndarray variables to classify
idx (int) – (optional) index number for printing purposes

Returns

Causation score (Value : 1 if a->b and -1 if b->a)

Return type

float

NCC

class cdt.causality.pairwise.NCC[source]

Neural Causation Coefficient.

Description: The Neural Causation Coefficient (NCC) is an approach neural network relying only on Neural networks to build causally relevant embeddings of distributions during training, and classyfing the pairs using the last layers of the neural network.

Data Type: Continuous, Categorical, Mixed

Assumptions: This method needs a substantial amount of labelled causal pairs to train itself. Its final performance depends on the training set used.

Example

>>> from cdt.causality.pairwise import NCC
>>> import networkx as nx
>>> import matplotlib.pyplot as plt
>>> from cdt.data import load_dataset
>>> from sklearn.model_selection import train_test_split
>>> data, labels = load_dataset('tuebingen')
>>> X_tr, X_te, y_tr, y_te = train_test_split(data, labels, train_size=.5)
>>>
>>> obj = NCC()
>>> obj.fit(X_tr, y_tr)
>>> # This example uses the predict() method
>>> output = obj.predict(X_te)
>>>
>>> # This example uses the orient_graph() method. The dataset used
>>> # can be loaded using the cdt.data module
>>> data, graph = load_dataset("sachs")
>>> output = obj.orient_graph(data, nx.Graph(graph))
>>>
>>> #To view the directed graph run the following command
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

fit(x_tr, y_tr, epochs=50, batch_size=32, learning_rate=0.01, verbose=None, device=None)[source]

Fit the NCC model.

Parameters

x_tr (pd.DataFrame) – CEPC format dataframe containing the pairs
y_tr (pd.DataFrame or np.ndarray) – labels associated to the pairs
epochs (int) – number of train epochs
learning_rate (float) – learning rate of Adam
verbose (bool) – verbosity (defaults to cdt.SETTINGS.verbose)
device (str) – cuda or cpu device (defaults to cdt.SETTINGS.default_device)

predict_dataset(df, device=None, verbose=None)[source]

Parameters

x_tr (pd.DataFrame) – CEPC format dataframe containing the pairs
epochs (int) – number of train epochs
rate (learning) – learning rate of Adam
verbose (bool) – verbosity (defaults to cdt.SETTINGS.verbose)
device (str) – cuda or cpu device (defaults to cdt.SETTINGS.default_device)

Returns

dataframe containing the predicted causation coefficients

Return type

pandas.DataFrame

predict_proba(dataset, device=None, idx=0)[source]

Infer causal directions using the trained NCC pairwise model.

Parameters

dataset (tuple) – Couple of np.ndarray variables to classify
device (str) – Device to run the algorithm on (defaults to cdt.SETTINGS.default_device)

Returns

Causation score (Value : 1 if a->b and -1 if b->a)

Return type

float

RCC

class cdt.causality.pairwise.RCC(rand_coeff=333, nb_estimators=500, nb_min_leaves=20, max_depth=None, s=10, njobs=None, verbose=None)[source]

Randomized Causation Coefficient model. 2nd approach in the Fast Causation challenge.

Description: The Randomized causation coefficient (RCC) relies on the projection of the empirical distributions into a RKHS using random cosine embeddings, then classfies the pairs using a random forest based on those features.

Data Type: Continuous, Categorical, Mixed

Assumptions: This method needs a substantial amount of labelled causal pairs to train itself. Its final performance depends on the training set used.

Parameters

rand_coeff (int) – number of randomized coefficients
nb_estimators (int) – number of estimators
nb_min_leaves (int) – number of min samples leaves of the estimator
() (max_depth) – (optional) max depth of the model
s (float) – scaling
njobs (int) – number of jobs to be run on parallel (defaults to cdt.SETTINGS.NJOBS)
verbose (bool) – verbosity (defaults to cdt.SETTINGS.verbose)

Note

Ref : Lopez-Paz, David and Muandet, Krikamol and Schölkopf, Bernhard and Tolstikhin, Ilya O, “Towards a Learning Theory of Cause-Effect Inference”, ICML 2015.

Example

>>> from cdt.causality.pairwise import RCC
>>> import networkx as nx
>>> import matplotlib.pyplot as plt
>>> from cdt.data import load_dataset
>>> from sklearn.model_selection import train_test_split
>>> data, labels = load_dataset('tuebingen')
>>> X_tr, X_te, y_tr, y_te = train_test_split(data, labels, train_size=.5)
>>>
>>> obj = RCC()
>>> obj.fit(X_tr, y_tr)
>>> # This example uses the predict() method
>>> output = obj.predict(X_te)
>>>
>>> # This example uses the orient_graph() method. The dataset used
>>> # can be loaded using the cdt.data module
>>> data, graph = load_dataset('sachs')
>>> output = obj.orient_graph(data, nx.DiGraph(graph))
>>>
>>> # To view the directed graph run the following command
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

featurize_row(x, y)[source]

Projects the causal pair to the RKHS using the sampled kernel approximation.

Parameters

x (np.ndarray) – Variable 1
y (np.ndarray) – Variable 2

Returns

projected empirical distributions into a single fixed-size vector.

Return type

np.ndarray

fit(x, y)[source]

Train the model.

Parameters

x_tr (pd.DataFrame) – CEPC format dataframe containing the pairs
y_tr (pd.DataFrame or np.ndarray) – labels associated to the pairs

predict_proba(dataset, **kwargs)[source]

Predict the causal score using a trained RCC model

Parameters: dataset (tuple) – Couple of np.ndarray variables to classify
Returns: Causation score (Value : 1 if a->b and -1 if b->a)
Return type: float

RECI

class cdt.causality.pairwise.RECI(degree=3)[source]

RECI model.

Description: Regression Error based Causal Inference (RECI) relies on a best-fit mse with monome regressor and [0,1] rescaling to infer causal direction.

Data Type: Continuous (depends on the regressor used)

Assumptions: No independence tests are used, but the assumptions on the model depend on the regessor used for RECI.

Parameters: degree (int) – Degree of the polynomial regression.

Note

Bloebaum, P., Janzing, D., Washio, T., Shimizu, S., & Schoelkopf, B. (2018, March). Cause-Effect Inference by Comparing Regression Errors. In International Conference on Artificial Intelligence and Statistics (pp. 900-909).

Example

>>> from cdt.causality.pairwise import RECI
>>> import networkx as nx
>>> import matplotlib.pyplot as plt
>>> from cdt.data import load_dataset
>>> data, labels = load_dataset('tuebingen')
>>> obj = RECI()
>>>
>>> # This example uses the predict() method
>>> output = obj.predict(data)
>>>
>>> # This example uses the orient_graph() method. The dataset used
>>> # can be loaded using the cdt.data module
>>> data, graph = load_dataset("sachs")
>>> output = obj.orient_graph(data, nx.Graph(graph))
>>>
>>> #To view the directed graph run the following command
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

b_fit_score(x, y)[source]

Compute the RECI fit score

Parameters

x (numpy.ndarray) – Variable 1
y (numpy.ndarray) – Variable 2

Returns

RECI fit score

Return type

float

predict_proba(dataset, **kwargs)[source]

Infer causal relationships between 2 variables using the RECI statistic

Parameters: dataset (tuple) – Couple of np.ndarray variables to classify
Returns: Causation coefficient (Value : 1 if a->b and -1 if b->a)
Return type: float

cdt.causality.graph

Find causal relationships and output a directed graph.

class cdt.causality.graph.model.GraphModel[source]

Base class for all graph causal inference models.

Usage for undirected/directed graphs and raw data. All causal discovery models out of observational data base themselves on this class. Its main feature is the predict function that executes a function according to the given arguments.

create_graph_from_data(data, **kwargs)[source]: Infer a directed graph out of data.

Note

Not implemented: will be implemented by the model classes.

orient_directed_graph(data, dag, **kwargs)[source]: Re/Orient an undirected graph.

Note

Not implemented: will be implemented by the model classes.

orient_undirected_graph(data, umg, **kwargs)[source]: Orient an undirected graph.

Note

Not implemented: will be implemented by the model classes.

predict(df_data, graph=None, **kwargs)[source]

Orient a graph using the method defined by the arguments.

Depending on the type of graph, this function process to execute different functions:

If graph is a networkx.DiGraph, then self.orient_directed_graph is executed.
If graph is a networkx.Graph, then self.orient_undirected_graph is executed.
If graph is a None, then self.create_graph_from_data is executed.

Parameters

df_data (pandas.DataFrame) – DataFrame containing the observational data.
graph (networkx.DiGraph or networkx.Graph or None) – Prior knowledge on the causal graph.

Warning

Requirement : Name of the nodes in the graph must correspond to the name of the variables in df_data

bnlearn-based models

class cdt.causality.graph.bnlearn.BNlearnAlgorithm(score='NULL', alpha=0.05, beta='NULL', optim=False, verbose=None, algorithm=None)[source]

BNlearn algorithm. All these models imported from bnlearn revolve around this base class and have all the same attributes/interface.

Parameters

score (str) – the label of the conditional independence test to be used in the algorithm. If none is specified, the default test statistic is the mutual information for categorical variables, the Jonckheere-Terpstra test for ordered factors and the linear correlation for continuous variables. See below for available tests.
alpha (float) – a numeric value, the target nominal type I error rate.
beta (int) – a positive integer, the number of permutations considered for each permutation test. It will be ignored with a warning if the conditional independence test specified by the score argument is not a permutation test.
optim (bool) – See bnlearn-package for details.
verbose (bool) – Sets the verbosity. Defaults to SETTINGS.verbose

Available tests:

discrete case (categorical variables)

– mutual information: an information-theoretic distance measure.
It’s proportional to the log-likelihood ratio (they differ by a 2n factor) and is related to the deviance of the tested models. The asymptotic χ2 test (mi and mi-adf, with adjusted degrees of freedom), the Monte Carlo permutation test (mc-mi), the sequential Monte Carlo permutation test (smc-mi), and the semiparametric test (sp-mi) are implemented.

– shrinkage estimator for the mutual information (mi-sh)
An improved asymptotic χ2 test based on the James-Stein estimator for the mutual information.

– Pearson’s X2the classical Pearson’s X2 test for contingency tables.
The asymptotic χ2 test (x2 and x2-adf, with adjusted degrees of freedom), the Monte Carlo permutation test (mc-x2), the sequential Monte Carlo permutation test (smc-x2) and semiparametric test (sp-x2) are implemented .
discrete case (ordered factors)

– Jonckheere-Terpstraa trend test for ordinal variables.
The asymptotic normal test (jt), the Monte Carlo permutation test (mc-jt) and the sequential Monte Carlo permutation test (smc-jt) are implemented.
continuous case (normal variables)

– linear correlation: Pearson’s linear correlation.
The exact Student’s t test (cor), the Monte Carlo permutation test (mc-cor) and the sequential Monte Carlo permutation test (smc-cor) are implemented.

– Fisher’s Z: a transformation of the linear correlation with asymptotic normal distribution.
Used by commercial software (such as TETRAD II) for the PC algorithm (an R implementation is present in the pcalg package on CRAN). The asymptotic normal test (zf), the Monte Carlo permutation test (mc-zf) and the sequential Monte Carlo permutation test (smc-zf) are implemented.

– mutual information: an information-theoretic distance measure.
Again it is proportional to the log-likelihood ratio (they differ by a 2n factor). The asymptotic χ2 test (mi-g), the Monte Carlo permutation test (mc-mi-g) and the sequential Monte Carlo permutation test (smc-mi-g) are implemented.

– shrinkage estimator for the mutual information(mi-g-sh):
an improved asymptotic χ2 test based on the James-Stein estimator for the mutual information.
hybrid case (mixed discrete and normal variables)

– mutual information: an information-theoretic distance measure.
Again it is proportional to the log-likelihood ratio (they differ by a 2n factor). Only the asymptotic χ2 test (mi-cg) is implemented.

create_graph_from_data(data)[source]

Run the algorithm on data.

Parameters: data (pandas.DataFrame) – DataFrame containing the data
Returns: Solution given by the algorithm.
Return type: networkx.DiGraph

orient_directed_graph(data, graph)[source]

Run the algorithm on a directed_graph.

Parameters

data (pandas.DataFrame) – DataFrame containing the data
graph (networkx.DiGraph) – Skeleton of the graph to orient

Returns

Solution on the given skeleton.

Return type

networkx.DiGraph

Warning

The algorithm is ran on the skeleton of the given graph.

orient_undirected_graph(data, graph)[source]

Run the algorithm on an undirected graph.

Parameters

data (pandas.DataFrame) – DataFrame containing the data
graph (networkx.Graph) – Skeleton of the graph to orient

Returns

Solution on the given skeleton.

Return type

networkx.DiGraph

GS

class cdt.causality.graph.bnlearn.GS(score='NULL', alpha=0.05, beta='NULL', optim=False, verbose=None)[source]

Grow-Shrink algorithm.

Description: The Grow Shrink algorithm is a constraint based algorithm to recover bayesian networks. It consists in two phases, one growing phase in which nodes are added to the markov blanket based on conditional independence and a shrinking phase in which most irrelevant nodes are removed.

Required R packages: bnlearn

Data Type: Depends on the test used. Check here for the list of available tests.

Assumptions: GS outputs a CPDAG, with additional assumptions depending on the conditional test used.

Note

Margaritis D (2003). Learning Bayesian Network Model Structure from Data . Ph.D. thesis, School of Computer Science, Carnegie-Mellon University, Pittsburgh, PA. Available as Technical Report CMU-CS-03-153

Example

>>> import networkx as nx
>>> from cdt.causality.graph import GS
>>> from cdt.data import load_dataset
>>> data, graph = load_dataset("sachs")
>>> obj = GS()
>>> #The predict() method works without a graph, or with a
>>> #directed or undirected graph provided as an input
>>> output = obj.predict(data)    #No graph provided as an argument
>>>
>>> output = obj.predict(data, nx.Graph(graph))  #With an undirected graph
>>>
>>> output = obj.predict(data, graph)  #With a directed graph
>>>
>>> #To view the graph created, run the below commands:
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

IAMB

class cdt.causality.graph.bnlearn.IAMB(score='NULL', alpha=0.05, beta='NULL', optim=False, verbose=None)[source]

IAMB algorithm.

Description: The is a bayesian constraint based algorithm to recover Markov blankets in a forward selection and a modified backward selection process.

Required R packages: bnlearn

Data Type: Depends on the test used. Check here for the list of available tests.

Assumptions: IAMB outputs Markov blankets of nodes, with additional assumptions depending on the conditional test used.

Note

Tsamardinos I, Aliferis CF, Statnikov A (2003). “Algorithms for Large Scale Markov Blanket Discovery”. In “Proceedings of the Sixteenth International Florida Artificial Intelligence Research Society Conference”, pp. 376-381. AAAI Press.

Example

>>> import networkx as nx
>>> from cdt.causality.graph import IAMB
>>> from cdt.data import load_dataset
>>> data, graph = load_dataset("sachs")
>>> obj = IAMB()
>>> #The predict() method works without a graph, or with a
>>> #directed or undirected graph provided as an input
>>> output = obj.predict(data)    #No graph provided as an argument
>>>
>>> output = obj.predict(data, nx.Graph(graph))  #With an undirected graph
>>>
>>> output = obj.predict(data, graph)  #With a directed graph
>>>
>>> #To view the graph created, run the below commands:
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

Fast_IAMB

class cdt.causality.graph.bnlearn.Fast_IAMB(score='NULL', alpha=0.05, beta='NULL', optim=False, verbose=None)[source]

Fast IAMB algorithm.

Description: Similar to IAMB, Fast-IAMB adds speculation to provide more computational performance without affecting the accuracy of markov blanket recovery.

Required R packages: bnlearn

Data Type: Depends on the test used. Check here for the list of available tests.

Assumptions: Fast-IAMB outputs markov blankets of nodes, with additional assumptions depending on the conditional test used.

Note

Yaramakala S, Margaritis D (2005). “Speculative Markov Blanket Discovery for Optimal Feature Selection”. In “ICDM ’05: Proceedings of the Fifth IEEE International Conference on Data Mining”, pp. 809-812. IEEE Computer Society.

Example

>>> import networkx as nx
>>> from cdt.causality.graph import Fast_IAMB
>>> from cdt.data import load_dataset
>>> data, graph = load_dataset("sachs")
>>> obj = Fast_IAMB()
>>> #The predict() method works without a graph, or with a
>>> #directed or undirected graph provided as an input
>>> output = obj.predict(data)    #No graph provided as an argument
>>>
>>> output = obj.predict(data, nx.Graph(graph))  #With an undirected graph
>>>
>>> output = obj.predict(data, graph)  #With a directed graph
>>>
>>> #To view the graph created, run the below commands:
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

Inter_IAMB

class cdt.causality.graph.bnlearn.Inter_IAMB(score='NULL', alpha=0.05, beta='NULL', optim=False, verbose=None)[source]

Interleaved IAMB algorithm.

Description: Similar to IAMB, Interleaved-IAMB has a progressive forward selection minimizing false positives.

Required R packages: bnlearn

Data Type: Depends on the test used. Check here for the list of available tests.

Assumptions: Inter-IAMB outputs markov blankets of nodes, with additional assumptions depending on the conditional test used.

Note

Yaramakala S, Margaritis D (2005). “Speculative Markov Blanket Discovery for Optimal Feature Selection”. In “ICDM ’05: Proceedings of the Fifth IEEE International Conference on Data Min- ing”, pp. 809-812. IEEE Computer Society.

Example

>>> import networkx as nx
>>> from cdt.causality.graph import Inter_IAMB
>>> from cdt.data import load_dataset
>>> data, graph = load_dataset("sachs")
>>> obj = Inter_IAMB()
>>> #The predict() method works without a graph, or with a
>>> #directed or undirected graph provided as an input
>>> output = obj.predict(data)    #No graph provided as an argument
>>>
>>> output = obj.predict(data, nx.Graph(graph))  #With an undirected graph
>>>
>>> output = obj.predict(data, graph)  #With a directed graph
>>>
>>> #To view the graph created, run the below commands:
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

MMPC

class cdt.causality.graph.bnlearn.MMPC(score='NULL', alpha=0.05, beta='NULL', optim=False, verbose=None)[source]

Max-Min Parents-Children algorithm.

Description: The Max-Min Parents-Children (MMPC) is a 2-phase algorithm with a forward pass and a backward pass. The forward phase adds recursively the variables that possess the highest association with the target conditionally to the already selected variables. The backward pass tests d-separability of variables conditionally to the set and subsets of the selected variables.

Required R packages: bnlearn

Data Type: Depends on the test used. Check here for the list of available tests.

Assumptions: MMPC outputs markov blankets of nodes, with additional assumptions depending on the conditional test used.

Note

Tsamardinos I, Aliferis CF, Statnikov A (2003). “Time and Sample Efficient Discovery of Markov Blankets and Direct Causal Relations”. In “KDD ’03: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining”, pp. 673-678. ACM. Tsamardinos I, Brown LE, Aliferis CF (2006). “The Max-Min Hill-Climbing Bayesian Network Structure Learning Algorithm”. Machine Learning,65(1), 31-78.

Example

>>> import networkx as nx
>>> from cdt.causality.graph import MMPC
>>> from cdt.data import load_dataset
>>> data, graph = load_dataset("sachs")
>>> obj = MMPC()
>>> #The predict() method works without a graph, or with a
>>> #directed or undirected graph provided as an input
>>> output = obj.predict(data)    #No graph provided as an argument
>>>
>>> output = obj.predict(data, nx.Graph(graph))  #With an undirected graph
>>>
>>> output = obj.predict(data, graph)  #With a directed graph
>>>
>>> #To view the graph created, run the below commands:
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

CAM

class cdt.causality.graph.CAM(score='nonlinear', cutoff=0.001, variablesel=True, selmethod='gamboost', pruning=False, prunmethod='gam', njobs=None, verbose=None)[source]

CAM algorithm [R model].

Description: Causal Additive models, a causal discovery algorithm relying on fitting Gaussian Processes on data, while considering all noises additives and additive contributions of variables.

Required R packages: CAM

Data Type: Continuous

Assumptions: The data follows a generalized additive noise model: each variable $X_i$ in the graph $\mathcal{G}$ is generated following the model $X_i = \sum_{X_j \in \mathcal{G}} f(X_j) + \epsilon_i$, $\epsilon_i$ representing mutually independent noises variables accounting for unobserved variables.

Parameters

score (str) – Score used to fit the gaussian processes.
cutoff (float) – threshold value for variable selection.
variablesel (bool) – Perform a variable selection step.
selmethod (str) – Method used for variable selection.
pruning (bool) – Perform an initial pruning step.
prunmethod (str) – Method used for pruning.
njobs (int) – Number of jobs to run in parallel.
verbose (bool) – Sets the verbosity of the output.

Available scores:

nonlinear: ‘SEMGAM’
linear: ‘SEMLIN’

Available variable selection methods:

gamboost’: ‘selGamBoost’
gam’: ‘selGam’
lasso’: ‘selLasso’
linear’: ‘selLm’
linearboost’: ‘selLmBoost’

Default Parameters:

FILE: ‘/tmp/cdt_CAM/data.csv’
SCORE: ‘SEMGAM’
VARSEL: ‘TRUE’
SELMETHOD: ‘selGamBoost’
PRUNING: ‘TRUE’
PRUNMETHOD: ‘selGam’
NJOBS: str(SETTINGS.NJOBS)
CUTOFF: str(0.001)
VERBOSE: ‘FALSE’
OUTPUT: ‘/tmp/cdt_CAM/result.csv’

Note

Ref: Bühlmann, P., Peters, J., & Ernest, J. (2014). CAM: Causal additive models, high-dimensional order search and penalized regression. The Annals of Statistics, 42(6), 2526-2556.

Warning

This implementation of CAM does not support starting with a graph. The adaptation will be made at a later date.

Example

>>> import networkx as nx
>>> from cdt.causality.graph import CAM
>>> from cdt.data import load_dataset
>>> data, graph = load_dataset("sachs")
>>> obj = CAM()
>>> output = obj.predict(data)

create_graph_from_data(data, **kwargs)[source]

Apply causal discovery on observational data using CAM.

Parameters: data (pandas.DataFrame) – DataFrame containing the data
Returns: Solution given by the CAM algorithm.
Return type: networkx.DiGraph

CCDr

class cdt.causality.graph.CCDr(verbose=None)[source]

CCDr algorithm [R model].

Description: Concave penalized Coordinate Descent with reparametrization) structure learning algorithm as described in Aragam and Zhou (2015). This is a fast, score based method for learning Bayesian networks that uses sparse regularization and block-cyclic coordinate descent.

Required R packages: sparsebn

Data Type: Continuous

Assumptions: This model does not restrict or prune the search space in any way, does not assume faithfulness, does not require a known variable ordering, works on observational data (i.e. without experimental interventions), works effectively in high dimensions, and is capable of handling graphs with several thousand variables. The output of this model is a DAG.

Imported from the ‘sparsebn’ package.

Warning

This implementation of CCDr does not support starting with a graph.

Note

ref: Aragam, B., & Zhou, Q. (2015). Concave penalized estimation of sparse Gaussian Bayesian networks. Journal of Machine Learning Research, 16, 2273-2328.

Example

>>> import networkx as nx
>>> from cdt.causality.graph import CCDr
>>> from cdt.data import load_dataset
>>> data, graph = load_dataset("sachs")
>>> obj = CCCDr()
>>> output = obj.predict(data)

create_graph_from_data(data, **kwargs)[source]

Apply causal discovery on observational data using CCDr.

Parameters: data (pandas.DataFrame) – DataFrame containing the data
Returns: Solution given by the CCDR algorithm.
Return type: networkx.DiGraph

CGNN

class cdt.causality.graph.CGNN(nh=20, nruns=16, njobs=None, gpus=None, batch_size=-1, lr=0.01, train_epochs=1000, test_epochs=1000, verbose=None, dataloader_workers=0)[source]

Causal Generative Neural Netwoks

Description: Causal Generative Neural Networks. Score-method that evaluates candidate graph by generating data following the topological order of the graph using neural networks, and using MMD for evaluation.

Data Type: Continuous

Assumptions: The class of generative models is not restricted with a hard contraint, but with the hyperparameter nh. This algorithm greatly benefits from bootstrapped runs (nruns >=12 recommended), and is very computationnally heavy. GPUs are recommended.

Parameters

nh (int) – Number of hidden units in each generative neural network.
nruns (int) – Number of times to run CGNN to have a stable evaluation.
njobs (int) – Number of jobs to run in parallel. Defaults to cdt.SETTINGS.NJOBS.
gpus (bool) – Number of available gpus (Initialized with cdt.SETTINGS.GPU)
batch_size (int) – batch size, defaults to full-batch
lr (float) – Learning rate for the generative neural networks.
train_epochs (int) – Number of epochs used to train the network.
test_epochs (int) – Number of epochs during which the results are harvested. The network still trains at this stage.
verbose (bool) – Sets the verbosity of the execution. Defaults to cdt.SETTINGS.verbose.
dataloader_workers (int) – how many subprocesses to use for data loading. 0 means that the data will be loaded in the main process. (default: 0)

Note

Ref : Learning Functional Causal Models with Generative Neural Networks Olivier Goudet & Diviyan Kalainathan & Al. (https://arxiv.org/abs/1709.05321)

Note

The input data can be of type torch.utils.data.Dataset, or it defaults to cdt.utils.io.MetaDataset. This class is overridable to write custom data loading functions, useful for very large datasets.

Example

>>> import networkx as nx
>>> from cdt.causality.graph import CGNN
>>> from cdt.data import load_dataset
>>> data, graph = load_dataset("sachs")
>>> obj = CGNN()
>>> #The predict() method works without a graph, or with a
>>> #directed or undirected graph provided as an input
>>> output = obj.predict(data)    #No graph provided as an argument
>>>
>>> output = obj.predict(data, nx.Graph(graph))  #With an undirected graph
>>>
>>> output = obj.predict(data, graph)  #With a directed graph
>>>
>>> #To view the graph created, run the below commands:
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

create_graph_from_data(data)[source]

Use CGNN to create a graph from scratch. All the possible structures are tested, which leads to a super exponential complexity. It would be preferable to start from a graph skeleton for large graphs.

Parameters: data (pandas.DataFrame or torch.utils.data.Dataset) – Observational data on which causal discovery has to be performed.
Returns: Solution given by CGNN.
Return type: networkx.DiGraph

orient_directed_graph(data, dag, alg='HC')[source]

Modify and improve a directed acyclic graph solution using CGNN.

Parameters

data (pandas.DataFrame or torch.utils.data.Dataset) – Observational data on which causal discovery has to be performed.
dag (nx.DiGraph) – Graph that provides the initial solution, on which the CGNN algorithm will be applied.
alg (str) – Exploration heuristic to use, only “HC” is supported for now.

Returns

Solution given by CGNN.

Return type

networkx.DiGraph

orient_undirected_graph(data, umg, alg='HC')[source]

Orient the undirected graph using GNN and apply CGNN to improve the graph.

Parameters

data (pandas.DataFrame) – Observational data on which causal discovery has to be performed.
umg (nx.Graph) – Graph that provides the skeleton, on which the GNN then the CGNN algorithm will be applied.
alg (str) – Exploration heuristic to use, only “HC” is supported for now.

Returns

Solution given by CGNN.

Return type

networkx.DiGraph

Note

GNN (cdt.causality.pairwise.GNN) is first used to orient the undirected graph and output a DAG before applying CGNN.

GES

class cdt.causality.graph.GES(score='obs', verbose=None)[source]

GES algorithm [R model].

Description: Greedy Equivalence Search algorithm. A score-based Bayesian algorithm that searches heuristically the graph which minimizes a likelihood score on the data.

Required R packages: pcalg

Data Type: Continuous (score='obs') or Categorical (score='int')

Assumptions: The output is a Partially Directed Acyclic Graph (PDAG) (A markov equivalence class). The available scores assume linearity of mechanisms and gaussianity of the data.

Parameters

score (str) – Sets the score used by GES.
verbose (bool) – Defaults to cdt.SETTINGS.verbose.

Available scores:

int: GaussL0penIntScore
obs: GaussL0penObsScore

Note

Ref: D.M. Chickering (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research 3 , 507–554

A. Hauser and P. Bühlmann (2012). Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. Journal of Machine Learning Research 13, 2409–2464.

P. Nandy, A. Hauser and M. Maathuis (2015). Understanding consistency in hybrid causal structure learning. arXiv preprint 1507.02608

P. Spirtes, C.N. Glymour, and R. Scheines (2000). Causation, Prediction, and Search, MIT Press, Cambridge (MA)

Example

>>> import networkx as nx
>>> from cdt.causality.graph import GES
>>> from cdt.data import load_dataset
>>> data, graph = load_dataset("sachs")
>>> obj = GES()
>>> #The predict() method works without a graph, or with a
>>> #directed or udirected graph provided as an input
>>> output = obj.predict(data)    #No graph provided as an argument
>>>
>>> output = obj.predict(data, nx.Graph(graph))  #With an undirected graph
>>>
>>> output = obj.predict(data, graph)  #With a directed graph
>>>
>>> #To view the graph created, run the below commands:
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

create_graph_from_data(data)[source]

Run the GES algorithm.

Parameters: data (pandas.DataFrame) – DataFrame containing the data
Returns: Solution given by the GES algorithm.
Return type: networkx.DiGraph

orient_directed_graph(data, graph)[source]

Run GES on a directed graph.

Parameters

data (pandas.DataFrame) – DataFrame containing the data
graph (networkx.DiGraph) – Skeleton of the graph to orient

Returns

Solution given by the GES algorithm.

Return type

networkx.DiGraph

orient_undirected_graph(data, graph)[source]

Run GES on an undirected graph.

Parameters

data (pandas.DataFrame) – DataFrame containing the data
graph (networkx.Graph) – Skeleton of the graph to orient

Returns

Solution given by the GES algorithm.

Return type

networkx.DiGraph

GIES

class cdt.causality.graph.GIES(score='obs', verbose=False)[source]

GIES algorithm [R model].

Description: Greedy Interventional Equivalence Search algorithm. A score-based Bayesian algorithm that searches heuristically the graph which minimizes a likelihood score on the data. The main difference with GES is that it accepts interventional data for its inference.

Required R packages: pcalg

Data Type: Continuous (score='obs') or Categorical (score='int')

Assumptions: The output is a Partially Directed Acyclic Graph (PDAG) (A markov equivalence class). The available scores assume linearity of mechanisms and gaussianity of the data.

Parameters

score (str) – Sets the score used by GIES.
verbose (bool) – Defaults to cdt.SETTINGS.verbose.

Available scores:

int: GaussL0penIntScore
obs: GaussL0penObsScore

Note

Ref: D.M. Chickering (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research 3 , 507–554

A. Hauser and P. Bühlmann (2012). Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. Journal of Machine Learning Research 13, 2409–2464.

P. Nandy, A. Hauser and M. Maathuis (2015). Understanding consistency in hybrid causal structure learning. arXiv preprint 1507.02608

P. Spirtes, C.N. Glymour, and R. Scheines (2000). Causation, Prediction, and Search, MIT Press, Cambridge (MA)

Example

>>> import networkx as nx
>>> from cdt.causality.graph import GIES
>>> from cdt.data import load_dataset
>>> data, graph = load_dataset("sachs")
>>> obj = GIES()
>>> #The predict() method works without a graph, or with a
>>> #directed or undirected graph provided as an input
>>> output = obj.predict(data)    #No graph provided as an argument
>>>
>>> output = obj.predict(data, nx.Graph(graph))  #With an undirected graph
>>>
>>> output = obj.predict(data, graph)  #With a directed graph
>>>
>>> #To view the graph created, run the below commands:
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

create_graph_from_data(data)[source]

Run the GIES algorithm.

Parameters: data (pandas.DataFrame) – DataFrame containing the data
Returns: Solution given by the GIES algorithm.
Return type: networkx.DiGraph

orient_directed_graph(data, graph)[source]

Run GIES on a directed_graph.

Parameters

data (pandas.DataFrame) – DataFrame containing the data
graph (networkx.DiGraph) – Skeleton of the graph to orient

Returns

Solution given by the GIES algorithm.

Return type

networkx.DiGraph

orient_undirected_graph(data, graph)[source]

Run GIES on an undirected graph.

Parameters

data (pandas.DataFrame) – DataFrame containing the data
graph (networkx.Graph) – Skeleton of the graph to orient

Returns

Solution given by the GIES algorithm.

Return type

networkx.DiGraph

LiNGAM

class cdt.causality.graph.LiNGAM(verbose=False)[source]

LiNGAM algorithm [R model].

Description: Linear Non-Gaussian Acyclic model. LiNGAM handles linear structural equation models, where each variable is modeled as $X_j = \sum_k \alpha_k P_a^{k}(X_j) + E_j, j \in [1,d]$, with $P_a^{k}(X_j)$ the $k$-th parent of $X_j$ and $\alpha_k$ a real scalar.

Required R packages: pcalg

Data Type: Continuous

Assumptions: The underlying causal model is supposed to be composed of linear mechanisms and non-gaussian data. Under those assumptions, it is shown that causal structure is fully identifiable (even inside the Markov equivalence class).

Parameters: verbose (bool) – Sets the verbosity of the algorithm. Defaults to cdt.SETTINGS.verbose

Note

Ref: S. Shimizu, P.O. Hoyer, A. Hyvärinen, A. Kerminen (2006) A Linear Non-Gaussian Acyclic Model for Causal Discovery; Journal of Machine Learning Research 7, 2003–2030.

Warning

This implementation of LiNGAM does not support starting with a graph.

Example

>>> import networkx as nx
>>> from cdt.causality.graph import LiNGAM
>>> from cdt.data import load_dataset
>>> data, graph = load_dataset("sachs")
>>> obj = LiNGAM()
>>> output = obj.predict(data)

create_graph_from_data(data)[source]

Run the LiNGAM algorithm.

Parameters: data (pandas.DataFrame) – DataFrame containing the data
Returns: Prediction given by the LiNGAM algorithm.
Return type: networkx.DiGraph

PC

class cdt.causality.graph.PC(CItest='gaussian', method_indep='corr', alpha=0.01, njobs=None, verbose=None)[source]

PC algorithm [R model].

Description: PC (Peter - Clark) One of the most famous score based approaches for causal discovery. Based on conditional tests on variables and sets of variables, it proved itself to be really efficient.

Required R packages: pcalg, kpcalg, RCIT (variant, see notes)

Data Type: Continuous and discrete

Assumptions: This approach’s complexity grows rapidly with the number of variables, even for quick tests. Consider graphs < 200 variables. The model assumptions made by this approch mainly depend on the type of test used. Kernel-based tests are also available. The prediction of PC is a CPDAG (identifiability up to the Markov equivalence class).

Parameters

CItest (str) – Test for conditional independence.
alpha (float) – significance level (number in (0, 1) for the individual conditional independence tests.
njobs (int) – number of processor cores to use for parallel computation. Only available for method = “stable.fast” (set as default).
verbose – if TRUE, detailed output is provided.

Variables

~PC.arguments (dict) – contains all current parameters used in the PC algorithm execution.
~PC.dir_CI_test (dict) – contains all available conditional independence tests.
~PC.dir_method_indep (dict) – contains all available heuristics for CI testing.

Available CI tests:

binary: “data=X, ic.method=”dcc””
discrete: “data=X, ic.method=”dcc””
hsic_gamma: “data=X, ic.method=”hsic.gamma””
hsic_perm: “data=X, ic.method=”hsic.perm””
hsic_clust: “data=X, ic.method=”hsic.clust””
gaussian: “C = cor(X), n = nrow(X)”
rcit: “data=X, ic.method=”RCIT::RCIT””
rcot: “data=X, ic.method=”RCIT::RCoT””

Default Parameters:

FILE: ‘/tmp/cdt_pc/data.csv’
SKELETON: ‘FALSE’
EDGES: ‘/tmp/cdt_pc/fixededges.csv’
GAPS: ‘/tmp/cdt_pc/fixedgaps.csv’
CITEST: “pcalg::gaussCItest”
METHOD_INDEP: “C = cor(X), n = nrow(X)”
SELMAT: ‘NULL’
DIRECTED: ‘TRUE’
SETOPTIONS: ‘NULL’
ALPHA: ‘0.01’
VERBOSE: ‘FALSE’
OUTPUT: ‘/tmp/cdt_pc/result.csv’

Note

Ref: D.Colombo and M.H. Maathuis (2014). Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15 3741-3782.

M. Kalisch, M. Maechler, D. Colombo, M.H. Maathuis and P. Buehlmann (2012). Causal Inference Using Graphical Models with the R Package pcalg. Journal of Statistical Software 47(11) 1–26, http://www.jstatsoft.org/v47/i11/

M. Kalisch and P. Buehlmann (2007). Estimating high-dimensional directed acyclic graphs with the PC-algorithm. JMLR 8 613-636.

J. Ramsey, J. Zhang and P. Spirtes (2006). Adjacency-faithfulness and conservative causal inference. In Proceedings of the 22nd Annual Conference on Uncertainty in Artificial Intelligence. AUAI Press, Arlington, VA.

P. Spirtes, C. Glymour and R. Scheines (2000). Causation, Prediction, and Search, 2nd edition. The MIT Press

Strobl, E. V., Zhang, K., & Visweswaran, S. (2017). Approximate Kernel-based Conditional Independence Tests for Fast Non-Parametric Causal Discovery. arXiv preprint arXiv:1702.03877.

Imported from the Pcalg package.

The RCIT package has been adapted to fit the CDT package, please use the variant available at https://github.com/Diviyan-Kalainathan/RCIT

Example

>>> import networkx as nx
>>> from cdt.causality.graph import PC
>>> from cdt.data import load_dataset
>>> data, graph = load_dataset("sachs")
>>> obj = PC()
>>> #The predict() method works without a graph, or with a
>>> #directed or undirected graph provided as an input
>>> output = obj.predict(data)    #No graph provided as an argument
>>>
>>> output = obj.predict(data, nx.Graph(graph))  #With an undirected graph
>>>
>>> output = obj.predict(data, graph)  #With a directed graph
>>>
>>> #To view the graph created, run the below commands:
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

create_graph_from_data(data, **kwargs)[source]

Run the PC algorithm.

Parameters: data (pandas.DataFrame) – DataFrame containing the data
Returns: Solution given by PC on the given data.
Return type: networkx.DiGraph

orient_directed_graph(data, graph, *args, **kwargs)[source]

Run PC on a directed_graph (Only takes account of the skeleton of the graph).

Parameters

data (pandas.DataFrame) – DataFrame containing the data
graph (networkx.DiGraph) – Skeleton of the graph to orient

Returns

Solution given by PC on the given skeleton.

Return type

networkx.DiGraph

Warning

The algorithm is ran on the skeleton of the given graph.

orient_undirected_graph(data, graph, **kwargs)[source]

Run PC on an undirected graph.

Parameters

data (pandas.DataFrame) – DataFrame containing the data
graph (networkx.Graph) – Skeleton of the graph to orient

Returns

Solution given by PC on the given skeleton.

Return type

networkx.DiGraph

SAM

class cdt.causality.graph.SAM(lr=0.01, dlr=0.001, mixed_data=False, lambda1=10, lambda2=0.001, nh=20, dnh=200, train_epochs=3000, test_epochs=1000, batch_size=-1, losstype='fgan', dagloss=True, dagstart=0.5, dagpenalization=0, dagpenalization_increase=0.01, functional_complexity='l2_norm', hlayers=2, dhlayers=2, sampling_type='sigmoidproba', linear=False, nruns=8, njobs=None, gpus=None, verbose=None)[source]

SAM Algorithm.

Description: Structural Agnostic Model is an causal discovery algorithm for DAG recovery leveraging both distributional asymetries and conditional independencies. the first version of SAM without DAG constraint is available as SAMv1.

Data Type: Continuous, (Mixed - Experimental)

Assumptions: The class of generative models is not restricted with a hard contraint, but with soft constraints parametrized with the lambda1 and lambda2 parameters, with gumbel softmax sampling. This algorithms greatly benefits from bootstrapped runs (nruns >=8 recommended). GPUs are recommended but not compulsory. The output is a DAG, but may need a thresholding as the output is averaged over multiple runs.

Parameters

lr (float) – Learning rate of the generators
dlr (float) – Learning rate of the discriminator
mixed_data (bool) – Experimental – Enable for mixed-type datasets
lambda1 (float) – L0 penalization coefficient on the causal filters
lambda2 (float) – L2 penalization coefficient on the weights of the neural network
nh (int) – Number of hidden units in the generators’ hidden layers (regularized with lambda2)
dnh (int) – Number of hidden units in the discriminator’s hidden layers
train_epochs (int) – Number of training epochs
test_epochs (int) – Number of test epochs (saving and averaging the causal filters)
batch_size (int) – Size of the batches to be fed to the SAM model Defaults to full-batch
losstype (str) – type of the loss to be used (either ‘fgan’ (default), ‘gan’ or ‘mse’)
dagloss (bool) – Activate the DAG with No-TEARS constraint
dagstart (float) – Controls when the DAG constraint is to be introduced in the training (float ranging from 0 to 1, 0 denotes the start of the training and 1 the end)
dagpenalisation (float) – Initial value of the DAG constraint
dagpenalisation_increase (float) – Increase incrementally at each epoch the coefficient of the constraint
functional_complexity (str) – Type of functional complexity penalization (choose between ‘l2_norm’ and ‘n_hidden_units’)
hlayers (int) – Defines the number of hidden layers in the generators
dhlayers (int) – Defines the number of hidden layers in the discriminator
sampling_type (str) – Type of sampling used in the structural gates of the model (choose between ‘sigmoid’, ‘sigmoid_proba’ and ‘gumble_proba’)
linear (bool) – If true, all generators are set to be linear generators
nruns (int) – Number of runs to be made for causal estimation Recommended: >=32 for optimal performance
njobs (int) – Numbers of jobs to be run in Parallel Recommended: 1 if no GPU available, 2*number of GPUs else
gpus (int) – Number of available GPUs for the algorithm
verbose (bool) – verbose mode

Note

Ref: Kalainathan, Diviyan & Goudet, Olivier & Guyon, Isabelle & Lopez-Paz, David & Sebag, Michèle. (2018). Structural Agnostic Modeling: Adversarial Learning of Causal Graphs.

Example

>>> import networkx as nx
>>> from cdt.causality.graph import SAM
>>> from cdt.data import load_dataset
>>> data, graph = load_dataset("sachs")
>>> obj = SAM()
>>> #The predict() method works without a graph, or with a
>>> #directed or undirected graph provided as an input
>>> output = obj.predict(data)    #No graph provided as an argument
>>>
>>> output = obj.predict(data, nx.Graph(graph))  #With an undirected graph
>>>
>>> output = obj.predict(data, graph)  #With a directed graph
>>>
>>> #To view the graph created, run the below commands:
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

predict(data, graph=None, return_list_results=False)[source]

Execute SAM on a dataset given a skeleton or not.

Parameters

data (pandas.DataFrame) – Observational data for estimation of causal relationships by SAM
skeleton (numpy.ndarray) – A priori knowledge about the causal relationships as an adjacency matrix. Can be fed either directed or undirected links.

Returns

Graph estimated by SAM, where A[i,j] is the term of the ith variable for the jth generator.

Return type

networkx.DiGraph

SAMv1

class cdt.causality.graph.SAMv1(lr=0.1, dlr=0.1, l1=0.1, nh=50, dnh=200, train_epochs=1000, test_epochs=1000, batch_size=-1, nruns=6, njobs=None, gpus=None, verbose=None)[source]

SAM Algorithm. Implementation of the first version of the SAM algorithm, available at https://arxiv.org/abs/1803.04929v1.

Description: Structural Agnostic Model is an fully-differenciable causal discovery algorithm leveraging both distributional assymetries and conditional independencies.

Data Type: Continuous

Assumptions: The class of generative models is not restricted with a hard contraint, but with the hyperparameter nh. This algorithms greatly benefits from bootstrapped runs (nruns >=8 recommended). GPUs are recommended but not compulsory. Output is not a DAG

Parameters

lr (float) – Learning rate of the generators
dlr (float) – Learning rate of the discriminator
l1 (float) – L1 penalization on the causal filters
nh (int) – Number of hidden units in the generators’ hidden layers
dnh (int) – Number of hidden units in the discriminator’s hidden layer$
train_epochs (int) – Number of training epochs
test_epochs (int) – Number of test epochs (saving and averaging the causal filters)
batch_size (int) – Size of the batches to be fed to the SAM model.
nruns (int) – Number of runs to be made for causal estimation. Recommended: >=12 for optimal performance.
njobs (int) – Numbers of jobs to be run in Parallel. Recommended: 1 if no GPU available, 2*number of GPUs else.
gpus (int) – Number of available GPUs for the algorithm.
verbose (bool) – verbose mode

Note

Ref: Kalainathan, Diviyan & Goudet, Olivier & Guyon, Isabelle & Lopez-Paz, David & Sebag, Michèle. (2018). SAM: Structural Agnostic Model, Causal Discovery and Penalized Adversarial Learning.

Example

>>> import networkx as nx
>>> from cdt.causality.graph import SAMv1
>>> from cdt.data import load_dataset
>>> data, graph = load_dataset("sachs")
>>> obj = SAMv1()
>>> #The predict() method works without a graph, or with a
>>> #directed or undirected graph provided as an input
>>> output = obj.predict(data)    #No graph provided as an argument
>>>
>>> output = obj.predict(data, nx.Graph(graph))  #With an undirected graph
>>>
>>> output = obj.predict(data, graph)  #With a directed graph
>>>
>>> #To view the graph created, run the below commands:
>>> nx.draw_networkx(output, font_size=8)
>>> plt.show()

predict(data, graph=None, return_list_results=False)[source]

Execute SAM on a dataset given a skeleton or not.

Parameters

data (pandas.DataFrame) – Observational data for estimation of causal relationships by SAM
skeleton (numpy.ndarray) – A priori knowledge about the causal relationships as an adjacency matrix. Can be fed either directed or undirected links.

Returns

Graph estimated by SAM, where A[i,j] is the term of the ith variable for the jth generator.

Return type

networkx.DiGraph