The goal of causalXtreme is to provide an interface to perform causal discovery in linear structural equation models (SEM) with heavy-tailed noise. For more details see the paper "Causal discovery in heavy-tailed models" from Gnecco, N., Meinshausen, N., Peters, J., and, Engelke, S. [https://arxiv.org/abs/1908.05097].
You can install the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("nicolagnecco/causalXtreme")
Let us first generate a SEM with two Student-t variables with 1.5 degrees of freedom (i.e., heavy-tailed).
library(causalXtreme)
## basic example code
set.seed(1)
sem <- simulate_data(n = 500, p = 2, prob_connect = 0.5,
distr = "student_t", tail_index = 1.5)
Let us investigate the randomly generated directed acyclic graph (DAG) induced by the SEM.
sem$dag
#> [,1] [,2]
#> [1,] 0 1
#> [2,] 0 0
We see that the first variable causes the second variable, since the entry (1, 2) of the matrix sem$dag
is equal to 1. We can plot the simulated dataset.
At this point, we can compute the causal tail coefficients between the two variables X1 and X2.
causal_tail_matrix(dat = sem$dataset)
#> [,1] [,2]
#> [1,] NA 0.9523333
#> [2,] 0.4816667 NA
We see that the coefficient Γ12 ≈ 1 (entry (1, 2) of the matrix) and Γ21 < 1 (entry (2, 1) of the matrix). This is evidence for a causal relationship from X1 to X2.
We can also run the extremal ancestral search (EASE) algorithm, based on the causal tail coefficients. The algorithm estimates from the data a causal order for the DAG.
ease(dat = sem$dataset)
#> [1] 1 2
In this case, we see that the estimated causal order is correct, since the cause X1 is placed before its effect X2.