This project is an open-source community project, hosted on GitHub at the following address: https://github.com/FenTechSolutions/CausalDiscoveryToolbox
We abide by the principles of openness, respect, and consideration of others of the Python Software Foundation: https://www.python.org/psf/codeofconduct/
Encountering a bug while using this package may occur. In order to fix the said bug and improve all users’ experience, it is highly recommended to submit a bug report on the GitHub issue tracker: https://github.com/FenTechSolutions/CausalDiscoveryToolbox/issues
When reporting a bug, please mention:
cdtpackage version or docker image tag.
Your python version.
Your hardware configuration, if there are GPUs available.
The full traceback of the raised error if one is raised.
A small code snippet to reproduce the bug if the description is not explicit.
The recommended way to contribute to the Causal Discovery Toolbox is to submit a
pull request on the
dev branch of https://github.com/FenTechSolutions/CausalDiscoveryToolbox
To submit a pull request, the following are required:
Having an up-to-date forked repository of the package and a python 3 installation
Clone your forked version of the code locally and install it in developer mode, in a separate python environement (e.g. Anaconda environement):
$ conda create --name cdt_dev python=3.6 numpy scipy scikit-learn $ source activate cdt_dev $ git clone firstname.lastname@example.org:YourLogin/CausalDiscoveryToolbox.git $ cd CausalDiscoveryToolbox $ git checkout dev $ python setup.py install develop
pythonrefers to your python 3 installation.
Make your changes to the source code of the package
Test your changes using
$ cd CausalDiscoveryToolbox $ pip install pytest $ pytest
If the tests pass, commit and push your changes:
$ git add . $ git commit -m "[DEV] Your commit message" $ git push -u origin dev
The commits must begin with a tag, defining the main purpose of the commit. Examples of tags are:
[TRAVIS]for changes on the continuous integration
[TEST]for testing and coverage
[MREL]are reserved names for releases and major releases. They trigger package version updates on the continuous integration.
[DEPLOY]is a reserved tag for the continuous integration to upload its changes.
Please check that your pull request complies with all the rules of the checklist:
Respected the pattern design of the package, using the
networkx.DiGraphclasses and the
cdt.Settingsmodules and heritage from the model classes, and verified the correct import of the new functionalities.
Added documentation to your added functionalities (check the following section)
Added corresponding tests to the added functions/classes in
Finally, submit your pull request using the GitHub website.
The package is to be as much independent of other packages as possible, as it already depends on many libraries. Therefore, all contributions requiring the addition of a new dependency will be severely examined.
Two types of dependencies are possible for now:
Python dependencies, defined in
R dependencies, defined in
For R dependencies, the Docker base images have to be rebuilt, thus notifying the core maintainers of the package is necessary for the Docker image to be updated.
The documentation of the package is automatically generated using Sphinx, by
parsing docstrings of functions and classes, as defined in
/docs/*.rst files. To add a new function in the documentation, add
the respective mention in the
.rst file. The documentation is automatically
built and updated online by the Continuous Integration Tool at each push on the
When writing your docstrings, please use the Google Style format: https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html
Your docstrings must include:
A presentation of the functionality
A detailed description or the arguments and returns
A scientific source in
A short example
The package is thoroughly tested using
codecov for code
coverage. Tests are run using a Continuous Integration Tool, for
each push on master/dev or pull requests, allowing to provide users with
The test scripts are included in the GitHub repository at
and some sample data for the function to be applied on can be found in
In order to write new tests functions, add either a new python file or complete
an already existing file, and add a function whose name must begin with
This allows pytest to automatically detect the new test function.
New test functions must provide optimal code coverage of tested functionalities, as well as test of imports and result coherence.
Continuous integration (travis-ci) is enabled on this project, it allows for:
Testing new code with
pytestand upload the code coverage results to https://codecov.io/gh/FenTechSolutions/CausalDiscoveryToolbox
Bumping a new version of the package and push it to GitHub.
Building new docker images and push them to https://hub.docker.com/u/fentech
Push the new package version to PyPi
Compile the new documentation and upload its website.
All the tasks described above are defined in the
One of this project’s main features is wrapping around R-libraries. In order to do it in the most efficient way, the R tasks are executed in a different process than the main python process thus freeing the computation from the GIL.
A /tmp/ folder is used as buffer, and everything is executed with the subprocess library. Check out cdt.utils.R for more detailed information.
Many algorithms are computationally heavy, but parallelizable, as they include bootstrapped functions, multiple runs of a same computation.
Therefore, using multiprocessing allows to alleviate the required computation
time. For CPU jobs, we use the
joblib library, for its efficiency and ease
of use. However, for GPU jobs, the multiprocessing interface was recoded,
in order to account for available resources and a memory leak issue between
joblib and PyTorch.
Check out cdt.utils.parallel for more detailed information.