JASPAR software tools
JASPAR is supported by a growing number of open source software tools and APIs implemented in various programming languages including Perl, Python/Biopython, R/Bioconductor and Ruby. In addition, the TFBS enrichment tool was introduced in the 2022 release. Each of those resources are described more in detail below.
JASPAR 2020 comes with a Representational State Transfer (REST) application programming interface (API) to access the JASPAR database programmatically (Khan A. et al. 2017). The RESTful API enables programmatic access to JASPAR by most programming languages and returns data in seven widely used formats including JSON, JSONP, JASPAR, MEME, PFM, TRANSFAC, and YAML. Further, it provides a browsable interface for bioinformatics tool developers. The API is freely accessible at https://jaspar2020.genereg.net/api/. To read more about JASPAR RESTful API visit its documentation page. If you wish to cite JASPAR RESTful-API, please check the FAQ page.
pyJASPAR is a Pythonic interface to JASPAR transcription factor motifs. It uses Biopython and SQLite3 to provide a serverless interface to JASPAR database to query and access TF motif profiles across various releases of JASPAR. The releases currently available are: JASPAR2014, JASPAR2016, JASPAR2018, and JASPAR2022. The pyJASPAR package will be updated when future JASPAR releases become available.
The releases currently available are: JASPAR2014, JASPAR2016, JASPAR2018, and JASPAR2022. The pyJASPAR package will be updated when future JASPAR releases become available.
pyJASPAR can be easily installed using Bioconda:
conda install -c bioconda pyjaspar
Or via PyPI:
pip install pyjaspar
TFBS enrichment analysis
The JASPAR enrichment tool predicts which sets of TFBSs from the CORE collection in the JASPAR database are enriched in a set of given genomic regions. Enrichment computations are performed using the LOLA tool. The tool allows for two types of computations:
- Enrichment of TFBSs in a set of genomic regions compared to a given universe of genomic regions.
- Differential TFBS enrichment when comparing one set of genomic regions (set1) to another (set2).
Bioconductor TFBSTools package
TFBSTools is an R/Bioconductor package for the analysis and manipulation of transcription factor binding sites and their associated transcription factor profile matrices. TFBStools provides a toolkit for handling TFBS profile matrices, scanning sequences and alignments including whole genomes, and querying the JASPAR database. The functionality of the package can be easily extended to include advanced statistical analysis, data visualisation and data integration. For more information on the package and how to install it, please visit http://bioconductor.org/packages/TFBSTools/.
To retrieve data from the JASPAR database, we also provided the Bioconductor data packages. Currently four JASPAR releases are available:
The Biopython package Bio.motifs has a subpackage dedicated to JASPAR, which is called Bio.motifs.jaspar. It allows the retrieval of profiles from the JASPAR database as well as reading and writing motifs in various flat file formats. To read more about this module, please visit https://biopython.org/docs/1.75/api/Bio.motifs.jaspar.html.
TFBS Perl/TFBSTools Bioconductor module
The Perl TFBS module provides functionality for a large number of tasks including scanning sequences and alignments for putative TFBS. It includes a JASPAR interface module, TFBS::DB::JASPAR6 for retrieving binding site profiles from the JASPAR database. Currently this Perl module is not under active development. All the functionality can be found in the TFBSTools Bioconductor package. Users are highly encouraged to switch to TFBSTools. Full documentation available in their dedicated websites.
A Ruby gem providing basic functionality for parsing, searching, and comparing JASPAR motifs. To learn more about this tool please visit https://rubygems.org/gems/bio-jaspar and the repository at https://github.com/wassermanlab/jaspar-bioruby.
Transcript Factor Flexible Models (TFFM)
TFFMs were introduced in JASPAR2016 (6th release). TFFMs are hidden Markov-based models capturing dinucleotide dependencies in TF-DNA interactions, which have been recurrently shown to occur within TFBSs and are not captured by classical PFMs. The TFFMs need to be initialized with a PFM and trained on ChIP-seq data.