SCENIC.Rmd
SCENIC
(Single-Cell Regulatory Network Inference and Clustering) is a
computational method that provides deep insights into the regulatory
networks governing gene expression in single cells. It is highly
recommended to use the Nextflow pipeline to run SCENIC, which can be
found here.
This process requires a loom file as input, which can be generated
directly using the Seu2Loom()
function. Currently, the
SeuratExtend
package does not integrate the
RunScenic
functionality directly (requiring the use of
Nextflow command line), but if there is user demand, this could be
considered for future updates. The workflow results in a file named
“pyscenic_integrated-output.loom,” which includes a list of
transcription factors (TFs) and their regulated genes, as well as a
TF-cell matrix representing the AUCell values. The AUCell score
represents the enrichment score of all genes regulated by a TF,
indicating regulon activity. The ImportPyscenicLoom()
function allows for the import of SCENIC-generated loom files into
Seurat objects, facilitating further analysis and visualization within
the Seurat framework.
As an example, we use a pre-computed SCENIC loom file which can be downloaded as follows:
library(SeuratExtend)
scenic_loom_path <- file.path(tempdir(), "pyscenic_integrated-output.loom")
download.file("https://zenodo.org/records/10944066/files/pbmc3k_small_pyscenic_integrated-output.loom", scenic_loom_path)
# Importing SCENIC Loom Files into Seurat
pbmc <- ImportPyscenicLoom(scenic_loom_path, seu = pbmc)
If you prefer to import SCENIC results without specifying a Seurat object:
# Importing SCENIC results without an existing Seurat object
scenic_output <- ImportPyscenicLoom(scenic_loom_path)
SCENIC results are stored in seu@misc$SCENIC
, which
includes seu@misc$SCENIC$Regulons
, a list where each
element’s name is a TF name and the value is a list of genes it
regulates. Additionally, seu@misc$SCENIC$RegulonsAUC
, a
TF-cell AUCell matrix, is also loaded into the “TF” assay of the Seurat
object, making the manipulation of TF regulon activity as
straightforward as handling gene expression data.
# Viewing the outputs
tf_auc <- pbmc@misc$SCENIC$RegulonsAUC
head(tf_auc, 4:5)
## AHR ARID3A ARNT ARNTL ATF1
## CTATAAGATCGTTT-1 0.01406902 0.03347861 0.000000000 0.1144568 0.016730526
## GTGATTCTGGTTCA-1 0.00000000 0.00000000 0.000000000 0.1730939 0.017191920
## ACGTTGGACCGTAA-1 0.00000000 0.03189382 0.000000000 0.1463286 0.003671087
## GGATACTGCAGCTA-1 0.02483910 0.01347068 0.006867406 0.0945589 0.016670344
tf_gene_list <- pbmc@misc$SCENIC$Regulons
head(tf_gene_list, 5)
## $AHR
## [1] "NFYC-AS1" "MYSM1" "ZZZ3" "FUBP1" "WDR77" "TMEM183A"
## [7] "TGOLN2" "SEC22C" "ATXN7" "RBPJ" "C5orf24" "CREBRF"
## [13] "POLR1C" "YIPF3" "RUNX2" "BCLAF1" "HOXA9" "HOXA10"
## [19] "CREB5" "STAG3" "POP7" "ASAP1" "RMI1" "PSMD5"
## [25] "LRSAM1" "ABI1" "HNRNPH3" "IFT46" "ST3GAL4" "BAZ2A"
## [31] "BRI3BP" "MICU2" "TLE3" "PARP6" "NPTN" "PPCDC"
## [37] "FAM103A1" "CHMP1A" "TMEM107" "PHF12" "TLK2" "NOL11"
## [43] "MBP" "DIDO1" "SH3GL1" "SPINT2" "RBFOX2" "GABPA"
## [49] "CCT8"
##
## $ARID3A
## [1] "CDC7" "IL6R" "LAMC1" "LINC01136" "CDC42EP3" "HNMT"
## [7] "GPBAR1" "OARD1" "GNB2" "ATP6V1F" "YWHAZ" "XPA"
## [13] "ANKRD22" "APLP2" "PSMB5" "LGALS3" "RPH3AL" "EMILIN2"
## [19] "ME2" "MAFB" "GNAS" "ARID3A" "UQCRFS1" "CECR1"
##
## $ARNT
## [1] "EPHB3" "SMIM14" "GPR68" "ANXA2" "MORF4L1" "ZFP3"
##
## $ARNTL
## [1] "SLC4A10" "AP2M1" "EMC2" "PHF20L1" "SLC43A1" "CFL1" "SPG21"
## [8] "CRB3" "RPS28"
##
## $ATF1
## [1] "CAMK2N1" "PAFAH2" "BCAS2" "S100A13"
## [5] "ZNF281" "LIN9" "THUMPD2" "CALM2"
## [9] "ACVR1" "NAB1" "ZDBF2" "SNRK"
## [13] "IFRD2" "DUSP7" "MITF" "FOXP1"
## [17] "DHX36" "KIAA0232" "TEC" "G3BP2"
## [21] "ZNF330" "ADAMTS6" "PPARGC1B" "RREB1"
## [25] "FAM65B" "GNL1" "SYNGAP1" "PAQR8"
## [29] "MOSPD3" "GNB2" "FIS1" "ARF5"
## [33] "RPS6KA3" "CCDC120" "PLS3" "TMEM187"
## [37] "SGK3" "PKIA" "PAG1" "RAD54B"
## [41] "UBAP2" "TESK1" "MRRF" "BAG3"
## [45] "CARS" "PGAP2" "BTBD10" "ITFG2"
## [49] "ESYT1" "C12orf73" "MLXIP" "SPATA7"
## [53] "C15orf41" "TLE3" "KIAA0556" "DNAJA2"
## [57] "CYLD" "PDP2" "CENPN" "RPH3AL"
## [61] "SMG6" "TMEM256-PLSCR3" "ARL4D" "B3GNTL1"
## [65] "GPCPD1" "SPAG4" "MYBL2" "ADNP"
## [69] "SBNO2" "CIRBP" "MVB12A" "PLEKHF1"
## [73] "TMEM147" "SPHK2" "EMC10" "ZNF787"
## [77] "DGCR8" "CRKL" "PDE9A"
Once SCENIC data is integrated into a Seurat object, users can
leverage a variety of visualization tools provided in the
Enhanced Visualization section to explore and interpret
these regulatory networks. Both the extracted tf_auc
matrix
or the Seurat object itself can be used as inputs. Here are some
practical examples:
Since we have imported SCENIC results into the “TF” assay, we can easily access the corresponding AUCell values by prefixing “tf_” to the TF name:
DimPlot2(
pbmc,
features = c("ETS1", "ATF3", "tf_ETS1", "tf_ATF3"),
cols = list("tf_ETS1" = "D", "tf_ATF3" = "D")
)
## Loading required package: viridis
## Loading required package: viridisLite
##
## Attaching package: 'viridis'
## The following object is masked from 'package:scales':
##
## viridis_pal
## Loading required package: cowplot
##
## Attaching package: 'cowplot'
## The following object is masked from 'package:mosaic':
##
## theme_map
If you find manually adding “tf_” to each transcription factor cumbersome, you can set the default assay to “TF”, which simplifies operations involving regulon activity. For example, to create a waterfall plot that compares the regulon activity between two cell types, you can do the following:
# Setting the default assay to "TF" for easier access to regulon activity
DefaultAssay(pbmc) <- "TF"
# Creating a waterfall plot to compare regulon activity between monocytes and CD8 T cells
WaterfallPlot(
pbmc,
features = rownames(pbmc), # Using all available TFs in the "TF" assay
ident.1 = "Mono CD14", # First group of cells
ident.2 = "CD8 T cell", # Second group of cells
exp.transform = FALSE, # Disable transformation of expression data
top.n = 20 # Display the top 20 most differentially active TFs
)
##
## Attaching package: 'tidyr'
## The following object is masked from 'package:reshape2':
##
## smiths
## The following objects are masked from 'package:Matrix':
##
## expand, pack, unpack
These examples illustrate how to integrate and utilize SCENIC analysis within the Seurat framework, providing a comprehensive approach to understanding gene regulatory mechanisms at the single-cell level.
## R version 4.4.0 (2024-04-24)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 20.04.6 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=de_BE.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=de_BE.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=de_BE.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=de_BE.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Europe/Brussels
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] tidyr_1.3.1 rlist_0.4.6.2 cowplot_1.1.3
## [4] viridis_0.6.5 viridisLite_0.4.2 rlang_1.1.4
## [7] scales_1.3.0 reshape2_1.4.4 mosaic_1.9.1
## [10] mosaicData_0.20.4 ggformula_0.12.0 Matrix_1.7-0
## [13] ggplot2_3.5.1 lattice_0.22-6 dplyr_1.1.4
## [16] loomR_0.2.0 itertools_0.1-3 iterators_1.0.14
## [19] hdf5r_1.3.10 R6_2.5.1 Seurat_5.1.0
## [22] SeuratExtend_1.0.0 SeuratObject_5.0.2 sp_2.1-4
## [25] SeuratExtendData_0.2.0
##
## loaded via a namespace (and not attached):
## [1] RColorBrewer_1.1-3 rstudioapi_0.16.0 jsonlite_1.8.8
## [4] magrittr_2.0.3 spatstat.utils_3.0-4 farver_2.1.2
## [7] rmarkdown_2.27 fs_1.6.4 ragg_1.3.2
## [10] vctrs_0.6.5 ROCR_1.0-11 memoise_2.0.1
## [13] spatstat.explore_3.2-7 forcats_1.0.0 htmltools_0.5.8.1
## [16] haven_2.5.4 sass_0.4.9 sctransform_0.4.1
## [19] parallelly_1.37.1 KernSmooth_2.23-24 bslib_0.7.0
## [22] htmlwidgets_1.6.4 desc_1.4.3 ica_1.0-3
## [25] plyr_1.8.9 plotly_4.10.4 zoo_1.8-12
## [28] cachem_1.1.0 igraph_1.3.4 mime_0.12
## [31] lifecycle_1.0.4 pkgconfig_2.0.3 fastmap_1.2.0
## [34] fitdistrplus_1.1-11 future_1.33.2 shiny_1.8.1.1
## [37] digest_0.6.35 colorspace_2.1-0 patchwork_1.2.0
## [40] tensor_1.5 RSpectra_0.16-1 irlba_2.3.5.1
## [43] textshaping_0.4.0 labeling_0.4.3 progressr_0.14.0
## [46] fansi_1.0.6 spatstat.sparse_3.0-3 httr_1.4.7
## [49] polyclip_1.10-6 abind_1.4-5 compiler_4.4.0
## [52] withr_3.0.0 bit64_4.0.5 fastDummies_1.7.3
## [55] highr_0.11 MASS_7.3-60.2 tools_4.4.0
## [58] lmtest_0.9-40 httpuv_1.6.15 future.apply_1.11.2
## [61] goftest_1.2-3 glue_1.7.0 nlme_3.1-165
## [64] promises_1.3.0 grid_4.4.0 Rtsne_0.17
## [67] cluster_2.1.6 generics_0.1.3 gtable_0.3.5
## [70] spatstat.data_3.0-4 labelled_2.13.0 hms_1.1.3
## [73] data.table_1.15.4 utf8_1.2.4 spatstat.geom_3.2-9
## [76] RcppAnnoy_0.0.22 ggrepel_0.9.5 RANN_2.6.1
## [79] pillar_1.9.0 stringr_1.5.1 spam_2.10-0
## [82] RcppHNSW_0.6.0 later_1.3.2 splines_4.4.0
## [85] survival_3.7-0 bit_4.0.5 deldir_2.0-4
## [88] tidyselect_1.2.1 miniUI_0.1.1.1 pbapply_1.7-2
## [91] knitr_1.47 gridExtra_2.3 scattermore_1.2
## [94] xfun_0.44 mosaicCore_0.9.4.0 matrixStats_1.3.0
## [97] stringi_1.8.4 lazyeval_0.2.2 yaml_2.3.8
## [100] evaluate_0.23 codetools_0.2-20 tibble_3.2.1
## [103] cli_3.6.2 uwot_0.2.2 xtable_1.8-4
## [106] reticulate_1.37.0 systemfonts_1.1.0 munsell_0.5.1
## [109] jquerylib_0.1.4 Rcpp_1.0.12 globals_0.16.3
## [112] spatstat.random_3.2-3 png_0.1-8 parallel_4.4.0
## [115] pkgdown_2.0.9 dotCall64_1.1-1 listenv_0.9.1
## [118] ggridges_0.5.6 leiden_0.4.3.1 purrr_1.0.2