This set of functions converts a Seurat object and associated Velocyto loom file(s) into an AnnData object and generates visualization plots for RNA velocity analysis using scVelo. The AnnData object can be directly read from a file or accessed from memory to produce various styles of plots. This integrated approach facilitates the use of scVelo for trajectory analysis in Python's Scanpy library, allowing seamless transition between data processing in R and trajectory analysis in Python.

scVelo.SeuratToAnndata(
  seu,
  filename,
  velocyto.loompath,
  cell.id.match.table = NULL,
  prefix = NULL,
  postfix = "-1",
  conda_env = "seuratextend"
)

scVelo.Plot(
  load.adata = NULL,
  style = c("stream", "grid", "scatter"),
  basis = "umap_cell_embeddings",
  color = NULL,
  groups = NULL,
  palette = NULL,
  alpha = 0.15,
  arrow_size = 3,
  arrow_length = 2,
  dpi = 300,
  legend_fontsize = 9,
  figsize = c(7, 5),
  xlim = NULL,
  ylim = NULL,
  save = NULL,
  conda_env = "seuratextend"
)

Arguments

seu

The Seurat object containing single-cell RNA sequencing data that needs to be analyzed using scVelo.

filename

Path where the resulting AnnData object will be saved. This should be a path to an h5ad file.

velocyto.loompath

Path(s) to the Velocyto-generated loom file which contains RNA velocity data.

cell.id.match.table

An optional data frame for advanced users that maps cell IDs between the Seurat object and Velocyto loom file across multiple samples. It requires a strict format with three columns: cellid.seurat, cellid.velocyto, and velocyto.loompath, indicating the cell ID in the Seurat object, the corresponding cell ID in the Velocyto loom, and the loom file path for that sample, respectively. Default: NULL

prefix

Prefix used to prepend to cell IDs in the Seurat object to match the corresponding IDs in the Velocyto loom file, reflecting sample or batch identifiers. Default: NULL

postfix

Postfix appended to cell IDs in the Seurat object to match the corresponding IDs in the Velocyto loom file. Default: '-1'

conda_env

Name of the Conda environment where the Python dependencies for scVelo and Scanpy are installed. This environment is used to run Python code from R. Default: 'seuratextend'

load.adata

Path to a previously saved AnnData object (in h5ad format) which can be directly loaded to avoid re-running preprocessing. If NULL, reticulate will automatically use the existing AnnData object `adata` in the Python environment for plotting. Default: NULL.

style

Style of the velocity plot, allowing for different visual representations such as 'stream', 'grid', or 'scatter'. Default: c("stream", "grid", "scatter").

basis

The embedding to be used for plotting, typically 'umap_cell_embeddings' to represent UMAP reductions. Default: 'umap_cell_embeddings'.

color

The variable by which to color the plot, usually a categorical variable like cluster identifiers or a continuous variable reflecting gene expression levels. Default: NULL.

groups

Groups or clusters to highlight in the plot, useful for focusing on specific cell types or conditions within the dataset. Default: NULL.

palette

Color palette to use for differentiating between groups or clusters within the plot. Allows customization of aesthetic presentation. Default: NULL.

alpha

Opacity of the points in the plot, which can be adjusted to enhance visualization when dealing with densely packed points. Default: 0.15.

arrow_size

Size of the arrows representing RNA velocity vectors in the plot, relevant only when `style` is set to 'scatter'. This can be adjusted to make the arrows more or less prominent based on visualization needs. Default: 3.

arrow_length

Length of the arrows, which affects how far the arrows extend from their origin points. Relevant only when style is 'scatter', helping in interpreting the directionality and magnitude of cellular transitions. Default: 2.

dpi

Resolution of the saved plot, useful when preparing figures for publication or presentations. Default: 300.

legend_fontsize

Size of the font used in the plot legend, allowing for customization based on the figure's intended use or audience. Default: 9.

figsize

Dimensions of the plot in inches, providing control over the size of the output figure to accommodate different analysis contexts. Default: c(7, 5).

xlim

Limits for the x-axis, which can be set to focus on specific areas of the plot or to standardize across multiple plots. Default: NULL.

ylim

Limits for the y-axis, similar in use to `xlim` for focusing or standardizing the y-axis view. Default: NULL.

save

Path where the plot should be saved. If specified, the plot will be saved to the given location. Supports various file formats like PNG, PDF, SVG, etc. Default: NULL.

Value

These functions do not return any object within R; instead, they prepare and store an AnnData object `adata` in the Python environment accessible via `reticulate`, and generate plots which can be viewed directly or saved to a file. The plots reflect the dynamics of RNA velocity in single-cell datasets.

Details

This integrated functionality facilitates a seamless transition between converting Seurat objects to AnnData objects and plotting with scVelo. The primary metadata and dimension reduction data from the Seurat object are used to prepare the AnnData object, which is then utilized for generating plots. `SeuratExtend` enhances scVelo plotting capabilities in R, supporting a variety of customization options for visualizing single-cell RNA velocity data. Users can manipulate plot styles, color schemes, group highlights, and more, making it an essential tool for advanced single-cell analysis without the need for direct interaction with Python code.

Examples

library(Seurat)
library(SeuratExtend)

# Download the example Seurat Object
mye_small <- readRDS(url("https://zenodo.org/records/10944066/files/pbmc10k_mye_small_velocyto.rds", "rb"))

# Download the example velocyto loom file to tmp folder
loom_path <- file.path(tempdir(), "pbmc10k_mye_small.loom")
download.file("https://zenodo.org/records/10944066/files/pbmc10k_mye_small.loom", loom_path)

# Set up the path for saving the AnnData object in the HDF5 (h5ad) format
adata_path <- file.path(tempdir(), "mye_small.h5ad")

# Integrate Seurat Object and velocyto loom information into one AnnData object, which will be stored at the specified path.
scVelo.SeuratToAnndata(
  mye_small, # The downloaded example Seurat object
  filename = adata_path, # Path where the AnnData object will be saved
  velocyto.loompath = loom_path, # Path to the loom file
  prefix = "sample1_", # Prefix for cell IDs in the Seurat object
  postfix = "-1" # Postfix for cell IDs in the Seurat object
)

# Generate a default UMAP plot colored by 'cluster' and save it as a PNG file
scVelo.Plot(color = "cluster", save = "umap1.png", figsize = c(5,4))

# Generate a scatter style plot highlighting specific groups, using a custom color palette, with specified axis limits, and save it to a file
scVelo.Plot(
  style = "scatter",
  color = "cluster",
  groups = c("DC", "Mono CD14"),
  palette = color_pro(3, "light"),
  xlim = c(0, 10), ylim = c(0, 10),
  save = "umap2_specified_area.png"
)