Integrating Spatial Transcriptomics and Single-Cell Data

This tutorial covers the best practices for integrating spatial transcriptomics (ST) and single-cell RNA-seq (scRNA-seq) data using FADVI.

What input ST data should be used?

FADVI is NOT designed to deconvolve cell types from large-spot-based ST data. Instead, it leverages the high-resolution, single-cell level ST data with matched scRNA-seq data. FADVI will output joint representations of ST and scRNA-seq data, which can be used for downstream analysis like clustering and visualization.

  • Supported technologies: 10X Genomics Visium HD, 10X Genomics Xenium, CosMx, Stereo-seq and other high-resolution ST platforms

  • Unsupported technologies: 10X Genomics Visium, and other large-spot-based ST data

Can I use unlabeled ST data?

  • Yes: FADVI can integrate unlabeled ST data with labeled scRNA-seq data.

  • However, labeled ST data is recommended for optimal integration performance.

(Alternative) Integrating unlabeled ST data with scRNA-seq data

It is possible to directly integrate unlabeled ST data with labeled scRNA-seq data, but sometimes the results may be suboptimal. This is due to the large discrepancy in the transcription profiles of same cell types in different technologies.

import scanpy as sc
import anndata as ad
import tacco as tc
import fadvi

# Load your data

adata_sc = sc.read_h5ad("scRNAseq_data.h5ad")
adata_st = sc.read_h5ad("st_data.h5ad")

# Assign "Unknown" label for all ST data
adata_st.obs["cell_type"] = "Unknown"

# concatenate ST and scRNA-seq data
adata = ad.concat({"scRNA-seq": adata_sc, "spatial": adata_st}, label="tech")

# Set up AnnData
fadvi.FADVI.setup_anndata(adata,
    batch_key="tech",
    labels_key="cell_type",
    unlabeled_category="Unknown",
    layer="counts"
)

# Initialize model with default parameters
model = fadvi.FADVI(adata)

# Train with default settings
model.train(max_epochs=30) # 30 epoches should be good for most datasets

# Get latent representation
latent = model.get_latent_representation()
adata.obsm["X_fadvi_l"] = latent

Then ST and scRNA-seq data can be integrated using FADVI to obtain a joint representation.

Next Steps

  • Explore Advanced Usage for more sophisticated use cases

  • Check the API Reference for detailed parameter descriptions

  • See example notebooks