Tutorials

Welcome to the scpviz tutorials. These guides walk you through the main steps of a single-cell or spatial proteomics workflow using pAnnData.

How to Use These Tutorials

Each tutorial is designed to be self-contained, with code snippets you can run in a Python environment.

Each tutorial includes:

Example code blocks you can copy into a Jupyter notebook.
Visual outputs to illustrate the results.
Tips and notes to explain recommended practices.

If you’re new to scpviz, start with Importing Data and work through the sequence.

Workflow at a Glance

  graph TB
  A["`Import data  
  (DIA-NN / PD)`"] --> B["`Parse metadata  
  (.obs from filenames)`"]
  B --> C["`Filter proteins/peptides  
  (≥2 unique peptides, sample queries)`"]
  C --> D["`Normalize  
    (global, reference feature)`"]
    D --> E["`Impute missing values  
    (KNN / group-wise)`"]
    E --> F["`Plotting  
    (abundance, PCA/UMAP, clustermap)`"]
    F --> G["`DE analysis  
    (mean vs. pairwise strategies)`"]
    G --> H["`Enrichment (STRING)  
    (GSEA / GO / PPI)`"]
    B --> I["`Export results`"]

%% Optional side paths
B -. "QC summaries" .-> F
C -. "RS matrix checks" .-> F
G -. "ranked/unranked lists" .-> H
D .-> I
F .-> I
G .-> I 
H .-> I

1. Importing Data

Load DIA-NN or Proteome Discoverer (PD) reports into pAnnData.
Automatically parse sample metadata (.obs) from filenames.
Understand the prot, pep, and rs matrices.

2. Filtering and Normalization

Filter proteins by peptide support (e.g. ≥2 unique peptides).
Apply sample-level filters and advanced queries on .obs or .summary.
Normalize intensities by global scale, reference features, or other strategies.

3. Imputation

Handle missing values using KNN-based or group-wise strategies.
Summarize imputation statistics stored in pdata.stats.

4. Plotting

Visualize abundance distributions with violin/box/strip plots.
Run PCA/UMAP embeddings with flexible coloring options.
Generate heatmaps and clustermaps with class annotations.

5. Differential Expression (DE)

Perform DE testing at the protein or peptide level.
Compare fold-change strategies (mean-based vs. pairwise median).
Export DE results for downstream use.

6. Enrichment and Networks

Run GSEA and GO enrichment with STRING.
Explore protein–protein interaction networks.
Retrieve functional annotations for differentially expressed genes.

For in-depth details and examples, see the examples in the relevant function's docstring in API Reference.