Tutorials
Welcome to the scpviz tutorials.
These guides walk you through the main steps of a single-cell or spatial proteomics workflow using pAnnData.
How to Use These Tutorials
Each tutorial is designed to be self-contained, with code snippets you can run in a Python environment.
Each tutorial includes:
- Example code blocks you can copy into a Jupyter notebook.
- Visual outputs to illustrate the results.
- Tips and notes to explain recommended practices.
If you’re new to scpviz, start with Importing Data and work through the sequence.
Workflow at a Glance
graph TB
A["`Import data
(DIA-NN / PD)`"] --> B["`Parse metadata
(.obs from filenames)`"]
B --> C["`Filter proteins/peptides
(≥2 unique peptides, sample queries)`"]
C --> D["`Normalize
(global, reference feature)`"]
D --> E["`Impute missing values
(KNN / group-wise)`"]
E --> F["`Plotting
(abundance, PCA/UMAP, clustermap)`"]
F --> G["`DE analysis
(mean vs. pairwise strategies)`"]
G --> H["`Enrichment (STRING)
(GSEA / GO / PPI)`"]
B --> I["`Export results`"]
%% Optional side paths
B -. "QC summaries" .-> F
C -. "RS matrix checks" .-> F
G -. "ranked/unranked lists" .-> H
D .-> I
F .-> I
G .-> I
H .-> I
1. Importing Data
- Load DIA-NN or Proteome Discoverer (PD) reports into
pAnnData. - Automatically parse sample metadata (
.obs) from filenames. - Understand the
prot,pep, andrsmatrices.
2. Filtering and Normalization
- Filter proteins by peptide support (e.g. ≥2 unique peptides).
- Apply sample-level filters and advanced queries on
.obsor.summary. - Normalize intensities by global scale, reference features, or other strategies.
3. Imputation
- Handle missing values using KNN-based or group-wise strategies.
- Summarize imputation statistics stored in
pdata.stats.
4. Plotting
- Visualize abundance distributions with violin/box/strip plots.
- Run PCA/UMAP embeddings with flexible coloring options.
- Generate heatmaps and clustermaps with class annotations.
5. Differential Expression (DE)
- Perform DE testing at the protein or peptide level.
- Compare fold-change strategies (mean-based vs. pairwise median).
- Export DE results for downstream use.
6. Enrichment and Networks
- Run GSEA and GO enrichment with STRING.
- Explore protein–protein interaction networks.
- Retrieve functional annotations for differentially expressed genes.
For in-depth details and examples, see the examples in the relevant function's docstring in API Reference.