1 Introduction

SingleR is an automatic annotation method for single-cell RNA sequencing (scRNAseq) data (Aran et al. 2019). Given a reference dataset of samples (single-cell or bulk) with known labels, it labels new cells from a test dataset based on similarity to the reference set. Specifically, for each test cell:

  1. We compute the Spearman correlation between its expression profile and that of each reference sample.
  2. We define the per-label score as a fixed quantile (by default, 0.8) of the distribution of correlations.
  3. We repeat this for all labels and we take the label with the highest score as the annotation for this cell.
  4. We optionally perform a fine-tuning step:
  • The reference dataset is subsetted to only include labels with scores close to the maximum.
  • Scores are recomputed using only marker genes for the subset of labels.
  • This is iterated until one label remains.

Automatic annotation provides a convenient way of transferring biological knowledge across datasets. In this manner, the burden of interpreting clusters and defining marker genes only has to be done once.

2 Using the built-in references

SingleR provides several reference datasets (mostly derived from bulk RNA-seq or microarray data) through dedicated data retrieval functions. For example, to obtain reference data from the Human Primary Cell Atlas:

library(SingleR)
hpca.se <- HumanPrimaryCellAtlasData()
hpca.se
## class: SummarizedExperiment 
## dim: 19363 713 
## metadata(0):
## assays(1): logcounts
## rownames(19363): A1BG A1BG-AS1 ... ZZEF1 ZZZ3
## rowData names(0):
## colnames(713): GSM112490 GSM112491 ... GSM92233 GSM92234
## colData names(2): label.main label.fine

We use this reference in the SingleR() function to annotate a scRNA-seq dataset from La Manno et al. (2016). For the sake of speed, we will only label the first 100 cells from this dataset.

library(scRNAseq)
hESCs <- LaMannoBrainData('human-es')
hESCs <- hESCs[,1:100]

# Restrict to common genes between test and reference data:
library(scater)
common <- intersect(rownames(hESCs), rownames(hpca.se))
hpca.se <- hpca.se[common,]
hESCs <- hESCs[common,]
hESCs <- logNormCounts(hESCs)

pred.hpca <- SingleR(test = hESCs, ref = hpca.se, labels = hpca.se$label.main)
pred.hpca
## DataFrame with 100 rows and 5 columns
##                                                                     scores
##                                                                   <matrix>
## 1772122_301_C02  0.118426779945786:0.179699807625087:0.157326274226517:...
## 1772122_180_E05  0.129708246318855:0.236277439793527:0.202370888668263:...
## 1772122_300_H02  0.158201338525345:0.250060222727419:0.211831550178353:...
## 1772122_180_B09   0.158778546217777:0.27716592787528:0.222681369744636:...
## 1772122_180_G04   0.138505219642345:0.236658649096383:0.19092437361406:...
## ...                                                                    ...
## 1772122_299_E07  0.145931041885859:0.241153701803065:0.217382763112476:...
## 1772122_180_D02  0.122983434596168:0.239181076829949:0.181221997276501:...
## 1772122_300_D09  0.129757310468164:0.233775092572195:0.196637664917917:...
## 1772122_298_F09  0.143118885460347:0.262267367714562:0.214329641867196:...
## 1772122_302_A11 0.0912854247387272:0.185945405472165:0.139232371863794:...
##                         first.labels                         tuning.scores
##                          <character>                           <DataFrame>
## 1772122_301_C02 Neuroepithelial_cell   0.18244020296249:0.0991115652997192
## 1772122_180_E05 Neuroepithelial_cell  0.137548373236792:0.0647133734667384
## 1772122_300_H02 Neuroepithelial_cell   0.275798157639906:0.136969040146444
## 1772122_180_B09 Neuroepithelial_cell 0.0851622797320583:0.0819878452425098
## 1772122_180_G04 Neuroepithelial_cell   0.198841544187094:0.101662168246495
## ...                              ...                                   ...
## 1772122_299_E07 Neuroepithelial_cell  0.176002520599547:0.0922503823656398
## 1772122_180_D02 Neuroepithelial_cell   0.196760862365318:0.112480486219438
## 1772122_300_D09 Neuroepithelial_cell 0.0816424287822026:0.0221368018363302
## 1772122_298_F09 Neuroepithelial_cell  0.187249853552379:0.0671892835266423
## 1772122_302_A11 Neuroepithelial_cell   0.156079956344163:0.105132159755961
##                               labels        pruned.labels
##                          <character>          <character>
## 1772122_301_C02 Neuroepithelial_cell Neuroepithelial_cell
## 1772122_180_E05              Neurons              Neurons
## 1772122_300_H02 Neuroepithelial_cell Neuroepithelial_cell
## 1772122_180_B09 Neuroepithelial_cell Neuroepithelial_cell
## 1772122_180_G04 Neuroepithelial_cell Neuroepithelial_cell
## ...                              ...                  ...
## 1772122_299_E07 Neuroepithelial_cell Neuroepithelial_cell
## 1772122_180_D02 Neuroepithelial_cell Neuroepithelial_cell
## 1772122_300_D09 Neuroepithelial_cell Neuroepithelial_cell
## 1772122_298_F09 Neuroepithelial_cell Neuroepithelial_cell
## 1772122_302_A11            Astrocyte            Astrocyte

Each row of the output DataFrame contains prediction results for a single cell. Labels are shown before fine-tuning (first.labels), after fine-tuning (labels) and after pruning (pruned.labels), along with the associated scores. We summarize the distribution of labels across our subset of cells: [6~

table(pred.hpca$labels)
## 
##            Astrocyte Neuroepithelial_cell              Neurons 
##                   14                   81                    5

At this point, it is worth noting that SingleR is workflow/package agnostic. The above example uses SummarizedExperiment objects, but the same functions will accept any (log-)normalized expression matrix.

3 Using single-cell references

3.1 Setting up the data

Here, we will use two human pancreas datasets from the scRNAseq package. The aim is to use one pre-labelled dataset to annotate the other unlabelled dataset. First, we set up the Muraro et al. (2016) dataset to be our reference.

library(scRNAseq)
sceM <- MuraroPancreasData()

# One should normally do cell-based quality control at this point, but for
# brevity's sake, we will just remove the unlabelled libraries here.
sceM <- sceM[,!is.na(sceM$label)]
table(sceM$label)
## 
##      acinar       alpha        beta       delta        duct endothelial 
##         219         812         448         193         245          21 
##     epsilon mesenchymal          pp     unclear 
##           3          80         101           4
sceM <- logNormCounts(sceM)

We then set up our test dataset from Grun et al. (2016). To speed up this demonstration, we will subset to the first 100 cells.

sceG <- GrunPancreasData()
sceG <- sceG[,colSums(counts(sceG)) > 0]
sceG <- logNormCounts(sceG) 
sceG <- sceG[,1:100]

We then restrict to common genes:

common <- intersect(rownames(sceM), rownames(sceG))
sceM <- sceM[common,]
sceG <- sceG[common,]

3.2 Defining custom markers

The default marker definition in SingleR() is intended for references derived from bulk RNA-seq data. When using single-cell data as a reference, we suggest building your own marker list. This involves a series of pairwise comparisons between labels to define markers that distinguish each label from another, and is easy to perform with functions from scran. For example, we can perform pairwise \(t\)-tests and obtain the top 10 marker genes from each pairwise comparison.

library(scran)
out <- pairwiseTTests(logcounts(sceM), sceM$label, direction="up")
markers <- getTopMarkers(out$statistics, out$pairs, n=10)

We then supply these genes to SingleR() directly via the genes= argument. A more focused gene set also allows annotation to be performed more quickly compared to the default approach.

pred <- SingleR(test=sceG, ref=sceM, labels=sceM$label, genes=markers)
table(pred$labels)
## 
##  acinar    beta   delta    duct      pp unclear 
##      59       4       1      34       1       1

In some cases, markers may only be available for specific labels rather than for pairwise comparisons between labels. This is accommodated by supplying a named list of character vectors to genes. Note that this is likely to be less powerful than the list-of-lists approach as information about pairwise differences is discarded.

label.markers <- lapply(markers, unlist, recursive=FALSE)
pred2 <- SingleR(test=sceG, ref=sceM, labels=sceM$label, genes=label.markers)
table(pred$labels, pred2$labels)
##          
##           acinar beta delta duct pp
##   acinar      53    0     0    6  0
##   beta         0    4     0    0  0
##   delta        0    0     1    0  0
##   duct         0    0     0   34  0
##   pp           0    0     0    0  1
##   unclear      0    0     0    1  0

4 Annotation diagnostics

4.1 Based on the scores within cells

SingleR provides a few basic yet powerful visualization tools. plotScoreHeatmap() displays the scores for all cells across all reference labels, which allows users to inspect the confidence of the predicted labels across the dataset. We can also display clusters (or other metadata information) for each cell by setting clusters= or annotation_col=. In this case, we display which donor the cells came from and the labels assigned to each cell.

plotScoreHeatmap(pred, show.labels = TRUE,
    annotation_col=data.frame(donor=sceG$donor,
        row.names=rownames(pred)))

For this plot, the key point is to examine the spread of scores within each cell. Ideally, each cell (i.e., column) should have one score that is obviously larger than the rest, indicating that it is unambiguously assigned to a single label. A spread of similar scores for a given cell indicates that the assignment is uncertain.

4.2 Based on the deltas across cells

The pruneScores() function will remove potentially poor-quality or ambiguous assignments. In particular, ambiguous assignments are identified based on the per-cell delta, i.e., the difference between the score for the assigned label and the median across all labels for each cell. Low deltas indicate that the assignment is uncertain, which is especially relevant if the cell’s true label does not exist in the reference. The exact threshold used for pruning is identified using an outlier-based approach that accounts for differences in the scale of the correlations in various contexts.

to.remove <- pruneScores(pred)
summary(to.remove)
##    Mode   FALSE 
## logical     100

By default, SingleR() will also report pruned labels where low-quality assignments are replaced with NA. However, the default pruning thresholds may not be appropriate for every dataset - see ?pruneScores for a more detailed discussion. We also provide the plotScoreDistribution() to help determine whether the thresholds are appropriate. This displays the per-label distribution of the differences-from-median across cells, from which pruneScores() defines an appropriate threshold as 3 median absolute deviations (MADs) below the median.

plotScoreDistribution(pred, show = "delta.med", ncol = 3, show.nmads = 3)

If some tuning parameters must be adjusted, we can simply call pruneScores() directly with adjusted parameters. Here, we set labels to NA if they are to be discarded, which is also how SingleR() marks such labels in pruned.labels.

new.pruned <- pred$labels
new.pruned[pruneScores(pred, nmads=5)] <- NA
table(new.pruned, useNA="always")
## new.pruned
##  acinar    beta   delta    duct      pp unclear    <NA> 
##      59       4       1      34       1       1       0

4.3 Based on marker gene expression

Another simple yet effective diagnostic is to examine the expression of the marker genes for each label in the test dataset. We extract the identity of the markers from the metadata of the SingleR results and use them in the plotHeatmap() function from scater, as shown below for beta cell markers.

# Beta cell-related markers
plotHeatmap(sceG, order_columns_by=list(I(pred$labels)),
    features=unique(unlist(metadata(pred)$de.genes$beta))) 

We can perform this for all labels by wrapping this code in a loop, as shown below:

for (lab in names(metadata(pred)$de.genes)) {
    plotHeatmap(sceG, order_columns_by=list(I(pred$labels)), 
        features=unique(unlist(metadata(pred)$de.genes[[lab]]))) 
}

If a cell in the test dataset is confidently assigned to a particular label, we would expect it to have strong expression of that label’s markers. At the very least, it should exhibit upregulation of those markers relative to cells assigned to other labels. If this is not the case, some skepticism towards the quality of the assignments is warranted.

5 Available references

The legacy SingleR package provides RDA files that contain normalized expression values and cell types labels based on bulk RNA-seq, microarray and single-cell RNA-seq data from:

  • Blueprint (Martens and Stunnenberg 2013) and Encode (The ENCODE Project Consortium 2012),
  • the Human Primary Cell Atlas (Mabbott et al. 2013),
  • the murine ImmGen (Heng et al. 2008), and
  • a collection of mouse data sets downloaded from GEO (Benayoun et al. 2019).

The bulk RNA-seq and microarray data sets of the first three reference data sets were obtained from pre-sorted cell populations, i.e., the cell labels of these samples were mostly derived based on the respective sorting/purification strategy, not via in silico prediction methods.

Three additional reference datasets from bulk RNA-seq and microarray data for immune cells have also been prepared. Each of these datasets were also obtained from pre-sorted cell populations:

The characteristics of each dataset are summarized below:

Data retrieval Organism Samples Sample types No. of main labels No. of fine labels Cell type focus
HumanPrimaryCellAtlasData() human 713 microarrays of sorted cell populations 37 157 Non-specific
BlueprintEncodeData() human 259 RNA-seq 24 43 Non-specific
DatabaseImmuneCellExpressionData() human 1561 RNA-seq 5 15 Immune
NovershternHematopoieticData() human 211 microarrays of sorted cell populations 17 38 Hematopoietic & Immune
MonacoImmuneData() human 114 RNA-seq 11 29 Immune
ImmGenData() mouse 830 microarrays of sorted cell populations 20 253 Hematopoietic & Immune
MouseRNAseqData() mouse 358 RNA-seq 18 28 Non-specific

Details for each dataset can be viewed on the corresponding help page (e.g. ?ImmGenData). The available sample types in each set can be viewed in the collapsible sections below.

BlueprintEncodeData Labels

label.main label.fine
Neutrophils Neutrophils
Monocytes Monocytes
HSC MEP
CD4+ T-cells CD4+ T-cells
CD4+ T-cells Tregs
CD4+ T-cells CD4+ Tcm
CD4+ T-cells CD4+ Tem
CD8+ T-cells CD8+ Tcm
CD8+ T-cells CD8+ Tem
NK cells NK cells
B-cells naive B-cells
B-cells Memory B-cells
B-cells Class-switched memory B-cells
HSC HSC
HSC MPP
HSC CLP
HSC GMP
Macrophages Macrophages
CD8+ T-cells CD8+ T-cells
Erythrocytes Erythrocytes
HSC Megakaryocytes
HSC CMP
Macrophages Macrophages M1
Macrophages Macrophages M2
Endothelial cells Endothelial cells
DC DC
Eosinophils Eosinophils
B-cells Plasma cells
Chondrocytes Chondrocytes
Fibroblasts Fibroblasts
Smooth muscle Smooth muscle
Epithelial cells Epithelial cells
Melanocytes Melanocytes
Skeletal muscle Skeletal muscle
Keratinocytes Keratinocytes
Endothelial cells mv Endothelial cells
Myocytes Myocytes
Adipocytes Adipocytes
Neurons Neurons
Pericytes Pericytes
Adipocytes Preadipocytes
Adipocytes Astrocytes
Mesangial cells Mesangial cells

HumanPrimaryCellAtlasData Labels

label.main label.fine
DC DC:monocyte-derived:immature
DC DC:monocyte-derived:Galectin-1
DC DC:monocyte-derived:LPS
DC DC:monocyte-derived
Smooth_muscle_cells Smooth_muscle_cells:bronchial:vit_D
Smooth_muscle_cells Smooth_muscle_cells:bronchial
Epithelial_cells Epithelial_cells:bronchial
B_cell B_cell
Neutrophils Neutrophil
T_cells T_cell:CD8+_Central_memory
T_cells T_cell:CD8+
T_cells T_cell:CD4+
T_cells T_cell:CD8+_effector_memory_RA
T_cells T_cell:CD8+_effector_memory
T_cells T_cell:CD8+_naive
Monocyte Monocyte
Erythroblast Erythroblast
BM & Prog. BM
DC DC:monocyte-derived:rosiglitazone
DC DC:monocyte-derived:AM580
DC DC:monocyte-derived:rosiglitazone/AGN193109
DC DC:monocyte-derived:anti-DC-SIGN_2h
Endothelial_cells Endothelial_cells:HUVEC
Endothelial_cells Endothelial_cells:HUVEC:Borrelia_burgdorferi
Endothelial_cells Endothelial_cells:HUVEC:IFNg
Endothelial_cells Endothelial_cells:lymphatic
Endothelial_cells Endothelial_cells:HUVEC:Serum_Amyloid_A
Endothelial_cells Endothelial_cells:lymphatic:TNFa_48h
T_cells T_cell:effector
T_cells T_cell:CCR10+CLA+1,25(OH)2_vit_D3/IL-12
T_cells T_cell:CCR10-CLA+1,25(OH)2_vit_D3/IL-12
Gametocytes Gametocytes:spermatocyte
DC DC:monocyte-derived:A._fumigatus_germ_tubes_6h
Neurons Neurons:ES_cell-derived_neural_precursor
Keratinocytes Keratinocytes
Keratinocytes Keratinocytes:IL19
Keratinocytes Keratinocytes:IL20
Keratinocytes Keratinocytes:IL22
Keratinocytes Keratinocytes:IL24
Keratinocytes Keratinocytes:IL26
Keratinocytes Keratinocytes:KGF
Keratinocytes Keratinocytes:IFNg
Keratinocytes Keratinocytes:IL1b
HSC_-G-CSF HSC_-G-CSF
DC DC:monocyte-derived:mature
Monocyte Monocyte:anti-FcgRIIB
Macrophage Macrophage:monocyte-derived:IL-4/cntrl
Macrophage Macrophage:monocyte-derived:IL-4/Dex/cntrl
Macrophage Macrophage:monocyte-derived:IL-4/Dex/TGFb
Macrophage Macrophage:monocyte-derived:IL-4/TGFb
Monocyte Monocyte:leukotriene_D4
NK_cell NK_cell
NK_cell NK_cell:IL2
Embryonic_stem_cells Embryonic_stem_cells
Tissue_stem_cells Tissue_stem_cells:iliac_MSC
Chondrocytes Chondrocytes:MSC-derived
Osteoblasts Osteoblasts
Tissue_stem_cells Tissue_stem_cells:BM_MSC
Osteoblasts Osteoblasts:BMP2
Tissue_stem_cells Tissue_stem_cells:BM_MSC:BMP2
Tissue_stem_cells Tissue_stem_cells:BM_MSC:TGFb3
DC DC:monocyte-derived:Poly(IC)
DC DC:monocyte-derived:CD40L
DC DC:monocyte-derived:Schuler_treatment
DC DC:monocyte-derived:antiCD40/VAF347
Tissue_stem_cells Tissue_stem_cells:dental_pulp
T_cells T_cell:CD4+_central_memory
T_cells T_cell:CD4+_effector_memory
T_cells T_cell:CD4+_Naive
Smooth_muscle_cells Smooth_muscle_cells:vascular
Smooth_muscle_cells Smooth_muscle_cells:vascular:IL-17
BM BM
Platelets Platelets
Epithelial_cells Epithelial_cells:bladder
Macrophage Macrophage:monocyte-derived
Macrophage Macrophage:monocyte-derived:M-CSF
Macrophage Macrophage:monocyte-derived:M-CSF/IFNg
Macrophage Macrophage:monocyte-derived:M-CSF/Pam3Cys
Macrophage Macrophage:monocyte-derived:M-CSF/IFNg/Pam3Cys
Macrophage Macrophage:monocyte-derived:IFNa
Gametocytes Gametocytes:oocyte
Monocyte Monocyte:F._tularensis_novicida
Endothelial_cells Endothelial_cells:HUVEC:B._anthracis_LT
B_cell B_cell:Germinal_center
B_cell B_cell:Plasma_cell
B_cell B_cell:Naive
B_cell B_cell:Memory
DC DC:monocyte-derived:AEC-conditioned
Tissue_stem_cells Tissue_stem_cells:lipoma-derived_MSC
Tissue_stem_cells Tissue_stem_cells:adipose-derived_MSC_AM3
Endothelial_cells Endothelial_cells:HUVEC:FPV-infected
Endothelial_cells Endothelial_cells:HUVEC:PR8-infected
Endothelial_cells Endothelial_cells:HUVEC:H5N1-infected
Macrophage Macrophage:monocyte-derived:S._aureus
Fibroblasts Fibroblasts:foreskin
iPS_cells iPS_cells:skin_fibroblast-derived
iPS_cells iPS_cells:skin_fibroblast
T_cells T_cell:gamma-delta
Monocyte Monocyte:CD14+
Macrophage Macrophage:Alveolar
Macrophage Macrophage:Alveolar:B._anthacis_spores
Neutrophils Neutrophil:inflam
iPS_cells iPS_cells:PDB_fibroblasts
iPS_cells iPS_cells:PDB_1lox-17Puro-5
iPS_cells iPS_cells:PDB_1lox-17Puro-10
iPS_cells iPS_cells:PDB_1lox-21Puro-20
iPS_cells iPS_cells:PDB_1lox-21Puro-26
iPS_cells iPS_cells:PDB_2lox-5
iPS_cells iPS_cells:PDB_2lox-22
iPS_cells iPS_cells:PDB_2lox-21
iPS_cells iPS_cells:PDB_2lox-17
iPS_cells iPS_cells:CRL2097_foreskin
iPS_cells iPS_cells:CRL2097_foreskin-derived:d20_hepatic_diff
iPS_cells iPS_cells:CRL2097_foreskin-derived:undiff.
B_cell B_cell:CXCR4+_centroblast
B_cell B_cell:CXCR4-_centrocyte
Endothelial_cells Endothelial_cells:HUVEC:VEGF
iPS_cells iPS_cells:fibroblasts
iPS_cells iPS_cells:fibroblast-derived:Direct_del._reprog
iPS_cells iPS_cells:fibroblast-derived:Retroviral_transf
Endothelial_cells Endothelial_cells:lymphatic:KSHV
Endothelial_cells Endothelial_cells:blood_vessel
Monocyte Monocyte:CD16-
Monocyte Monocyte:CD16+
Tissue_stem_cells Tissue_stem_cells:BM_MSC:osteogenic
Hepatocytes Hepatocytes
Neutrophils Neutrophil:uropathogenic_E._coli_UTI89
Neutrophils Neutrophil:commensal_E._coli_MG1655
MSC MSC
Neuroepithelial_cell Neuroepithelial_cell:ESC-derived
Astrocyte Astrocyte:Embryonic_stem_cell-derived
Endothelial_cells Endothelial_cells:HUVEC:IL-1b
HSC_CD34+ HSC_CD34+
CMP CMP
GMP GMP
B_cell B_cell:immature
MEP MEP
Myelocyte Myelocyte
Pre-B_cell_CD34- Pre-B_cell_CD34-
Pro-B_cell_CD34+ Pro-B_cell_CD34+
Pro-Myelocyte Pro-Myelocyte
Smooth_muscle_cells Smooth_muscle_cells:umbilical_vein
iPS_cells iPS_cells:foreskin_fibrobasts
iPS_cells iPS_cells:iPS:minicircle-derived
iPS_cells iPS_cells:adipose_stem_cells
iPS_cells iPS_cells:adipose_stem_cell-derived:lentiviral
iPS_cells iPS_cells:adipose_stem_cell-derived:minicircle-derived
Fibroblasts Fibroblasts:breast
Monocyte Monocyte:MCSF
Monocyte Monocyte:CXCL4
Neurons Neurons:adrenal_medulla_cell_line
Tissue_stem_cells Tissue_stem_cells:CD326-CD56+
NK_cell NK_cell:CD56hiCD62L+
T_cells T_cell:Treg:Naive
Neutrophils Neutrophil:LPS
Neutrophils Neutrophil:GM-CSF_IFNg
Monocyte Monocyte:S._typhimurium_flagellin
Neurons Neurons:Schwann_cell

DatabaseImmuneCellExpressionData Labels

label.main label.fine
B cells B cells, naive
Monocytes Monocytes, CD14+
Monocytes Monocytes, CD16+
NK cells NK cells
T cells, CD4+ T cells, CD4+, memory TREG
T cells, CD4+ T cells, CD4+, naive
T cells, CD4+ T cells, CD4+, naive, stimulated
T cells, CD4+ T cells, CD4+, naive TREG
T cells, CD4+ T cells, CD4+, TFH
T cells, CD4+ T cells, CD4+, Th1
T cells, CD4+ T cells, CD4+, Th1_17
T cells, CD4+ T cells, CD4+, Th17
T cells, CD4+ T cells, CD4+, Th2
T cells, CD8+ T cells, CD8+, naive
T cells, CD8+ T cells, CD8+, naive, stimulated

NovershternHematopoieticData Labels

label.main label.fine
Basophils Basophils Basophils
Naïve B cells B cells Naïve B cells
Mature B cells class able to switch B cells Mature B cells class able to switch
Mature B cells B cells Mature B cells
Mature B cells class switched B cells Mature B cells class switched
Common myeloid progenitors CMPs Common myeloid progenitors
Plasmacytoid Dendritic Cells Dendritic cells Plasmacytoid Dendritic Cells
Myeloid Dendritic Cells Dendritic cells Myeloid Dendritic Cells
Eosinophils Eosinophils Eosinophils
Erythroid_CD34+ CD71+ GlyA- Erythroid cells Erythroid_CD34+ CD71+ GlyA-
Erythroid_CD34- CD71+ GlyA- Erythroid cells Erythroid_CD34- CD71+ GlyA-
Erythroid_CD34- CD71+ GlyA+ Erythroid cells Erythroid_CD34- CD71+ GlyA+
Erythroid_CD34- CD71lo GlyA+ Erythroid cells Erythroid_CD34- CD71lo GlyA+
Erythroid_CD34- CD71- GlyA+ Erythroid cells Erythroid_CD34- CD71- GlyA+
Granulocyte/monocyte progenitors GMPs Granulocyte/monocyte progenitors
Colony Forming Unit-Granulocytes Granulocytes Colony Forming Unit-Granulocytes
Granulocytes (Neutrophilic Metamyelocytes) Granulocytes Granulocytes (Neutrophilic Metamyelocytes)
Granulocytes (Neutrophils) Granulocytes Granulocytes (Neutrophils)
Hematopoietic stem cells_CD133+ CD34dim HSCs Hematopoietic stem cells_CD133+ CD34dim
Hematopoietic stem cells_CD38- CD34+ HSCs Hematopoietic stem cells_CD38- CD34+
Colony Forming Unit-Megakaryocytic Megakaryocytes Colony Forming Unit-Megakaryocytic
Megakaryocytes Megakaryocytes Megakaryocytes
Megakaryocyte/erythroid progenitors MEPs Megakaryocyte/erythroid progenitors
Colony Forming Unit-Monocytes Monocytes Colony Forming Unit-Monocytes
Monocytes Monocytes Monocytes
Mature NK cells_CD56- CD16+ CD3- NK cells Mature NK cells_CD56- CD16+ CD3-
Mature NK cells_CD56+ CD16+ CD3- NK cells Mature NK cells_CD56+ CD16+ CD3-
Mature NK cells_CD56- CD16- CD3- NK cells Mature NK cells_CD56- CD16- CD3-
NK T cells NK T cells NK T cells
Early B cells B cells Early B cells
Pro B cells B cells Pro B cells
CD8+ Effector Memory RA CD8+ T cells CD8+ Effector Memory RA
Naive CD8+ T cells CD8+ T cells Naive CD8+ T cells
CD8+ Effector Memory CD8+ T cells CD8+ Effector Memory
CD8+ Central Memory CD8+ T cells CD8+ Central Memory
Naive CD4+ T cells CD4+ T cells Naive CD4+ T cells
CD4+ Effector Memory CD4+ T cells CD4+ Effector Memory
CD4+ Central Memory CD4+ T cells CD4+ Central Memory

MonacoImmuneData Labels

label.main label.fine
Naive CD8 T cells CD8+ T cells Naive CD8 T cells
Central memory CD8 T cells CD8+ T cells Central memory CD8 T cells
Effector memory CD8 T cells CD8+ T cells Effector memory CD8 T cells
Terminal effector CD8 T cells CD8+ T cells Terminal effector CD8 T cells
MAIT cells T cells MAIT cells
Vd2 gd T cells T cells Vd2 gd T cells
Non-Vd2 gd T cells T cells Non-Vd2 gd T cells
Follicular helper T cells CD4+ T cells Follicular helper T cells
T regulatory cells CD4+ T cells T regulatory cells
Th1 cells CD4+ T cells Th1 cells
Th1/Th17 cells CD4+ T cells Th1/Th17 cells
Th17 cells CD4+ T cells Th17 cells
Th2 cells CD4+ T cells Th2 cells
Naive CD4 T cells CD4+ T cells Naive CD4 T cells
Progenitor cells Progenitors Progenitor cells
Naive B cells B cells Naive B cells
Non-switched memory B cells B cells Non-switched memory B cells
Exhausted B cells B cells Exhausted B cells
Switched memory B cells B cells Switched memory B cells
Plasmablasts B cells Plasmablasts
Classical monocytes Monocytes Classical monocytes
Intermediate monocytes Monocytes Intermediate monocytes
Non classical monocytes Monocytes Non classical monocytes
Natural killer cells NK cells Natural killer cells
Plasmacytoid dendritic cells Dendritic cells Plasmacytoid dendritic cells
Myeloid dendritic cells Dendritic cells Myeloid dendritic cells
Low-density neutrophils Neutrophils Low-density neutrophils
Low-density basophils Basophils Low-density basophils
Terminal effector CD4 T cells CD4+ T cells Terminal effector CD4 T cells

ImmGenData Labels

label.main label.fine
Macrophages Macrophages (MF.11C-11B+)
Macrophages Macrophages (MF.ALV)
Monocytes Monocytes (MO.6+I-)
Monocytes Monocytes (MO.6+2+)
B cells B cells (B.MEM)
B cells B cells (B1A)
DC DC (DC.11B+)
DC DC (DC.11B-)
Stromal cells Stromal cells (DN.CFA)
Stromal cells Stromal cells (DN)
Eosinophils Eosinophils (EO)
Fibroblasts Fibroblasts (FRC.CAD11.WT)
Fibroblasts Fibroblasts (FRC.CFA)
Fibroblasts Fibroblasts (FRC)
Neutrophils Neutrophils (GN)
Endothelial cells Endothelial cells (LEC.CFA)
Endothelial cells Endothelial cells (LEC)
Macrophages Macrophages (MF)
T cells T cells (T.DP.69-)
T cells T cells (T.DP)
T cells T cells (T.DP69+)
Macrophages Macrophages (MF.F480HI.GATA6KO)
Macrophages Macrophages (MF.F480HI.CTRL)
T cells T cells (T.CD4.1H)
T cells T cells (T.CD4.24H)
T cells T cells (T.CD4.48H)
T cells T cells (T.CD4.5H)
T cells T cells (T.CD4.96H)
T cells T cells (T.CD4.CTR)
T cells T cells (T.CD8.1H)
T cells T cells (T.CD8.24H)
T cells T cells (T.CD8.48H)
T cells T cells (T.CD8.5H)
T cells T cells (T.CD8.96H)
T cells T cells (T.CD8.CTR)
Macrophages Macrophages (MFAR-)
Monocytes Monocytes (MO)
ILC ILC (ILC1.CD127+)
ILC ILC (LIV.ILC1.DX5-)
ILC ILC (LPL.NCR+ILC1)
ILC ILC (ILC2)
ILC ILC (LPL.NCR+ILC3)
ILC ILC (ILC3.LTI.CD4+)
ILC ILC (ILC3.LTI.CD4-)
ILC ILC (ILC3.LTI.4+)
NK cells NK cells (NK.CD127-)
ILC ILC (LIV.NK.DX5+)
ILC ILC (LPL.NCR+CNK)
Basophils Basophils (BA)
Epithelial cells Epithelial cells (Ep.5wk.MEC.Sca1+)
Epithelial cells Epithelial cells (Ep.5wk.MEChi)
Epithelial cells Epithelial cells (Ep.5wk.MEClo)
Epithelial cells Epithelial cells (Ep.8wk.CEC.Sca1+)
Epithelial cells Epithelial cells (Ep.8wk.CEChi)
Epithelial cells Epithelial cells (Ep.8wk.MEChi)
Epithelial cells Epithelial cells (Ep.8wk.MEClo)
Mast cells Mast cells (MC.ES)
Mast cells Mast cells (MC)
Mast cells Mast cells (MC.TO)
Mast cells Mast cells (MC.TR)
Mast cells Mast cells (MC.DIGEST)
Epithelial cells Epithelial cells (MECHI.GFP+.ADULT)
Epithelial cells Epithelial cells (MECHI.GFP+.ADULT.KO)
Epithelial cells Epithelial cells (MECHI.GFP-.ADULT)
Macrophages Macrophages (MF.480HI.NAIVE)
Macrophages Macrophages (MF.480INT.NAIVE)
T cells T cells (T.4EFF49D+11A+.D8.LCMV)
T cells T cells (T.4MEM49D+11A+.D30.LCMV)
T cells T cells (T.4NVE44-49D-11A-)
T cells T cells (T.8EFF.TBET+.OT1LISOVA)
T cells T cells (T.8EFF.TBET-.OT1LISOVA)
T cells T cells (T.8EFFKLRG1+CD127-.D8.LISOVA)
T cells T cells (T.8MEMKLRG1-CD127+.D8.LISOVA)
T cells T cells (T.4+8int)
T cells T cells (T.4FP3+25+)
T cells T cells (T.4int8+)
T cells T cells (T.4SP24-)
T cells T cells (T.4SP24int)
T cells T cells (T.4SP69+)
T cells T cells (T.8SP24-)
T cells T cells (T.8SP24int)
T cells T cells (T.8SP69+)
T cells T cells (T.DPbl)
T cells T cells (T.DPsm)
T cells T cells (T.ISP)
B cells B cells (B.FrE)
B cells B cells (B.FrF)
B cells B cells (preB.FrD)
B cells B cells (proB.FrBC)
B cells B cells (preB.FrC)
Stem cells Stem cells (SC.STSL)
T cells T cells (T.CD4+TESTNA)
T cells T cells (T.CD4+TESTDB)
B cells B cells (B.CD19CONTROL)
T cells T cells (T.CD4CONTROL)
T cells T cells (T.CD4TESTJS)
T cells T cells (T.CD4TESTCJ)
Stem cells Stem cells (SC.CD150-CD48-)
Tgd Tgd (Tgd.imm.vg2+)
Tgd Tgd (Tgd.imm.vg2)
Tgd Tgd (Tgd.mat.vg3)
Tgd Tgd (Tgd.mat.vg3.)
Tgd Tgd (Tgd)
Tgd Tgd (Tgd.vg2+.act)
Tgd Tgd (Tgd.vg2-.act)
Tgd Tgd (Tgd.vg2-)
B cells B cells (B.Fo)
B cells B cells (B.FRE)
B cells B cells (B.GC)
B cells B cells (B.MZ)
B cells B cells (B.T1)
B cells B cells (B.T2)
B cells B cells (B.T3)
B cells B cells (B1a)
B cells B cells (B1b)
DC DC (DC)
DC DC (DC.103+11B-)
DC DC (DC.8-4-11B+)
DC DC (DC.LC)
NK cells NK cells (NK.49CI+)
NK cells NK cells (NK.49CI-)
NK cells NK cells (NK.B2M-)
NK cells NK cells (NK.DAP10-)
NK cells NK cells (NK.DAP12-)
NK cells NK cells (NK.H+.MCMV1)
NK cells NK cells (NK.H+.MCMV7)
NK cells NK cells (NK.H+MCMV1)
NK cells NK cells (NK.MCMV7)
NK cells NK cells (NK)
NKT NKT (NKT.4+)
NKT NKT (NKT.4-)
NKT NKT (NKT.44+NK1.1+)
NKT NKT (NKT.44+NK1.1-)
NKT NKT (NKT.44-NK1.1-)
B cells B cells (preB.FRD)
B cells B cells (proB.CLP)
Stem cells Stem cells (proB.CLP)
B cells B cells (proB.FrA)
B cells B cells (proB.FRA)
B cells, pro B cells, pro (proB.FrA)
T cells T cells (T.4MEM)
T cells T cells (T.4Mem)
T cells T cells (T.4MEM44H62L)
T cells T cells (T.4Nve)
T cells T cells (T.4NVE)
T cells T cells (T.8EFF.OT1.D15.VSVOVA)
T cells T cells (T.8EFF.OT1.D5.VSVOVA)
T cells T cells (T.8EFF.OT1.VSVOVA)
T cells T cells (T.8EFF.OT1.D8.VSVOVA)
T cells T cells (T.8MEM)
T cells T cells (T.8Mem)
T cells T cells (T.8MEM.OT1.D106.VSVOVA)
T cells T cells (T.8EFF.OT1.D45VSV)
T cells T cells (T.8Nve)
T cells T cells (T.8NVE)
B cells B cells (proB.FRBC)
T cells T cells (T.4)
T cells T cells (T.4.Pa)
T cells T cells (T.4.PLN)
T cells T cells (T.4FP3-)
Tgd Tgd (Tgd.VG2+)
Tgd Tgd (Tgd.vg2+.TCRbko)
Tgd Tgd (Tgd.vg2-.TCRbko)
Tgd Tgd (Tgd.vg5+.act)
Tgd Tgd (Tgd.VG5+.ACT)
Tgd Tgd (Tgd.VG5+)
Tgd Tgd (Tgd.vg5-.act)
Tgd Tgd (Tgd.VG5-)
NK cells NK cells (NK.49H+)
NK cells NK cells (NK.49H-)
DC DC (DC.8+)
DC DC (DC.8-)
DC DC (DC.8-4-11B-)
DC DC (DC.PDC.8+)
DC DC (DC.PDC.8-)
Macrophages Macrophages (MF.II-480HI)
Macrophages Macrophages (MF.RP)
Macrophages Macrophages (MFIO5.II+480INT)
Macrophages Macrophages (MFIO5.II+480LO)
Macrophages Macrophages (MFIO5.II-480HI)
Macrophages Macrophages (MFIO5.II-480INT)
Monocytes Monocytes (MO.6C+II+)
Monocytes Monocytes (MO.6C+II-)
Monocytes Monocytes (MO.6C-II+)
Monocytes Monocytes (MO.6C-II-)
Monocytes Monocytes (MO.6C-IIINT)
T cells T cells (T.8EFF.OT1.D10LIS)
T cells T cells (T.8EFF.OT1.D10.LISOVA)
T cells T cells (T.8EFF.OT1.D15LIS)
T cells T cells (T.8EFF.OT1.D15.LISOVA)
T cells T cells (T.8EFF.OT1LISO)
T cells T cells (T.8EFF.OT1.LISOVA)
T cells T cells (T.8EFF.OT1.D8LISO)
T cells T cells (T.8EFF.OT1.D8.LISOVA)
T cells T cells (T.8MEM.OT1.D100.LISOVA)
T cells T cells (T.8MEM.OT1.D45.LISOVA)
T cells T cells (T.8NVE.OT1)
B cells B cells (B.FO)
Endothelial cells Endothelial cells (BEC)
Epithelial cells Epithelial cells (EP.MECHI)
Fibroblasts Fibroblasts (FI.MTS15+)
Fibroblasts Fibroblasts (FI)
Stromal cells Stromal cells (ST.31-38-44-)
Stem cells Stem cells (SC.LT34F)
Stem cells Stem cells (SC.MDP)
Stem cells Stem cells (SC.MEP)
Stem cells Stem cells (SC.MPP34F)
Stem cells Stem cells (SC.ST34F)
Stem cells Stem cells (SC.CDP)
Stem cells Stem cells (SC.CMP.DR)
Stem cells Stem cells (GMP)
Stem cells Stem cells (MLP)
Stem cells Stem cells (LTHSC)
T cells T cells (T.DN2-3)
T cells T cells (T.DN2)
T cells T cells (T.DN2A)
T cells T cells (T.DN2B)
T cells T cells (T.DN3-4)
T cells T cells (T.DN3A)
T cells T cells (T.DN3B)
T cells T cells (T.DN1-2)
T cells T cells (T.DN4)
Macrophages Macrophages (MF.103-11B+.SALM3)
Macrophages Macrophages (MF.103-11B+)
DC DC (DC.103-11B+24+)
Macrophages Macrophages (MF.103-11B+24-)
DC DC (DC.103-11B+F4-80LO.KD)
Macrophages Macrophages (MF.11CLOSER.SALM3)
Macrophages Macrophages (MF.11CLOSER)
Macrophages Macrophages (MF.103CLOSER)
Macrophages Macrophages (MF.II+480LO)
Neutrophils Neutrophils (GN.ARTH)
Neutrophils Neutrophils (GN.Thio)
Neutrophils Neutrophils (GN.URAC)
Macrophages Macrophages (MF.169+11CHI)
Macrophages Macrophages (MF.MEDL)
Macrophages Macrophages (MF.SBCAPS)
Microglia Microglia (Microglia)
T cells T cells (T.ETP)
Tgd Tgd (Tgd.imm.VG1+)
Tgd Tgd (Tgd.imm.VG1+VD6+)
Tgd Tgd (Tgd.mat.VG1+)
Tgd Tgd (Tgd.mat.VG1+VD6+)
Tgd Tgd (Tgd.mat.VG2+)
Tgd Tgd (Tgd.VG3+24AHI)
Tgd Tgd (Tgd.VG5+24AHI)
T cells T cells (T.8EFF.OT1.12HR.LISOVA)
T cells T cells (T.8EFF.OT1.24HR.LISOVA)
T cells T cells (T.8EFF.OT1.48HR.LISOVA)
T cells T cells (T.Tregs)
Tgd Tgd (Tgd.VG2+24AHI)
Tgd Tgd (Tgd.VG4+24AHI)
Tgd Tgd (Tgd.VG4+24ALO)

MouseRNAseqData Labels

label.main label.fine
Adipocytes Adipocytes
Neurons aNSCs
Astrocytes Astrocytes
Astrocytes Astrocytes activated
Endothelial cells Endothelial cells
Erythrocytes Erythrocytes
Fibroblasts Fibroblasts
Fibroblasts Fibroblasts activated
Fibroblasts Fibroblasts senescent
Granulocytes Granulocytes
Macrophages Macrophages
Microglia Microglia
Microglia Microglia activated
Monocytes Monocytes
Neurons Neurons
Neurons Neurons activated
NK cells NK cells
Neurons NPCs
Oligodendrocytes Oligodendrocytes
Neurons qNSCs
T cells T cells
Dendritic cells Dendritic cells
Cardiomyocytes Cardiomyocytes
Hepatocytes Hepatocytes
B cells B cells
Epithelial cells Ependymal
Oligodendrocytes OPCs
Macrophages Macrophages activated

6 Combining predictions from different references

In many cases, running the test set against multiple reference datasets may be useful. The combineResults function will pool multiple predictions together and retain the data from the prediction with the highest score for each cell/cluster:

hpca.se <- HumanPrimaryCellAtlasData()
bp.se <- BlueprintEncodeData()

hESCs <- LaMannoBrainData('human-es')
hESCs <- hESCs[,1:100]

# Restrict to common genes between reference datasets:
common.refs <- intersect(rownames(bp.se), rownames(hpca.se))
hpca.se <- hpca.se[common.refs,]
bp.se <- bp.se[common.refs,]

# Restrict to common genes between test and reference data:
common <- intersect(rownames(hESCs), rownames(hpca.se))
hpca.se <- hpca.se[common,]
bp.se <- bp.se[common,]
hESCs <- hESCs[common,]
hESCs <- logNormCounts(hESCs)

pred.hpca <- SingleR(test = hESCs, ref = hpca.se, labels = hpca.se$label.main)
table(pred.hpca$labels)
## 
##            Astrocyte Neuroepithelial_cell              Neurons 
##                   11                   82                    7
pred.bp <- SingleR(test = hESCs, ref = bp.se, labels = bp.se$label.main)
table(pred.bp$labels)
## 
##    Erythrocytes Mesangial cells         Neurons   Smooth muscle 
##               1               3              95               1
combined.preds <- combineResults(list(hpca = pred.hpca, bp = pred.bp))
table(combined.preds$labels)
## 
##            Astrocyte Neuroepithelial_cell              Neurons 
##                    4                   53                   43

7 Advanced use

Advanced users can split the SingleR() workflow into two separate training and classification steps. This means that training (e.g., marker detection, assembling of nearest-neighbor indices) only needs to be performed once. The resulting data structures can then be re-used across multiple classifications with different test datasets, provided the test feature set is identical to or a superset of the features in the training set. For example:

trained <- trainSingleR(sceM, labels=sceM$label, genes=markers)
pred2b <- classifySingleR(sceG, trained)
table(pred$labels, pred2b$labels)
##          
##           acinar beta delta duct pp unclear
##   acinar      59    0     0    0  0       0
##   beta         0    4     0    0  0       0
##   delta        0    0     1    0  0       0
##   duct         0    0     0   34  0       0
##   pp           0    0     0    0  1       0
##   unclear      0    0     0    0  0       1

Other efficiency improvements are possible through several arguments:

  • Switching to an approximate algorithm for the nearest neighbor search in trainSingleR() via the BNPARAM= argument from the BiocNeighbors package.
  • Parallelizing the fine-tuning step in classifySingleR() with the BPPARAM= argument from the BiocParallel package.

These arguments can also be specified in the SingleR() command.

8 FAQs

How can I use this with my Seurat, SingleCellExperiment, or cell_data_set object?

SingleR is workflow agnostic - all it needs is normalized counts. An example showing how to map its results back to common single-cell data objects is available in the README.

Where can I find reference sets appropriate for my data?

scRNAseq contains many single-cell datasets with more continually being added. ArrayExpress and GEOquery can be used to download any of the bulk or single-cell datasets in ArrayExpress or GEO, respectively.

9 Session information

sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.3 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.10-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.10-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] knitr_1.26                  scran_1.14.5               
##  [3] scater_1.14.6               ggplot2_3.2.1              
##  [5] scRNAseq_2.0.2              SingleCellExperiment_1.8.0 
##  [7] SingleR_1.0.1               SummarizedExperiment_1.16.0
##  [9] DelayedArray_0.12.1         BiocParallel_1.20.0        
## [11] matrixStats_0.55.0          Biobase_2.46.0             
## [13] GenomicRanges_1.38.0        GenomeInfoDb_1.22.0        
## [15] IRanges_2.20.1              S4Vectors_0.24.1           
## [17] BiocGenerics_0.32.0         BiocStyle_2.14.2           
## 
## loaded via a namespace (and not attached):
##  [1] bitops_1.0-6                  bit64_0.9-7                  
##  [3] RColorBrewer_1.1-2            httr_1.4.1                   
##  [5] tools_3.6.1                   backports_1.1.5              
##  [7] R6_2.4.1                      irlba_2.3.3                  
##  [9] vipor_0.4.5                   DBI_1.1.0                    
## [11] lazyeval_0.2.2                colorspace_1.4-1             
## [13] withr_2.1.2                   tidyselect_0.2.5             
## [15] gridExtra_2.3                 bit_1.1-14                   
## [17] curl_4.3                      compiler_3.6.1               
## [19] BiocNeighbors_1.4.1           labeling_0.3                 
## [21] bookdown_0.16                 scales_1.1.0                 
## [23] rappdirs_0.3.1                stringr_1.4.0                
## [25] digest_0.6.23                 rmarkdown_2.0                
## [27] XVector_0.26.0                pkgconfig_2.0.3              
## [29] htmltools_0.4.0               highr_0.8                    
## [31] limma_3.42.0                  dbplyr_1.4.2                 
## [33] fastmap_1.0.1                 rlang_0.4.2                  
## [35] RSQLite_2.1.4                 shiny_1.4.0                  
## [37] DelayedMatrixStats_1.8.0      farver_2.0.1                 
## [39] dplyr_0.8.3                   RCurl_1.95-4.12              
## [41] magrittr_1.5                  BiocSingular_1.2.0           
## [43] GenomeInfoDbData_1.2.2        Matrix_1.2-18                
## [45] Rcpp_1.0.3                    ggbeeswarm_0.6.0             
## [47] munsell_0.5.0                 viridis_0.5.1                
## [49] lifecycle_0.1.0               edgeR_3.28.0                 
## [51] stringi_1.4.3                 yaml_2.2.0                   
## [53] zlibbioc_1.32.0               BiocFileCache_1.10.2         
## [55] AnnotationHub_2.18.0          grid_3.6.1                   
## [57] blob_1.2.0                    dqrng_0.2.1                  
## [59] promises_1.1.0                ExperimentHub_1.12.0         
## [61] crayon_1.3.4                  lattice_0.20-38              
## [63] locfit_1.5-9.1                zeallot_0.1.0                
## [65] pillar_1.4.2                  igraph_1.2.4.2               
## [67] glue_1.3.1                    BiocVersion_3.10.1           
## [69] evaluate_0.14                 BiocManager_1.30.10          
## [71] vctrs_0.2.1                   httpuv_1.5.2                 
## [73] gtable_0.3.0                  purrr_0.3.3                  
## [75] assertthat_0.2.1              xfun_0.11                    
## [77] rsvd_1.0.2                    mime_0.7                     
## [79] xtable_1.8-4                  later_1.0.0                  
## [81] viridisLite_0.3.0             pheatmap_1.0.12              
## [83] tibble_2.1.3                  AnnotationDbi_1.48.0         
## [85] beeswarm_0.2.3                memoise_1.1.0                
## [87] statmod_1.4.32                interactiveDisplayBase_1.24.0

References

Aran, D., A. P. Looney, L. Liu, E. Wu, V. Fong, A. Hsu, S. Chak, et al. 2019. “Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage.” Nat. Immunol. 20 (2):163–72.

Benayoun, Bérénice A., Elizabeth A. Pollina, Param Priya Singh, Salah Mahmoudi, Itamar Harel, Kerriann M. Casey, Ben W. Dulken, Anshul Kundaje, and Anne Brunet. 2019. “Remodeling of epigenome and transcriptome landscapes with aging in mice reveals widespread induction of inflammatory responses.” Genome Research 29:697–709. https://doi.org/10.1101/gr.240093.118.

Grun, D., M. J. Muraro, J. C. Boisset, K. Wiebrands, A. Lyubimova, G. Dharmadhikari, M. van den Born, et al. 2016. “De Novo Prediction of Stem Cell Identity using Single-Cell Transcriptome Data.” Cell Stem Cell 19 (2):266–77.

Heng, Tracy S.P., Michio W. Painter, Kutlu Elpek, Veronika Lukacs-Kornek, Nora Mauermann, Shannon J. Turley, Daphne Koller, et al. 2008. “The immunological genome project: Networks of gene expression in immune cells.” Nature Immunology 9 (10):1091–4. https://doi.org/10.1038/ni1008-1091.

La Manno, G., D. Gyllborg, S. Codeluppi, K. Nishimura, C. Salto, A. Zeisel, L. E. Borm, et al. 2016. “Molecular Diversity of Midbrain Development in Mouse, Human, and Stem Cells.” Cell 167 (2):566–80.

Mabbott, Neil A., J. K. Baillie, Helen Brown, Tom C. Freeman, and David A. Hume. 2013. “An expression atlas of human primary cells: Inference of gene function from coexpression networks.” BMC Genomics 14. https://doi.org/10.1186/1471-2164-14-632.

Martens, Joost H A, and Hendrik G. Stunnenberg. 2013. “BLUEPRINT: Mapping human blood cell epigenomes.” Haematologica 98:1487–9. https://doi.org/10.3324/haematol.2013.094243.

Monaco, Gianni, Bernett Lee, Weili Xu, Seri Mustafah, You Yi Hwang, Christophe Carré, Nicolas Burdin, et al. 2019. “RNA-Seq Signatures Normalized by mRNA Abundance Allow Absolute Deconvolution of Human Immune Cell Types.” Cell Reports 26 (6):1627–1640.e7. https://doi.org/10.1016/j.celrep.2019.01.041.

Muraro, M. J., G. Dharmadhikari, D. Grun, N. Groen, T. Dielen, E. Jansen, L. van Gurp, et al. 2016. “A Single-Cell Transcriptome Atlas of the Human Pancreas.” Cell Syst 3 (4):385–94.

Novershtern, Noa, Aravind Subramanian, Lee N. Lawton, Raymond H. Mak, W. Nicholas Haining, Marie E. McConkey, Naomi Habib, et al. 2011. “Densely Interconnected Transcriptional Circuits Control Cell States in Human Hematopoiesis.” Cell 144 (2):296–309. https://doi.org/10.1016/j.cell.2011.01.004.

Schmiedel, Benjamin J., Divya Singh, Ariel Madrigal, Alan G. Valdovino-Gonzalez, Brandie M. White, Jose Zapardiel-Gonzalo, Brendan Ha, et al. 2018. “Impact of Genetic Polymorphisms on Human Immune Cell Gene Expression.” Cell 175 (6):1701–1715.e16. https://doi.org/10.1016/j.cell.2018.10.022.

The ENCODE Project Consortium. 2012. “An integrated encyclopedia of DNA elements in the human genome.” Nature. https://doi.org/10.1038/nature11247.