CaseStudy_Bassez2021.Rmd

---
title: Exhausted CD8 T cells are associated with ICB responsiveness in breast cancer
date: "`r Sys.Date()`"
author: "M. Andreatta and S. Carmona"
output:
  rmdformats::readthedown:
    self-contained: true
    highlight: haddock
    thumbnails: false
    css: styles.css
knit: (function(input_file, encoding) {
  out_dir <- 'docs';
  rmarkdown::render(input_file, encoding=encoding,
  output_file=file.path(dirname(input_file), out_dir, 'Bassez_BC.html'))})
---

```{r setup, echo=FALSE, warning=FALSE}
#install.packages("rmdformats")
#Template markdown setup
library(knitr)
library(rmdformats)
library(formatR)

## Global options
options(max.print="75")
opts_chunk$set(echo=TRUE,
	             cache=TRUE,
               cache.lazy=FALSE,
               prompt=FALSE,
               tidy=TRUE,
               comment=NA,
               message=FALSE,
               warning=FALSE,
               dev='png')
opts_knit$set(width=75)

```

# Background

The single-cell dataset generated by [Bassez et al. (2021) Nature Medicine](https://www.nature.com/articles/s41591-021-01323-8) contains single-cell data from breast carcinoma biopsies from 29 patients taken immediately before (pre-treatment) and 9 +/- 2 days after (on-treatment) **receiving anti-PD1 immunotherapy**. Lacking an objective measure of clinical benefit following therapy, in this study the authors use T cell expansion following treatment as a proxy for response.

[Full Data Original Source](https://biokey.lambrechtslab.org/en/data-access)

[Case Study Dataset Internal link](https://figshare.com/ndownloader/files/47900725)

# Goal
To compare the T cell subtype composition between responders and non-responders using `ProjecTILs` and reference maps for [CD8 T cells](https://doi.org/10.6084/m9.figshare.21931875) and [CD4 T cells](https://doi.org/10.6084/m9.figshare.21981536).

# R Environment

```{r, message=FALSE, warning=F, results=FALSE}
library(renv)
renv::restore()

library(patchwork)
library(ggplot2)
library(reshape2)
library(Seurat)
library(ProjecTILs)
```

# scRNA-seq data preparation

```{r}
options(timeout = 5000)
ddir <- "input/Bassez_breast_data"

if (!file.exists(ddir)){
    dir.create(ddir)
    dataUrl <- "https://figshare.com/ndownloader/files/47900725"
    download.file(dataUrl, paste0(ddir,"/tmp.zip"))
    unzip(paste0(ddir,"/tmp.zip"),exdir = ddir)
    file.remove(paste0(ddir,"/tmp.zip"))
}
#Count matrices
f1 <- sprintf("%s/1864-counts_tcell_cohort1.rds", ddir)

cohort1 <- readRDS(f1)
dim(cohort1)

#Metadata
meta1 <- read.csv(sprintf("%s/1870-BIOKEY_metaData_tcells_cohort1_web.csv", ddir))
rownames(meta1) <- meta1$Cell

head(meta1)

data.seurat <- CreateSeuratObject(cohort1, project="Cohort1_IT", meta.data = meta1)
data.seurat <- NormalizeData(data.seurat)

```


Subset pre-treatment biopsies
```{r}
table(data.seurat$timepoint)
data.seurat <- subset(data.seurat, subset=timepoint=="Pre")
```

Subset T cells according to annotation by the authors
```{r}
table(data.seurat$cellSubType)
data.seurat <- subset(data.seurat, subset=cellSubType %in% c("NK_REST","Vg9Vd2_gdT","gdT","NK_CYTO"), invert=T)
```


We will remove samples that are too small and downsample large ones, in order to have similar contribution from all patients/samples. Downsampling large samples also speeds up downstream computations.

```{r}
ds <- 1000
min.cells <- 200

tab <- table(data.seurat$patient_id)
keep <- names(tab)[tab > min.cells]
data.seurat <- subset(data.seurat, patient_id %in% keep)


Idents(data.seurat) <- "patient_id"
data.seurat <- subset(data.seurat, cells = WhichCells(data.seurat,downsample = ds))
table(data.seurat$patient_id)
```

# ProjecTILs analysis

## Load reference maps

Here we will project the query data onto human reference TIL atlas for CD4 and CD8 T cells, with the goal to interpret T cell diversity in the context of reference T cell subtypes. First, load the reference maps:
```{r fig.height=4, fig.width=9}
download.file("https://figshare.com/ndownloader/files/41414556", destfile = "CD8T_human_ref_v1.rds")
ref.cd8 <- load.reference.map("CD8T_human_ref_v1.rds")

download.file("https://figshare.com/ndownloader/files/39012395", destfile = "CD4T_human_ref_v1.rds")
ref.cd4 <- load.reference.map("CD4T_human_ref_v1.rds")

a <- DimPlot(ref.cd8, cols=ref.cd8@misc$atlas.palette, label = T) +
  theme(aspect.ratio = 1) + ggtitle("CD8 T reference") + NoLegend()

b <- DimPlot(ref.cd4, cols=ref.cd4@misc$atlas.palette, label = T) +
  theme(aspect.ratio = 1) + ggtitle("CD4 T reference") + NoLegend()

a | b
```

Show key marker genes for the reference subtypes
```{r fig.height=4, fig.width=10}
DefaultAssay(ref.cd8) <- "RNA"
Idents(ref.cd8) <- "functional.cluster"

genes <- c("SELL","TCF7","LEF1","CCR7","S1PR1","LMNA","IL7R","GZMK",
           "FGFBP2",'FCGR3A','XCL1',"XCL2","CD200","CRTAM","GNG4","TOX",'PDCD1',"HAVCR2",
           "GZMB","PRF1","LAG3",'KLRB1','TRAV1-2')

DotPlot(ref.cd8, features = genes, cols="RdBu", scale=T, col.max=1.5) +
  theme(axis.text.x = element_text(angle = 45, hjust=1)) + ggtitle("CD8 T cell reference map")
```

```{r fig.height=4, fig.width=10}
DefaultAssay(ref.cd4) <- "RNA"
Idents(ref.cd4) <- "functional.cluster"

genes <- c("SELL","TCF7","S1PR1","KLF2","IL7R","CCL5","GZMK","EOMES",
           "TBX21","CXCR6","GZMA","GZMB","PRF1","GNLY",
           "CXCL13","PDCD1","TOX","LAG3","HAVCR2","KLRB1","IL17A","FOXP3","IL2RA")

DotPlot(ref.cd4, features = genes, cols="RdBu", scale=T, col.max=1.5) +
  theme(axis.text.x = element_text(angle = 45, hjust=1)) + ggtitle("CD4 T cell reference map")
```

## Reference-based analysis

To classify CD4 and CD8 T cell subtypes, we will project CD4 and CD8 T cells separately into the two reference maps. Running `ProjecTILs.classifier` sequentially on the two maps allows combining the subtype predictions for a complete annotation of T cells subtypes.

For optimal batch-effect correction, we recommend projecting each patient/batch separately (set `split.by="patient_id"`). For large samples, `Run.ProjecTILs` can take several minutes. Set the number of parallel jobs with *ncores* according to your computational resources.

```{r message=F, warnings=F}
#Classify CD8 T subtypes
ncores=8
DefaultAssay(ref.cd8) <- "integrated"
data.seurat <-  ProjecTILs.classifier(data.seurat, ref.cd8,
                                  ncores=ncores, split.by = "patient_id")
```

After running the classifier with the CD8T reference, we obtain CD8 T cell subtype annotations:
```{r}
table(data.seurat$functional.cluster, useNA="ifany")
```

We can now run the classifier on the same object with the CD4 T reference, specifying the parameter `overwrite=FALSE`. This adds labels for the CD4 T cells without replacing the CD8 T cell subtypes. Note that only cells that can be unequivocally assigned to one cell type are labeled; the remaining cells (including those that receive a label by more than one reference) are assigned to NA.
```{r message=F, warnings=F}
#Classify CD4 T subtypes
DefaultAssay(ref.cd4) <- "integrated"
data.seurat <-  ProjecTILs.classifier(data.seurat, ref.cd4,
                                  ncores=ncores, split.by = "patient_id",
                                  overwrite = FALSE)
```

The labels now contain both CD4T and CD8T subtypes:
```{r}
table(data.seurat$functional.cluster, useNA="ifany")
```


We can verify the expression profile using a panel of marker genes.
```{r fig.height=9, fig.width=13}
genes4radar <- c("CD4","CD8A","TCF7","CCR7","IL7R","LMNA","GZMA","GZMK","FGFBP2","XCL1","CD200",
                 "CRTAM","TOX","PDCD1","HAVCR2","PRF1","GLNY","GZMB","TRAV1-2","KLRB1","FOXP3")

plot.states.radar(ref=ref.cd8, query=data.seurat,
                  genes4radar = genes4radar,
                  min.cells=20)
plot.states.radar(ref=ref.cd4, query=data.seurat, genes4radar = genes4radar, min.cells=20)
```


Compare subtype composition of responders vs. non-responders (based on clonal expansion) on pre-therapy samples
```{r fig.height=4, fig.width=6.5}
a <- ref.cd8@misc$atlas.palette
b <- ref.cd4@misc$atlas.palette
cols <- c(a,b)

by.exp <- SplitObject(data.seurat, split.by = "expansion")

tb <-  table(factor(by.exp$E$functional.cluster, levels = names(cols)))
tb <- tb * 100/sum(tb)
tb.m <- reshape2::melt(tb)
colnames(tb.m) <- c("Cell_state", "Perc_cells")

p1 <- ggplot(tb.m, aes(x = Cell_state, y = Perc_cells, 
            fill = Cell_state)) + geom_bar(stat = "identity") + 
            theme_bw() + scale_fill_manual(values = cols) +
            theme(axis.text.x = element_blank(), legend.position = "none") +
            ggtitle("Responders") + ylim(0,60)


tb <-  table(factor(by.exp$NE$functional.cluster, levels = names(cols)))
tb <- tb * 100/sum(tb)
tb.m <- reshape2::melt(tb)
colnames(tb.m) <- c("Cell_state", "Perc_cells")

p2 <- ggplot(tb.m, aes(x = Cell_state, y = Perc_cells, 
            fill = Cell_state)) + geom_bar(stat = "identity") + 
            theme_bw() + scale_fill_manual(values = cols) +
            theme(axis.text.x = element_blank(), legend.position = "right") +
            ggtitle("Non-Responders") + ylim(0,60)

p1 | p2
```

The analysis shows that ICB-responding tumors have a higher proportion of CD8 exhausted (CD8.TEX), CD8 precursor exhausted (CD8.TPEX) and CD4 cytotoxic exhausted (CD4.CTL_Exh) T cells; on the other hand, CD4 Naive-like and CD8 TEMRA are more abundant in non-responders.

A more compact way of displaying enrichment/depletion of specific subtypes is to calculate and visualize **fold-changes** of TIL composition in **responders vs non-responders**

```{r fig.height=3.5, fig.width=5}
which.types <- table(data.seurat$functional.cluster) > 50  # disregard small populations
which.types <- names(which.types)[which.types]

states_all <- names(cols)

cols_use <- cols[which.types]

norm.c <- table(by.exp$NE$functional.cluster)/sum(table(by.exp$NE$functional.cluster))
norm.q <- table(by.exp$E$functional.cluster)/sum(table(by.exp$E$functional.cluster))

foldchange <- norm.q[which.types]/norm.c[which.types]
foldchange <- sort(foldchange, decreasing = T)

tb.m <- reshape2::melt(foldchange)
colnames(tb.m) <- c("Cell_state", "Fold_change")
ggplot(tb.m, aes(x = Cell_state, y = Fold_change, fill = Cell_state)) +
  geom_bar(stat = "identity") +
    scale_fill_manual(values = cols_use) + theme_bw() +
  geom_hline(yintercept = 1) + scale_y_continuous(trans = "log2") +
    theme(axis.text.x = element_blank(), legend.position = "left") +
  ggtitle("Responders vs. Non-responders")
```

Tumors in ICB-responders have a >10 fold-change increase in **CD4 exhausted CTL** and **CD8 exhausted T cells**.

# Correspondence analysis of subtype composition

Do patients cluster based on their subtype composition? are subtype composition "classes" related to response to ICB? we can perform correspondence analysis (CA) to try to answer these questions, and group patients based on their composition of T cell subtypes.

```{r fig.width=12, fig.height=7}
library("FactoMineR")
min.cells <- 30

#Only use samples with sufficient amount of cells
tab <- table(data.seurat$patient_id)
samples.use <- names(tab)[tab>min.cells]
data.seurat <- subset(data.seurat, subset = patient_id %in% samples.use)

subtype_comp_table <- prop.table(table(data.seurat$functional.cluster,
                                       data.seurat$patient_id), margin = 2)
res.CA <- CA(t(subtype_comp_table), graph = FALSE)
a <- plot(res.CA) + theme(aspect.ratio = 1) 

#Color by response
coords <- as.data.frame(res.CA$row$coord)
coords.ct <- as.data.frame(res.CA$col$coord)
coords.all <- rbind(coords[,1:5], coords.ct[,1:5])

is.R <- unique(data.seurat$patient_id[data.seurat$expansion=='E'])
is.NR <- unique(data.seurat$patient_id[data.seurat$expansion=='NE'])
coords[['Responder']] <- "n/a"
coords$Responder[rownames(coords) %in% is.R] <- "Responder"
coords$Responder[rownames(coords) %in% is.NR] <- "Non-responder"
coords[['Patient']] <- rownames(coords)

b <- ggplot(coords, aes(x = `Dim 1`, y = `Dim 2`, color = Responder)) +
  geom_point(size=4) + theme_bw() + theme(aspect.ratio = 1) +
  scale_color_manual(values=c("#E9E9E9","#fb0505","#08e579")) +
  xlim(min(coords.all[,'Dim 1']), max(coords.all[,'Dim 1'])) +
  ylim(min(coords.all[,'Dim 2']), max(coords.all[,'Dim 2']))

a | b
```

# Discussion

Our automated ProjecTILs analysis of this patient cohort indicate that compared to non-responders, early responders to PD-1 blockade (in terms of T cell expansion) have higher relative amounts of exhausted T cells (both CD8+ and CD4+) infiltrating their tumors, in agreement with previous studies (Daud et al. 2016; Thommen et al. 2018; Kumagai et al. 2020; Bassez et al. 2020).  This supports the notion that patients with a pre-existing tumor-specific T cell response in their tumors are more likely to respond to PD-1 blockade therapy. 

Because tumoricidal CD8 Tex are derived from Tpex (Siddiqui et al. 2019; Miller et al. 2019), the presence of CD8 Tex seems to be a good proxy for the presence of smaller populations of quiescent Tpex, that have the ability to proliferate and differentiate into Tex cells following PD-1 blockade (Siddiqui et al. 2019; Miller et al. 2019). Indeed we observed a small population of Tpex-like cells, with increased levels R vs RN. Because there are very few Tpex their gene profile might be less robust than others, but compared to Tex they display higher levels of Tpex-associated genes, such as *XCL1* and *CD200*, and lower levels of terminal-exhaustion markers such as *HAVCR2* and *ENTPD1*.

# Further reading

Dataset original publication -  [Bassez et al](https://www.nature.com/articles/s41591-021-01323-8)

ProjecTILs case studies - [INDEX](https://carmonalab.github.io/ProjecTILs_CaseStudies/) - [Repository](https://github.com/carmonalab/ProjecTILs_CaseStudies)

The ProjecTILs method [Andreatta et. al (2021) Nat. Comm.](https://www.nature.com/articles/s41467-021-23324-4) and [code](https://github.com/carmonalab/ProjecTILs)


# References

* Bassez, A., Vos, H., Van Dyck, L. et al. A single-cell map of intratumoral changes during anti-PD1 treatment of patients with breast cancer. Nat Med 27, 820–832 (2021). https://doi.org/10.1038/s41591-021-01323-8

* Daud, A.I., Loo, K., Pauli, M.L., Sanchez-Rodriguez, R., Sandoval, P.M., Taravati, K., Tsai, K., Nosrati, A., Nardo, L., Alvarado, M.D., Algazi, A.P., Pampaloni, M.H., Lobach, I.V., Hwang, J., Pierce, R.H., Gratz, I.K., Krummel, M.F., Rosenblum, M.D., 2016. Tumor immune profiling predicts response to anti-PD-1 therapy in human melanoma. J. Clin. Invest. 126, 3447–3452. https://doi.org/10.1172/JCI87324

* Gros, A., Robbins, P.F., Yao, X., Li, Y.F., Turcotte, S., Tran, E., Wunderlich, J.R., Mixon, A., Farid, S., Dudley, M.E., Hanada, K., Almeida, J.R., Darko, S., Douek, D.C., Yang, J.C., Rosenberg, S. a, 2014. PD-1 identifies the patient-specific in filtrating human tumors. J. Clin. Invest. 124, 2246–59. https://doi.org/10.1172/JCI73639.2246

* Miller, B.C., Sen, D.R., Al Abosy, R., Bi, K., Virkud, Y.V., LaFleur, M.W., Yates, K.B., Lako, A., Felt, K., Naik, G.S., Manos, M., Gjini, E., Kuchroo, J.R., Ishizuka, J.J., Collier, J.L., Griffin, G.K., Maleri, S., Comstock, D.E., Weiss, S.A., Brown, F.D., Panda, A., Zimmer, M.D., Manguso, R.T., Hodi, F.S., Rodig, S.J., Sharpe, A.H., Haining, W.N., 2019. Subsets of exhausted CD8+ T cells differentially mediate tumor control and respond to checkpoint blockade. Nat. Immunol. 20, 326–336. https://doi.org/10.1038/s41590-019-0312-6

* Siddiqui, I., Schaeuble, K., Chennupati, V., Fuertes Marraco, S.A., Calderon-Copete, S., Pais Ferreira, D., Carmona, S.J., Scarpellino, L., Gfeller, D., Pradervand, S., Luther, S.A., Speiser, D.E., Held, W., 2019. Intratumoral Tcf1+PD-1+CD8+ T Cells with Stem-like Properties Promote Tumor Control in Response to Vaccination and Checkpoint Blockade Immunotherapy. Immunity 50, 195–211.e10. https://doi.org/10.1016/J.IMMUNI.2018.12.021

* Thommen, D.S., Koelzer, V.H., Herzig, P., Roller, A., Trefny, M., Dimeloe, S., Kiialainen, A., Hanhart, J., Schill, C., Hess, C., Prince, S.S., Wiese, M., Lardinois, D., Ho, P.C., Klein, C., Karanikas, V., Mertz, K.D., Schumacher, T.N., Zippelius, A., 2018. A transcriptionally and functionally distinct pd-1 + cd8 + t cell pool with predictive potential in non-small-cell lung cancer treated with pd-1 blockade. Nat. Med. 24, 994. https://doi.org/10.1038/s41591-018-0057-z

* Kumagai, S., Togashi, Y., Kamada, T. et al. The PD-1 expression balance between effector and regulatory T cells predicts the clinical efficacy of PD-1 blockade therapies. Nat Immunol 21, 1346–1358 (2020). https://doi.org/10.1038/s41590-020-0769-3