project_SadeFeldman.Rmd

---
title: TIL contexture predicts checkpoint blockade response in melanoma patients
date: "`r Sys.Date()`"
author: "M. Andreatta and S. Carmona"
output:
  rmdformats::readthedown:
    self-contained: true
    highlight: haddock
    thumbnails: false
    css: styles.css
knit: (function(input_file, encoding) {
  out_dir <- 'docs';
  rmarkdown::render(input_file, encoding=encoding,
  output_file=file.path(dirname(input_file), out_dir, 'SadeFeldman_ortho.html'))})
---


```{r setup, echo=FALSE, warning=FALSE}
#install.packages("rmdformats")
#Template markdown setup
library(knitr)
library(rmdformats)
library(formatR)

## Global options
options(max.print="75")
opts_chunk$set(echo=TRUE,
	             cache=TRUE,
               cache.lazy=FALSE,
               prompt=FALSE,
               tidy=TRUE,
               comment=NA,
               message=FALSE,
               warning=FALSE,
               dev='png')
opts_knit$set(width=75)

```


In this case study, we will use ProjecTILs to interpret **human** scRNA-seq T cell data in the context of a **murine TIL atlas**. We are going to use the single-cell dataset generated by Sade-Feldman [GSE120575](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE120575) to illustrate **interspecies mapping** of human cells on a stable murine atlas using gene orthologs.

# Background

It remains unclear why some patients respond to checkpoint blockade therapy while others do not.

In the study by [Sade-Feldman et al. (2018) Cell](https://pubmed.ncbi.nlm.nih.gov/30388456/), the authors characterize transcriptional profiles of immune cells from melanoma patients before and after immune checkpoint blockade, with the goal to identify factors that associate with success or failure of checkpoint therapy. They reported that the balance between two CD8 T cell states found in tumor tissue (essentially, memory-like vs exhausted-like) is linked to tumor regression following checkpoint blockade. And in particular, that the frequency of TCF7+ CD8+ T cells in tumors predicts response and better survival.

The single-cell expression data [GSE120575](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE120575) consists of CD45+ single cells from 48 tumor biopsies of melanoma patients before or after checkpoint inhibitors treatment, sequenced using the Smart-seq2 protocol. Meta-data on response to therapy (Responder vs. Non-responder) is also available on the same GEO identifier.


# R Environment
Check & load R packages
```{r message=F, warning=F, results=F, cache=F}
Sys.setenv(R_REMOTES_NO_ERRORS_FROM_WARNINGS="true")

library(renv)
renv::restore()

library(Seurat)
library(ggplot2)
library(ProjecTILs)
library(gridExtra)
library(reshape2)
```

# scRNA-seq data preparation

Download the count matrix and metadata from Gene Expression Omnibus (GEO), and store as Seurat object.
```{r message=F, warning=F, cache=F}
cached.object <- "SadeFeldman.seurat.rds"

if(!file.exists(cached.object)){
  
  library(GEOquery)
  geo_acc <- "GSE120575"
  datadir <- "input/SadeFeldman"
  gse <- getGEO(geo_acc)
  
  series <- paste0(geo_acc, "_series_matrix.txt.gz")
  
  system(paste0("mkdir -p ", datadir))
  getGEOSuppFiles(geo_acc,baseDir=datadir)
  
  ##Load expression matrix and metadata
  exp.mat <- read.delim(sprintf("%s/%s/GSE120575_Sade_Feldman_melanoma_single_cells_TPM_GEO.txt.gz", datadir, geo_acc), header=F, sep="\t")
  genes <- exp.mat[c(-1,-2),1]
  cells <- as.vector(t(exp.mat[1,2:16292]))
  samples <- as.factor(t(exp.mat[2,2:16292]))
  
  exp.mat <- exp.mat[c(-1,-2), 2:16292]
  colnames(exp.mat) <- cells
  rownames(exp.mat) <- genes
  
  meta <- read.delim(sprintf("%s/%s/GSE120575_patient_ID_single_cells.txt.gz", datadir, geo_acc), header = T, sep = "\t", skip = 19, nrows = 16291)
  meta <- meta[,1:7]
  
  treat <- factor(ifelse(grepl("Post", samples),'Post','Pre'))
  response <- factor(meta$characteristics..response)
  therapy <- factor(meta$characteristics..therapy)
  
  ##Create Seurat object and add meta data
  query.object <- CreateSeuratObject(counts = exp.mat, project = "SadeFeldman", min.cells = 10)
  rm(exp.mat)
  query.object@meta.data$Sample <- samples
  query.object@meta.data$Time <- treat
  query.object@meta.data$Response <- response
  query.object@meta.data$Therapy <- therapy
  
  saveRDS(query.object, file=cached.object)
} else {
  query.object <- readRDS(cached.object)
}
```

Some basic statistics - cells per group (Pre vs. Post, Therapy, Responder vs. Non-responder)
```{r}
query.object <- UpdateSeuratObject(query.object)
table(query.object$Time)
table(query.object$Therapy)
table(query.object$Response)
```

Select only baseline (Pre-treatment) samples
```{r}
query.object <- subset(query.object, subset = Time == "Pre")
```


# Reference projection

Load reference TIL map - if it's not present in the working directory, it will be downloaded from the repository
```{r}
ref <- load.reference.map()
```

Run ProjecTILs projection algorithm - note that human genes will be converted to mouse orthologs. Also, non-T cells are automatically detected and removed before projection.
```{r message=FALSE, warning=FALSE}
query.projected <- Run.ProjecTILs(query.object, ref=ref, split.by = "Sample", ncores=4)
```

Plot global projection of human TIL data over the reference in UMAP space.
```{r, cache=F}
plot.projection(ref, query.projected, linesize = 0.5, pointsize = 0.5)
```

Interestingly, expression of marker genes of projected human TILs correspond fairly well to those of the murine reference map: e.g. Pdcd1, Havcr2 and Entpd1 expression in terminally exhausted CD8 TILs (CD8_Tex), co-expression of Pdcd1, Tox,  and Tcf7 (at a modest level) in the CD8_Tpex state with a higher expression of Ifng; high expression of Tcf7, Ccr7 with low expression of Pdcd1 or cytotoxic molecules in the Naive-like states (that might include central memory cells); Cxcr5 and Tox coexpression in follicular helper T cells; Foxp3, Tox, Havcr2, Entpd1 in regulatory CD4 T cells, etc.
```{r, fig.width=16, fig.height=10, cache=FALSE}
query.list <- SplitObject(query.projected, split.by = "Response")
plot.states.radar(ref, query=query.list, min.cells = 50, 
                  genes4radar = c("Foxp3","Cd4","Cd8a","Tcf7","Ccr7","Gzmb","Pdcd1",
                                  "Havcr2","Tox","Entpd1","Cxcr5","Ifng","Cxcl13","Xcl1","Itgae"))

```

Note that projections and comparisons are performed in the ortholog space of murine genes - to check the names of human-mouse orthologs you can examine the conversion table for genes of interest:
```{r}
data(Hs2Mm.convert.table)
which.genes <- c("TCF7","GZMB","CD8B","PDCD1","ITGAE")

Hs2Mm.convert.table[Hs2Mm.convert.table$Gene.HS %in% which.genes, ]
```

# Response to therapy

Now to the interesting part.

Let's visualize the projection and TIL contexture of tumors that responded vs. did not respond to immune checkpoint blockade:
```{r fig.height=12, fig.width=12, fig.align="center",  cache=FALSE}
query.list <- SplitObject(query.projected, split.by = "Response")

pll <- list()

pll[[1]] <- plot.projection(ref, query.list[["Responder"]],
                            linesize = 0.5, pointsize = 0.5) + ggtitle("Responder") + NoLegend()
pll[[2]] <- plot.statepred.composition(ref, query.list[["Responder"]],
                                       metric="Percent") +ggtitle("Responder") + ylim(0,40)
pll[[3]] <- plot.projection(ref, query.list[["Non-responder"]],
                            linesize = 0.5, pointsize = 0.5) + ggtitle("Non-responder") + NoLegend()
pll[[4]] <- plot.statepred.composition(ref, query.list[["Non-responder"]],
                                       metric="Percent") + ggtitle("Non-responder") + ylim(0,40)

grid.arrange(grobs=pll, ncol=2, nrow=2, widths=c(1.5,1))
```


In non-responders, there is a clear enrichment in the terminally exhausted CD8 state (CD8_Tex), while responders are enriched in Naive-like states.

To better examine differences in TIL contexture between responders and non-responders, we can visualize the fold-change of T cell state frequency between the two groups:
```{r fig.heigh=8, fig.width=5, fig.align="center",  cache=FALSE}

which.types <- table(query.projected$functional.cluster)>20

stateColors_func <- c("#edbe2a","#A58AFF","#53B400","#F8766D","#00B6EB","#d1cfcc","#FF0000","#87f6a5","#e812dd")
states_all <- levels(ref$functional.cluster)
names(stateColors_func) <- states_all
cols_use <- stateColors_func[names(which.types)][which.types]

#Responder vs non Responder
query.projected$functional.cluster <- factor(query.projected$functional.cluster, levels=states_all)
query.list <- SplitObject(query.projected, split.by = "Response")

norm.c <- table(query.list[["Non-responder"]]$functional.cluster)/sum(table(query.list[["Non-responder"]]$functional.cluster))
norm.q <- table(query.list[["Responder"]]$functional.cluster)/sum(table(query.list[["Responder"]]$functional.cluster))

foldchange <- norm.q[which.types]/norm.c[which.types]
foldchange <- sort(foldchange,decreasing = T)

tb.m <- reshape2::melt(foldchange)
colnames(tb.m) <- c("Cell_state","Fold_change")
pll <- list()
ggplot(tb.m, aes(x=Cell_state, y=Fold_change, fill=Cell_state)) + geom_bar(stat="identity") +
  scale_fill_manual(values=cols_use) + geom_hline(yintercept = 1) + scale_y_continuous(trans='log2')  + ggtitle("Responder vs. Non-responder") + theme_bw() +
  theme(axis.text.x=element_blank(), legend.position="left")

```

Indeed, CD4 and CD8 Naive-like states are the most enriched in responders compared to non-responders, while terminally exhausted Entpd1+ Havcr2+ CD8 TILs are the most under-represented, confirming the observation of the original paper that a higher frequency of TCF7+ CD8 TILs is associated with response to immune checkpoint therapy. 

# Conclusions

Taking advantage of the ortholog mapping functionality of `ProjecTILs`, we have illustrated how to effortlessly analyze human scRNA-seq data in the context of a reference murine TIL map. Gene expression profiles confirmed that T cells are accurately projected in major CD4+ and CD8+ categories, as well as in more specific subtypes (CD8_Tex, CD8_Tpex, Naive-like, Follicular helper, Th1, and T regulatory CD4+ cells). 

Comparison of transcriptional profiles and cell states of TILs at baseline from responding vs non-responding melanoma patients confirmed the original observation by Sade-Feldman et al. that the **frequency of TCF7+ CD8 TILs correlates with checkpoint therapy responsiveness**.

However, the TCF7+ CD8 TIL population associated to response seems to correspond to a **naive-like state and not to the (PD-1+ TOX+) Precursor exhausted state** previously characterized in murine cancer and chronic infection models (Siddiqui et al. 2019; Miller et al. 2019). 

Moreover, this naive-like TIL population is **unlikely to be tumor-specific**, especially in the absence of strong evidence for clonal expansion, as the frequency of tumor-reactive cells among PD-1- CD8 TILs has been shown to be very low in melanoma tumors (Gros et al. 2014).

This might seem at odds with other studies showing that the presence of PD-1+ 'exhausted-like' CD8 T cells are predictive of response to checkpoint blockade in melanoma and non-small cell lung cancer (Daud et al. 2016; Thommen et al. 2018). However, in this dataset most tumors from both responders and non-responders did contain a pool of exhausted-like TILs expressing PD-1 (PDCD1), CD39 (ENTPD1), CXCL13, CD103 (ITGAE), even if its frequency was higher among non-responders.

Therefore, the presence of a pool of PD-1+ CD8 T cells in the tumor might be required, but not sufficient, for therapeutic success. Other factors, including total amount of CD8 TIL infiltration, spatial distribution, etc. might be correlated with the presence of naive-like CD8 TILs and with improved response. For example, we also observed that responding tumors had higher frequencies of Th1-like and naive-like CD4 T cells and lower frequency of regulatory CD4 T cells, which might also contribute to the improved anti-tumor response following immunotherapy.

In summary, ProjecTILs analysis of human T cell data in the context of a stable atlas provides a stable framework to compare samples across groups and conditions, and gives a more complete picture of the T cell states that are associated with immunotherapy response. While we have shown that robust ortholog signals can be extracted by projection of human data onto a reference mouse atlas, human-mouse mapping will also be beneficial towards the construction of stable human atlases, where inter-individual variability represents a major hurdle. Human T cell reference maps are now also available for [CD8 T cells](https://doi.org/10.6084/m9.figshare.21931875) and [CD4 T cells](https://doi.org/10.6084/m9.figshare.21981536) - you may experiment with analyzing these data also in the context of these maps (see also [this case study](https://carmonalab.github.io/ProjecTILs_CaseStudies/Bassez_BC.html)).


# Further reading

Dataset original publication - [Sade-Feldman et al. (2018) Cell](https://pubmed.ncbi.nlm.nih.gov/30388456/)

ProjecTILs case studies - [INDEX](https://carmonalab.github.io/ProjecTILs_CaseStudies/) - [Repository](https://github.com/carmonalab/ProjecTILs_CaseStudies)

The ProjecTILs method [Andreatta et. al (2021) Nat. Comm.](https://www.nature.com/articles/s41467-021-23324-4) and [code](https://github.com/carmonalab/ProjecTILs)


# References


* Daud, A.I., Loo, K., Pauli, M.L., Sanchez-Rodriguez, R., Sandoval, P.M., Taravati, K., Tsai, K., Nosrati, A., Nardo, L., Alvarado, M.D., Algazi, A.P., Pampaloni, M.H., Lobach, I.V., Hwang, J., Pierce, R.H., Gratz, I.K., Krummel, M.F., Rosenblum, M.D., 2016. Tumor immune profiling predicts response to anti-PD-1 therapy in human melanoma. J. Clin. Invest. 126, 3447–3452. https://doi.org/10.1172/JCI87324

* Gros, A., Robbins, P.F., Yao, X., Li, Y.F., Turcotte, S., Tran, E., Wunderlich, J.R., Mixon, A., Farid, S., Dudley, M.E., Hanada, K., Almeida, J.R., Darko, S., Douek, D.C., Yang, J.C., Rosenberg, S. a, 2014. PD-1 identifies the patient-specific in filtrating human tumors. J. Clin. Invest. 124, 2246–59. https://doi.org/10.1172/JCI73639.2246

* Miller, B.C., Sen, D.R., Al Abosy, R., Bi, K., Virkud, Y.V., LaFleur, M.W., Yates, K.B., Lako, A., Felt, K., Naik, G.S., Manos, M., Gjini, E., Kuchroo, J.R., Ishizuka, J.J., Collier, J.L., Griffin, G.K., Maleri, S., Comstock, D.E., Weiss, S.A., Brown, F.D., Panda, A., Zimmer, M.D., Manguso, R.T., Hodi, F.S., Rodig, S.J., Sharpe, A.H., Haining, W.N., 2019. Subsets of exhausted CD8+ T cells differentially mediate tumor control and respond to checkpoint blockade. Nat. Immunol. 20, 326–336. https://doi.org/10.1038/s41590-019-0312-6

* Siddiqui, I., Schaeuble, K., Chennupati, V., Fuertes Marraco, S.A., Calderon-Copete, S., Pais Ferreira, D., Carmona, S.J., Scarpellino, L., Gfeller, D., Pradervand, S., Luther, S.A., Speiser, D.E., Held, W., 2019. Intratumoral Tcf1+PD-1+CD8+ T Cells with Stem-like Properties Promote Tumor Control in Response to Vaccination and Checkpoint Blockade Immunotherapy. Immunity 50, 195–211.e10. https://doi.org/10.1016/J.IMMUNI.2018.12.021

* Thommen, D.S., Koelzer, V.H., Herzig, P., Roller, A., Trefny, M., Dimeloe, S., Kiialainen, A., Hanhart, J., Schill, C., Hess, C., Prince, S.S., Wiese, M., Lardinois, D., Ho, P.C., Klein, C., Karanikas, V., Mertz, K.D., Schumacher, T.N., Zippelius, A., 2018. A transcriptionally and functionally distinct pd-1 + cd8 + t cell pool with predictive potential in non-small-cell lung cancer treated with pd-1 blockade. Nat. Med. 24, 994. https://doi.org/10.1038/s41591-018-0057-z