Introduction
The goal of this tutorial is to show an example where dynamic logic model is used to predict drug response.
The tutorial is based on the paper
Eduati et al (2017) Drug resistance mechanisms in colorectal cancer dissected with cell type-specific dynamic logic models. Cancer Research. DOI: 10.1158/0008-5472.CAN-17-0078
On Github: https://github.com/saezlab/CRC-pathway-biomarkers
We investigate here the drug-response of colorectal cancer cell lines. For this, we use drug response data and a signaling dataset.
The Genomics of Drug Sensitivity in Cancer (GDSC), https://www.cancerrxgene.org/ offers drug response data for more than a 1000 human cancer cell lines, for hundreds of drugs. A small part of these data is can be found in "./data/IC50_GDSC.csv".
The perturbation dataset contains the short time signaling response of 14 colorectal cancer cell lines, where 14 phosphoproteins are measured under 43 perturbation conditions (5 stimuli, 7 inhibitors).
First, we construct signaling models based on the perturbation data to the cell lines, here we use the CNORode modelling package. In the next step, we will associate model features to drug response to see why certain cell lines respond to certain drugs and others do not. Here we use a linear modeling framework.
CNORode
CNORode is a member of the CellNOptR logic based tool family. It can translate the network to ordinary differential equation (ODE) model, fit the model parameters to data and make predictions.
Dependencies
# To install the packages from Bioconductor you will need the package BiocManager
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager", repos = "https://ftp.gwdg.de/pub/misc/cran/")
}
# installs CellNOptR, CNORode and MEIGOR from Bioconductor
if (!requireNamespace("CellNOptR", quietly = TRUE)) {
BiocManager::install("CellNOptR")
}
if (!requireNamespace("CNORode", quietly = TRUE)) {
BiocManager::install("CNORode")
}
if (!requireNamespace("MEIGOR", quietly = TRUE)) {
BiocManager::install("MEIGOR")
}
# installs data manipulation packages from the tidyverse
if (!requireNamespace("dplyr", quietly = TRUE)) {
install.packages("dplyr", repos = "https://ftp.gwdg.de/pub/misc/cran/")
}
if (!requireNamespace("readr", quietly = TRUE)) {
install.packages("readr", repos = "https://ftp.gwdg.de/pub/misc/cran/")
}
if (!requireNamespace("tidyr", quietly = TRUE)) {
install.packages("tidyr", repos = "https://ftp.gwdg.de/pub/misc/cran/")
}
if (!requireNamespace("ggplot2", quietly = TRUE)) {
install.packages("ggplot2", repos = "https://ftp.gwdg.de/pub/misc/cran/")
}
Make sure to import the libraries
library(CellNOptR)
library(CNORode)
library(MEIGOR)
library(dplyr)
library(tidyr)
library(readr)
library(ggplot2)
PART I: DRUG response exploration
IC50 <- read_csv("./data/tutorial_3/IC50_GDSC.csv") %>%
rename("cell_line" = "...1")
print(IC50)
## # A tibble: 14 × 31
## cell_line Gefitinib RDEA119 `CI-1040` Afatinib `Nutlin-3a` `PD-0325901`
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 CAR1 1.36 1.34 3.27 -0.831 5.51 -0.235
## 2 CCK81 -0.659 NA 2.74 -1.61 0.551 -1.72
## 3 COLO320HSR 1.46 3.78 4.15 1.22 4.26 0.658
## 4 DIFI -3.26 0.00908 1.61 -5.31 4.44 -1.97
## 5 HCT116 1.03 -0.510 1.46 -0.0704 1.14 -1.56
## 6 HT115 0.444 0.946 2.96 -0.640 4.55 -1.41
## 7 HT29 1.91 -2.24 0.230 0.849 4.68 -3.74
## 8 SKCO1 NA -1.49 NA NA 1.97 -4.58
## 9 SNUC2B 0.375 1.55 1.94 0.113 3.45 -1.44
## 10 SNUC5 2.13 0.246 1.79 0.739 5.01 -3.26
## 11 SW1116 NA NA NA NA NA NA
## 12 SW1463 NA NA NA NA NA NA
## 13 SW620 1.51 -0.802 -0.104 1.36 4.28 -3.73
## 14 SW837 NA NA NA NA NA NA
## # … with 24 more variables: SB590885 <dbl>, AZD6244 <dbl>, BMS-536924 <dbl>,
## # Cetuximab <dbl>, HG-5-88-01 <dbl>, (5Z)-7-Oxozeaenol <dbl>,
## # Trametinib <dbl>, Dabrafenib <dbl>, Afatinib (rescreen) <dbl>,
## # AZD6244.1 <dbl>, RDEA119 (rescreen) <dbl>, BMS-754807 <dbl>, OSI-906 <dbl>,
## # BMS-345541 <dbl>, Ruxolitinib <dbl>, LY317615 <dbl>, XMD14-99 <dbl>,
## # CP724714 <dbl>, CH5424802 <dbl>, EKB-569 <dbl>, OSI-930 <dbl>,
## # QL-XI-92 <dbl>, EX-527 <dbl>, THZ-2-102-1 <dbl>
IC50 %>%
pivot_longer(names_to = "drug_name", values_to = "log_IC50", -cell_line) %>%
ggplot() +
geom_point(aes(drug_name, log_IC50, col = cell_line)) +
coord_flip() +
theme_bw() +
ggtitle("Drug response of 14 CRC cell lines")
## Warning: Removed 49 rows containing missing values (geom_point).
Form the raw IC50 values we can see that there are some drugs that are more effective (Trametinib) than others, like XMD14-99. There are also cell-line differences, for example, DIFI shows stronger sensitivity to Afatinib than any other cell lines. What could be the reason for this?
PART II: cell-line models
The goal of part II is to build a cell-line specific model from the perturbation data using CNORode.
This model is an ordinary differential equation (ODE) model, where the equation for each state (\(x_A\)) can be written as \[\frac{dx_A}{dt} = \tau_A(B(f_1(x),f_2(x),...)-x_A) \] here
\(f_i(x)\) represents the incoming edges on node \(A\) with a transfer function. This transfer function typically has an S-shape. \[ f(x) = \frac{x^n}{x^n+k^n}\]
\(B\) is a Boolean homologue function. This is responsible to combine the incoming edges with the OR and AND gates. For example, an OR gate is represented by \(x_1\cdot x_2\).
\(\tau\) is a time parameter, that tells how fast node A adapts to the input.
the model has free parameters: a \(\tau\) for each node, and (\(k\),\(n\)) for each edge. These are found by optimisation.
The main differences are that in ODE models the states are continuous values, therefore it is quantitative, not only qualitative like a Boolean model. Further, here we haVe to find the specific edge and node parameters.
What do we need to build and simulate a differential equation model?
- the equations are derived from the network graph
- inputs: given in the MIDAS description
- Initial conditions for each state in each experiment
In this example, the baseline is set to 0.5. A value of 1 means full activation and 0 means full inhibition of the node.
Perturbation data
The following heatmap shows an overview on the perturbation data. The first block outlines the combinations of treatment. Then each other block represents the response of a cell line. Different columns within a block shows the different phosphoprotein markers.
In the tutorial we make a single model for the first cell line COLO320HSR.
Model a single cell line
Similarly to the previous tutorial with CellNopt, here we also start by importing a prior knowledge network and the perturbation data in MIDAS format.
# load Prior Knowledge Network (PKN)
pknmodel <- readSIF("./data/tutorial_3/PKN.sif")
# load normalised perturbation data
# select MIDAS file for the desired cell line
MIDASfile <- "./data/tutorial_3/processed/MIDAS/MD-COLO320HSR_Ktuned_v1_n4_all_noEGFRi_CNORode.csv"
Mydata <- readMIDAS(MIDASfile = MIDASfile, verbose = FALSE)
cnolist <- makeCNOlist(Mydata, subfield = FALSE)
## [1] "Please be aware that if you only have some conditions at time zero (e.g.only inhibitor/no inhibitor), the measurements for these conditions will be copied across matching conditions at t=0"
cnolist$valueStimuli[cnolist$valueStimuli == 0] <- 0.5
Show the network first
plotModel(pknmodel, cnolist)
As in CellNOPt:
- green nodes are stimulated in some experiments
- blue nodes are measured
- white nodes are modelled, but not measured
- red nodes or red bordered nodes are occasionally inhibited
- black edges represents activation, red T-shaped arrows represents inhibition
Then the data in CellNOpt format:
plotCNOlist(cnolist)
The data is very large, therefore is hard to see the details, but we can notice as some nodes increases their activity at the final time (30 mins).
# compress the network (no expansion, only OR gates are considered)
model <- preprocessing(
data = cnolist, model = pknmodel,
compression = TRUE, expansion = FALSE
)
## [1] "The following species are measured: AKT, EGFR, ERK, GSK3, IkBa, JNK, MEK, PI3K, RPS6, RSKp90, SMAD2, cJun, mTOR, p38, TAK1, TGFRb, IRS1, M3K1, MP2K4, S6K, TBK1, IKK, RASK, BRAF, PAK1"
## [1] "The following species are stimulated: PLX, EGF, HGF, IGF1, TGFb, TNFa"
## [1] "The following species are inhibited: IKK, MEK, PI3K, BRAF, TAK1, TBK1, mTOR"
## [1] "The following species are not observable and/or not controllable: "
# set initial parameters
# (here parameters 'k' and 'tau' are optimised and 'n' fixed to 3)
ode_parameters <- createLBodeContPars(model,
LB_n = 1, LB_k = 0, LB_tau = 0,
UB_n = 3, UB_k = 1, UB_tau = 1,
default_n = 3,
default_k = 0.5,
default_tau = 0.01,
opt_n = FALSE, opt_k = TRUE, opt_tau = TRUE,
random = TRUE
)
# PLX -> BRAF is an artificial regulation used to model paradoxical effect of
# PLX4720, which works as selective BRAF inhibitor in cell-lines where BRAF is
# mutated in V600E (i.e. HT29 and SNUC5 in our panel), but induces a paradoxical
# activation of wild type BRAF cells (modeled as stimulus on those cell lines)
ode_parameters$parValues[which(ode_parameters$parNames == "PLX_k_BRAF")] <- 0.5
ode_parameters$index_opt_pars <- setdiff(
ode_parameters$index_opt_pars,
which(ode_parameters$parNames == "PLX_k_BRAF")
)
## Parameter Optimization
# essm
paramsSSm <- defaultParametersSSm()
paramsSSm$local_solver <- "DHC"
paramsSSm$maxtime <- 30
# 36000;
paramsSSm$transfer_function <- 4
The actual optimisation takes around 10 mins, instead of the 30 sec. So, instead of running it here, we just load the results:
opt_pars <- parEstimationLBode(cnolist, model,
method = "essm",
ode_parameters = ode_parameters,
paramsSSm = paramsSSm
)
# write_rds(opt_pars,"data/tutorial_3/opt_pars_30sec.RDS")
opt_pars <- read_rds("data/tutorial_3/opt_pars_30sec.RDS")
Plot the fit of the model:
sim_res <- CNORode::plotLBodeFitness(cnolist, model, ode_parameters = opt_pars)