This package provides a foundation for the PharmacoGx, RadioGx and ToxicoGx packages. It is not intended for standalone use, only as a dependency for the aforementioned software. Its existence allows abstracting generic definitions, method definitions and class structures common to all three of the Gx suite packages.
Load the pacakge:
library(CoreGx)
library(Biobase)
library(SummarizedExperiment)
The CoreSet class is intended as a general purpose data structure for storing multiomic treatment response data. Extensions of this class have been customized for their respective fields of study. For example, the PharmacoSet class inherits from the CoreSet and is specialized for storing and analyzing drug sensitivity and perturbation experiments on cancer cell lines together with associated multiomic data for each sample treatment. The RadioSet class serves a role similar to the PharmacoSet with radiation instead of drug treatments. Finally, the ToxicoSet class is used to store toxicity data for healthy human and rat hepatocytes along with the associated multiomic profile for each treatment.
getClass("CoreSet")
## Class "CoreSet" [package "CoreGx"]
##
## Slots:
##
## Name: sensitivity annotation molecularProfiles cell
## Class: list_or_LongTable list list_or_MAE data.frame
##
## Name: datasetType perturbation curation
## Class: character list list
The annotation
slot holds the CoreSet name, the original constructor call, and
a range of metadata about the R session in which the constructor was called.
This allows easy comparison of CoreSet versions across time and ensures the
code used to generate a CoreSet is documented and reproducible.
The molecularProfiles
slot contains a list of SummarizedExperiment
objects
for each multi-omic molecular datatype available for a given experiment. Within
the SummarizedExperiments
are feature and sample annotations for each data
type.
The cell
slot contains a data.frame
with annotations for cell lines used in
the sensitivty and/or perturbation slots.
The datasetType
slot contains a character vector indicating the experiment
type the CoreSet
contains.
The sensitivty
slot contains a list of raw, curated and meta data for
sensitivity experiments.
The perturbation
slot contains a list of raw, curated and meta data for
perturbation experiments.
The curation
slot contains a list of ground truth curations sample identifiers
such as cell line names/ids, tissue names/ids, drug names/ids, etc. This slot
is to assist in curating across experiment and molecular profile slots to esnure
consistent nomenclature.
The class provides a set of standardized accessor methods which allow easy curation, annotation and retrieval of data associated with a specfic treatment response experiment. All accessors are implemented as generics to allow new methods to be defined on classes inheriting from the CoreSet.
methods(class="CoreSet")
## [1] annotation annotation<- cellInfo
## [4] cellInfo<- cellNames cellNames<-
## [7] curation curation<- datasetType
## [10] datasetType<- dateCreated dateCreated<-
## [13] fNames fNames<- featureInfo
## [16] featureInfo<- mDataNames mDataNames<-
## [19] molecularProfiles molecularProfiles<- molecularProfilesSlot
## [22] molecularProfilesSlot<- name name<-
## [25] pertNumber pertNumber<- phenoInfo
## [28] phenoInfo<- sensNumber sensNumber<-
## [31] sensitivityInfo sensitivityInfo<- sensitivityMeasures
## [34] sensitivityMeasures<- sensitivityProfiles sensitivityProfiles<-
## [37] sensitivityRaw sensitivityRaw<- sensitivitySlot
## [40] sensitivitySlot<- show subsetByFeature
## [43] subsetBySample subsetByTreatment treatmentInfo
## [46] treatmentInfo<- treatmentNames treatmentNames<-
## [49] updateObject
## see '?methods' for accessing help and source code
We have provided a sample CoreSet in this package. In the below code we load the example cSet and demonstrate a few of the accessor methods.
data(clevelandSmall_cSet)
clevelandSmall_cSet
## <CoreSet>
## Name: Cleveland
## Date Created: Wed Oct 25 17:38:42 2017
## Number of cell lines: 10
## Molecular profiles:
## A MultiAssayExperiment object of 2 listed
## experiments with user-defined names and respective classes.
## Containing an ExperimentList class object of length 2:
## [1] rna: SummarizedExperiment with 1000 rows and 9 columns
## [2] rnaseq: SummarizedExperiment with 1000 rows and 9 columns
## Treatment response:
## <LongTable>
## dim: 9 10
## assays(2): sensitivity profiles
## rownames(9): radiation:1:1 radiation:1:2 ... radiation:8:1 radiation:10:1
## rowData(3): drug1id drug1dose replicate_id
## colnames(10): CHP-212 IMR-32 KP-N-S19s ... SK-N-SH SNU-245 SNU-869
## colData(2): cellid rn
## metadata(1): experiment_metadata
Access a specific molecular profiles:
mProf <- molecularProfiles(clevelandSmall_cSet, "rna")
mProf[seq_len(5), seq_len(5)]
## NIECE_P_NCLE_RNA3_HG-U133_PLUS_2_G10_296152
## ENSG00000000003 10.280970
## ENSG00000000005 3.647436
## ENSG00000000419 11.883769
## ENSG00000000457 7.515721
## ENSG00000000460 7.808139
## GILDS_P_NCLE_RNA11_REDO_HG-U133_PLUS_2_G02_587654
## ENSG00000000003 10.304971
## ENSG00000000005 4.895494
## ENSG00000000419 11.865191
## ENSG00000000457 7.187144
## ENSG00000000460 7.789921
## BUNDS_P_NCLE_RNA5_HG-U133_PLUS_2_B11_419860
## ENSG00000000003 9.596987
## ENSG00000000005 3.793174
## ENSG00000000419 12.498285
## ENSG00000000457 8.076655
## ENSG00000000460 8.456691
## SILOS_P_NCLE_RNA9_HG-U133_PLUS_2_A04_523474
## ENSG00000000003 8.620860
## ENSG00000000005 3.674918
## ENSG00000000419 11.674671
## ENSG00000000457 6.790332
## ENSG00000000460 6.663846
## WATCH_P_NCLE_RNA8_HG-U133_PLUS_2_B04_474582
## ENSG00000000003 9.866551
## ENSG00000000005 3.748959
## ENSG00000000419 12.228260
## ENSG00000000457 7.292420
## ENSG00000000460 8.869378
Access cell-line metadata:
cInfo <- cellInfo(clevelandSmall_cSet)
cInfo[seq_len(5), seq_len(5)]
## cellid tissueid CellLine Primarysite Histology
## SK-N-FI SK-N-FI autonomic_ganglia SKNFI autonomic_ganglia neuroblastoma
## IMR-32 IMR-32 autonomic_ganglia IMR32 autonomic_ganglia neuroblastoma
## SK-N-AS SK-N-AS autonomic_ganglia SKNAS autonomic_ganglia neuroblastoma
## CHP-212 CHP-212 autonomic_ganglia CHP212 autonomic_ganglia neuroblastoma
## KP-N-S19s KP-N-S19s autonomic_ganglia KPNSI9S autonomic_ganglia neuroblastoma
Access sensitivty data:
sensProf <- sensitivityProfiles(clevelandSmall_cSet)
sensProf[seq_len(5), seq_len(5)]
## AUC_published AUC_recomputed alpha beta SF2
## radiation_CHP-212 1.833 1.3440925 0.4615728 0 0.4160333
## radiation_IMR-32 0.634 0.5490508 0.8098218 0 0.0824000
## radiation_KP-N-S19s 1.477 2.3251731 0.2974555 0 0.2469118
## radiation_MHH-NB-11 0.545 0.6260366 0.7520926 0 0.0652000
## radiation_NB1 0.774 0.7906043 0.6551282 0 0.2169389
For more information about the accessor methods available for the CoreSet
class please see the class?CoreSet
help page.
Given that the CoreSet class is intended for extension, we will show some examples of how to define a new class based on it and implement new methods for the generics provided for the CoreSet class.
Here we will define a new class, the DemoSet
, with an additional slot, the
demoSlot
. We will then view the available methods for this class as well as
define new S4 methods on it.
DemoSet <- setClass("DemoSet",
representation(demoSlot="character"),
contains="CoreSet")
getClass("DemoSet")
## Class "DemoSet" [in ".GlobalEnv"]
##
## Slots:
##
## Name: demoSlot sensitivity annotation molecularProfiles
## Class: character list_or_LongTable list list_or_MAE
##
## Name: cell datasetType perturbation curation
## Class: data.frame character list list
##
## Extends: "CoreSet"
Here we can see the class extending CoreSet
has all of the same slots as the
original CoreSet
, plus the new slot we defined: demoSlot
.
We can see which methods are available for this new class.
methods(class="DemoSet")
## [1] annotation annotation<- cellInfo
## [4] cellInfo<- cellNames cellNames<-
## [7] curation curation<- datasetType
## [10] datasetType<- dateCreated dateCreated<-
## [13] fNames fNames<- featureInfo
## [16] featureInfo<- mDataNames mDataNames<-
## [19] molecularProfiles molecularProfiles<- molecularProfilesSlot
## [22] molecularProfilesSlot<- name name<-
## [25] pertNumber pertNumber<- phenoInfo
## [28] phenoInfo<- sensNumber sensNumber<-
## [31] sensitivityInfo sensitivityInfo<- sensitivityMeasures
## [34] sensitivityMeasures<- sensitivityProfiles sensitivityProfiles<-
## [37] sensitivityRaw sensitivityRaw<- sensitivitySlot
## [40] sensitivitySlot<- show subsetByFeature
## [43] subsetBySample subsetByTreatment treatmentInfo
## [46] treatmentInfo<- treatmentNames treatmentNames<-
## [49] updateObject
## see '?methods' for accessing help and source code
We see that all the accessors defined for the CoreSet
are also defined for the
inheriting DemoSet
. These methods all assume the inherit slots have the same
structure as the CoreSet
. If this is not true, for example, if molecularProfiles
holds ExpressionSets
instead of SummarizedExperiments
, we can redefine
existing methods as follows:
clevelandSmall_dSet <- DemoSet(clevelandSmall_cSet)
class(clevelandSmall_dSet@molecularProfiles[['rna']])
## [1] "SummarizedExperiment"
## attr(,"package")
## [1] "SummarizedExperiment"
expressionSets <- lapply(molecularProfilesSlot(clevelandSmall_dSet), as, 'ExpressionSet')
molecularProfilesSlot(clevelandSmall_dSet) <- expressionSets
# Now this will error
tryCatch({molecularProfiles(clevelandSmall_dSet, 'rna')},
error=function(e)
print(paste("Error: ", e$message)))
## [1] "Error: unable to find an inherited method for function 'assay' for signature '\"ExpressionSet\", \"numeric\"'"
Since we changed the data in the molecularProfiles
slot of the DemoSet
,
the original method from CoreGx
no longer works. Thus we get an error when
trying to access that slot. To fix this we will need to set a new S4 method
for the molecularProfiles generic function defined in CoreGx
.
setMethod(molecularProfiles,
signature("DemoSet"),
function(object, mDataType) {
pData(object@molecularProfiles[[mDataType]])
})
This new method is now called whenever we use the molecularProfiles
method
on a DemoSet
. Since the new method uses ExpressionSet
accessor methods
instead of SummarizedExperiment
accessor methods, we now expect to be able
to access the data in our modified slot.
# Now we test our new method
mProf <- molecularProfiles(clevelandSmall_dSet, 'rna')
head(mProf)[seq_len(5), seq_len(5)]
## samplename
## NIECE_P_NCLE_RNA3_HG-U133_PLUS_2_G10_296152 NIECE_p_NCLE_RNA3_HG-U133_Plus_2_G10_296152
## GILDS_P_NCLE_RNA11_REDO_HG-U133_PLUS_2_G02_587654 GILDS_p_NCLE_RNA11_Redo_HG-U133_Plus_2_G02_587654
## BUNDS_P_NCLE_RNA5_HG-U133_PLUS_2_B11_419860 BUNDS_p_NCLE_RNA5_HG-U133_Plus_2_B11_419860
## SILOS_P_NCLE_RNA9_HG-U133_PLUS_2_A04_523474 SILOS_p_NCLE_RNA9_HG-U133_Plus_2_A04_523474
## WATCH_P_NCLE_RNA8_HG-U133_PLUS_2_B04_474582 WATCH_p_NCLE_RNA8_HG-U133_Plus_2_B04_474582
## filename
## NIECE_P_NCLE_RNA3_HG-U133_PLUS_2_G10_296152 NIECE_p_NCLE_RNA3_HG-U133_Plus_2_G10_296152.CEL.gz
## GILDS_P_NCLE_RNA11_REDO_HG-U133_PLUS_2_G02_587654 GILDS_p_NCLE_RNA11_Redo_HG-U133_Plus_2_G02_587654.CEL.gz
## BUNDS_P_NCLE_RNA5_HG-U133_PLUS_2_B11_419860 BUNDS_p_NCLE_RNA5_HG-U133_Plus_2_B11_419860.CEL.gz
## SILOS_P_NCLE_RNA9_HG-U133_PLUS_2_A04_523474 SILOS_p_NCLE_RNA9_HG-U133_Plus_2_A04_523474.CEL.gz
## WATCH_P_NCLE_RNA8_HG-U133_PLUS_2_B04_474582 WATCH_p_NCLE_RNA8_HG-U133_Plus_2_B04_474582.CEL.gz
## chiptype
## NIECE_P_NCLE_RNA3_HG-U133_PLUS_2_G10_296152 HG-U133_Plus_2
## GILDS_P_NCLE_RNA11_REDO_HG-U133_PLUS_2_G02_587654 HG-U133_Plus_2
## BUNDS_P_NCLE_RNA5_HG-U133_PLUS_2_B11_419860 HG-U133_Plus_2
## SILOS_P_NCLE_RNA9_HG-U133_PLUS_2_A04_523474 HG-U133_Plus_2
## WATCH_P_NCLE_RNA8_HG-U133_PLUS_2_B04_474582 HG-U133_Plus_2
## hybridization.date
## NIECE_P_NCLE_RNA3_HG-U133_PLUS_2_G10_296152 07/15/08
## GILDS_P_NCLE_RNA11_REDO_HG-U133_PLUS_2_G02_587654 2010-05-21
## BUNDS_P_NCLE_RNA5_HG-U133_PLUS_2_B11_419860 12/19/08
## SILOS_P_NCLE_RNA9_HG-U133_PLUS_2_A04_523474 2009-12-08
## WATCH_P_NCLE_RNA8_HG-U133_PLUS_2_B04_474582 2009-08-14
## hybridization.hour
## NIECE_P_NCLE_RNA3_HG-U133_PLUS_2_G10_296152 12:54:10
## GILDS_P_NCLE_RNA11_REDO_HG-U133_PLUS_2_G02_587654 16:45:06Z
## BUNDS_P_NCLE_RNA5_HG-U133_PLUS_2_B11_419860 11:43:19
## SILOS_P_NCLE_RNA9_HG-U133_PLUS_2_A04_523474 20:44:59Z
## WATCH_P_NCLE_RNA8_HG-U133_PLUS_2_B04_474582 17:15:45Z
We can see our new method works! In order to finish updating the methods for our new class, we would have to redefine all the methods which access the modified slot.
However, additional work needs to be done to define accessors for the new
demoSlot
. Since no generics are available in CoreGx to access this slot,
we need to first define a generic, then implement methods which dispatch on
the 'DemoSet' class to retrieve data in the slot.
# Define generic for setter method
setGeneric('demoSlot<-', function(object, value) standardGeneric('demoSlot<-'))
## [1] "demoSlot<-"
# Define a setter method
setReplaceMethod('demoSlot',
signature(object='DemoSet', value="character"),
function(object, value) {
object@demoSlot <- value
return(object)
})
# Lets add something to our demoSlot
demoSlot(clevelandSmall_dSet) <- c("This", "is", "the", "demoSlot")
# Define generic for getter method
setGeneric('demoSlot', function(object, ...) standardGeneric("demoSlot"))
## [1] "demoSlot"
# Define a getter method
setMethod("demoSlot",
signature("DemoSet"),
function(object) {
paste(object@demoSlot, collapse=" ")
})
# Test our getter method
demoSlot(clevelandSmall_dSet)
## [1] "This is the demoSlot"
Now you should have all the knowledge you need to extend the CoreSet class for use in other treatment-response experiments!
For more information about this package and the possibility of collaborating on its extension please contact benjamin.haibe.kains@utoronto.ca.
sessionInfo()
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.14-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.14-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] knitr_1.36 data.table_1.14.2
## [3] CoreGx_1.6.0 SummarizedExperiment_1.24.0
## [5] Biobase_2.54.0 GenomicRanges_1.46.0
## [7] GenomeInfoDb_1.30.0 IRanges_2.28.0
## [9] S4Vectors_0.32.0 MatrixGenerics_1.6.0
## [11] matrixStats_0.61.0 BiocGenerics_0.40.0
## [13] formatR_1.11 BiocStyle_2.22.0
##
## loaded via a namespace (and not attached):
## [1] lsa_0.73.2 bitops_1.0-7
## [3] BumpyMatrix_1.2.0 SnowballC_0.7.0
## [5] tools_4.1.1 backports_1.2.1
## [7] bslib_0.3.1 DT_0.19
## [9] utf8_1.2.2 R6_2.5.1
## [11] KernSmooth_2.23-20 DBI_1.1.1
## [13] colorspace_2.0-2 gridExtra_2.3
## [15] tidyselect_1.1.1 compiler_4.1.1
## [17] shinyjs_2.0.0 DelayedArray_0.20.0
## [19] bookdown_0.24 slam_0.1-48
## [21] sass_0.4.0 caTools_1.18.2
## [23] scales_1.1.1 checkmate_2.0.0
## [25] relations_0.6-10 stringr_1.4.0
## [27] digest_0.6.28 rmarkdown_2.11
## [29] XVector_0.34.0 pkgconfig_2.0.3
## [31] htmltools_0.5.2 highr_0.9
## [33] fastmap_1.1.0 limma_3.50.0
## [35] htmlwidgets_1.5.4 rlang_0.4.12
## [37] shiny_1.7.1 visNetwork_2.1.0
## [39] jquerylib_0.1.4 generics_0.1.1
## [41] jsonlite_1.7.2 BiocParallel_1.28.0
## [43] gtools_3.9.2 dplyr_1.0.7
## [45] RCurl_1.98-1.5 magrittr_2.0.1
## [47] GenomeInfoDbData_1.2.7 Matrix_1.3-4
## [49] Rcpp_1.0.7 munsell_0.5.0
## [51] fansi_0.5.0 lifecycle_1.0.1
## [53] piano_2.10.0 stringi_1.7.5
## [55] yaml_2.2.1 zlibbioc_1.40.0
## [57] gplots_3.1.1 grid_4.1.1
## [59] parallel_4.1.1 promises_1.2.0.1
## [61] shinydashboard_0.7.2 crayon_1.4.1
## [63] lattice_0.20-45 pillar_1.6.4
## [65] fgsea_1.20.0 igraph_1.2.7
## [67] marray_1.72.0 fastmatch_1.1-3
## [69] glue_1.4.2 evaluate_0.14
## [71] BiocManager_1.30.16 MultiAssayExperiment_1.20.0
## [73] vctrs_0.3.8 httpuv_1.6.3
## [75] gtable_0.3.0 purrr_0.3.4
## [77] assertthat_0.2.1 ggplot2_3.3.5
## [79] xfun_0.27 mime_0.12
## [81] xtable_1.8-4 later_1.3.0
## [83] tibble_3.1.5 sets_1.0-19
## [85] cluster_2.1.2 ellipsis_0.3.2