optimalFlowData is a package containing 40 simulated flow cytometry datasets, saved as data frames, used for testing and developping examples for the package optimalFlow based on the results in del Barrio et al. (2019).
The simulated cytometries are based on data that come from flow cytometry measurements obtained following the Euroflow protocols and kindly provided by Centro de Investigación del Cancer (CIC) in Salamanca, Spain. The artificial cytometries mimic 31 cytometries from healthy individuals and 9 cytometries from patients with different types of cancer.
Installation procedure:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("optimalFlowData")
library(optimalFlowData)
head(Cytometry1)
## CD19/TCRgd:PE Cy7-A LOGICAL CD38:APC H7-A LOGICAL CD3:APC-A LOGICAL
## 1 3808 5814 3060
## 2 2848 5584 3472
## 3 3008 5103 2814
## 4 3283 5270 2660
## 5 3600 6056 4136
## 6 3271 6091 3876
## CD4+CD20:PB-A LOGICAL CD45:PO-A LOGICAL CD56+IgK:PE-A LOGICAL
## 1 2449 4620 5498
## 2 2388 4391 6064
## 3 2017 4440 5350
## 4 1877 4339 5634
## 5 1198 4666 5963
## 6 2422 4506 5791
## CD5:PerCP Cy5-5-A LOGICAL CD8+IgL:FITC-A LOGICAL FSC-A LINEAR
## 1 3112 5200 1780
## 2 1691 5491 2104
## 3 3531 4982 2034
## 4 2358 5010 1762
## 5 2272 5246 2033
## 6 2161 5094 1657
## SSC-A Exp-SSC Low Population ID (name)
## 1 1626 Basophils
## 2 1769 Basophils
## 3 1721 Basophils
## 4 1211 Basophils
## 5 1913 Basophils
## 6 1398 Basophils
We can create a database of gated cytometries containing. For simplicity and visualisation we only choose 4 cell types. For an example of a database, we select some of the cytometries, as is usual in machine learning, where a subset of the data is the learning set.
database <- buildDatabase(
dataset_names = paste0('Cytometry', c(2:5, 7:9, 12:17, 19, 21)),
population_ids = c('Monocytes', 'CD4+CD8-', 'Mature SIg Kappa', 'TCRgd-'))
A plot of the data in a 3 dimensional subspace
pairs(database[[1]][,c(4, 3, 9)], col = droplevels(database[[1]][, 11]))
The diagnosis for each cytometry is obtained as follows
help("cytometry.diagnosis") # for an explanation of the abbreviations
cytometry.diagnosis
## [1] "HD" "HD" "HD" "HD" "HD" "HD" "HD" "HD" "HD"
## [10] "HD" "HD" "HD" "HD" "HD" "HD" "HD" "HD" "HD"
## [19] "HD" "HD" "HD" "MCL" "MCL" "FL" "MCL" "LPL" "CLL"
## [28] "CLL" "HD" "HD" "HD" "HD" "HD" "DLBCL" "HCL" "HD"
## [37] "HD" "HD" "HD" "HD"