phenoptr includes a flexible mechanism for selecting cells (i.e. rows) from a cell seg table. The mechanism is implemented in
select_rows. Row selection may be used directly via
select_rows and ordinary subsetting operations. It is also used indirectly by calling functions that support it, including
The return value from
select_rows is a boolean (logical) vector whose length is the number of rows of the given cell seg table. You use this returned value to select rows of the table.
select_rows uses to specify phenotypes is very flexible. This flexibility comes at a cost in complexity. Most common phenotype combinations can also be specified using
parse_phenotypes, which supports a friendlier syntax.
This tutorial uses
count_within to give examples of the phenotype specifications used by
The simplest selector is just the name of a single phenotype. This example selects the rows containing
CK+ cells. The same syntax works with both
library(phenoptr) csd <- sample_cell_seg_data rows <- select_rows(csd, 'CK+') sum(rows) # The number of selected rows
##  2257
# Select just the desired rows by subsetting ck <- csd[rows, ] dim(ck)
##  2257 199
This example counts
CD8+ cells with 15 microns of
dst <- distance_matrix(csd) # Compute this just once and re-use it count_within(csd, from='CK+', to='CD8+', radius=15, dst=dst)
## # A tibble: 1 x 5 ## radius from_count to_count from_with within_mean ## <dbl> <int> <int> <int> <dbl> ## 1 15 2257 228 193 0.115
Double positive (or more) cells can be selected by including multiple names in a list. Selectors in a list are combined with AND.
Multiple phenotypes may selected together by including each name in a character vector (not a list!). Names in a vector are combined with OR.
For example, to select cells phenotyped as either
FoxP3+, use the selector
This example selects this combination. Note the call to
select_rows has been combined with the subsetting of
tcells <- csd[select_rows(csd, c('CD8+', 'FoxP3+')), ] dim(tcells)
##  456 199
count_within(csd, from='CK+', to=c('CD8+', 'FoxP3+'), radius=15, dst=dst)
## # A tibble: 1 x 5 ## radius from_count to_count from_with within_mean ## <dbl> <int> <int> <int> <dbl> ## 1 15 2257 456 354 0.206
This type of grouping is an either / or selection. The
count_within example above counts the number of T cells (
FoxP3+) within 15 microns of a
CK+ cell. If you want separate counts for
For more flexibility,
select_rows supports selection using any valid R expression. Expressions are written using one-sided formulas. The formulas are evaluated in the context of the cell seg table so they may reference any column of the table.
For example, to select cells with PDL1 expression greater than 3, use the expression
~`Entire Cell PDL1 (Opal 520) Mean`>3. In this example, the column name is
Entire Cell PDL1 (Opal 520) Mean.
Expressions and phenotype names may be combined in a list. This example selects
CK+ cells with PDL1 > 3.
rows <- select_rows(csd, list('CK+', ~`Entire Cell PDL1 (Opal 520) Mean`>3)) ck_pdl1 <- csd[rows, ] dim(ck_pdl1)
##  531 199
count_within(csd, from=list('CK+', ~`Entire Cell PDL1 (Opal 520) Mean`>3), to='CD8+', radius=15, dst=dst)
## # A tibble: 1 x 5 ## radius from_count to_count from_with within_mean ## <dbl> <int> <int> <int> <dbl> ## 1 15 531 228 86 0.228
A few things to note about formula expressions:
read_cell_seg_data(path, remove_units=TRUE)(the default), the table names will be abbreviated compared to the names in the file.
Several functions in
phenoptr operate on pairs of phenotypes and have arguments
phenotype_rules. For example, see
spatial_distribution_report. These functions build on
select_rows to allow allow flexible selection of pairs of phenotypes.
In the simplest usage, the names in
pairs are the names of phenotypes in the cell seg data. In this case,
pairs just lists the desired phenotypes. For example, to pair
CK+ cells with
CD8+ cells, use the argument
For a single pair, a list is not required so this can be simplified to
pairs <- c('CK+', 'CD8+')
For multiple pairs, list each pair separately. For example, to pair
CK+ cells first with
CD8+ cells and then with
CD68+ cells, use the argument
You may want to define a new phenotype using grouping or expressions as shown in the “Selecting phenotypes” sections above. To do this, use the
phenotype_rules argument to associate a
select_rows rule with a name; then use the new name in the pairs argument.
For example, to create a
T Cell phenotype which matches
FoxP3+ phenotypes, and pair it with a
PDL1+ CK+ phenotype which applies a threshold to tumor cells, use these arguments:
pairs <- c('PDL1+ CK+', 'T Cell') phenotype_rules <- list( 'PDL1+ CK+'=list('CK+', ~`Entire Cell PDL1 (Opal 520) Mean`>3), 'T Cell'=c('CD8+', 'FoxP3+'))
phenotype_rules only needs to include phenotypes which are not in the cell seg data. For example, to extend the previous example to include a pairing from
PDL1+ CK+ to
CD68+ cells, where
CD68+ is an existing phenotype, extend the
pairs argument without changing