For each cell in a single sample, find the distances from the cell to the nearest neighbor cells in each of the provided phenotypes.

find_nearest_distance(csd, phenotypes = NULL, dst = NULL)

Arguments

csd

A data frame with Cell X Position, Cell Y Position and Phenotype columns, such as the result of calling read_cell_seg_data.

phenotypes

Optional list of phenotypes to include. If omitted, unique_phenotypes(csd) will be used.

dst

Optional distance matrix. If provided, this should be distance_matrix(csd). Not used if rtree is available.

Value

A tibble containing a Distance to <phenotype> column and Cell ID <phenotype> column for each phenotype. Columns will contain NA values where there is no other cell of the phenotype.

Details

If the rtree package is available, this will use a fast, memory-efficient algorithm capable of processing fields with many thousand cells. Otherwise, a simple distance matrix algorithm is used. The simple algorithm requires at least 8 * (number of cells)^2 bytes of memory which becomes prohibitive as the number of cells becomes large.

Install the rtree package from GitHub using the command devtools::install_github('akoyabio/rtree').

See also

Examples

# Compute distance columns
csd <- sample_cell_seg_data
nearest <- find_nearest_distance(csd)
dplyr::glimpse(nearest)
#> Rows: 6,072
#> Columns: 10
#> $ `Distance to CD68+`  <dbl> 29.529646, 30.269622, 38.082148, 36.674242, 15.62~
#> $ `Cell ID CD68+`      <int> 108, 99, 41, 262, 41, 99, 4949, 217, 69, 99, 108,~
#> $ `Distance to CD8+`   <dbl> 18.03469, 50.09241, 64.37585, 67.57403, 60.00208,~
#> $ `Cell ID CD8+`       <int> 101, 189, 5068, 423, 5068, 189, 182, 128, 188, 18~
#> $ `Distance to CK+`    <dbl> 36.830694, 21.377558, 109.317199, 3.605551, 105.0~
#> $ `Cell ID CK+`        <int> 192, 166, 5127, 45, 636, 58, 68, 4943, 209, 166, ~
#> $ `Distance to FoxP3+` <dbl> 16.347783, 24.909837, 40.140379, 30.870698, 26.40~
#> $ `Cell ID FoxP3+`     <int> 117, 138, 214, 229, 214, 138, 437, 229, 102, 138,~
#> $ `Distance to other`  <dbl> 10.307764, 6.800735, 8.062258, 19.811613, 5.59017~
#> $ `Cell ID other`      <int> 76, 36, 84, 57, 50, 40, 43, 142, 49, 71, 60, 85, ~

# Make a combined data frame including original data and distance columns
csd <- cbind(csd, find_nearest_distance(csd))

if (FALSE) {
# If `merged` is a data frame containing cell seg data from multiple fields,
# this code will create a new `tibble` with distance columns computed
# for each `Sample Name` in the data.
merged_with_distance <- merged %>%
  dplyr::group_by(`Sample Name`) %>%
  dplyr::do(dplyr::bind_cols(., find_nearest_distance(.)))
}