R/distance_funcs.R
find_nearest_distance.Rd
For each cell in a single sample, find the distances from the cell to the nearest neighbor cells in each of the provided phenotypes.
find_nearest_distance(csd, phenotypes = NULL, dst = NULL)
A data frame with Cell X Position
,
Cell Y Position
and Phenotype
columns,
such as the result of calling
read_cell_seg_data.
Optional list of phenotypes to include. If omitted,
unique_phenotypes(csd)
will be used.
Optional distance matrix. If provided, this should be
distance_matrix(csd)
. Not used if rtree
is available.
A tibble
containing a Distance to <phenotype>
column
and Cell ID <phenotype>
column for each phenotype.
Columns will contain NA
values where there is no other cell
of the phenotype.
If the rtree
package is available, this will use a fast, memory-efficient
algorithm capable of processing fields with many thousand cells. Otherwise,
a simple distance matrix algorithm is used. The simple algorithm
requires at least 8 * (number of cells)^2 bytes of memory which becomes
prohibitive as the number of cells becomes large.
Install the rtree
package from GitHub using the command
devtools::install_github('akoyabio/rtree')
.
compute_all_nearest_distance which applies this function to a (possibly merged) data file.
Other distance functions:
compute_all_nearest_distance()
,
count_touching_cells()
,
count_within_batch()
,
count_within_many()
,
count_within()
,
distance_matrix()
,
spatial_distribution_report()
,
subset_distance_matrix()
# Compute distance columns
csd <- sample_cell_seg_data
nearest <- find_nearest_distance(csd)
dplyr::glimpse(nearest)
#> Rows: 6,072
#> Columns: 10
#> $ `Distance to CD68+` <dbl> 29.529646, 30.269622, 38.082148, 36.674242, 15.62~
#> $ `Cell ID CD68+` <int> 108, 99, 41, 262, 41, 99, 4949, 217, 69, 99, 108,~
#> $ `Distance to CD8+` <dbl> 18.03469, 50.09241, 64.37585, 67.57403, 60.00208,~
#> $ `Cell ID CD8+` <int> 101, 189, 5068, 423, 5068, 189, 182, 128, 188, 18~
#> $ `Distance to CK+` <dbl> 36.830694, 21.377558, 109.317199, 3.605551, 105.0~
#> $ `Cell ID CK+` <int> 192, 166, 5127, 45, 636, 58, 68, 4943, 209, 166, ~
#> $ `Distance to FoxP3+` <dbl> 16.347783, 24.909837, 40.140379, 30.870698, 26.40~
#> $ `Cell ID FoxP3+` <int> 117, 138, 214, 229, 214, 138, 437, 229, 102, 138,~
#> $ `Distance to other` <dbl> 10.307764, 6.800735, 8.062258, 19.811613, 5.59017~
#> $ `Cell ID other` <int> 76, 36, 84, 57, 50, 40, 43, 142, 49, 71, 60, 85, ~
# Make a combined data frame including original data and distance columns
csd <- cbind(csd, find_nearest_distance(csd))
if (FALSE) {
# If `merged` is a data frame containing cell seg data from multiple fields,
# this code will create a new `tibble` with distance columns computed
# for each `Sample Name` in the data.
merged_with_distance <- merged %>%
dplyr::group_by(`Sample Name`) %>%
dplyr::do(dplyr::bind_cols(., find_nearest_distance(.)))
}