Count the number of from cells having a to cell within radius microns in tissue category category. Compute the average number of to cells within radius of from cells.

count_within(csd, from, to, radius, category = NA, dst = NULL)

Arguments

csd

A data frame with Cell X Position, Cell Y Position and Phenotype columns, such as the result of calling read_cell_seg_data.

from, to

Selection criteria for the rows and columns. Accepts all formats accepted by select_rows.

radius

The radius or radii to search within.

category

Optional tissue category to restrict both from and to.

dst

Optional distance matrix corresponding to csd, produced by calling distance_matrix.

Value

A tibble with five columns and one row for each value in radius:

radius

The value of radius for this row.

from_count

The number of from cells found in csd.

to_count

The number of to cells found in csd.

from_with

The number of from cells with a to cell within radius.

within_mean

The average number of to cells found within radius microns of each from cell.

Details

For each from cell, count the number of to cells within radius microns. Report the number of from cells containing at least one to cell within radius as from_with. Report the average number of to cells per from cell as within_mean.

count_within counts cells within a single field. It will give an error if run on a merged cell seg data file. To count cells in a merged file, use group_by and do to call count_within for each sample in the merged file. See the Examples.

There are some subtleties to the count calculation.

  • It is not symmetric in from and to. For example the number of tumor cells with a macrophage within 25 microns is not the same as the number of macrophages with a tumor cell within 25 microns.

  • from_count*within_mean is not the number of to cells within radius of a from cell, it may count to cells multiple times.

  • Surprisingly, from_count*within_mean is symmetric in from and to. The double-counting works out.

To aggregate within_mean across multiple samples (e.g. by Slide ID) see the examples below.

If category is specified, all reported values are for cells within the given tissue category. If category is NA, values are reported for the entire data set.

radius may be a vector with multiple values.

Examples

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
csd <- sample_cell_seg_data

# Find the number of macrophages with a tumor cell within 10 or 25 microns
count_within(csd, from='CD68+', to='CK+', radius=c(10, 25))
#> # A tibble: 2 x 5
#>   radius from_count to_count from_with within_mean
#>    <dbl>      <int>    <int>     <int>       <dbl>
#> 1     10        417     2257        89       0.288
#> 2     25        417     2257       274       3.28 

# Find the number of tumor cells with a macrophage within 10 or 25 microns
count_within(csd, from='CK+', to='CD68+', radius=c(10, 25))
#> # A tibble: 2 x 5
#>   radius from_count to_count from_with within_mean
#>    <dbl>      <int>    <int>     <int>       <dbl>
#> 1     10       2257      417       109      0.0532
#> 2     25       2257      417       664      0.606 

if (FALSE) {
# If 'merged' is a merged cell seg file, this will run count_within for
# each field:
distances = merged %>% group_by(`Slide ID`, `Sample Name`) %>%
  do(count_within(., from='CK+', to='CD68+', radius=c(10, 25)))

# This will aggregate the fields by Slide ID:
distances %>% group_by(`Slide ID`, radius) %>%
  summarize(within=sum(from_count*within_mean, na.rm=TRUE),
            from_count=sum(from_count),
            to_count=sum(to_count),
            from_with=sum(from_with),
            within_mean=within/from_count) %>%
  select(-within)
}