read_cell_seg_data makes it easier to use data from Akoya Biosciences' inForm program. It reads data files written by inForm 2.0 and later and does useful cleanup on the result.

read_cell_seg_data(
path = NA,
pixels_per_micron = getOption("phenoptr.pixels.per.micron"),
remove_units = TRUE,
col_select = NULL
)

## Arguments

path

Path to the file to read, or NA to use a file chooser.

pixels_per_micron

Conversion factor to microns (default 2 pixels/micron, the resolution of 20x MSI fields taken on Vectra Polaris and Vectra 3.). Set to NA to skip conversion. Set to 'auto' to read from an associated component_data.tif file.

remove_units

If TRUE (default), remove the unit name from expression columns.

col_select

Optional column selection expression, may be

• NULL - retain all columns

• "phenoptrReports" - retain only columns needed by functions in the phenoptrReports package.

• A quoted list of one or more selection expressions, like in dplyr::select() (see example).

## Value

A tibblecontaining the cleaned-up data set.

## Details

read_cell_seg_data reads both single-field tables, merged tables and consolidated tables and does useful cleanup on the data:

• Removes columns that are all NA. These are typically unused summary columns.

• Converts percent columns to numeric fractions.

• Converts pixel distances to microns. The conversion factor may be specified as a parameter, by setting options(phenoptr.pixels.per.micron), or by reading an associated component_data.tif file.

• Optionally removes units from expression names

• If the file contains multiple sample names, a tag column is created containing a minimal, unique tag for each sample. This is useful when a short name is needed, for example in chart legends.

If pixels_per_micron='auto', read_cell_seg_data looks for a component_data.tif file in the same directory as path. If found, pixels_per_micron is read from the file and the cell coordinates are offset to the correct spatial location.

If col_select is "phenoptrReports", only columns normally needed by phenoptrReports are read. This can dramatically reduce the time to read a file and the memory required to store the results.

Specifically, passing col_select='phenoptrReports' will omit

• Component stats other than mean expression

• Shape stats other than area

• Path, Processing Region ID, Category Region ID, Lab ID, Confidence, and columns which are normally blank.

Other file readers: get_field_info(), list_cell_seg_files(), read_components(), read_maps()

## Examples

path <- sample_cell_seg_path()

# count all the phenotypes in the data
table(csd\$Phenotype)
#>
#>  CD68+   CD8+    CK+ FoxP3+  other
#>    417    228   2257    228   2942

# Read only columns needed by phenoptrReports