Skip to contents

Use a Gaussian mixture model (with random parameters) to generate a traning dataset from the reference single-cell data

Usage

BuildTrainingSet(
  count,
  latent,
  max.iter = 5000,
  max.cent = 5,
  step = ifelse(max.iter <= 10000, max.iter, 10000),
  dims = 10,
  min.cent = 1,
  n = round(ncol(count)/2),
  sigma_min_cells = NULL,
  sigma_max_cells = NULL,
  verbose = FALSE
)

Arguments

count

single-cell count matrix (features x cells)

latent

matrix of single-cell latent space (cells x dims)

max.iter

size of the training dataset (default = 10,000)

max.cent

max number of centers in the Gaussian (default = 5)

step

manually parallelize building the training dataset

dims

number of dimensions from latent (default = ncol(latent))

min.cent

min number of centers in the Gaussian (default = 1)

n

number of cells to be chosen to create the training dataset (default is half the number of cells in the count matrix)

sigma_min_cells

min number of cells that should be captured by the standard deviation of the Gaussian

sigma_max_cells

max number of cells that should be captured by the standard deviation of the Gaussian

verbose

logical indicating whether to print progress (default = TRUE)

Value

ConDecon object with training data

Examples

data(counts_gps)
data(latent_gps)

# For this example, we will reduce the training size to max.iter = 50 to reduce run time
TrainingSet = BuildTrainingSet(count = counts_gps, latent = latent_gps, max.iter = 50)