fit-methods {Rtreemix} | R Documentation |
Function for fitting a mutagenetic trees mixture model to a given dataset
data
. The dataset and the number of trees K
have to be specified.
The function estimates K-oncogenetic trees mixture model from the
specified data by using an EM-like learning algorithm. The first tree
component of the model has a star topology and is referred to as the
noise component.
## S4 method for signature 'RtreemixData, numeric': fit(data, K, ...)
data |
An RtreemixData object giving the dataset used for
learning the trees mixture model. |
K |
An integer larger than 0 specifying the number of
branchings in the mixture model. |
... |
no.start.sol is an integer larger than 0 specifying the number of starting solutions for the k-means
algorithm. The default value is 100.
eps is a numeric giving the minimum conditional probability to include edge. The
default value is 0.01.
weighing is a logical specifying whether to use special
weights log(Pr(v)) for the edges (root, v). The default value is FALSE .
equal.edgeweights is a logical specifying whether to use
equal edge weights in the noise component. The default value is
TRUE . When you have few data samples always use its default value (TRUE )
to ensure nonzero probabilities for all possible patterns (sets of events).
seed is a positive integer specifying the random generator
seed. The default value is (-1) and then the time is used as a
random generator.
noise is a logical indicating the presence of a noise
(star) component in the fitted mixture model. It is mostly relevant
for models with a single tree component, since it is assumed that mixture models with
at least two components always have the noise as a first component.
|
When K = 1 and noise = FALSE a single mutagenetic tree is fit to the data. When K = 1 and noise = TRUE a star mutagenetic tree is fit to the data. If K > 1 the first mutagenetic tree is always the star, i.e. the case K > 1 and noise = FALSE is not possible.
The method returns an RtreemixModel
object that represents the
K-trees mixture model learned from the given dataset.
When you have too few data samples always use the default value TRUE
for the equal.edgeweights
. Like this you make sure that all possible
patterns (sets of events) have non-zero probabilities. If they don't the
fitting procedure will not be completed and you will get an error!
Jasmina Bogojeska
Learning multiple evolutionary pathways from cross-sectional data, N. Beerenwinkel et al.
RtreemixData-class
, RtreemixModel-class
,
generate-methods
, bootstrap-methods
,
confIntGPS-methods
## Create an RtreemixData object from a randomly generated RtreemixModel object. rand.mod <- generate(K = 3, no.events = 9, noise.tree = TRUE, prob = c(0.2, 0.8)) data <- sim(model = rand.mod, no.draws = 300) show(data) ## Create an RtreemixModel object by fitting model to the given data. mod <- fit(data = data, K = 3, equal.edgeweights = TRUE, noise = TRUE) show(mod) ## See the number of tree components in the mixture model. numTrees(mod) ## See the weights of the branchings from the fitted mixture model. Weights(mod) ## See a specific tree component k. getTree(object = mod, k = 2)