 |
|
|
|
|
|
 |
 |
| |
|
Section II
Additional Statistical Methods
The ANN supervised learning
algorithms have been previously described. 2 To
determine the performance of each model using
ANN, a confidence threshold was built for
each diagnostic subtype utilizing a modification
of the method described by Khan et al. 3 Models
were built with two possiblilities: subgroup
and non-subgroup. 3 ANN models were built
by 3-fold cross validation utilizing only
samples in the training set. The training
set samples were then shuffled and 3 additional
ANN models were built. 100 repetitions of
the model building process were performed.
An empirical probability distribution for
the ANN output node value was summarized using
only nodal values greater than 0.5. to determine
the 95% confidence threshold. For each individual
sample in the training set, the 100 validation
subtype nodal values were averaged, the samples
was assigned to the subgroup only when its
average subtype nodal value was greater than
the 95% confidence threshold. Similarly,
nodal values for test set samples are averaged
and assigned to a subgroup only when the nodal
value exceed the 95% confidence level defined
on the training set.
|
|
|
|
|
| |
|
|