|
MCM (Meta-Classification Model)
The MCM project targets the
creation of meta-classifiers built on top of the COVE
prediction algorithms. For an DNA/RNA
sequence to be classified, we examine an EMM for each sub-pattern
length (1,2,3,4,5). We can consider an EMM thus trained as a sequence
classifier. The training data are
representative sequences in the target set.
The output of this EMM classifier can be crisp class labels (1 for class
and 0 for non-class) or continuous-valued probabilities (soft class
labels). Clearly, short sub-pattern lengths will not capture potential
interactions between nucleotides that are relatively far apart, yet those
interactions may be important for RNA secondary structures. On the other hand, although large
sub-pattern lengths can capture long-range interactions, they will also
result in high-dimensional TCGR feature spaces,
which generally pose difficulties for classifiers especially when training
data are limited. Because of such trade-offs, it is often unclear what
sub-pattern length will be optimal. Therefore, instead of trying to select
the single best EMM classifier, we propose to use a classifier-fusion
framework to combine the outputs of EMM classifiers using different
sub-pattern lengths.
Publications:
1.
Yuhang Wang, Margaret H. Dunham, James A. Waddle, and Monnie
McGee, “Classifier Fusion for Poorly-Differentiated Tumor Classification
using Both Messenger RNA and MicroRNA Expression Profiles,” Accepted by the
2006 Computational Systems Bioinformatics Conference (CSB 2006), Stanford,
California, 2006.
|