http://engr.smu.edu/cse/dbgroup/images/smubad.jpg

 

MCM (Meta-Classification Model)

 

The MCM project targets the creation of meta-classifiers built on top of the COVE prediction algorithms.  For an DNA/RNA sequence to be classified, we examine an EMM for each sub-pattern length (1,2,3,4,5). We can consider an EMM thus trained as a sequence classifier.   The training data are representative sequences in the target set.  The output of this EMM classifier can be crisp class labels (1 for class and 0 for non-class) or continuous-valued probabilities (soft class labels). Clearly, short sub-pattern lengths will not capture potential interactions between nucleotides that are relatively far apart, yet those interactions may be important for RNA secondary structures.  On the other hand, although large sub-pattern lengths can capture long-range interactions, they will also result in high-dimensional TCGR feature spaces, which generally pose difficulties for classifiers especially when training data are limited. Because of such trade-offs, it is often unclear what sub-pattern length will be optimal. Therefore, instead of trying to select the single best EMM classifier, we propose to use a classifier-fusion framework to combine the outputs of EMM classifiers using different sub-pattern lengths.

 

Publications:

1.    Yuhang Wang, Margaret H. Dunham, James A. Waddle, and Monnie McGee, “Classifier Fusion for Poorly-Differentiated Tumor Classification using Both Messenger RNA and MicroRNA Expression Profiles,” Accepted by the 2006 Computational Systems Bioinformatics Conference (CSB 2006), Stanford, California, 2006.