Hykgene User Manual

Yuhang Wang
2004

How to install

HykGene is a mixture of Matlab, Java and Perl code. Please make sure your computer system meets the following requirements.

System requirements:

Follow these steps to install Hykgene:

  1. Download the hykgene.zip to your computer. Unzip it to a directory. For example, C:\tools\hykgene.
  2. Add the HykGene directory to the Matlab path. If you don't know how to do this, refer to Matlab's user manual.
  3. Add the HykGene directory to the Java CLASSPATH environment variable.
  4. Download and install PRTools 3.1.7 for Matlab. You can get it from here. Add the PRTools directory to the Matlab path. The latest version may also work, but I haven't checked.
  5. If you would like to use the SOM clustering option, download and install the SOM Toolbox for Matlab. You can also get it here. Add the SOM toolbox directory to the Matlab path.
  6. Download and install Weka. Add the weka.jar to the Java CLASSPATH environment variable.
  7. That's it!

How to run HykGene

Input files: HykGene needs two files as input: a gene expression file in the ARFF format and a gene information file in tab delimited text file format. Here are examples of the ARFF file and the gene information file: AMLALL.arff and AMLALLgeneinfo.txt. In the gene information file, the first column lists the feature names as used in the ARFF file; the second column lists the corresponding probe IDs; and the third column lists the corresponding gene descriptions. You can prepare these files using Microsoft Excel.

Typical usage:

hykgene(arfffile, geneinfofile, m, rankmethod, classifier, cvfold, clustermethod, selectedgenes)

where

For example, you can type in Matlab:
hykgene('AMLALL', 'AMLALLgeneinfo.txt', 100, 'chi2', 'svm', 72, 'hc', 'amlallhkgenes.xls')

As a result, you will get the "amlallhkgenes.xls" file containing marker genes picked by HykGene and a "AMLALLchi2top50genes.xls" file containing all of the top-50 genes as ranked by the Chi Squared gene ranking method.

Frequently Asked Questions

  1. I use R instead of Matlab. Will you release a R version?
    Yes, we are also moving to R. A pure Java version is also planned.
  2. I have some data in .xls files. How can I convert it to ARFF?
    It depends on how the data is organized in the .xls files. The easiest way we have found is to convert it to the CSV format first, and then use the following from the command line:

    java weka.core.converters.CSVLoader filename.csv > filename.arff

  3. Where can I find some more gene expression files in the ARFF format?
    You can download some of them from the excellent Kent Ridge Bio-medical Data Set Repository.