rSWTi-DCNDen User Manual

Yuhang Wang
2007

How to install

First please make sure your computer system meets the following requirements:

Follow these steps to install rSWTi-DCNDen:

  1. Download MCRInstaller.exe to your computer and run it.
  2. Download dcnden.zip and unzip it to a directory. For example, C:\tools\dcnden.

How to run SWTi-DCNDen

Input files: rSWTi-DCNDen needs two files as input: a DNA copy number data file, and a chromosome information file. Both are plain text file delimited by tabs or white spaces. Here are examples of the data file and the chromosome information file: example.dcn and example_chrlen.ini.

In the data file, the first column lists the names (or numbers) of the chromosomes; the second column lists the physical distances of the probes from the beginning of the chromosome; and the third column lists the observed DNA copy number data as log2 ratios at the corresponding probe loci.

The current version of the chromosome information file is very simple: the first column lists the names (or numbers) of the chromosomes, and the second column lists the length of each chromosomes. In future versions of the chromosome information file, we will include more information, such as cytobands.

Usage for the command-line tool:

dcndenoise_cmd inputdatafile chrinfofile outputfile dointerp level

where

For example, you can type into the Command Prompt on Windows XP:
dcndenoise_cmd example.dcn example_chrlen.ini example_denoised.dcn 1 4

As a result, you will get the "example_denoised.dcn" file containing denoised data.

Usage for the GUI tool:

Interface is self-explanatory. By default, the software uses the mean physical distance between neighboring probe loci as the distance between psudo-markers. But you can change the default setting by entering a number different than 1, for example, 2 into the edit box. If you encounter "out of memory" problems, you can reduce the amount of memory required by the software by specifying a larger number here, for example, 5. After loading the data file and the chromosome information file, you can then click the "Denoise" button to denoise the data. This will denoise data on ALL the chromosomes. You can choose to view denoised data and original data on any one or all chromosomes by selecting the drop-down menu under "View Data". Denoised data can be saved to a file by clicking the "Save Denoised Data As..." button.

Frequently Asked Questions

1. Why am I getting this error: "Attempted to access (:,1); index out of bounds becuase size()=[0,0]."? What is wrong?
Answer: Most likely, this is because your .dcn file is incompatible with your .ini file (chromosome infom file). For example, in your .dcn file, if you have things like

1 92527 0.06571727
1 799686 -0.023100856
1 1340156 -0.029567067
1 1783589 -0.04096103
1 2117391 0.197468593
1 2152308 0.191745703

This means that you are using 1, 2, 3... as chromosme names. Then you MUST use the same set of chromosome names in your .ini file, for example:

1 247249719
2 242951149
3 199501827

If you use different names in your .ini file, the code won't know where to look for stuff and you will get the above mentioned error message.

2. What can I do if my computer run out memory?
Answer: To save memory usage of the software, you can specify a larger number in the edit box in frot of "mean physical distance between neighboring probe loci", for example, 5. Or, you can turn off interpolation all together. In our experiments on synthetic data, even without interpolation, our software is still better than MODWT.
In our experience, we were able to run the software on data from NimbelGen 500k whole genome arrays on a PC with 2GB of RAM. But on another NimbelGen 500k tiling array for the chromosome X only, the software would run out of memory even when interpolation is turned off. This is because the wavelet analysis is done on each chromosome separately. For the whole genome arrays, the number of probes on chr 1 is about 20k, so it's fine. But for the tiling array, all of the 500k probes are on one single chr (Chr X), and that's why it will run out of memory.