Welcome to LRTag

LRTag is a tool that is capable of efficiently selecting a near optimal set of SNP markers across multiple populations. The tool is named "LRTag" because the underlying algorithm is based on Lagrangian Relaxation. The algorithm is described in detail in our paper titled "Efficient Algorithms for genome-wide tagSNP selection across populations via the linkage disequilibrium criterion". The paper can be accessed online at http://www.lifesciencessociety.org/CSB2007/toc/67.2007.html

To use LRTag, the data for each population have to be preprocessed into the following two files:

  • ppl.maf:  a file lists the SNP markers of interest and their minor-allele frequencies, e.g. CEU.maf.
  • ppl.lod:  a file lists the pair-wise lod scores, e.g. CEU.lod.
Once the data have been preprocessed into the required format, one simply runs the following command to execute LRTag:

LRTag.exe   lod_cutoff   maf_cutoff   ppl1.lod   ppl1.maf   ppl2.lod   ppl2.maf   ...

where lod_cutoff and maf_cutoff are the thresholds for lod score and minor-allele frequency respectively. Only the markers that have a minor-allele frequency greater than or equal to maf_cutoff will be considered. A marker m1 can be tagged by another marker m2 if the lod score for the pair (m1,m2) exceeds lod_cutoff.

As the output, LRTag will print the solution (i.e. the set of Tag SNPs), the cost of the solution as well as a lower bound on the cost of the optimal solution to the screen. By comparing the lower bound to the solution produced, one can tell how far the solution is from an optimum. According to our experimental results, in most cases, our solution is indeed the optimum or close to the optimum.

Download

LRTag can be downloaded from here

Copyright

LRTag is free for academic use only. For questions about the tool, please contact yonghui@cs.ucr.edu.