TESeeker

Citation

Kennedy, R.C., Unger, M.F., Christley, S., Collins, F.H., Madey, G.R., "An automated homology-based approach for identifying transposable elements," BMC Bioinformatics, 12(1):130, May 2011. Available (Highly Accessed)

Background

Transposable elements (TEs) are a type of repetitive sequence that have been found in nearly all eukaryotic genomes. First discovered and analyzed by McClintock in the 1950s, TEs have the ability to move about and replicate within a genome. Due to their mobile and replicative nature, TEs often occupy large portions of genomes. This prevalence of TEs poses a major difficulty in sequence assembly, as repeat regions are prone to misassembly. TEs can impact host genomes in a number of ways. They are believed to play a major role in genome evolution, as they can insert themselves into, mutate, and move genes, thereby influencing gene expression, causing gene variation, and transferring genetic material.

With the number of sequenced genomes rapidly rising, the need to identify TEs within them also grows. The ability to do this automatically and effectively in a manner similar to the methods used for genes is of increasing importance. This document describes how to use the implementation of our approach, TESeeker to identify high-quality consensus TEs.

Implementation

TESeeker implements an automated homology-based approach for identifying transposable elements and is available as a VirtualBox appliance in the open virtualization format (OVF). TESeeker requires at least 5 GB free hard disk space and at least 1.5 GB of RAM on the host machine to run smoothly. TESeeker can dynamically allocate up to 40 GB hard disk space for use in the virtual appliance.

TESeeker is released under GNU General Public License (GPL) v3.