Maize TE DB

From Purdue Genomics Database Facility

Revision as of 20:18, 12 August 2008 by Westerm (Talk | contribs)
(diff) ← Older revision | Current revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

Note

Aug, 2008. We have removed most of the content of the initial web page since it references the actual programming done during the summer of 2007 by the SanMiguel group and was out of date. This Wiki is now being repositioned as an internal wiki for the entire Maize project.

Request an account in order to edit the Wiki

http://wiki.genomics.purdue.edu/index.php/Special:RequestAccount

About the Maize transposable element database project

<This header of this section needs to be written>

Personnel

  • Phillip SanMiguel, PI. pmiguel@purdue.edu
  • Rick Westerman, staff programmer. westerman@purdue.edu

<This section needs to be fleshed out>

Time line

<Do we need this section? If so, please add to it.>

Data mining tools used

Phillip has a data mining tools page.

Similar projects and databases

We are not the first to create a transposable element web site. “RepBase” (http://www.girinst.org/repbase/index.html) is probably the oldest and most developed site. It caters to a whole range of elements and species – much more than we are targeting – but it is probably the web site to judge our work against. This site requires a free registration and it is well worthwhile to register in order to see what it offers. It may be particularly instructive to search for Phillip's entries (use the keyword SanMiguel) or by an element name (use the keyword 'opie') and/or browse for maize. Note how the data is related to each other. Also note how useful (or not) the web site is for finding information. While you shouldn't submit any actual information to the site, you can try out the applet or standalone version of the formatting tool and see what you like or do not like about it.

A new Wikiposon site (http://www.bioinformatics.org/wikiposon/doku.php?id=main) can be an interesting read.


Another web site is the TIGR plant repeat database (http://www.tigr.org/tdb/e2k1/plant.repeats/). No registration is required for this site. If you concentrate on searching or retrieving the “Zea” repeats then you will be looking at the maize data. Note the ease of use (or lack thereof) of this site. Ditto with the information retrieved. Where is the meta-data?

The MIPS resource in also interesting. It covers many species. http://mips.gsf.de/proj/plant/webapp/recat/ Once again there are search and browse functions that are useful to look at.

A non-maize site but informative because, like maize, wheat has a lot of repeats is TREP (http://wheat.pw.usda.gov/ITMI/Repeats/index.shtml)

Major Transposable Elements in Zea mays

According to the Genome Sequencing Center of Washington University in St. Louis, repetitive DNA sequences make up 88% of the Zea mays genome. Of those repetitive sequences, most are due to retrotransposons. Retrotransposons replicate by the so called "copy and paste" mechanism. Instead of the whole transposon being cut out of its site in the DNA and moved to another location, reverse transcriptase essentially produces a cDNA of the transcribed RNA which can then be inserted to another site in the genome by the enzyme integrase. In plants, the majority of retrotransposons have long terminal repeats (LTRs) which flank both sides of the retrotransposon genes and they are identical upon insertion. These LTR's are used to identify retrotransposons and order the sequence of when the retrotransposons entered the genome. By analyzing transposons that have been inserted inside other transposons (called nested transposons) we can determine the order in which the transposons entered the genome. Furthermore, if we know the substitution rate we can estimate how long the retrotransposon was within the genome. This is possible because the LTRs of a retrotransposon are identical when they are inserted into the genome by integrase. The more disimilarities there are in the sequence of two corresponding LTRs, the longer the retrotransposon has been in the genome.

Within the subclass of LTR retrotransposons, there are 2 main superfamilies. These are Copia and Gypsy. They differ in the order of the proteins that are encoded by the POL open reading frame (ORF). Gypsy sometimes encodes env which is an envelope protein that is found in retroviruses (in one instance env has also been found in a copia-like element). ENV allows the RNA to be packaged and removed from the cell by exocytosis. Exocytosis has not been documented in plants with Gypsy transposable elements yet.

The main copia-like elements are Opie and Ji/PREM-2. Opie is the most common transposable element in the Zea mays genome. It is present in 30,000 copies and makes up 10-15% of the large maize genome. As seen by the 30,000 copies of Opie, Opie and Ji/PREM-2 are highly repetetive sequences. Two other copia-like elements which are less common are Fourf and Victim. Fourf and Victim are low copy number elements, having their sequences repeated only hundreds of times.

The main gypsy-like elements are Huck, Grande, and Cinful/Zeon. These are highly repetetive sequences. Cinful is present about 18,000 times and makes up almost 9% of the Zea mays genome. Reina is a low copy number retroelement.

(all the citations that should be in this paper are from papers by Phillip SanMiguel)

research Groups