RNA Papers

From Purdue Genomics Database Facility

Jump to: navigation, search

| RNA Home | RNA Structure Match | RNA_Structure_Datasets | RNA_Papers |RNA_misc_(Internal_discussion)|

Contents

RNA Structure Matching

Finding motifs in real or predicted RNA structures. This section is very out of date if someone wants to update.

New/Interesting

  • Xu et al., 2007 RNA Sampler: a new sampling based algorithm for common RNA secondary structure prediction and structural alignment
  • Reeder et al., 2007 Locomotif: from graphical motif description to RNA motif search
  • Hamada et al., 2006 Mining frequent stem patterns from unaligned RNA sequences

==My list of papers to come==Gribskov 17:31, 29 October 2007 (EDT)

  • 1999 Bouthinon
  • 1999 Hein
  • 2000 Chen & Maizel
  • 2001 Gorodkin
  • 2002 Foget
  • 2002 Hu
  • 2003 Cai
  • 2003 Duarte
  • 2003 Knudsen
  • 2003 Nam
  • 2003 Perriquet
  • 2004 Dowell & Eddy
  • 2004 Doshi
  • 2004 Gan
  • 2004 Fan review
  • 2004 Gardner
  • 2004 Hofacker 1
  • 2004 Hofacker 2
  • 2004 Ji
  • 2004 Liu
  • 2004 Pavesi
  • 2004 Reinert
  • 2005 Holmes
  • 2005 Freybult
  • 2005 Havgaard
  • 2005 Haynes
  • 2005 Lescoute
  • 2005 Liu
  • 2005 Reeder
  • 2005 Siebert
  • 2006 ding
  • 2006 Steffen
  • 2006 Uzilov

New (relatively) from pubmed

These need to characterized

1: Veksler-Lublinsky I, Ziv-Ukelson M, Barash D, Kedem K. A structure-based flexible search method for motifs in RNA. J Comput Biol. 2007 Sep;14(7):908-26. PMID: 17803370 [PubMed - in process]

3: Wexler Y, Zilberstein C, Ziv-Ukelson M. A study of accessible motifs and RNA folding complexity. J Comput Biol. 2007 Jul-Aug;14(6):856-72. PMID: 17691898 [PubMed - indexed for MEDLINE]

8: McQuisten KA, Peek AS. Identification of sequence motifs significantly associated with antisense activity. BMC Bioinformatics. 2007 Jun 7;8:184. PMID: 17555590 [PubMed - indexed for MEDLINE]

9: Thompson WA, Newberg LA, Conlan S, McCue LA, Lawrence CE. The Gibbs Centroid Sampler. Nucleic Acids Res. 2007 Jul 1;35(Web Server issue):W232-7. Epub 2007 May 5. PMID: 17483517 [PubMed - indexed for MEDLINE]

10: Weeks KE, Chuzhanova NA, Donnison IS, Scott IM. Evolutionary hierarchies of conserved blocks in 5'-noncoding sequences of dicot rbcS genes. BMC Evol Biol. 2007 Apr 2;7:51. PMID: 17407546 [PubMed - indexed for MEDLINE]

11: Sugahara J, Yachie N, Sekine Y, Soma A, Matsui M, Tomita M, Kanai A. SPLITS: a new program for predicting split and intron-containing tRNA genes at the genome level. In Silico Biol. 2006;6(5):411-8. PMID: 17274770 [PubMed - indexed for MEDLINE]

12: Horesh Y, Amir A, Michaeli S, Unger R. RNAMAT: an efficient method to detect classes of RNA molecules and their structural features. Conf Proc IEEE Eng Med Biol Soc. 2004;4:2869-72. PMID: 17270876 [PubMed]

14: Huang X, Ali H. High sensitivity RNA pseudoknot prediction. Nucleic Acids Res. 2007;35(2):656-63. Epub 2006 Dec 19. PMID: 17179177 [PubMed - indexed for MEDLINE]

15: Wan XF, Lin G, Xu D. Rnall: an efficient algorithm for predicting RNA local secondary structural landscape in genomes. J Bioinform Comput Biol. 2006 Oct;4(5):1015-31. PMID: 17099939 [PubMed - indexed for MEDLINE]

16: Bailey JM, Tapprich WE. Structure of the 5' nontranslated region of the coxsackievirus b3 genome: Chemical modification and comparative sequence analysis. J Virol. 2007 Jan;81(2):650-68. Epub 2006 Nov 1. PMID: 17079314 [PubMed - indexed for MEDLINE]

17: Hiller M, Pudimat R, Busch A, Backofen R. Using RNA secondary structures to guide sequence motif finding towards single-stranded regions. Nucleic Acids Res. 2006;34(17):e117. Epub 2006 Sep 20. PMID: 16987907 [PubMed - indexed for MEDLINE]

18: Baird SD, Turcotte M, Korneluk RG, Holcik M. Searching for IRES. RNA. 2006 Oct;12(10):1755-85. Epub 2006 Sep 6. Review. PMID: 16957278 [PubMed - indexed for MEDLINE]

20: Jones NC, Pevzner PA. Comparative genomics reveals unusually long motifs in mammalian genomes. Bioinformatics. 2006 Jul 15;22(14):e236-42. PMID: 16873477 [PubMed - indexed for MEDLINE]

21: Thebault P, de Givry S, Schiex T, Gaspin C. Searching RNA motifs and their intermolecular contacts with constraint networks. Bioinformatics. 2006 Sep 1;22(17):2074-80. Epub 2006 Jul 4. PMID: 16820426 [PubMed - indexed for MEDLINE]

22: Das D, Nahle Z, Zhang MQ. Adaptively inferring human transcriptional subnetworks. Mol Syst Biol. 2006;2:2006.0029. Epub 2006 Jun 6. PMID: 16760900 [PubMed - indexed for MEDLINE]

23: Lemieux S, Major F. Automated extraction and classification of RNA tertiary structure cyclic motifs. Nucleic Acids Res. 2006 May 5;34(8):2340-6. Print 2006. PMID: 16679452 [PubMed - indexed for MEDLINE]

24: Anwar M, Nguyen T, Turcotte M. Identification of consensus RNA secondary structures using suffix arrays. BMC Bioinformatics. 2006 May 5;7:244. PMID: 16677380 [PubMed - indexed for MEDLINE]

25: McCauley S, Hein J. Using hidden Markov models and observed evolution to annotate viral genomes. Bioinformatics. 2006 Jun 1;22(11):1308-16. Epub 2006 Apr 13. PMID: 16613911 [PubMed - indexed for MEDLINE]

26: Haynes T, Knisley D, Seier E, Zou Y. A quantitative analysis of secondary RNA structure using domination based parameters on trees. BMC Bioinformatics. 2006 Mar 3;7:108. PMID: 16515683 [PubMed - indexed for MEDLINE]

29: Yao Z, Weinberg Z, Ruzzo WL. CMfinder--a covariance model based RNA motif finding algorithm. Bioinformatics. 2006 Feb 15;22(4):445-52. Epub 2005 Dec 15. PMID: 16357030 [PubMed - indexed for MEDLINE]

30: Laserson U, Gan HH, Schlick T. Predicting candidate genomic sequences that correspond to synthetic functional RNA motifs. Nucleic Acids Res. 2005 Oct 27;33(18):6057-69. Print 2005. PMID: 16254081 [PubMed - indexed for MEDLINE]

31: Havgaard JH, Lyngso RB, Gorodkin J. The FOLDALIGN web server for pairwise structural RNA alignment and mutual motif search. Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W650-3. PMID: 15980555 [PubMed - indexed for MEDLINE]

32: Saetrom O, Snove O Jr, Saetrom P. Weighted sequence motifs as an improved seeding step in microRNA target prediction algorithms. RNA. 2005 Jul;11(7):995-1003. Epub 2005 May 31. PMID: 15928346 [PubMed - indexed for MEDLINE]

33: La D, Livesay DR. Predicting functional sites with an automated algorithm suitable for heterogeneous datasets. BMC Bioinformatics. 2005 May 13;6:116. PMID: 15890082 [PubMed - indexed for MEDLINE]

35: Liu J, Wang JT, Hu J, Tian B. A method for aligning RNA secondary structures and its application to RNA motif detection. BMC Bioinformatics. 2005 Apr 7;6:89. PMID: 15817128 [PubMed - indexed for MEDLINE]


38: Peng X, Karuturi RK, Miller LD, Lin K, Jia Y, Kondu P, Wang L, Wong LS, Liu ET, Balasubramanian MK, Liu J. Identification of cell cycle-regulated genes in fission yeast. Mol Biol Cell. 2005 Mar;16(3):1026-42. Epub 2004 Dec 22. PMID: 15616197 [PubMed - indexed for MEDLINE]

39: Wadley LM, Pyle AM. The identification of novel RNA structural motifs using COMPADRES: an automated approach to structural discovery. Nucleic Acids Res. 2004 Dec 17;32(22):6650-9. Print 2004. PMID: 15608296 [PubMed - indexed for MEDLINE]


41: Ohler U, Yekta S, Lim LP, Bartel DP, Burge CB. Patterns of flanking sequence conservation and a characteristic upstream motif for microRNA gene identification. RNA. 2004 Sep;10(9):1309-22. PMID: 15317971 [PubMed - indexed for MEDLINE]

42: Fera D, Kim N, Shiffeldrim N, Zorn J, Laserson U, Gan HH, Schlick T. RAG: RNA-As-Graphs web resource. BMC Bioinformatics. 2004 Jul 6;5:88. PMID: 15238163 [PubMed - indexed for MEDLINE]

43: Lambert A, Fontaine JF, Legendre M, Leclerc F, Permal E, Major F, Putzer H, Delfour O, Michot B, Gautheret D. The ERPIN server: an interface to profile-based RNA motif identification. Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W160-5. PMID: 15215371 [PubMed - indexed for MEDLINE]

44: Pavesi G, Mauri G, Stefani M, Pesole G. RNAProfile: an algorithm for finding conserved secondary structure motifs in unaligned RNA sequences. Nucleic Acids Res. 2004 Jun 15;32(10):3258-69. Print 2004. PMID: 15199174 [PubMed - indexed for MEDLINE]

47: Ding Y, Lawrence CE. A statistical sampling algorithm for RNA secondary structure prediction. Nucleic Acids Res. 2003 Dec 15;31(24):7280-301. PMID: 14654704 [PubMed - indexed for MEDLINE]

49: Worthey EA, Schnaufer A, Mian IS, Stuart K, Salavati R. Comparative analysis of editosome proteins in trypanosomatids. Nucleic Acids Res. 2003 Nov 15;31(22):6392-408. PMID: 14602897 [PubMed - indexed for MEDLINE]

50: Duarte CM, Wadley LM, Pyle AM. RNA structure comparison, motif search and discovery using a reduced representation of RNA conformational space. Nucleic Acids Res. 2003 Aug 15;31(16):4755-61. PMID: 12907716 [PubMed - indexed for MEDLINE]

RNA Papers to read

Graph algorithms and methodologies

Very Brief Summary (Aditi) :: Definitions- 1. Vertex Cover of an undirected graph is the set of vertices that cover all the edges in the given graph. 2. Independent Set of the graph is the vertex set that is complement of the vertex cover. Vertices in this independent set would induce an edgeless subgraph of the original graph.

For two given graphs G1 and G2, authors propose the following strategy to find maximum common subgraph: (1). Find vertex cover (C) of size k in G1. (2). Match C with vertices in G2 (exhaustive search, as in brute force and backtracking). (3). Each candidate subgraph in G2 that matched in step 2 can be extended by adding vertices from complement of C (the independent set of G1) that forms a compatible pair with previously unmatched vertices of G2.

Some issues that this method may have with our current approach:

1. We match edges, not vertices. But we may come up with a suitable node labeling scheme.

2. Search for common subgraphs seem to start from vertex cover of a graph. But most of our XIOS graphs have one or more nodes that are connected to almost every other vertex. Thus, vertex covers of current XIOS graphs could be too small (just 1 node for RNase P!).

3. This method requires computing min vertex cover (an NP-C problem!) of both graphs. The best alg so far for this runs in O(1.274^k), where k is the size of the vertex cover.

Authors say that their method has a better worst case behavior, as long as k is small. But its still exponential in m/3, where m is the size of the smaller graph.

RNA secondary structure prediction and reports of some RNA str being identified

--Brief Summary (Aditi):

Does pre-mRNA form secondary structure in vivo? Yes, many (if not all) pre-mRNA do (examples given). This structure formation is affected by proteins that coat RNA in vivo- mainly hnRNP proteins, prevent RNA from folding. Stat analysis of mRNA coding regions => calculated mRNA folding more stable than expected by chance -> codon bias may favor existence of mRNA structures (ref 87); refuted by ref 107 using different stat tools and genes.

Does mRNA sec str affect binding of splicing factors (and vice versa)? Yes. Splicing factors recognize well-defined target regions in mRNA even in absence of strong sequence conservation in these target regions. Authors onclude by saying that both sequence and structure is important.

RNA sec str on 5’ 3’ splice site and branch points: These three conserved regions define an exon. Outline 2 main mechanisms how structure may influence splicing: 1. presence of str elements may render some splice sequence inaccessible to splicing factors. 2. secondary str can vary the relative distance between the splicing sequences -> can influence splice site usage/efficiency.

Does RNA sec str affects intronic or exonic enhancer/silencer elements? Yes, in the two ways mentioned above (many egs given). Authors show via examples that sec str formation affects availability of pre-mRNA regions.

Conclusion:: many pre-mRNA regions can fold into well-defined str in vivo. Provides an additional regulatory mech (like bringing two elemts together). Also, too much str -> can interfere with later splicing processes. Thus, this (formation of str in pre-mRNA) may be an evolutionarily controlled thing.


--Brief Summary (Aditi): The alternative splicing of tau exon 10 in mammals is at least partially regulated by a stem-loop structure that is formed at the 5’ splice site in pre-mRNA. This stem-loop formation competes with binding of splicing factors. Thus, stem-loop formation at 5’ splice site blocks definition of exon 10 in tau mRNA -> the exon get skipped (not included) in mRNA. Disruption of this stem-loop structure-> splicing factors can bind-> increased incorporation of exon 10 in mRNA. Tau isoform made from this mRNA has four microtubule-binding domains -> proposed to cause neural degeneration as observed in dementia and parkinsons.


research Groups