Ferramentas de comparação genômica são passíveis de erros

domingo, maio 30, 2010

Genome Comparison Tools Found to Be Susceptible to Slip-Ups

ScienceDaily (May 29, 2010) — You might call it comparing apples and oranges, but lining up different species' genomes is common practice in evolutionary research. Scientists can see how species have evolved, pinpoint which sections of DNA are similar between species, meaning they probably are crucial to the animals' survival, or sketch out evolutionary trees in places where the fossil record is spotty.

But the tools used to align genomes from different species have serious quality-control issues, according to a study published online this week in the journal Nature Biotechnology.

"We discovered that there's a disturbingly low level of agreement between genome alignments produced by different tools," said corresponding author Martin Tompa, a UW professor of computer science and engineering and of genome sciences. "What this should suggest to biologists is that they should be very cautious about trusting these alignments in their entirety."



This is especially true when comparing distantly related species, and in regions of the genome that do not code for a protein, he said.

Aligning genomes, while simple in theory, is difficult in practice. Aligning more than two sequences becomes much harder with every additional sequence. At the scale of a mammal's entire genome, all of its genetic code, finding the optimal alignment of many genomes is far beyond the capabilities of any computer, Tompa said.

Various software tools instead use strategic shortcuts.

"At a high level the tools are very similar," Tompa said. "They make different decisions at the lower, more detailed levels, and those decisions seem to have widespread effect on the outcome."

The new paper compared the alignments from a previous study in which four research teams each took the same 1 percent of the human genome and aligned it to the genomes of 27 other vertebrate animals, ranging from mouse to elephant.

"This is a marvelous dataset," Tompa said. "It's a very large-scale multiple sequence alignment, done by four expert teams using four different tools, all of them working on the same input sequences."

However, the new study found that the resulting alignments were quite different. The authors also compared the coverage of each tool, meaning how much of the human DNA it was able to match to each other species, as well as what fraction of alignments were suspiciously close to a random match.
...

Read more here/Leia mais aqui: Science Daily

+++++

Comparative assessment of methods for aligning multiple genome sequences

Xiaoyu Chen & Martin Tompa

Affiliations

Contributions

Corresponding authorNature Biotechnology (2010) doi:10.1038/nbt.1637Received 21 December 2009 Accepted 27 April 2010 Published online 23 May 2010

Abstract

Multiple sequence alignment is a difficult computational problem. There have been compelling pleas for methods to assess whole-genome multiple sequence alignments and compare the alignments produced by different tools. We assess the four ENCODE alignments, each of which aligns 28 vertebrates on 554 Mbp of total input sequence. We measure the level of agreement among the alignments and compare their coverage and accuracy. We find a disturbing lack of agreement among the alignments not only in species distant from human, but even in mouse, a well-studied model organism. Overall, the assessment shows that Pecan produces the most accurate or nearly most accurate alignment in all species and genomic location categories, while still providing coverage comparable to or better than that of the other alignments in the placental mammals. Our assessment reveals that constructing accurate whole-genome multiple sequence alignments remains a significant challenge, particularly for noncoding regions and distantly related species.

Affiliations


Department of Computer Science and Engineering, Department of Genome Sciences, University of Washington, Seattle, Washington, USA.
Xiaoyu Chen & Martin Tompa

Contributions

X.C., design, implementation, experimentation, analysis; M.T., design, analysis.

Competing financial interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to:
Martin Tompa (tompa@cs.washington.edu)


+++++

Professores, pesquisadores e alunos de universidades públicas e privadas com acesso ao site CAPES/Periódicos podem ler gratuitamente este artigo da Nature Biotechnology e de mais 22.440 publicações científicas.

+++++