« Back to Stories WID Series: Featured Publications

DECIPHER: harnessing local sequence context to improve protein multiple sequence alignment

Researchers at WID are continually publishing premier research in top publications. Here, we feature some of the most important and transformative scientific publications from our community.

Photo by Jeff Goldberg / Esto

Biology is being revolutionized by technological advances in DNA sequencing that have made copious amounts of DNA sequence available. To make sense of this deluge of data, biologists will oftentimes “align” the DNA sequences so that they can be readily compared. These comparisons are important for homology detection, recognition of evolutionarily important sites, and a number of other biological endeavors. This process works fairly well for a small number of sequences, but has been unable to maintain its quality with the large number of sequences now available.

Example of an empirically determined structural alignment of two lactate dehydrogenase proteins (1a5z and 1ldn).

Example of an empirically determined structural alignment of two lactate dehydrogenase proteins (1a5z and 1ldn).

In a recent study, Erik Wright, a graduate student in the Systems Biology theme at WID, investigated potential solutions to the problem of rapidly generating large and accurate biological sequence alignments. He found that scalable alignments can be made by harnessing structural predictions during the alignment process. Although the primary sequence (ACTG…) diverges greatly between organisms, its corresponding 2D and 3D structure is often highly conserved. By aligning the structures and sequences simultaneously, large alignments of diverse sequences can be created that maintain similar accuracy to small alignments. This discovery will enable biologists to better respond to the rapid increase in new DNA sequences.

You can find the full paper, published in BMC Bioinformatics, here.

Press Contact:

WID Media

More articles in Featured Publications:

Understanding the Immune System with Machine Learning

Systems Biology researchers Deborah Chasman and Sushmita Roy are using machine learning to identify virus and pathogenicity-specific regulatory networks which may guide the design of effective therapeutics for infectious diseases. The work is described in a recent paper in PLOS Computational Biology.