A greedy algorithm for aligning DNA sequences

A greedy algorithm for aligning DNA sequences

For aligning DNA sequences that differ only by sequencing errors, or by equivalent errors from other sources, a greedy algorithm can be much faster than traditional dynamic programming approaches and yet produce an alignment that is guaranteed to be theoretically optimal.

We introduce a new greedy alignment algorithm with particularly good performance and show that it computes the same alignment as does a certain dynamic programming algorithm, while executing over 10 times faster on appropriate data. An implementation of this algorithm is currently used in a program that assembles the UniGene database at the National Center for Biotechnology Information.

Recent advances in genome engineering technologies based on the CRISPR-associated RNA-guided endonuclease Cas9 are enabling the systematic interrogation of mammalian genome function. Analogous to the search function in modern word processors, Cas9 can be guided to specific locations within complex genomes by a short RNA search string.

Using this system, DNA sequences within the endogenous genome and their functional outputs are now easily edited or modulated in virtually any organism of choice. Cas9-mediated genetic perturbation is simple and scalable, empowering researchers to elucidate the functional organization of the genome at the systems level and establish causal linkages between genetic variations and biological phenotypes.

A greedy algorithm for aligning DNA sequences
A greedy algorithm for aligning DNA sequences

In this Review, we describe the development and applications of Cas9 for a variety of research or translational applications while highlighting challenges as well as future directions. Derived from a remarkable microbial defense system, Cas9 is driving innovative applications from basic biology to biotechnology and medicine.

Folding in the cytosol is achieved

Efficient folding of many newly synthesized proteins depends on assistance from molecular chaperones, which serve to prevent protein misfolding and aggregation in the crowded environment of the cell. Nascent chain–binding chaperones, including trigger factor, Hsp70, and prefoldin, stabilize elongating chains on ribosomes in a nonaggregated state.

Folding in the cytosol is achieved either on controlled chain release from these factors or after transfer of newly synthesized proteins to downstream chaperones, such as the chaperonins. These are large, cylindrical complexes that provide a central compartment for a single protein chain to fold unimpaired by aggregation. Understanding how the thousands of different proteins synthesized in a cell use this chaperone machinery has profound implications for biotechnology and medicine.