DNA sequencing is the process of determining the order of nucleotides in a DNA strand. DNA sequencing can be done on the entire genome, or on small, targeted regions of the genome, providing answers to a number of genomics research questions.
DNA contains the instructions needed to make living organisms. One of the biggest achievements of the modern era is the sequencing of genetic material, that is, deciphering the order of millions to billions of As, Ts, Cs, and Gs in a sample. Unlocking the secrets of the genome through DNA sequencing can provide answers to a variety of questions, from the simple workings of biology to the complexities of disease. Through DNA sequencing, researchers can identify and better understand important genomic elements, from functional gene sequences to critical regulatory elements─not only in the human genome, but in plant, animal, and microbial genomes, too.
Generally, DNA sequencing requires fragmenting long DNA strands into shorter fragments, running those fragments through a sequencing reaction, and then piecing the resulting sequence data back together in the correct order.
There are two main sequencing approaches: Sanger sequencing and next generation sequencing.
Sanger Sequencing, also known as first-generation sequencing, was used to sequence the first human genome over a period of 13 years. Sanger sequencing relies on chemicals called dideoxynucleotides, also called chain-terminating nucleotides, which have unique fluorescent tags: one for A, one for T, one for C, and one for G. Once one is added to the growing nucleotide chain, no other nucleotides can be added. Multiple partially complete fragments are produced in Sanger sequencing. They are separated by length in a process called capillary electrophoresis. The fragments partially overlap, and the precise DNA sequence is determined by ordering the fluorescent tags on the fragments.
Sanger sequencing is slow and cumbersome and has limited ability to identify gene variants and mutations, especially if they are outnumbered by normal copies of a gene. Next generation sequencing (NGS)─also called massively parallel high-throughput sequencing─is a much faster, cheaper, and more accurate sequencing method. To put it in perspective, an entire human genome can be sequenced in a matter of days with next-generation sequencing, not years. Illumina sequencing by synthesis is a common approach to NGS that relies on the generation of DNA strand “clusters”─multiple copies of the same DNA fragment bound to a solid surface called a flow cell. These clusters are synthesized by a process called “bridge amplification” ─ a PCR reaction on a chip where essentially, a single DNA strand attached to the solid surface, is amplified repeatedly. As nucleotides are added, a fluorescent signal (one per nucleotide)is released and captured. Multiple clusters, and therefore multiple DNA fragments, can be synthesized and sequenced on that single flow cell, making the method massively parallel and high-throughput.
Aside from the time and money savings, there are several additional benefits to DNA sequencing with NGS: