logo
AAT Bioquest

What are the basic steps involved in next-generation sequencing?

Posted February 9, 2024


Answer

The four basic steps involved in NGS include: sample extraction, library preparation, DNA sequencing, and data analysis. 

  1. During sample extraction, NGS can be applied to any sample containing DNA or RNA, such as cell cultures, formalin-fixed paraffin-embedded (FFPE) tissues, saliva and blood. Different extraction protocols are available for distinct sample types, each designed to yield high-quality and abundant nucleic acid. After extraction, the quality and quantity of DNA or RNA are assessed. The purity is commonly expressed as an A260/280 value, with "pure" DNA typically having a reading of 1.8 and RNA approximately 2.0.
  2. Library preparation involves two main steps: amplification and adapter ligation. Amplification generates a pool of DNA fragments of the appropriate size. If RNA is the starting material, an additional step involves converting RNA to cDNA through reverse transcription. PCR amplification results in a library of DNA fragments suited for compatibility with the chosen sequencing system. The next step involves the amplified DNA or cDNA fragments (known as amplicons) that undergo adapter ligation. This process attaches specific oligonucleotide sequences to the ends of the amplicons, permitting them to interact with a sequencing flow cell. When sequencing multiple samples in a single run, each sample’s amplicons are uniquely marked with a barcode. These libraries can then be collected for a sole sequencing run and are later demultiplexed during data analysis. 
  3. For DNA sequencing, NGS platforms (e.g. Illumina) perform parallel sequencing in which the prepared DNA or RNA library is loaded onto the sequencer. The sequencer then reads the nucleotides individually. The quantity of reads produced varies depending on the specific sequencing platform and kit used. In Illumina, each fragment of the library binds to primers and undergoes amplification, forming millions to billions of clonal clusters. Subsequently, fluorescently labeled nucleotides are used to build a complementary strand for each fragment. The sequencer then captures images of the flow cell, recording the emission from each cluster. By analyzing the fluorescent emission intensity and wavelength, the sequence of the templates is identified.
  4. During data analysis, software is utilized to interpret the dataset generated. Initially, the reads undergo filtering based on criteria such as amplicon size, quality, and alignment between paired ends. Subsequently, the filtered reads are correlated to a reference genome. Lastly, (whether in their assembled or raw form) the reads are compared to a reference sequence or reads from another sample to identify variants associated with disease states, among other factors. If the reads are correlated with a reference genome, variant annotation is used to link variants with regulatory sequences or known genes. 
Additional resources

Next-generation sequencing and its clinical application

Next Generation Sequencing (NGS)

MagaDye™ 535-ddGTP