Ideation, Iteration, Innovation At the Frontier of Genome Editing (pt. 1 of 3)

Prime Editing is the latest breakthrough in genome editing and adds a potential tool to the CRISPR-Cas9 toolkit

David Fu
9 min readMay 9, 2020
From an article on gene editing from The Conversation

Contents

  1. Introducing to the World: Prime Editing (pt. 1)
  2. DNA and RNA: A Refresher (pt. 1)
  3. Editing The Genome: Enter CRISPR-Cas9 (pt. 1)
  4. A Complementary Technique to CRISPR-Cas9: Base Editing (pt. 2)
  5. An Evolved Version of Base Editing: Prime Editing (pt. 2)
  6. Why This Is Groundbreaking: Word Processors (pt. 3)
  7. A Final Word On Ideation, Iteration and Innovation (pt. 3)
  8. 🔑 Takeaways (pt. 3)

Skip ahead to pt. 2 or pt 3.

1. Introducing to the World: Prime Editing

I n October 2019, researchers from Harvard Professor David R. Liu’s lab published a paper in Nature (full article accessible via NCBI) sharing that they had developed a new genome editing technique that they called ‘Prime Editing’ to distinguish from ‘Base Editing’ and from classic CRISPR-Cas9 editing techniques.

In fact, his research group had previously launched two companies called Editas Medicine and Beam Therapeutics to accelerate research and development of treatments using genome editing techniques. When they published this paper, he also launched a separate company called Prime Medicine to work collaboratively with Beam Therapeutics to research & develop treatments based on Prime Editing.

Why is this groundbreaking?

Technically, Liu’s Research Group has developed a new way to edit genomes with full versatility for any kind of edit (insertion, deletion, substitution), able to do larger edits than before (from 1 up to 44 base pairs for insertion and from 1 up to 80 base pairs for deletion), with more precision, more efficiency and fewer unwanted modifications or byproducts. Using mechanisms that work in both dividing (mitotic) and most notably, nondividing (post-mitotic) cells.

Broadly, these present new tools for advancing 1) the pace of biological research through enabling creation of precise genetic modifications, 2) development of treatments and therapeutics for diseases affecting millions worldwide (from sickle-cell to Tay Sachs disease to genetic blindness to cancer) and 3) the modification of plants for higher yielding or more resistant food crops.

In order to truly appreciate and understand how Prime Editing works and how it was developed, we’ll start with a refresher on DNA/RNA, the use of classic CRISPR-Cas9 in Genome Editing, a complementary technique leveraging Cas9 known as Base Editing and finally Prime Editing. In addition, I share what makes this groundbreaking through an analogy and conclude by discussing the broader process of innovation.

2. DNA and RNA: A Refresher

Our genome is made up of DNA (deoxyribonucleic acid), which is structurally a double helix comprised of building blocks (‘nucleotides’). Each building block is made up of the three parts: a phosphate group, a sugar group (deoxyribose) and a nitrogen base. There are four different nitrogen bases in DNA: Adenine, Thymine, Guanine, Cytosine, or A T G C. Often, we just use these letters to denote the entire nucleotide.

A double helix refers to two strands or chains of nucleotides — the nucleotides on each chain are linked to each other (connected via the sugar/phosphate part of the nucleotide to form the helical ‘backbone’). The two strands are connected to each other through complementary ‘base pairs.’ A is always supposed to be paired with T (and vice versa), and C is supposed to be paired with G (and vice versa), with the bonds between the base pairs linking the chains to form a double helix. (We sometimes call the process of connecting or disconnecting the base pairs zipping or unzipping the double helix.)

A beautiful rendition via Nature, with the left showing high level and right showing detailed structure.

It is an elegant structure as the ‘backbone’ once constructed is not meant to break apart easily (though we will soon discuss an enzyme that is able to do so, and how we started using it), while having complementary base pairs bonded more weakly to make it easier for DNA to replicate. In order to do so, the two strands are unzipped from each other using an enzyme (‘helicase’), and another enzyme (‘DNA polymerase’) uses a single strand as a template, reviews each nucleotide one at a time (say a G), synthesizes that nucleotide’s complement (a C), and adds this to a growing chain. Thus, through this methodical process and the information encoded via complementary base pairs, DNA is able to efficiently and effectively be replicated with fidelity.

Just as code are instructions your computer follows to execute certain functions or programs, DNA are instructions your cells follow to execute certain functions or programs of life. One of the key ways it does this is that every 3 bases encodes a different ‘amino acid’ (of which there are 20 kinds), which is the underlying building block of proteins (of which enzymes are a subclass).

RNA is another important component used in genome editing, so I want to give an overview of it as well. Although often in single strands and using a ribose sugar rather than deoxyribose (hence RNA rather than DNA), RNA works remarkably similarly to DNA. It also is composed of different nucleuotides using complementary bases (although, as shown in the diagram below, it uses Uracil U instead of Thymine T). Thus, for RNA,

The key is that RNA base pairs are complementary to DNA base pairs, and one of RNA’s functions is as an intermediary template in protein synthesis. It’s rather like a cache or RAM — when making a specific protein, you don’t want to have to unzip and use the entire DNA (the ROM), so you unzip the relevant section, create an RNA template (data then stored in cache or RAM for active APIs or programs), and the RNA template is then used to synthesize the desired protein. That RNA is complementary to DNA is key (U pairs with A, A pairs with T, C pairs with G, G pairs with C), as RNA is used in the various genome editing techniques as a guide to target the precise sequence of DNA that one desires to edit.

This beautiful diagram comparing and contrasting DNA vs. RNA comes from ThoughtCo.

It is using these four distinct DNA bases (A T C G) that the entire spice of life is encoded. The human genome is over 3 billion base pairs in length, and technological advances have made it increasingly cheaper and faster for us to sequence a person’s genome and edit it as well.

3. Editing The Genome: Enter CRISPR-Cas9

There have been various techniques and methods researched, created and tested over the years, but the foremost one that has entered popular consciousness is CRISPR. Starting in 2012, scientists began publishing work showing that they had repurposed and modified an existing natural mechanism (CRISPR) to increase the effectiveness and efficiency of genome editing (and with potential to be cheaper) relative to alternatives like ZFNs or TALENs.

CRISPR: refers to Clustered Regularly Interspaced Short Palindromic Repeats, which are part of defense mechanism that bacteria evolved to protect themselves from invaders (like viruses and plasmids). This a section of the bacteria’s genome where it stores short, unique sequences of DNA for viruses or plasmids it has identified as invaders. This is the equivalent of saving a mug shot of every foe they’ve ever come across so they could spot each one quicker in the future (as Megan Molteni put it in this Wired article).

CRISPR Associated Protein 9 (Cas9): refers to a nuclease (an enzyme that can cleave DNA chains) that uses RNA generated from CRISPR to target and destroy the DNA of invading viruses or plasmids. The bacteria uses this RNA identifier to hunt down any matching virus, much like a Sheriff riding out with a wanted poster of a criminal (as Daniel Binkoski put it in this Medium piece).

Scientists hypothesized that they could use this machinery if they engineer the right RNA sequences to identify specific locations of the DNA they want to target for modifications and then to trigger editing via the cells own pathway of repair.

TARGETING

A key step (first published Aug 2012) was showing that in certain CRISPR-Cas systems, CRISPR RNA (crRNA) and trans-activating crRNA (tracrRNA) together direct Cas9 to bind to and introduce double strand breaks (DSBs) in targeted DNA. Further, the same scientists showed that they could engineer a combined crRNA and tracrRNA (later dubbed a single guide or sgRNA) to also direct Cas9 to create targeted DSBs.

The exact mechanism by which sgRNA is used by Cas9 to target DNA and make the cleavage at the right point involves a protospacer and protospacer adjacent motif (PAM). The sgRNA is engineered with base sequences (‘spacer,’ 20 nucleotides in length per the diagram below) complementary to the specific protospacer sequence of the desired target DNA so that it ‘directs’ Cas9 to bind to that site through complementary base pairing. In the DNA sequence, each protospacer is always associated with a PAM. The Cas9 endonuclease makes a double-strand break ~3 base pairs (BP) away from the PAM — which is NGG (N being any base) in the most commonly used Cas9 derived from Streptococcus pyogenes.

L: 2013 “Genome engineering using the CRISPR-Cas9 system”. R: 2016 “Insert, remove or replace: A highly advanced genome editing system using CRISPR/Cas9”

EDITING

Once a double-strand break is created, it triggers one of two pathways by which a cell will respond to repair damaged DNA, each of which can be used to achieve a desired editing outcome.

NHEJ: non-homologous end joining — in the absence of a repair template, the cell will automatically do this if only a break is facilitated, and DSBs are reconnected (re-ligated) through the NHEJ process, leaving scars in the form of insertions or deletions (‘indels’). This can be used to knockout these genes because indels occurring within a coding exon can lead to frameshift mutations and premature stop codons — which will be read by a transcription enzyme — the former causing unintelligible and thus unusable RNA, the latter causing truncated and thus unusable RNA). Multiple DSBs can be used to cause larger deletions in the genome.

HDR: homology-directed repair — although typically occurring at lower and substantially more variable frequencies than NHEJ — can be used to generate precise, defined modifications at the target DNA (or locus) when also introducing a repair template. The repair template can either be in the form of conventional double-stranded DNA targeting constructs with homology arms flanking the insertion sequence, or single-stranded DNA oligonucleotides (ssODNs). The former allow for large modifications, including insertion of reporter genes such as fluorescent proteins or antibiotic resistance markers (GFP is an example & contains 238 amino acids, meaning a minimum of 714 bps). The later provides an effective and simple method for making small edits in the genome, such as the introduction of single-nucleotide point mutations.

From 2013 “Genome engineering using the CRISPR-Cas9 system”.

As the authors describe in the above piece, the advantage of CRISPR-Cas9 over other techniques to date is customization, higher targeting efficiency and the ability to target multiple genomic loci or edits simultaneously. Further, scientists have engineered versions of Cas9 to work in mammalian cells and to create desired changes (e.g., generating engineered eukaryotic cells carrying specific mutations via both NHEJ and HDR and rapid generation of transgenic mice with multiple modified alleles via direct injection of sgRNA and mRNA encoding Cas9 into embryos) to enable exploration and research on other issues. The authors also describe considerations for identifying the target selection, construction and delivery of the sgRNA.

Limitations:

  1. As mentioned in the diagram, HDR is generally active only in dividing cells (e.g., skin cells), as opposed to those that are nondividing in adults (e.g., neurons or brain cells). This is a fundamental limitation of using CRISPR-Cas9 in these types of cells for more precise edits that require HDR.
  2. Since some introduced double-strand breaks will be repaired by NHEJ, undesired insertions or deletions (indels) can happen at the target site. This is an inherent limitation of techniques based on DSBs.
  3. The targeting mechanism is quite complex and requires careful considerations (as documented here and can be designed for using resources listed here). These can quite often lead to off-target mutagenesis and thus unwanted changes to DNA.
  4. Production of nucleases can be laborious and delivery to cells can be challenging.

For the 2nd and 3rd limitations, there has been work done to reduce off-targets and to use Cas9 nickase that only nicks (e.g., cuts one strand) rather than creating a double-strand break in order to reduce errors and increase efficiency (e.g., as described in this 2016 paper).

Continued in pt. 2 and pt. 3.

--

--

David Fu

davidfu.co | Ever-evolving, global ed & innovation entrepreneur | CEO Streetlight Schools | expansion lead 4.0 Schools | ex-i-banker | Joburg Global Shaper @WEF