Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD TO GENERATE SAFER BIOTECHNOLOGIES FOR CRISPR-BASED GENE EDITING AND THERAPEUTICS
Document Type and Number:
WIPO Patent Application WO/2024/077291
Kind Code:
A2
Abstract:
In general, the present disclosure is directed to methods and systems for selecting and/or designing crRNA oligonucleotide sequences which display lower promiscuity for inducing off-target mutations in DNA. The example methods and systems utilize a CRISPR effector system to induce selective damage in oligonucleotide sequences included in a multi-plasmid system.

Inventors:
JOSEPHS ERIC (US)
NICHOLAS ASHLEY (US)
DIMIG HILLARY (US)
ROESING MIRANDA (US)
Application Number:
PCT/US2023/076357
Publication Date:
April 11, 2024
Filing Date:
October 09, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
THE UNIV OF NORTH CAROLINA AT GREENSBORO (US)
International Classes:
C12N15/74; A61K31/711
Attorney, Agent or Firm:
WIMBISH, J., Clinton (US)
Download PDF:
Claims:
CLAIMS

What is claimed:

1. A process for selecting crRNA sequences for CRISPR gene editing comprising: identifying a target oligonucleotide sequence; transferring the target oligonucleotide sequence to a first plasmid, wherein the first plasmid further comprises an oligonucleotide sequence encoding a toxin; obtaining a second plasmid, wherein the second plasmid comprises an oligonucleotide sequence encoding a CRISPR effector; identifying a plurality of off-target oligonucleotide sequences; obtaining a plurality of third plasmids; transferring each off-target oligonucleotide sequences into a different third plasmid from the plurality of third plasmids, wherein each third plasmid from the plurality of third plasmids further comprises an oligonucleotide sequence encoding a crRNA; transfecting an organism with the first plasmid, the second plasmid, and the plurality of third plasmids, wherein, the crRNA sequence comprises an x-gRNA structure, x portion including n-m nucleotides covalently linked to a gRNA portion, and the gRNA portion comprises a sequence of nucleotides complementary to a portion of the target oligonucleotide sequence.

2. The process of claim 1, wherein the target oligonucleotide sequence is selected from a mutation or variant sequence that results in phenotypic abnormality.

3. The process of claim 1, wherein the target oligonucleotide sequence includes a disease causing or promoting sequence.

4. The process of claim 3, wherein the disease is cancer.

5. The process of claim 3, wherein the disease is sickle cell disease.

6. The process of claim 2, wherein at least one of the plurality of off-target oligonucleotide sequences are healthy alleles.

7. The process of claim 1, wherein the CRISPR effector is Cas9.

8. The process of claim 1, wherein the organism is Escherichia coli.

9. The process of claim 1, wherein the x portion is configured to form a secondary structure with the gRNA portion.

10. The process of claim 9, wherein the secondary structure is a hairpin.

11. The process of claim 1 , wherein the x portion is determined using a recombinant library.

12. A system for selecting crRNA sequences for CRISPR gene editing comprising: a first plasmid, wherein the first plasmid comprises a target oligonucleotide sequence and a first oligonucleotide sequence encoding a toxin; a second plasmid, wherein the second plasmid comprises a second oligonucleotide sequence encoding a CRISPR effector; a plurality of third plasmids, wherein each third plasmid of the plurality of third plasmids comprises an off-target sequence and a third oligonucleotide sequence encoding a crRNA, wherein, the crRNA sequence comprises an x-gRNA structure, x portion including n-m nucleotides covalently linked to a gRNA portion, and the gRNA portion comprises a sequence of nucleotides complementary to a portion of the target oligonucleotide sequence.

13. The system of claim 12, wherein the target oligonucleotide sequence is selected from a mutation or variant sequence that results in phenotypic abnormality.

14. The system of claim 12, wherein the CRISPR effector is Cas9.

15. The system of claim 12, wherein the x portion is configured to form a secondary structure with the gRNA portion.

16. The system of claim 15, wherein the secondary structure is a hairpin.

17. The system of claim 12, wherein the x portion includes 8-12 nucleotides.

18. A process for selecting crRNA sequences for CRISPR gene editing comprising: identifying a target oligonucleotide sequence; transferring the target oligonucleotide sequence to a first plasmid, wherein the first plasmid further comprises an oligonucleotide sequence encoding a toxin; obtaining a second plasmid, wherein the second plasmid comprises an oligonucleotide sequence encoding a CRISPR effector; identifying a plurality of off-target oligonucleotide sequences; obtaining a plurality of third plasmids; transferring each off-target oligonucleotide sequences into a different third plasmid from the plurality of third plasmids, wherein each third plasmid from the plurality of third plasmids further comprises an oligonucleotide sequence encoding a crRNA and an antibiotic resistance gene; transfecting more than one organism with the first plasmid, the second plasmid, and the plurality of third plasmids, wherein, the crRNA sequence comprises an x-gRNA structure, x portion including 8-12 nucleotides covalently linked to a gRNA portion, and the gRNA portion comprises a sequence of nucleotides complementary to a portion of the target oligonucleotide sequence.

19. The process of claim 18, further comprising inducing expression of the toxin and the CRISPR effector, such that only those organisms where the CRISPR effector induces double strand breaks at the target oligonucleotide sequence survive toxin induction.

20. The process of claim 18, further comprising inducing expression of the CRISPR effector and exposing the organisms to an antibiotic to which the antibiotic resistance gene confers resistance, such that only those organisms where the CRISPR effector fails to induce double strand breaks at the off-target sequences survive the antibiotic exposure.

Description:
METHOD TO GENERATE SAFER BIOTECHNOLOGIES FOR CRISPR-BASED GENE EDITING AND THERAPEUTICS

RELATED APPLICATION DATA

[0001] The present application claims priority pursuant to 35 U.S.C. § 119(e) to United States Provisional Patent Application Number 63/414,112 filed October 7, 2022 which is incorporated herein by reference in its entirety.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

[0002] This invention was made with government support under contract nos. 1 R35G M, 133483-01, and 12718261 awarded by National Institute of Health, National Institute of General Medical Sciences, respectively. The government has certain rights in the invention.

SEQUENCE LISTING

[0003] An electronic sequence listing (060026-00064.xml; size 117 KB; date of creation October 4, 2023) submitted herewith is incorporated by reference in its entirety.

FIELD

[0004] The present disclosure relates to methods for selecting and/or designing crRNAs for use in applications such as gene editing to mitigate off-target mutation.

BACKGROUND

[0005] Class II CRISPR effectors like Cas9, Casl2, and Casl3, are endonucleases that use a modular segment of their RNA cofactors known as CRISPR RNAs (crRNAs) or guide RNAs (gRNAs) to recognize and trigger the degradation of nucleic acids with a sequence complementary to that segment (FIG. 1) These diverse enzymes are derived from a bacterial and archaeal defensive response to invasive plasmids and viruses and, because of their ability to be easily redirected to nucleic acid with different sequences by simply changing the sequence composition of that short portion of their gRNAs called their ‘spacer,’ they have been re- appropriated over the past several years for a number of different biotechnological applications, most notably in precision gene editing. During precision gene editing, a CRISPR effector is transfected into a human cell and directed to introduce a double strand break (DSB) into the genomic DNA at a specific targeted sequence; genomic mutations have been introduced at those sites as a result of mutagenic DSB repair. These technologies have experienced widespread adoption for biomedical research and possess a number of emerging therapeutic applications as well.

SUMMARY

[0006] Generally, the present disclosure is directed to methods and systems for selecting and/or designing crRNA oligonucleotide sequences which display lower promiscuity for inducing off-target mutations in DNA. The example methods and systems utilize a CRISPR effector system to induce selective damage in oligonucleotide sequences included in a multiplasmid system.

[0007] One aspect of example methods and systems includes an organism for hosting the selection. In one example implementation, the organism is a bacterium such as E. coli.

[0008] In some example implementations of the disclosure, the crRNA oligonucleotide sequences to be selected for are encoded on separate plasmids which are introduced (e.g., transfected) into the organism. Each separate plasmid also includes an off-target sequence. In this manner, reading of the separate plasmids by the bacterium transcription and/or translation machinery will produce the crRNA oligonucleotide sequences and their affinity for off-target sequences will over time select for crRNA oligonucleotide sequences that display low promiscuity, for example by degrading plasmids having off-target sequences for which the crRNA displays affinity /binds.

[0009] Additionally, to select for crRNA nucleotide sequences that display affinity for a target sequence, a target plasmid, including the target oligonucleotide sequence and encoding a toxin is transfected or otherwise present in the bacterium. Based on this design, crRNA sequences which are able to induce breaks in the target plasmid will prevent expression of the toxin, leading to survival of the bacterium.

[0010] Thus the combination of a library of plasmids encoding different crRNA sequences and a target plasmid including the target oligonucleotide sequence can be used to select for crRNA which are effective at degrading a target oligonucleotide sequence and therefore preventing expression of the toxin. Concurrently, the combination of the crRNA with an off- target sequence can be used to select for crRNA sequences that display low promiscuity for the off target sequence.

[0011] Another aspect of example implementations includes a CRISPR effector. Many endonucleases and variants have been developed to effect gene editing. For implementations of the disclosure, the CRISPR effector is used to induce damage in a double stranded DNA plasmid. Any CRISPR effector which can induce such DNA damage can be used. To include the CRISPR effector in example methods and systems, the CRISPR effector may be encoded in an effector plasmid, which is transduced or otherwise present in the bacterium.

[0012] In one example selection assay, the separate plasmids encoding the crRNA oligonucleotide sequences are transfected into a bacterium, where the bacterium includes a first plasmid having a target oligonucleotide sequence and the first plasmid also encodes a toxin.

[0013] As used herein, oligonucleotides sequences that “encode” a product have sequences whereupon transcript! on/translation by the bacterium machinery, produce the product (e.g., the toxin and/or an RNA sequence).

[0014] The present disclosure is directed to methods of crRNA design and nucleic acid sequences derived therefrom. In particular, the present disclosure provides methods for designing and/or selecting oligonucleotide sequences displaying reduced promiscuity for off-target mutation in gene editing applications using a CRISPR effector system.

[0015] An example aspect of the present disclosure can include a selection assay for plasmids encoding crRNA sequences. In an example implementation, the plasmids include an oligonucleotide sequence which upon transcription produces a crRNA having the general sequence x-gRNA, where x is a sequence of n-m nucleotides. In some example implementations, the sequence x is capable of forming a secondary structure (e.g., hairpin) with the gRNA portion of the x-gRNA. In other implementations, the sequence x can be determined at random (e.g., using a recombinant library). The gRNA portion can include a sequence of nucleotides complementary to at least a portion of a target sequence (e.g., a mutant or mutated gene).

[0016] As used herein, complementary nucleotides are based on Watson-Crick base paring rules. More particularly, Adenine (A) can form a base pair with Thyamine (T), and Guanine (G) forms a base pair with Cytosine (C). For RNA bases, Thyamine is replaced by uracil (U). Using this framework, a complementary sequence to ATGC would be TACG for DNA and UACG for RNA. [0017] In some example implementations, the plasmids encoding crRNA sequences can each have the same gRNA portion while having different x sequences. In this manner, the assay can be used to determine a leading oligonucleotide sequence x that can reduce promiscuity of the gRNA for off-target mutations.

[0018] Alternatively, in certain implementations, the plasmids encoding gRNA sequences can each have different gRNA portions while having the same x sequences.

[0019] In one example implementation, the plasmid selection may be determined iteratively. For instance, a process according to the present disclosure can include determining an x sequence by conducting a selection assay where the crRNA sequences have the same gRNA. As the plasmids encoding the crRNA are selected for both efficacy and reduced promiscuity, the remaining plasmids (e.g., after a certain number of bacterium growth cycles) can be isolated and sequenced to determine a selected x oligonucleotide sequence. A second selection assay can be performed to determine a gRNA sequence by conducing the assay where the crRNA sequences each include the selected x oligonucleotide sequence and while including different gRNA sequences. As should be understood, this process can be performed in any order (e.g., to determine a selected gRNA sequence then a selected x oligonucleotide sequence or vice versa). [0020] An example embodiment of the present disclosure can also include a process for crRNA selection. One aspect of the process can include identifying a target genetic sequence. The target genetic sequence can be selected from any organism. In a preferred embodiment, the target genetic sequence is selected from a mammal. In a further preferred embodiment, the target genetic sequence is selected from a mutation or variant sequence that results in phenotypic abnormality. For instance, the target genetic sequence can include a disease (e.g., cancer or sickle cell disease) causing and/or promoting sequence.

[0021] Another aspect of an example selection process can include obtaining a first plasmid including the target genetic sequence, where the first plasmid further includes an oligonucleotide sequence encoding a toxin.

[0022] A still further aspect of the example selection process can also include obtaining a second plasmid, where the second plasmid includes an oligonucleotide sequence encoding a CRISPR effector system (e.g., cas9).

[0023] The example selection process can also include identifying one or more off-target sequences (e.g., healthy alleles). [0024] In an example selection process, the process may include obtaining a plurality of third plasmids, each of the third plasmids including a distinct off-target sequence of the one or more off-target sequences, and where the third plasmid includes an oligonucleotide sequence further encoding a crRNA sequence.

[0025] In some implementations, the selection assay can also include transfecting the organism with the first plasmid, the second plasmid, and/or the plurality of third plasmids.

[0026] According to embodiments of the present disclosure, the crRNA sequence can comprise an x-gRNA structure including an ‘x’ oligonucleotide portion covalently attached to a gRNA oligonucleotide portion, wherein the gRNA oligonucleotide portion is an oligonucleotide sequence substantially complementary to a portion of the target oligonucleotide sequence.

[0027] In some implementations identifying the target sequence can include determining a sequence position for each of one or more protospacer motifs based at least in part on the CRISPR effector, were each of the one or more protospacer motifs include an adjacent sequence of nucleotides; and assigning at least one sequence position as a protospacer position; and identifying the two or more target sequences as a sequence of nucleotides immediately downstream ( toward the 3' end) of the protospacer position.

[0028] For certain example methods the CRISPR effector can be enAsCas!2a. For certain example methods the CRISPR effector can be Cas9.

[0029] In some example methods the one or more protospacer motifs are from the group consisting of: NGG, NAG, TTYN, CTTV, RTTC, TATM, CTCC, TCCC, TACA, RTTS, TATA, TGTV, ANCC, CVCC, TGCC, GTCC, TTAC, or combinations thereof.

[0030] In some example methods, the one or more protospacer motifs are from the group consisting of: NGG, NAG, TTYN, CTTV, RTTC, TATM, CTCC, TCCC, TACA, or combinations thereof.

[0031] In some example methods, the CRISPR effector is Cas9 and the one or more protospacer motifs are from the group consisting of: NGG, NAG, or combinations thereof

BRIEF DESCRIPTION OF THE FIGURES

[0032] Figure 1 illustrates a diagram depicting how CRISPR effectors recognize DNA complementary to their RNA cofactors to introduce double strand breaks (DSBs) at those sites. Genetic mutations result from non-conservative repair of those sites. [0033] Figure 2 illustrates a diagram depicting crRNA structures in combination with a CRISPR effector to show extending the crRNA (x-gRNA) outside the DNS-targeting region (gray “spacer”) can introduce internal interactions that inhibit full invasion of off-target sequences.

[0034] Figure 3 illustrates a graph (left) and an image of a predicted secondary structure of the crRNA (right). The x-gRNA with unusual predicted structure performs best.

[0035] Figure 4 illustrates a schematic depicting an example embodiment of the disclosure. X-gRNA libraries are co-transformed with a toxic plasmid and x-gRNA screened for cleavage activity (vs. a toxic plasmid with target site) and specificity (no activity vs. off-target on the guide plasmid). (Right). Only active and specific x-gRNAs will survive for rescreening and analysis.

[0036] Figure 5 illustrates images of using the regular EMX1 sgRNA, Cas9 is active and can cleave toxin plasmid: bacteria survive (activity positive selection, left column images). If there is a known EMX1 of/target introduced on the guide plasmid, self-cleavage will cause the bacteria to die (specificity negative selection, right column images).

[0037] Figure 6 illustrates a gel displaying three randomly picked colonies from the EMX1 x-gRNA library after screening (as in Figure 5, top right image) were sequenced. The x-gRNA from those colonies were synthesized to test Cas9 cleavage activity vs DNA with EMX1 target (ON) and a known off-target (OFF1) in vitro. All three screened x-gRNAs exhibited strong activity vs. on-target DNA (comparable to sgRNA) but abolished activity at the off-target target that only differs by 2 PAM-distal nucleotides.

[0038] Figure 7 illustrates a graph displaying in vitro cleavage activity of the top five extended gRNAs generated from the invention and their activity at the target site and four sites where the regular gRNA is known to exhibit significant off-target activity. N=3 independent trials.

[0039] Figure 8 illustrates a graph depicting the ratio of Cas9 activity at oncogenic mutations (KRAS G12D) vs healthy gene sequence (KRAS WT) based on data from Table 1.

[0040] Figure 9A is a schematic of the process of the present invention. SECRETS screen is provided identify short 5’ - nucleotide extensions to gRNAs that increase Cas9 gene editing specificity for a specific target with known off-targets. Simplified schematics of the three SECRETS plasmids: (i) high-copy number plasmid with inducible toxin and target sequence of interest to select for “on-target” activity, (ii) medium copy plasmid for inducible expression of Cas9, and (iii) low-copy plasmid for gRNA expression and counterselection for “off-targef ’ activity with kanamycin resistance cassette and a known “off-targef ’ sequence for the gRNA. [0041] Figure 9B is a schematic of the process of the present invention. SECRETS screen is provided identify short 5’ - nucleotide extensions to gRNAs that increase Cas9 gene editing specificity for a specific target with known off-targets. (Upper) a standard gRNA has 20 nt “spacer” (DNA-targeting) segment while (lower) an extended gRNA (x-gRNA) has an additional 8 - 12 nt to the 5’- of the spacer.

[0042] Figure 9C is a schematic of the process of the present invention. SECRETS screen is provided identify short 5’ - nucleotide extensions to gRNAs that increase Cas9 gene editing specificity for a specific target with known off-targets. During SECRETS selection, E. coli expressing standard gRNAs that exhibit promiscuous activity are unable to survive.

[0043] Figure 9D is a schematic of the process of the present invention. SECRETS screen is provided identify short 5’ - nucleotide extensions to gRNAs that increase Cas9 gene editing specificity for a specific target with known off-targets. Members of large, randomized x-gRNA libraries can be screened simultaneously using SECRETS for 5’ - extensions that result in efficient on-target Cas9 activity and minimized off-target activity that can be used for highly- specific gene editing applications at that target.

[0044] Figure 10A are images depicting results of the SECRETS method. SECRETS is highly selective for on-target activity and counter-selective for off-target activity. SECRETS screen with pSECRETS-A, an empty toxin plasmid (pSECRETS-C with no target), and pSECRETS-B with an EMX1 sgRNA after plating on non-selective (left) vs selective (right) conditions.

[0045] Figure 10B are images depicting results of the SECRETS method. SECRETS is highly selective for on-target activity and counter-selective for off-target activity. SECRETS screen with pSECRETS-A, pSECRETS-C with EMX1 ON, and pSECRETS-B with an EMX1 sgRNA on non-selective (left) vs selective (right) conditions.

[0046] Figure 10C are images depicting results of the SECRETS method. SECRETS is highly selective for on-target activity and counter-selective for off-target activity. SECRETS screen with pSECRETS-A, pSECRETS-C with EMX1 ON, and pSECRETS-B with an EMX1 sgRNA and EMX1 OFF1 on non-selective (left) vs selective (right) conditions. [0047] Figure 10D are images depicting results of the SECRETS method. SECRETS is highly selective for on-target activity and counter-selective for off-target activity. SECRETS screen with pSECRETS-A, pSECRETS-C with EMX1 ON, and pSECRETS-B with an EMX1 x- gRNA library (N8) and EMX1 OFF1 on non-selective (left) vs selective (right) conditions.

[0048] Figure 11 A is a graphical representation of the selection and validation of x-gRNAs. The top five x-gRNAs recovered from two SECRETS screen replicates for the EMX1 gRNA were identified via next-generation sequencing (NGS) for in vitro validation.

[0049] Figure 1 IB is a graphical representation of the selection and validation of x-gRNAs. All five x-gRNAs exhibit remarkable on-target activity (comparable or greater than “enhanced specificity” Cas9 variant eCas9) and greatly reduced off-target activities across four known off- target sites for the standard x-gRNA during in vitro cleavage assays using purified Cas9 and in vitro-transcribed (x-)gRNAs. n = 3 replicates each. Error bars = standard error of the mean.

[0050] Figure 11C is a graphical representation of the selection and validation of x-gRNAs. Highly-specific x-gRNAs could also be readily identified from SECRETS screens for gRNAs for FANCF and VEGFA targets, n > 4 each. Error bars = standard error of the mean.

[0051] Figure 1 ID is a schematic representation of the selection and validation of x-gRNAs. CHANGE-seq5 reveals that x-gRNAs identified through SECRETS exhibit minimal and greatly- reduced genome-wide off-target activity, with no novel off-target activity identified relative to the standard gRNA. n = 2 biological replicates each.

[0052] Figure 1 IE is a schematic and graphical representation of the selection and validation of x-gRNAs. x-gRNAs identified through SECRETS exhibit robust gene editing activity following transfection of purified Cas9 ribonucleoprotein (RNP) complexes, sg = standard gRNA, dCas9 = catalytically inactive Cas9, eCas9 = enhanced specificity Cas9, NGS = nextgeneration (illumina) sequencing, T7E1 = T7E1 mutation detection assay, n = 3 biological replicates, 2 NGS replicates, and 3 T7E1 replicates. Error bars = binomial confidence (NGS) and standard error of the mean (T7E1).

[0053] Figure 12 are representative gel images of Cas9 RNP cleavage activity with EMX1 sgRNA and x-gRNAs.

[0054] Figure 13 A is a representation of a gel image showing trials using FANCF.

[0055] Figure 13B is a representation of a gel image showing trials using VEGF. [0056] Figure 13C is a graphical representation of in vitro cleavage of FANCF target and top off-target sites by Cas9 and eCas9 using WT and hairpin gRNAs.

[0057] Figure 13D is a graphical representation of in vitro cleavage of I EGF target and top off-target sites by Cas9 and eCas9 using WT and hairpin gRNAs.

[0058] Figure 14A is a schematic representation of modifications to the SECRETS plasmids to expand the 5 ’-extension space and selection strength. Examples of potential modifications to the SECRETS plasmids.

[0059] Figure 14B is a gel image of modifications to the SECRETS plasmids to expand the 5 ’-extension space and selection strength. EMXI x-gRNA libraries with 8 random nucleotides and 4 potential (tetraloop) sequences between N8 and the spacer (UUCG, GCAA, CUUG, or no tetraloop sequence) were screened against EMXI OFF1 using SECRETS assay — 4 A 8 * 4 = 262,144 possible 5’- extensions. 3 colonies of the survivors were picked at random and sequenced using colony PCR and Sanger sequencing then characterized using in vitro cleavage assays of purified RNPs, and all 3 exhibited extremely strong on-target activity (comparable to sgRNA) and essentially no off-target activity at EMXl OFF1. EMXI sgRNA is SEQ ID NO: 105

- GAGUCCGAGCAGAAGAAGAA; EMXI xtl-gRNA is SEQ ID NO: 106 - UUGAGUAUUUCGGAGUCCGAGCAGAAGAAGAA; EMXI xt2-gRNA is SEQ ID NO: 107

- CGCUCUAUGAGUCCGAGCAGAAGAAGAA; and EMXI xt3-gRNA is SEQ ID NO: 108 - CUUGAGUUGCAAGAGUCCGAGCAGAAGAAGAA.

[0060] Figure 15A is a representation of two gel images showing trials using EMXI.

[0061] Figure 15B is a representation of two gel images showing trials using FANCF.

[0062] Figure 16 is a graphical representation of in vitro cleavage of human gene HBB and top off-target site by cas9 with WT or extended guide RNA. X-l, X-2, and X-3 are extended gRNA prepared according to the method of the present invention and compared to WT sgRNA to treat sickle cell disease. ON refers to the on-target sequence and OT1 refers to an off-target sequence. Error bars represent standard deviation.

DETAILED DESCRIPTION

[0063] In general, the present disclosure is directed to methods and systems for selection and/or design of crRNA sequences to perform gene editing using a CRISPR effector. An example aspect of the present disclosure can include selecting for crRNA sequences displaying low promiscuity while maintaining efficacy for degrading a target sequence.

[0064] For implementations of the present disclosure, the CRISPR effector can include any CRISPR effector that can be implemented as part of a CRISPR system to result in breakage of nucleotide oligomers such as RNA or DNA. Some non-limiting examples of CRISPR effectors that can be used in embodiments of the disclosure include enAsCasl2a (Casl2a), RfxCasl3d (Cas 13d), and/or SpyCas9 (Cas 9).

[0065] In accordance with example implementations of the present disclosure, an example embodiment can include a process for selecting crRNA sequences for CRISPR gene editing. Aspects of the process can include: identifying a target oligonucleotide sequence; transferring the target oligonucleotide sequence to a first plasmid, wherein the first plasmid further comprises an oligonucleotide sequence encoding a toxin; obtaining a second plasmid, wherein the second plasmid comprises an oligonucleotide sequence encoding a CRISPR effector (e.g., cas9); identifying a plurality of off-target oligonucleotide sequences; obtaining a plurality of third plasmids; transferring each off-target oligonucleotide sequences in a different third plasmid from the plurality of third plasmids, wherein each third plasmid from the plurality of third plasmids further comprises an oligonucleotide sequence encoding a crRNA; transfecting an organism with the first plasmid, the second plasmid, and the plurality of third plasmids, wherein, the crRNA sequence comprises an x-gRNA form, x portion including n-m nucleotides covalently linked to a gRNA portion, and the gRNA portion comprises a sequence of nucleotides complementary to a portion of the target oligonucleotide sequence.

[0066] A further example embodiment can include a system for selecting gRNA sequences for CRISPR gene editing. The example system can include: a first plasmid, where the first plasmid includes a target oligonucleotide sequence and a first oligonucleotide sequence encoding a toxin, a second plasmid, where the second plasmid comprises a second oligonucleotide sequence encoding a CRISPR effector (e.g., cas9), and a plurality of third plasmids, wherein each third plasmid of the plurality of third plasmids comprises an off-target sequence and a third oligonucleotide sequence encoding a crRNA, wherein,

[0067] The crRNA has an x-gRNA form, x portion including n-m nucleotides covalently linked to a gRNA portion, and the gRNA portion comprises a sequence of nucleotides complementary to a portion of the target oligonucleotide sequence. [0068] As discussed, certain CRISPR effectors may display preferred recognition and/or binding to certain protospacer motifs. For instance, using a CRISPR effector of the present disclosure, the one or more protospacer motifs can include one or more from the group: TTYN, CTTV, RTTC, TATM, CTCC, TCCC, TACA, RTTS, TATA, TGTV, ANCC, CVCC, TGCC, GTCC, TTAC, or combinations thereof. In some implementations, the one or more protospacer motifs can include a subset of this group. For example, in certain embodiments, the one or more protospacer motifs are from the group: TTYN, CTTV, RTTC, TATM, CTCC, TCCC, TACA, or combinations thereof. More particularly, some embodiments can include identifying target sequences that occur downstream of the position of one or more of these protospacer motifs in the viral genome. As used herein, protospacer motifs are provided as nucleotide sequences: A - adenosine, C - cytosine, T - thymidine, G - guanosine, V - uridine, N - any nucleotide, R - adenosine or guanosine, S - guanosine or cytosine, Y - a pyrimidine (C, T, or V).

[0069] The present invention will be better understood with reference to the following nonlimiting examples.

EXAMPLES

[0070] The present examples provide aspects of embodiments of the present disclosure. These examples are not meant to limit embodiments solely to such examples herein, but rather to illustrate some possible implementations.

EXAMPLE 1

Principle

[0071] A goal of therapeutic gene editing is to be able to make any genetic or epigenetic change that may treat a genetic disorder with both high efficiency and extreme precision and specificity. Achieving extreme specificity would not only eliminate clinical risk for off-target mutational events, but would also present new therapeutic opportunities of CRISPR gene editing like allele-specific gene editing. As such, what is most exciting beyond the demonstration of feasibility for additional (specially designed) nucleotides to improve specificity is that the now available vast 'design space' of x-gRNA sequences creates unprecedented potential to control CRISPR effector behavior and specificity. While the targeting sequence remains fully intact, the sequences of the extra nucleotides could be made to form base-pairing secondary structures with different segments of the target-binding region with tunable strengths, which suggests that x- gRNAs could be made where the secondary structure is displaced specifically upon invasion of a 'disease' allele but not at the healthy allele, or generally inhibits invasion/interaction at off-targets (FIG. 2).

Initial Results

[0072] While it was found that x/hp-gRNAs could significantly outperform the state-of-the- art (FIG. 3), as other x/hp-gRNAs were manually screened, it was found some that did not make targeting more specific; others also inhibited activity on-target; and still others (controls) were not predicted to increase specificity, but which sometimes did. Ultimately, with the small sample size of x-gRNAs overarching design rules could not be defined to predict de nova increases in specificity.

Approach

[0073] To empirically identify x-gRNAs that are both functional (mutational activity on- target) and specific (lower activity off-target or at 'healthy' alleles, as appropriate), a new approach was developed and validated to screen hundreds of thousands of x-gRNA variants by modifying an Escherichia coli-based selection method that has been used to isolate Cas9 and Casl2 variants with improved specificity and activity during directed evolution. The selection is based on the idea that in E. coli, cleavage or inducing DSBs in a plasmid results in plasmid degradation rather than repair, so CRISPR activity on-target can be positively selected for via degradation of a toxic plasmid that contains the targeted sequence, and CRISPR activity off- target can be negatively selected against by introducing the off-target sequences on another plasmid with antibiotic resistance genes and using antibiotics (FIG. 4).

[0074] A plasmid was introduced for the tightly-controlled expression of S. pyogenes Cas9 as well as a medium copy-number plasmid for the tightly-controlled expression of E. coli toxin CcdB into which a “target’ sequence (ON) from the human EMX1 gene was cloned (FIG. 5). A third, low-copy-number plasmid expresses a well-characterized EMX1 sgRNA to the EMX1 target, but also contains a known EMX1 off-target sequence (OFF1) where Cas9 with this sgRNA exhibits significant cleavage activity (FIG. 5). After transformation and simultaneous expression of CcdB, Cas9, and the sgRNA, this screen simultaneously exhibited strong positive selection of Cas 9 activity on target (able to degrade the toxin plasmid and survive; FIG. 5, left column images) and extremely strong negative selection of off-target (unable to survive if the gRNA plasmid contained OFF1 sequence, FIG. 5, right column images). It was further confirmed this selectivity using 14 other known EMX1 off-targets (containing up to 4 mismatches on the gRNA plasmid), and in 14 out of 15 (OFF13 being the only exception) every single one exhibited significant negative selection relative to gRNA plasmids with no off-target (lacking a PAM sequence).

[0075] An EMX1 "x-gRNA library" was then generated to clone randomized sets of 8 nts upstream of the gRNA sequence -sets of 65,536 unique x-gRNA sequences, three different 4 nt tetraloop sequences (as previously had been done to protect the x-gRNA from degradation and promote interactions with the targeting segment), or no tetraloop. Unlike when the EMX1 sgRNA plasmid was used (FIG. 5), when the EMX1 x-gRNA library was introduced into the screening protocol, many survivors were found. Three survivor colonies were picked for Sanger sequencing of their x-gRNA, and their x-gRNAs were sequenced in vitro to test their biochemical activity using purified Cas9 at DNA with ON and OFF1 (FIG. 6). As expected, the EMX1 sgRNA exhibited cleavage activity at both ON and OFF 1 in vitro, but all three selected x- gRNAs exhibited activity comparable to the sgRNA at the ON site while effectively abolishing cleavage activity at OFF1. Therefore, with no human "design" component or biophysical intuition necessary, the screen allowed de novo identification of several high-activity, high specificity x-gRNAs from a library of several hundred thousand, where most nt extensions would ordinarily be expected to either have no effect on off-target activity or be deleterious to on-target activity.

[0076] Next-generation sequencing was performed on the survivors and the top five extensions in terms of abundance were picked. Their activity was then screened in vitro at the target site and top four known off-target sites for EMX1.

[0077] As seen in FIG. 7, these extensions allowed the Cas9 to cleave its intended target with similar efficiency as the regular gRNA, while effectively eliminating activity at all four of the top known off-targets for the EXMI gRNA. So x-gRNAs generated through this screening approach of the present invention are broadly more specific than the regular gRNAs and therefore likely to be significantly safer for therapeutic applications. Similarly, extensions may be used with other CRISPR effectors, such as Casl2. Extensions may be on the 3’ side of the crRNA, the 5’ side of the crRNA, or combinations thereof. For example, with a Cast 2 CRISPR effector, the extension may be on the 3’ side of the crRNA. EXAMPLE 2

Target Mutation in KRAS

[0078] KRAS is the most highly mutated oncogene in many cancers of the lung, pancreas and colon-a driver of up to 20% of all cancer-and in many cases carries a negative prognosis. Oncogenic KRAS is considered essentially 'undruggable', meaning its mutationally activated and cancer-causing forms are, with one exception, unable to be inhibited by traditional pharmaceuticals. These activating mutations occur as a spectrum of SNVs at the hotspot codons 12 and 13 that have not been very easily targeted by CRISPR, or with poor specificity. To test the limits of the present invention, a screen was performed to see if Cas9 could be used to cleave DNA with a single nucleotide difference that causes cancer (KRAS G12D) while minimizing activity on the healthy sequence. Before the screen, the regular gRNA basically targeted the healthy and cancer-causing mutant sequence equally (FIG. 8). After the screen, gRNAs with hairpins shift the ratio of cleavage activity at the oncogenic mutation sequence vs healthy sequence to nearly 2: 1. Cleavage efficiencies are shown in Table 1.

Table 1 - Cleavage efficiency at cancer-causing (left) and healthy (right) sequences for the oncogene KRAS. N=4 independent trials.

[0079] This example shows that the invention method is capable of generating gRNAs with increased "specificity" in targeting disease-causing mutations even under very difficult and stringent conditions (a single nucleotide difference). EXAMPLE 3

Principle

[0080] For a CRISPR guide RNA (gRNA) with a specific target but activity at known “off- target” sequences, a method was presented to screen hundreds of thousands of gRNA variants with short, randomized 5’ nucleotide extensions near its DNA-targeting segment — a modification that can increase Cas9 gene editing specificity by orders of magnitude with certain 5’ - extension sequences, via some as-yet-unknown mechanism that makes de novo design of the extension sequence difficult to perform manually — to robustly identify extended gRNAs (x- gRNAs) that have been counter-selected against activity at those off-target sites and that exhibit significantly enhanced Cas9 specificity for their intended targets.

[0081] CRISPR effector Cas9 from Streptococcus pyogenes (SpyCas9) has emerged over the past several years as a powerful biotechnological tool that also holds tremendous therapeutic potential in the treatment of genetic diseases. This potential arises from the ability of CRISPR effectors to use a modular segment of their RNA co-factor, their guide RNA or ‘gRNA’, to recognize DNA sequences complementary to its ‘spacer’ segment and introduce targeted mutations into the DNA at those sites. However, oftentimes a gRNA for a specific target can cause the Cas9 nuclease to introduce “off-targef ’ double-strand breaks (DSBs) and mutations at similar nucleotide sequences that are also present in that genome, and the possibility of unintended or uncontrolled Cas9-induced mutational events raises significant concerns for those therapeutic applications. In particular, it is becoming increasingly important to recognize that individuals may carry unique or “personal” off-target sequences for a therapeutic gRNA as a result of genetic variations that exist between people and/or across different populations, and that these unique off-target sequences must be accounted for in an era of personalized medicine.

[0082] Although there have not been any approaches developed to directly limit activity at those specifically-identified off-target sequences for a gRNA of interest, there are a few ways to reduce “off-target” activity overall and increase the specificity of CRISPR systems in general. For example, these general approaches include reducing cellular exposure to Cas9 nucleases9 or selectively inhibiting Cas9 nuclease activity altogether, as well as using engineered, “high fidelity” or “enhanced specificity” Cas9 variants such as eCas9. eCas9 effectors have amino acid substitutions designed to reduce their overall affinity for DNA in a way that decreases the probability that the effectors’ latent nuclease domains will become activated at sequences with imperfect complementarity to its gRNA. These engineered Cas9 variants tend to exhibit nuclease activity that has been reduced overall by up to several orders of magnitude. Modification of the gRNA itself has also been found to modulate Cas9 specificity: gRNAs with chemically modified bases, phosphates, or sugars can exhibit increased specificity overall relative to unmodified gRNAs, although the optimal combination of modifications for a specific target/off-target can be difficult to predict de novo. Removing a few nucleotides from the 5’ end of the DNA-targeting segment of the gRNA (from 20 nt to 17 or 18 nt) to generate truncated gRNA (‘tru-gRNAs’) can also decrease off-target activity, an effect likely caused by a general destabilization of gRNA/DNA interactions when the spacers are shortened.

[0083] Recently, it was found that adding short nucleotide extensions (-6 to -16 nts) to the 5’ - end of the gRNA next to its DNA-targeting ‘spacer’ segment (FIG. 9B) — especially those that were predicted to form ‘hairpin’ or secondary structures with the spacer designed to interfere with gRNA interactions at specific off-target sequences — could significantly reduce Cas9 off- target activity while maintaining on-target mutational efficiencies. On average, the specificity in targeting during gene editing for these “hairpin-gRNAs” or “hp-gRNAs” increased 50-fold (and up to 200-fold) relative to gene editing using standard gRNAs, and this approach worked in diverse CRISPR effectors for multiple target sites each. While those hp-gRNAs could significantly outperform the state-of-the-art, the 5’ extended sequences were each designed and tested one-at-a-time, manually, for each targeted sequence and set of off-targets. At the time, it was also found that some of tested 5’ - extensions did not effectively reduce off-target activity; others also significantly inhibited on-target activity; and still others (controls) that were not predicted to increase specificity occasionally did. Because there are a very large number of possible short 5’- extensions and because, in principle, different 5’- extensions for the same spacer sequence can be fine-tuned or optimized to limit activity vs. specific off-target sequences, the inability to predict de novo which of those sequences will increase the specificity of an associated Cas9/gRNA ribonucleoprotein (RNP) has so far limited their utility in practice in eliminating the risk of off-target mutation during gene editing.

Materials and Methods

DNA oligonucleotides, dsDNA, and plasmids: [0084] DNA sequences for all oligonucleotides, dsDNA fragments, and plasmids are listed in Tables 2, 3, and 4, respectively.

Table 2 - Oligonucleotide and primer sequences

Table 3 - dsDNA fragments used to clone pSECRETS plasmids

Table 4 - Plasmid Sequences pSECRETS-A (SEQ ID NO: 99): GACGTCTTAAGACCCACTTTCACATTTAAGTTGTTTTTCTAATCCGCATATGATCAAT

TCAAGGCCGAATAAGAAGGCTGGCTCTGCACCTTGGTGATCAAATAATTCGATAGCT

TGTCGTAATAATGGCGGCATACTATCAGTAGTAGGTGTTTCCCTTTCTTCTTTAGCG A

CTTGATGCTCTTGATCTTCCAATACGCAACCTAAAGTAAAATGCCCCACAGCGCTGA

GTGCATATAATGCATTCTCTAGTGAAAAACCTTGTTGGCATAAAAAGGCTAATTGAT

TTTCGAGAGTTTCATACTGTTTTTCTGTAGGCCGTGTACCTAAATGTACTTTTGCTC C

ATCGCGATGACTTAGTAAAGCACATCTAAAACTTTTAGCGTTATTACGTAAAAAATC

TTGCCAGCTTTCCCCTTCTAAAGGGCAAAAGTGAGTATGGTGCCTATCTAACATCTC

AATGGCTAAGGCGTCGAGCAAAGCCCGCTTATTTTTTACATGCCAATACAATGTAGG

CTGCTCTACACCTAGCTTCTGGGCGAGTTTACGGGTTGTTAAACCTTCGATTCCGAC C

TCATTAAGCAGCTCTAATGCGCTGTTAATCACTTTACTTTTATCTAAACGAGACATA C

TCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGAT A

CATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCG

AAAAGTGCCACCTGACGTCCTCTGGCTAACTAGAGAACCCACTGCTTACTGGCTTAT

CGAAATTTCCCTATCAGTGATAGAGATTGACATCCCTATCAGTGATAGAGATACTGA

GCACATCAGCAGGACGCACTGACCAGGGAGACCCAAGCTTGCCACCATGGTGTACC

CCTACGACGTGCCCGACTACGCCGAATTGCCTCCAAAAAAGAAGAGAAAGGTAGGG

ATCCGAACCATGGATAAGAAATACTCAATAGGCTTAGATATCGGCACAAATAGCGT

CGGATGGGCGGTGATCACTGATGAATATAAGGTTCCGTCTAAAAAGTTCAAGGTTCT

GGGAAATACAGACCGCCACAGTATCAAAAAAAATCTTATAGGGGCTCTTTTATTTGA

CAGTGGAGAGACAGCGGAAGCGACTCGTCTCAAACGGACAGCTCGTAGAAGGTATA

CACGTCGGAAGAATCGTATTTGTTATCTACAGGAGATTTTTTCAAATGAGATGGCGA

AAGTAGATGATAGTTTCTTTCATCGACTTGAAGAGTCTTTTTTGGTGGAAGAAGACA

AGAAGCATGAACGTCATCCTATTTTTGGAAATATAGTAGATGAAGTTGCTTATCATG

AGAAATATCCAACTATCTATCATCTGCGAAAAAAATTGGTAGATTCTACTGATAAAG

CGGATTTGCGCTTAATCTATTTGGCCTTAGCGCATATGATTAAGTTTCGTGGTCATT T

TTTGATTGAGGGAGATTTAAATCCTGATAATAGTGATGTGGACAAACTATTTATCCA

GTTGGTACAAACCTACAATCAATTATTTGAAGAAAACCCTATTAACGCAAGTGGAGT

AGATGCTAAAGCGATTCTTTCTGCACGATTGAGTAAATCAAGACGATTAGAAAATCT

CATTGCTCAGCTCCCCGGTGAGAAGAAAAATGGCTTATTTGGGAATCTCATTGCTTT

GTCATTGGGTTTGACCCCTAATTTTAAATCAAATTTTGATTTGGCAGAAGATGCTAA ATTACAGCTTTCAAAAGATACTTACGATGATGATTTAGATAATTTATTGGCGCAAAT

TGGAGATCAATATGCTGATTTGTTTTTGGCAGCTAAGAATTTATCAGATGCTATTTT A

CTTTCAGATATCCTAAGAGTAAATACTGAAATAACTAAGGCTCCCCTATCAGCTTCA

ATGATTAAACGCTACGATGAACATCATCAAGACTTGACTCTTTTAAAAGCTTTAGTT

CGACAACAACTTCCAGAAAAGTATAAAGAAATCTTTTTTGATCAATCAAAAAACGG

ATATGCAGGTTATATTGATGGGGGAGCTAGCCAAGAAGAATTTTATAAATTTATCAA

ACCAATTTTAGAAAAAATGGATGGTACTGAGGAATTATTGGTGAAACTAAATCGTG

AAGATTTGCTGCGCAAGCAACGGACCTTTGACAACGGCTCTATTCCCCATCAAATTC

ACTTGGGTGAGCTGCATGCTATTTTGAGAAGACAAGAAGACTTTTATCCATTTTTAA

AAGACAATCGTGAGAAGATTGAAAAAATCTTGACTTTTCGAATTCCTTATTATGTTG

GTCCATTGGCGCGTGGCAATAGTCGTTTTGCATGGATGACTCGGAAGTCTGAAGAAA

CAATTACCCCATGGAATTTTGAAGAAGTTGTCGATAAAGGTGCTTCAGCTCAATCAT

TTATTGAACGCATGACAAACTTTGATAAAAATCTTCCAAATGAAAAAGTACTACCAA

AACATAGTTTGCTTTATGAGTATTTTACGGTTTATAACGAATTGACAAAGGTCAAAT

ATGTTACTGAAGGAATGCGAAAACCAGCATTTCTTTCAGGTGAACAGAAGAAAGCC

ATTGTTGATTTACTCTTCAAAACAAATCGAAAAGTAACCGTTAAGCAATTAAAAGAA

GATTATTTCAAAAAAATAGAATGTTTTGATAGTGTTGAAATTTCAGGAGTTGAAGAT

AGATTTAATGCTTCATTAGGTACCTACCATGATTTGCTAAAAATTATTAAAGATAAA

GATTTTTTGGATAATGAAGAAAATGAAGATATCTTAGAGGATATTGTTTTAACATTG

ACCTTATTTGAAGATAGGGAGATGATTGAGGAAAGACTTAAAACATATGCTCACCTC

TTTGATGATAAGGTGATGAAACAGCTTAAACGTCGCCGTTATACTGGTTGGGGACGT

TTGTCTCGAAAATTGATTAATGGTATTAGGGATAAGCAATCTGGCAAAACAATATTA

GATTTTTTGAAATCAGATGGTTTTGCCAATCGCAATTTTATGCAGCTGATCCATGAT G

ATAGTTTGACATTTAAAGAAGACATTCAAAAAGCACAAGTGTCTGGACAAGGCGAT

AGTTTACATGAACATATTGCAAATTTAGCTGGTAGCCCTGCTATTAAAAAAGGTATT

TTACAGACTGTAAAAGTTGTTGATGAATTGGTCAAAGTAATGGGGCGGCATAAGCC

AGAAAATATCGTTATTGAAATGGCACGTGAAAATCAGACAACTCAAAAGGGCCAGA

AAAATTCGCGAGAGCGTATGAAACGAATCGAAGAAGGTATCAAAGAATTAGGAAGT

CAGATTCTTAAAGAGCATCCTGTTGAAAATACTCAATTGCAAAATGAAAAGCTCTAT

CTCTATTATCTCCAAAATGGAAGAGACATGTATGTGGACCAAGAATTAGATATTAAT

CGTTTAAGTGATTATGATGTCGATCACATTGTTCCACAAAGTTTCCTTAAAGACGAT T CAATAGACAATAAGGTCTTAACGCGTTCTGATAAAAATCGTGGTAAATCGGATAAC

GTTCCAAGTGAAGAAGTAGTCAAAAAGATGAAAAACTATTGGAGACAACTTCTAAA

CGCCAAGTTAATCACTCAACGTAAGTTTGATAATTTAACGAAAGCTGAACGTGGAG

GTTTGAGTGAACTTGATAAAGCTGGTTTTATCAAACGCCAATTGGTTGAAACTCGCC

AAATCACTAAGCATGTGGCACAAATTTTGGATAGTCGCATGAATACTAAATACGATG

AAAATGATAAACTTATTCGAGAGGTTAAAGTGATTACCTTAAAATCTAAATTAGTTT

CTGACTTCCGAAAAGATTTCCAATTCTATAAAGTACGTGAGATTAACAATTACCATC

ATGCCCATGATGCGTATCTAAATGCCGTCGTTGGAACTGCTTTGATTAAGAAATATC

CAAAACTTGAATCGGAGTTTGTCTATGGTGATTATAAAGTTTATGATGTTCGTAAAA

TGATTGCTAAGTCTGAGCAAGAAATAGGCAAAGCAACCGCAAAATATTTCTTTTACT

CTAATATCATGAACTTCTTCAAAACAGAAATTACACTTGCAAATGGAGAGATTCGCA

AACGCCCTCTAATCGAAACTAATGGGGAAACTGGAGAAATTGTCTGGGATAAAGGG

CGAGATTTTGCCACAGTGCGCAAAGTATTGTCCATGCCCCAAGTCAATATTGTCAAG

AAAACAGAAGTACAGACAGGCGGATTCTCCAAGGAGTCAATTTTACCAAAAAGAAA

TTCGGACAAGCTTATTGCTCGTAAAAAAGACTGGGATCCAAAAAAATATGGTGGTTT

TGATAGTCCAACGGTAGCTTATTCAGTCCTAGTGGTTGCTAAGGTGGAAAAAGGGA

AATCGAAGAAGTTAAAATCCGTTAAAGAGTTACTAGGGATCACAATTATGGAAAGA

AGTTCCTTTGAAAAAAATCCGATTGACTTTTTAGAAGCTAAAGGATATAAGGAAGTT

AAAAAAGACTTAATCATTAAACTACCTAAATATAGTCTTTTTGAGTTAGAAAACGGT

CGTAAACGGATGCTGGCTAGTGCCGGAGAATTACAAAAAGGAAATGAGCTGGCTCT

GCCAAGCAAATATGTGAATTTTTTATATTTAGCTAGTCATTATGAAAAGTTGAAGGG

TAGTCCAGAAGATAACGAACAAAAACAATTGTTTGTGGAGCAGCATAAGCATTATT

TAGATGAGATTATTGAGCAAATCAGTGAATTTTCTAAGCGTGTTATTTTAGCAGATG

CCAATTTAGATAAAGTTCTTAGTGCATATAACAAACATAGAGACAAACCAATACGT

GAACAAGCAGAAAATATTATTCATTTATTTACGTTGACGAATCTTGGAGCTCCCGCT

GCTTTTAAATATTTTGATACAACAATTGATCGTAAACGATATACGTCTACAAAAGAA

GTTTTAGATGCCACTCTTATCCATCAATCCATCACTGGTCTTTATGAAACACGCATT G

ATTTGAGTCAGCTAGGAGGTGACTAACTCGAGTAAGGATCTCCAGGCATCAAATAA

AACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGA

ACGCTCTCTACTAGAGTCACACTGGCTCACCTTCGGGTGGGCCTTTCTGCGTTTATA C

CTAGGGATATATTCCGCTTCCTCGCTCACTGACTCGCTACGCTCGGTCGTTCGACTG C GGCGAGCGGAAATGGCTTACGAACGGGGCGGAGATTTCCTGGAAGATGCCAGGAAG

ATACTTAACAGGGAAGTGAGAGGGCCGCGGCAAAGCCGTTTTTCCATAGGCTCCGC

CCCCCTGACAAGCATCACGAAATCTGACGCTCAAATCAGTGGTGGCGAAACCCGAC

AGGACTATAAAGATACCAGGCGTTTCCCCCTGGCGGCTCCCTCGTGCGCTCTCCTGT

TCCTGCCTTTCGGTTTACCGGTGTCATTCCGCTGTTATGGCCGCGTTTGTCTCATTC C

ACGCCTGACACTCAGTTCCGGGTAGGCAGTTCGCTCCAAGCTGGACTGTATGCACGA

ACCCCCCGTTCAGTCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAA

CCCGGAAAGACATGCAAAAGCACCACTGGCAGCAGCCACTGGTAATTGATTTAGAG

GAGTTAGTCTTGAAGTCATGCGCCGGTTAAGGCTAAACTGAAAGGACAAGTTTTGGT

GACTGCGCTCCTCCAAGCCAGTTACCTCGGTTCAAAGAGTTGGTAGCTCAGAGAACC

TTCGAAAAACCGCCCTGCAAGGCGGTTTTTTCGTTTTCAGAGCAAGAGATTACGCGC

AGACCAAAACGATCTCAAGAAGATCATCTTATTAATCAGATAAAATATTTCTAGATT

TCAGTGCAATTTATCTCTTCAAATGTAGCACCTGAAGTCAGCCCCATACGATATAAG

TTGTTACTAGTGCTTGGATTCTCACCAATAAAAAACGCCCGGCGGCAACCGAGCGTT

CTGAACAAATCCAGATGGAGTTCTGAGGTCATTACTGGATCTATCAACAGGAGTCCA

AGCGAGCTCGATATCAAATTACGCCCCGCCCTGCCACTCATCGCAGTACTGTTGTAA

TTCATTAAGCATTCTGCCGACATGGAAGCCATCACAAACGGCATGATGAACCTGAAT

CGCCAGCGGCATCAGCACCTTGTCGCCTTGCGTATAATATTTGCCCATGGTGAAAAC

GGGGGCGAAGAAGTTGTCCATATTGGCCACGTTTAAATCAAAACTGGTGAAACTCA

CCCAGGGATTGGCTGAGACGAAAAACATATTCTCAATAAACCCTTTAGGGAAATAG

GCCAGGTTTTCACCGTAACACGCCACATCTTGCGAATATATGTGTAGAAACTGCCGG

AAATCGTCGTGGTATTCACTCCAGAGCGATGAAAACGTTTCAGTTTGCTCATGGAAA

ACGGTGTAACAAGGGTGAACACTATCCCATATCACCAGCTCACCGTCTTTCATTGCC

ATACGAAATTCCGGATGAGCATTCATCAGGCGGGCAAGAATGTGAATAAAGGCCGG

ATAAAACTTGTGCTTATTTTTCTTTACGGTCTTTAAAAAGGCCGTAATATCCAGCTG A

ACGGTCTGGTTATAGGTACATTGAGCAACTGACTGAAATGCCTCAAAATGTTCTTTA

CGATGCCATTGGGATATATCAACGGTGGTATATCCAGTGATTTTTTTCTCCATTTTA G

CTTCCTTAGCTCCTGAAAATCTCGATAACTCAAAAAATACGCCCGGTAGTGATCTTA

TTTCATTATGGTGAAAGTTGGAACCTCTTACGTGCCGATCAACGTCTCATTTTCGCC A

GATATC pSECRETS-B (SEQ ID NO: 100):

GACGTCTTAAGACCCACTTTCACATTTAAGTTGTTTTTCTAATCCGCATATGATCAA T

TCAAGGCCGAATAAGAAGGCTGGCTCTGCACCTTGGTGATCAAATAATTCGATAGCT

TGTCGTAATAATGGCGGCATACTATCAGTAGTAGGTGTTTCCCTTTCTTCTTTAGCG A

CTTGATGCTCTTGATCTTCCAATACGCAACCTAAAGTAAAATGCCCCACAGCGCTGA

GTGCATATAATGCATTCTCTAGTGAAAAACCTTGTTGGCATAAAAAGGCTAATTGAT

TTTCGAGAGTTTCATACTGTTTTTCTGTAGGCCGTGTACCTAAATGTACTTTTGCTC C

ATCGCGATGACTTAGTAAAGCACATCTAAAACTTTTAGCGTTATTACGTAAAAAATC

TTGCCAGCTTTCCCCTTCTAAAGGGCAAAAGTGAGTATGGTGCCTATCTAACATCTC

AATGGCTAAGGCGTCGAGCAAAGCCCGCTTATTTTTTACATGCCAATACAATGTAGG

CTGCTCTACACCTAGCTTCTGGGCGAGTTTACGGGTTGTTAAACCTTCGATTCCGAC C

TCATTAAGCAGCTCTAATGCGCTGTTAATCACTTTACTTTTATCTAAACGAGACATA C

TCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGAT A

CATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCG

AAAAGTGCCACCTGACGTCCTCTGGCTAACTAGAGAACCCACTGCTTACTGGCTTAT

CGAAATTTCCCTATCAGTGATAGAGATTGACATCCCTATCAGTGATAGAGATACTGA

GCACGTGAGACCCATGCCATAGCGTTGTTTAGGGATAACAGGGTAATACTGTCCACA

CAATCTGCCCTGGTCTCCGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGT

CCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTTTTTGGCCGGCATGGTCC

CAGCCTCCTCGCTGGCGCCGGCTGGGCAACATGCTTCGGCATGGCGAATGGGACGA

CATCACCTCCCACAACGAAGACTACACCATCGTTGAACAGTACGAACGTGCTGAAG

GTCGTCACTCCACCGGTGCTTAAGGATCCAAACTCGAGTAAGGATCTCCAGGCATCA

AATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTC

GGTGAACGCTCTCTACTAGAGTCACACTGGCTCACCTTCGGGTGGGCCTTTCTGCGT

TTATACCTAGGGTACGGGTTTTGCTGCCCGCAAACGGGCTGTTCTGGTGTTGCTAGT T

TGTTATCAGAATCGCAGATCCGGCTTCAGCCGGTTTGCCGGCTGAAAGCGCTATTTC

TTCCAGAATTGCCATGATTTTTTCCCCACGGGAGGCGTCACTGGCTCCCGTGTTGTC G

GCAGCTTTGATTCGATAAGCAGCATCGCCTGTTTCAGGCTGTCTATGTGTGACTGTT G

AGCTGTAACAAGTTGTCTCAGGTGTTCAATTTCATGTTCTAGTTGCTTTGTTTTACT G

GTTTCACCTGTTCTATTAGGTGTTACATGCTGTTCATCTGTTACATTGTCGATCTGT TC

ATGGTGAACAGCTTTGAATGCACCAAAAACTCGTAAAAGCTCTGATGTATCTATCTT TTTTACACCGTTTTCATCTGTGCATATGGACAGTTTTCCCTTTGATATGTAACGGTGA

ACAGTTGTTCTACTTTTGTTTGTTAGTCTTGATGCTTCACTGATAGATACAAGAGCC A

TAAGAACCTCAGATCCTTCCGTATTTAGCCAGTATGTTCTCTAGTGTGGTTCGTTGT T

TTTGCGTGAGCCATGAGAACGAACCATTGAGATCATACTTACTTTGCATGTCACTCA

AAAATTTTGCCTCAAAACTGGTGAGCTGAATTTTTGCAGTTAAAGCATCGTGTAGTG

TTTTTCTTAGTCCGTTATGTAGGTAGGAATCTGATGTAATGGTTGTTGGTATTTTGT C

ACCATTCATTTTTATCTGGTTGTTCTCAAGTTCGGTTACGAGATCCATTTGTCTATC T

AGTTCAACTTGGAAAATCAACGTATCAGTCGGGCGGCCTCGCTTATCAACCACCAAT

TTCATATTGCTGTAAGTGTTTAAATCTTTACTTATTGGTTTCAAAACCCATTGGTTA A

GCCTTTTAAACTCATGGTAGTTATTTTCAAGCATTAACATGAACTTAAATTCATCAA G

GCTAATCTCTATATTTGCCTTGTGAGTTTTCTTTTGTGTTAGTTCTTTTAATAACCA CT

CATAAATCCTCATAGAGTATTTGTTTTCAAAAGACTTAACATGTTCCAGATTATATT T

TATGAATTTTTTTAACTGGAAAAGATAAGGCAATATCTCTTCACTAAAAACTAATTC

TAATTTTTCGCTTGAGAACTTGGCATAGTTTGTCCACTGGAAAATCTCAAAGCCTTT A

ACCAAAGGATTCCTGATTTCCACAGTTCTCGTCATCAGCTCTCTGGTTGCTTTAGCT A

ATACACCATAAGCATTTTCCCTACTGATGTTCATCATCTGAGCGTATTGGTTATAAG T

GAACGATACCGTCCGTTCTTTCCTTGTAGGGTTTTCAATCGTGGGGTTGAGTAGTGC C

ACACAGCATAAAATTAGCTTGGTTTCATGCTCCGTTAAGTCATAGCGACTAATCGCT

AGTTCATTTGCTTTGAAAACAACTAATTCAGACATACATCTCAATTGGTCTAGGTGA

TTTTAATCACTATACCAATTGAGATGGGCTAGTCAATGATAATTACTAGTCCTTTTC C

CGGGTGATCTGGGTATCTGTAAATTCTGCTAGACCTTTGCTGGAAAACTTGTAAATT

CTGCTAGACCCTCTGTAAATTCCGCTAGACCTTTGTGTGTTTTTTTTGTTTATATTC AA

GTGGTTATAATTTATAGAATAAAGAAAGAATAAAAAAAGATAAAAAGAATAGATCC

CAGCCCTGTGTATAACTCACTACTTTAGTCAGTTCCGCAGTATTACAAAAGGATGTC

GCAAACGCTGTTTGCTCCTCTACAAAACAGACCTTAAAACCCTAAAGGCTTAAGTAG

CACCCTCGCAAGCTCGGGCAAATCGCTGAATATTCCTTTTGTCTCCGACCATCAGGC

ACCTGAGTCGCTGTCTTTTTCGTGACATTCAGTTCGCTGCGCTCACGGCTCTGGCAG T

GAATGGGGGTAAATGGCACTACAGGCGCCTTTTATGGATTCATGCAAGGAAACTAC

CCATAATACAAGAAAAGCCCGTCACGGGCTTCTCAGGGCGTTTTATGGCGGGTCTGC

TATGTGGTGCTATCTGACTTTTTGCTGTTCAGCAGTTCCTGCCCTCTGATTTTCCAG TC

TGACCACTTCGGATTATCCCGTGACAGGTCATTCAGACTGGCTAATGCACCCAGTAA GGCAGCGGTATCATCAACAGGCTTACCCGTCTTACTGTCCCTAGTGCTTGGATTCTC

ACCAATAAAAAACGCCCGGCGGCAACCGAGCGTTCTGAACAAATCCAGATGGAGTT

CTGAGGTCATTACTGGATCTATCAACAGGAGTCCAAGCGAGCTCTCGAACCCCAGA

GTCCCGCTCAGAAGAACTCGTCAAGAAGGCGATAGAAGGCGATGCGCTGCGAATCG

GGAGCGGCGATACCGTAAAGCACGAGGAAGCGGTCAGCCCATTCGCCGCCAAGCTC

TTCAGCAATATCACGGGTAGCCAACGCTATGTCCTGATAGCGGTCCGCCACACCCAG

CCGGCCACAGTCGATGAATCCAGAAAAGCGGCCATTTTCCACCATGATATTCGGCA

AGCAGGCATCGCCATGGGTCACGACGAGATCCTCGCCGTCGGGCATGCGCGCCTTG

AGCCTGGCGAACAGTTCGGCTGGCGCGAGCCCCTGATGCTCTTCGTCCAGATCATCC

TGATCGACAAGACCGGCTTCCATCCGAGTACGTGCTCGCTCGATGCGATGTTTCGCT

TGGTGGTCGAATGGGCAGGTAGCCGGATCAAGCGTATGCAGCCGCCGCATTGCATC

AGCCATGATGGATACTTTCTCGGCAGGAGCAAGGTGAGATGACAGGAGATCCTGCC

CCGGCACTTCGCCCAATAGCAGCCAGTCCCTTCCCGCTTCAGTGACAACGTCGAGCA

CAGCTGCGCAAGGAACGCCCGTCGTGGCCAGCCACGATAGCCGCGCTGCCTCGTCC

TGCAGTTCATTCAGGGCACCGGACAGGTCGGTCTTGACAAAAAGAACCGGGCGCCC

CTGCGCTGACAGCCGGAACACGGCGGCATCAGAGCAGCCGATTGTCTGTTGTGCCC

AGTCATAGCCGAATAGCCTCTCCACCCAAGCGGCCGGAGAACCTGCGTGCAATCCA

TCTTGTTCAATCATGCGAAACGATCCTCATCCTGTCTCTTGATCAGATCATGATCCC C

TGCGCCATCAGATCCTTGGCGGCAAGAAAGCCATCCAGTTTACTTTGCAGGGCTTCC

CAACCTTACCAGAGGGCGCCCCAGCTGGCAATTCC pSECRETS-C (pl 1-LacY-wtxl, commercially available at n2t.net/addgene:69056) (SEQ ID

NO: 101):

TCGATGCATAATGTGCCTGTCAAATGGACGAAGCAGGGATTCTGCAAACCCTATGCT

ACTCCGTCAAGCCGTCAATTGTCTGATTCGTTACCAATTATGACAACTTGACGGCTA

CATCATTCACTTTTTCTTCACAACCGGCACGGAACTCGCTCGGGCTGGCCCCGGTGC

ATTTTTTAAATACCCGCGAGAAATAGAGTTGATCGTCAAAACCAACATTGCGACC GA

CGGTGGCGATAGGCATCCGGGTGGTGCTCAAAAGCAGCTTCGCCTGGCTGATACGTT

GGTCCTCGCGCCAGCTTAAGACGCTAATCCCTAACTGCTGGCGGAAAAGATGTGAC

AGACGCGACGGCGACAAGCAAACATGCTGTGCGACGCTGGCGATATCAAAATTGCT

GTCTGCCAGGTGATCGCTGATGTACTGACAAGCCTCGCGTACCCGATTATCCATCGG TGGATGGAGCGACTCGTTAATCGCTTCCATGCGCCGCAGTAACAATTGCTCAAGCAG

ATTTATCGCCAGCAGCTCCGAATAGCGCCCTTCCCCTTGCCCGGCGTTAATGATTTG

CCCAAACAGGTCGCTGAAATGCGGCTGGTGCGCTTCATCCGGGCGAAAGAACCCCG

TATTGGCAAATATTGACGGCCAGTTAAGCCATTCATGCCAGTAGGCGCGCGGACGA

AAGTAAACCCACTGGTGATACCATTCGCGAGCCTCCGGATGACGACCGTAGTGATG

AATCTCTCCTGGCGGGAACAGCAAAATATCACCCGGTCGGCAAACAAATTCTCGTCC

CTGATTTTTCACCACCCCCTGACCGCGAATGGTGAGATTGAGAATATAACCTTTCAT

TCCCAGCGGTCGGTCGATAAAAAAATCGAGATAACCGTTGGCCTCAATCGGCGTTA

AACCCGCCACCAGATGGGCATTAAACGAGTATCCCGGCAGCAGGGGATCATTTTGC

GCTTCAGCCATACTTTTCATACTCCCGCCATTCAGAGAAGAAACCAATTGTCCATAT

TGCATCAGACATTGCCGTCACTGCGTCTTTTACTGGCTCTTCTCGCTAACCAAACCG G

TAACCCCGCTTATTAAAAGCATTCTGTAACAAAGCGGGACCAAAGCCATGACAAAA

ACGCGTAACAAAAGTGTCTATAATCACGGCAGAAAAGTCCACATTGATTATTTGCAC

GGCGTCACACTTTGCTATGCCATAGCATTTTTATCCATAAGATTAGCGGATCCTACC T

GACGCTTTTTATCGCAACTCTCTACTGTTTCTCCATACCCGTTTTTTTGGGCTAGCG A

TTGAAAACGATGCAGTTTAAGGTTTACACCTATAAAAGAGAGAGCCGTTATCGTCTG

TTTGTGGATGTACAGAGTGATATTATTGACACGCCCGGGCGACGGATGGTGATCCCC

CTGGCCAGTGCACGTCTGCTGTCAGATAAAGTCTCCCGTGAACTTTACCCGGTGGTG

CATATCGGGGATGAAAGCTGGCGCATGATGACCACCGATATGGCCAGTGTGCCGGT

CTCCGTTATCGGGGAAGAAGTGGCTGATCTCAGCCACCGCGAAAATGACATCAAAA

ACGCCATTAACCTGATGTTTTGGGGAATATAATCTAGCATTACGCTAGGGATAACAG

GGTAATATCACGCTCTAGACATACGGCATGCAAGCTTGGCTGTTTTGGCGGATGAGA

GAAGATTTTCAGCCTGATACAGATTAAATCAGAACGCAGAAGCGGTCTGATAAAAC

AGAATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGACCCCATGCCGAACTCA

GAAGTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGG

GAACTGCCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGT

TTTATCTGTTGTTTGTCGGTGAACGCTCTCCTGAGTAGGACAAATCCGCCGGGAGCG

GATTTGAACGTTGCGAAGCAACGGCCCGGAGGGTGGCGGGCAGGACGCCCGCCATA

AACTGCCAGGCATCAAATTAAGCAGAAGGCCATCCTGACGGATGGCCTTTTTGCGTT

TCTACAAACTCTTTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGA G

ACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTC AACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCT

CACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGT

GGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGA

AGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATC

CCGTGTTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATG

ACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTA

AGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTT

CTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGA

TCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACG

ACGAGCGTGACACCACGATGCCTGCAGCAATGGCAACAACGTTGCGCAAACTATTA

ACTGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCG

GATAAAGTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCT

GATAAATCTGGAGCCGGTGAGCGTGGGTCTCGCGGTATCATTGCAGCACTGGGGCC

AGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTAT

GGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGT

AACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTACGCGCCCTGTAGC

GGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGC

CAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGC C

GGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCT T

TACGGCACCTCGACCCCAAAAAACTTGATTTGGGTGATGGTTCACGTAGTGGGCCAT

CGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTG G

ACTCTTGTTCCAAACTTGAACAACACTCAACCCTATCTCGGGCTATTCTTTTGATTT A

TAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAA

TTTAACGCGAATTTTAACAAAATATTAACGTTTACAATTTAAAAGGATCTAGGTGAA

GATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTG A

GCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGC

GTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCG

GATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATA

CCAAATACTGTCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTA

GCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGC

GATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCA GCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCT

ACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAA

GGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCA

CGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCC

ACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGA

AAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTC A

CATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGA G

TGAGCTGATACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGAGCGA

GGAAGCGGAAGAGCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTC

ACACCGCATACGTACGATTTAAATAGGCCTGACTCACTATAGGGAGACCGGAATTCC

CTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTGAGCGCAACGCAATTAATG

TGAGTTAGCTCACTCATTAGGGACCCCGGGCTTTACACTTTATGCTTCCGGCTCGTA T

GTTGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCATG

ATTACGGATTCACTGGCCGTCGTTTTACAACGTCGTGACTGGTAAAACCCGGGCGTT

ACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCAGGCGTAATAAGGAA

AGGATCCATGTACTATTTGAAAAACACAAACTTTTGGATGTTCGGTTTATTCTTTTT C

TTTTACTTTTTTATCATGGGAGCCTACTTCCCGTTTTTCCCGATTTGGCTACATGAT AT

CAACCATATCAGCAAAAGTGATACGGGTATTATTTTTGCCGCTATTTCTCTGTTCTC G

CTATTATTCCAACCGCTGTTTGGTCTGCTTTCTGACAAACTCGGTCTACGCAAATAC C

TGCTGTGGATTATTACCGGCATGTTAGTGATGTTTGCGCCGTTCTTTATTTTTATCT TC

GGGCCACTGCTGCAGTACAACATTTTAGTAGGATCGATTGTTGGTGGTATTTATCTA

GGCTTTAGTTTTAACGCCGGTGCGCCAGCAGTAGAGGCATTTATTGAGAAAGTCAGC

CGGCGCAGTAATTTCGAATTTGGTCGCGCGCGGATGTTTGGCAGTGTTGGCTGGGCG

CTGGTTGCCTCGATTGTCGGGATCATGTTCACCATTAATAATCAGTTTGTTTTCTGG C

TGGGCTCTGGCAGTTGTCTCATCCTCGCCGTTTTACTCTTTTTCGCCAAAACGGACG C

GCCCTCGAGTGCCACGGTTGCCAATGCGGTAGGTGCCAACCATTCGGCATTTAGCCT

TAAGCTGGCACTGGAACTGTTCAGACAGCCAAAACTGTGGTTTTTGTCACTGTATGT

TATTGGCGTTTCCTCCACCTACGATGTTTTTGACCAACAGTTTGCTAATTTCTTTAC TT

CGTTCTTTGCTACCGGTGAACAGGGTACCCGCGTATTTGGCTACGTAACGACAATGG

GCGAATTACTTAACGCCTCGATTATGTTCTTTGCGCCACTGATCATTAATCGCATCG G

TGGGAAGAATGCCCTGCTGCTGGCTGGCACTATTATGTCTGTACGTATTATTGGCTC ATCGTTCGCCACCTCAGCGCTGGAAGTGGTTATTCTGAAAACGCTGCATATGTTTGA AGTACCGTTCCTGCTGGTGGGCTCCTTTAAATATATTACTAGTCAGTTTGAAGTGCGT TTTTCAGCGACGATTTATCTGGTCAGTTTCAGCTTCTTTAAGCAACTGGCGATGATTT TTATGTCTGTACTGGCGGGCAATATGTATGAAAGCATAGGTTTCCAAGGCGCTTATC TGGTGCTGGGTCTGGTGGCGCTGGGCTTCACCTTAATTTCCGTGTTCACGCTTAGCGG CCCGGGCCCGCTTTCCCTGCTGCGTCGTCAGGTGAATGAAGTCGCTTAAAGGCC

Cell lines and E. coll strains

[0085] All cloning was performed using New England Biolabs (NEB) 10-beta cells (NEB #C3020K) or TOPIO (Invitrogen #C404010) cells, and all SECRETS assays performed in Stbl2 cells (Invitrogen #10268019), grown at 30°C. A549 (ATCC CCL-185™) human lung epithelial cells were obtained from were obtained from ATCC (American Type Culture Collection). Cloning SECRETS plasmids and x-gRNA libraries: SECRETS plasmids:

[0086] Three plasmids were generated for the validation of the SECRETS protocol: pSECRETS-A (medium copy pl5A ori, chloramphenicol resistance, aTc-inducible Cas9 expression; FIG. 9A ii), pSECRETS-B (low copy SC101 ori, kanamycin resistance, aTc- inducible gRNA expression, and a site for “off-target’ ’ sequence; FIG. 9A iii), and pSECRETS-C (pl l.LacY.wtxl plasmid (Addgene #69056) high copy pBR322 ori, ampicillin resistance, arabinose-inducible/glucose-suppressed ccdB toxin also containing additional site for “target” sequence; FIG. 9A i). pl 1-LacY-wtxl was a gift from Huimin Zhao (Addgene plasmid # 69056; n2t.net/addgene:69056; RRID: Addgene 69056).

[0087] To clone pSECRETS-A, the Cas9 gene was PCR amplified from pwtCas9-bacteria (Addgene #44250) and a gBlock (purchased from Integrated DNA Technologies [IDT]) containing an anhydrotetracycline (aTc) inducible promoter (pLTetO-1) were inserted via HiFi Assembly (HiFi Assembly Kit (NEB#E5520S)) into a PCR-ed fragment of plasmid pBbA2c- RFP (Addgene #35326) to replace the red fluorescent protein (RFP) gene. pwtCas9-bacteria was a gift from Stanley Qi (Addgene plasmid # 44250; n2t.net/addgene:44250;

RRID:Addgene_44250). pBbA2c-RFP was a gift from Jay Keasling (Addgene plasmid # 35326; n2t.net/addgene:35326; RRID : Addgene_35326). [0088] To clone pSECRETS-B, the gRNA cassette for aTc-induced expression was constructed by inserting via HiFi Assembly a PCR’ed fragment of TetR from pBbA2c-RFP (Addgene #35326) and a gBlock (IDT) containing the pLTetO-1 promoter and a Golden Gate cassette (dual Bsal restriction sites) near a Cas9 fused tracrRNA-crRNA fusion to replace the RFP and LacI genes in a PCR’ed fragment of pBbS2K-RFP (Addgene #35330). The Golden Gate assembly cassettes of the resulting plasmid (called pSECRETS-B) could then be used to clone spacer sequences or x-gRNA libraries for gRNA expression after Bsal digestion and ligation of short phosphorylated annealed oligos or HiFi Assembly of single-stranded oligos, respectively. pBbS2k-RFP was a gift from Jay Keasling (Addgene plasmid # 35330; n2t.net/addgene:35330; RRID : Addgene_35330)

[0089] To clone pSECRETS-B derivatives containing off-target sequences and standard gRNAs or x-gRNA libraries, single-stranded oligonucleotides (oPools) were synthesized by IDT containing the off-target sequences, pLTetO-1 promoter, (N8 random nucleotides immediately upstream) the spacer sequence. These oligos were of the form of SEQ ID NO: 104:

5’-

CCACTGCTTACTGGCTTATCGGAAGGGATCGTCCTGACCCCG

[Off-target sequence, 20 nt + 3 nt PAM]

CCCCCTCCGTGGAGAAAATTTCCCTATCAGTGATAGAGATTGACATCCCTATCAGTG ATAGAGATACTGAGCAC [5’- extension library, for example: NNNNNNNN] [20 nt spacer sequence]

GTTTTAGAGCTAGAAATAGCAAG

-3’.

[0090] These inserts were PCR’ed with primers:

SECRETS-FwdUSER 5 ’ -AGC AAG\deoxyU\T AAAATAAGGCTAGTCCG-3 ’ and SECRET S-RevUSER 5 ’ -ACTTGC\deoxyU\ATTTCTAGCTCTAAAAC-3 ’ where deoxyU is a deoxyuracil modified, and the plasmid pSECRETS-B PCR’ed with primers: Bv3-FwdUSER 5’ -AGC AAG\deoxyU\T AAAATAAGGCTAGTCCG-3’ and

B-RevUSER 5 ’ -AGTGGG\deoxyU\TCTCTAGTTAGCCAGAG-3 ’ [0091] These inserts were then cloned into the pSECRETS-B cassette via USER cloning (NEB #M5505S). To maintain library diversity, after transformation, E. coli was recovered in 1 mL SOC media for 1 hour without selection, then 0.5 mL of the media was reinoculated into 7 mL LB with kanamycin (50 pg/mL) and grown overnight. 5 mL of the transformants were then centrifuged, then miniprepped. pSECRETS-B plasmids were sequenced using variants of primers:

SECRETS-BSeq5 5’- [NGS adapter][barcode]-GAGCGGATACATATTTGAATG-3’ SECRETS-BSeq3 5’- [NGS adapter] [barcode]- AAGTTGATAACGGACTAGCC-3 ’

[0092] The pSECRETS-C plasmids containing desired on-target sequences were constructed via HiFi assembly into the pl LLacY.wtxl plasmid (Addgene #69056), which was double digested with Xbal and SphI, with a dsDNA fragment containing a target site (20bp + PAM), 15 bp genomic context sequences flanking the target, and overhang sequences matching the digested plasmid. These fragments were constructed by PCRing short oligos with the form: 5’- ATAACAGGGTAATATCACGC

[15 bp upstream genomic sequence context] +

[20 bp target sequence + 3 bp PAM] +

[15 bp downstream genomic sequence context]

AAGCTTGGCTGTTTTGGCGG -3’

[0093] E. coli strains containing pSECRETS-C plasmids were grown in solutions supplemented with glucose (glu) to suppress leakage of arabinose-induced promoter until selection.

The SECRETS protocol and analysis

Validating selection strength using standard gRNAs

[0094] For validation, pSECRETS-C itself or pSECRETS-C containing the EMX1 target site and flanking sequences (pSECRETS-C-EMXl); pSECRETS-A; and pSECRETS-B to express the standard EMX1 gRNA (pSECRETS-B-EMXl-gRNA) or pSECRETS-B-EMXl-gRNA containing with EMX1 OT1 were electroporated sequentially into electrocompetent NEBlObeta E. coli cells and recovered in SOC media. For the last transformation with pSECRETS-B, recovery media was supplemented with 10 ng/mL aTc for pre-induction of sgRNA and Cas9. Following recoveragcy for 1 hr, cells were plated on LB agar plates under selective (aTc, arabinose, chloramphenicol, kanamycin) and non-selective (glucose, chloramphenicol, kanamycin, ampicillin) conditions and incubated for 24 hours.

Selection of extended g-RNAs (SECRETS protocol)

[0095] pSECRETS-B plasmids containing x-gRNA libraries and the off-target site were screened similarly to the validation experiments with few changes. E. coli cells were transformed in two steps instead of three: E. coli strains containing pSECRETS-A plasmids were electroporated with corresponding B and C plasmid simultaneously (75 ng each DNA).

Following recovery, cells were centrifuged at 4°C and supernatant was replaced with fresh LB before inoculating 0.5 mL of the culture into 7 mL liquid LB for selective or non-selective conditions and grown overnight. After miniprep (NEB T1010L) of the resulting cultures, samples were PCR’ed across the gRNA segment and prepared for Illumina next-generation sequencing using variants of primers SECRETS-BSeq5 and SECRETS-BSeq3.

Analysis of SECRETS outcomes

[0096] Small-scale (at least 50,000 reads) next-generation (Amplicon-EZ; Azenta Inc.) was performed of samples from the SECRETS assay. Custom code was written in MATLAB (Mathworks; Natick, MA) to extract and count the 5 ’-extensions from the x-gRNA sequence of each read; however, in principle, a short line of code can be written to the same effect following the approach found in: Maxwell, C. S., Jacobsen, T., Marshall, R., Noireaux, V. & Beisel, C. L. A detailed cell-free transcription-translation-based assay to decipher CRISPR protospacer- adjacent motifs. Methods 143, 48-57 (2018). In short, the extension sequences were found by identifying sequences with perfect identity to 20 bp of the gRNA promoter and the 20 bp spacer sequences, and if the spacing between the two sequences matched the expected length of the extension sequence, those sequences were recorded. The number of unique 5’-extensions were enumerated per sample and normalized to the total number of reads per sample and averaged across technical replicates (n = 2). The normalized number of reads per 5’ - extension were then averaged across biological replicates (n = 2), sorted from most prevalent to least, and the top five most prevalent 5’ - extensions per gRNA selected for further characterization.

In vitro validation of x-gRNAs Cas9 ribonucleoprotein (RNP) generation:

[0097] DNA oligos of sgRNAs and x-gRNAs were designed according to the EnGen sgRNA Synthesis Kit (NEB #E3322) to add 5’- T7 RNA polymerase promoter sequence and 3’ - Cas9 crRNA sequence and were purchased from Integrated DNA Technologies IDT then resuspended to a stock concentration of 100 pM. If the (x-)gRNA did not have an initial 5’- dG necessary for T7 RNA polymerase transcription, one was added in the DNA oligo sequence. For sgRNA synthesis, oligos were diluted lOOx (1 pM) then used with the EnGen sgRNA Synthesis Kit per manufacturer’s instructions. Cas9 RNPs were formed following the IDT Alt-R CRISPR-Cas9 System - In vitro cleavage of target DNA with ribonucleoprotein complex protocol (Option 2). Cas9 enzyme (Sigma Aldrich, #CAS9PROT-250UG), eCas9 enzyme (Sigma Aldrich #ESPCAS9PRO-50UG), or dCas9 enzyme (IDT Alt-R® S.p. dCas9 Protein V3 #1081066) and sgRNA were combined in equimolar amounts in Phosphate buffered saline, pH 7.4 - PBS (ThermoFisher, #10010023) and incubated at room temperature for 10 minutes. Following incubation, RNPs were stored at -80°C or immediately used for in vitro digestion reactions.

In vitro digestion reactions:

[0098] Three hundred (300) bp DNA targets containing the target sequence -200 bp from one end and the flanking genomic context were synthesized by Twist Bioscience, PCR amplified using the provided universal primers, purified, and resuspended in nuclease-free water to 100 nM. Three technical replications of reactions were assembled in the following order: 7 pL nuclease-free water, 1 pL target DNA substrate (100 nM), 1 pL lOx Cas9 Nuclease Reaction buffer (200 mM HEPES, 1 M NaCl, 50 mM MgC12, 1 mM EDTA (pH 6.5 at 25°C)), 1 pL Cas9- RNP (1 mM), then incubated for 1 hour at 37°C followed by proteinase K digestion (1 pL - 56°C for 10 minutes; ThermoFisher, #EO0491). Products were resolved on a 3% agarose gel stained with SYBR Gold and analyzed using ImageJ.

Evaluation of gene activity of transfected Cas9/x-gRNAs into human cell lines

[0099] Cells were transfected using the Lipofectamine CRISPRMAX Cas9 Transfection Reagent (ThermoFisher #CMAX00003) kit. Prior to transfection, A549 fibroblast cells were plated in 24-well plates at 25% confluency in Dulbecco’s Modified Eagle’s Medium - DMEM (ATCC #30-2002) + 10% Fetal Bovine Serum - FBS (ATCC 30-2020) + 1% Penicillin- Streptomycin solution (ATCC 30-2300) and incubated for 24 hours at 37°C + 5% CO2. Following incubation, the media was removed and cells were washed with lx PBS and replaced with fresh DMEM + 10% FBS. Cas9 RNP complexes were formed in a 1 : 1.2 molar ratio of Cas9 protein to sgRNA with Cas9 Plus reagent to a total volume of 25 pL in Opti-MEM Reduced Serum Medium (ThermoFisher #31985070) per reaction (n = 3). RNPs were added to a mix of 25 pL Opti-MEM I and 1.5 pL CRISPRMAX reagent per reaction, and following a 10 minute room temperature incubation, 50 pL was added to each well. Cells were then incubated at 37°C + 5% CO2 for 48 hours.

Analysis of gene editing outcomes:

[0100] Cells were processed as follows using the GeneArt Genomic Cleavage Detection Kit (ThermoFisher #A24372). Cell media was collected in a 1.5 mL Eppendorf tube. Remaining attached cells were washed with PBS then detached using TrypLe Express (ATCC 30-2300) trypsin and transferred to the corresponding Eppendorf tube for centrifugation at 1,200 x g at 4°C to pellet cells. Once the supernatant was discarded, pellets were resuspended in cell lysis buffer with protein degrader (supplied in kit) and incubated at 68°C then 95°C to lyse cells. Crude cell lysate was mixed with forward and reverse primer (10 pM), AmpliTaq Gold 360 master mix, and nuclease-free water for direct PCR amplification of the region of interest followed by agarose gel electrophoresis to confirm expected PCR length. Heteroduplexes of the PCR products were formed by mixing with lOx detection buffer and heating samples to 95°C, cooling to 85°C at 2°C/sec, then to 25°C at 0. l°C/sec. Detection enzyme was added to the samples and incubated at 37°C, then fragments were resolved on a 3% agarose gel stained with SYBR Gold. Fluorescence was measured through ImageJ, intensity normalized by length of the DNA fragments, and fraction cleaved was determined using the following equation:

[sum of cleaved band intensities/(sum of cleaved and parental band intensities)]xl00%

[0101] Two technical replicates of samples were also prepared for illumina next-generation sequencing and amplicon sequencing with editing efficiency determined using the CRISPResso2 pipeline.

Genome-wide off-target screens:

[0102] Genome-wide off-target editing was measured using CHANGE-seq (See Lazzarotto, C. R. et al. CHANGE-seq reveals genetic and epigenetic effects on CRISPR-Cas9 genome-wide activity. Nature Biotechnology, 1-11 (2020)). The genomic DNA (human male/female mixed, Promega #G3041 ) purification steps were carried out using the NEB Monarch Genomic DNA Purification Kit. An agarose gel was used to visualize the tagmentation of the human genomic DNA with the transposase. For the PCR step after cleavage and USER enzyme treatment (step 25 of the supplemental information), NEB 2X Q5 Master Mix was used in place of 2X Kapa HiFi HotStart Ready Mix. In place of the MiSeq protocol described in the supplemental information, cleaved genomic DNA barcoded and amplified via PCR for illumina sequencing was sent for NGS Amplicon-EZ sequencing by Azenta, and analyzed using the CHANGE-seq computational pipeline (github.com/tsailabSJ/changeseq) with default settings (i.e., >4 reads minimum / hit).

Availability of materials:

[0103] pSECRETS-A and pSECRETS-B precursors (without gRNA spacers or off-targets) will be provided to Addgene plasmid repository. pSECRETS-C precursor plasmids (without on- target sequences) are available from Addgene (Addgene plasmid # 69056 ; n2t.net/addgene:69056 ; RRID:Addgene_69056) as a gift from Huimin Zhao.

Results

[0104] To overcome this challenge, here is presented an experimental protocol to screen tens- to hundreds- of thousands of candidate 5’- extension sequences simultaneously to efficiently and reliably identify novel to extended gRNA sequences (x-gRNAs) that maintain robust Cas9 activity on-target while significantly increasing their gene editing specificity by effectively eliminating their activity even at known off-target sequences where conventional approaches to increase Cas9 specificity in general may fail (FIG. 9). In this protocol, called “Selection of Extended CRISPR RNAs with Enhanced Targeting and Specificity” (SECRETS), the activities of Cas9 enzymes with a library of x-gRNA candidates are evaluated in parallel using an Escherichia coli strain that is strongly selective for their ability to stimulate Cas9 nuclease activity on-target and strongly counter-selective for activity at their off-targets. The SECRETS E. coli strain maintains three plasmids (FIG. 9A): (i) a high-copy plasmid expresses the toxin ccdB in the presence of arabinose (ara), and also contains the target sequence of interest; (ii) a medium-copy plasmid expresses Cas9 in the presence of anhydrotetracycline (aTc); and (iii) a low-copy plasmid expresses the gRNA of interest, provides resistance to the antibiotic kanamycin (KanR), and also contains a known off-target sequence for the gRNA. DSBs induced by the Cas9 at the target of interest in E. colt results in the degradation of ccdB plasmid, allowing the bacteria to survive in the presence of arabinose, while DSBs induced by the Cas9 at the known off-target results in the degradation of the gRNA plasmid and bacterial susceptibility to the antibiotic kanamycin (KanS). Hence, only gRNAs that exhibit robust activity at their intended targets (degrading the high-copy toxin plasmid) and low activity at their off- target sites (not degrading the low-copy gRNA plasmid) will survive a SECRETS screen (FIG. 9C).

[0105] This system was tested with a well-characterized gRNA for human gene EMX1, its target sequence (EMX1 ON), and another sequence found in the human genome where the Cas9/gRNA RNP complex is known to exhibit off-target activity (EMX1 OFF1) due to a two nucleotide difference with the EMX1 ON target at positions where Cas9 effectors are especially susceptible to tolerating sequence divergence (FIG. 9A). After only 1 hr of Cas9/gRNA expression, followed by plating and overnight growth on LB with aTc, arabinose, chloramphenicol (cam), and kanamycin, strong suppression of E. coll growth was found with the standard EMX1 gRNA, but only when the EMX1 OFF1 sequence was present in the kanamycin resistance plasmid (FIG. 10).

[0106] As shown in FIG. 10A, the SECRETS screen is demonstrated with pSECRETS-A, an empty toxin plasmid (pSECRETS-C with no target), and pSECRETS-B with an EMX1 sgRNA after plating on non-selective (left) vs selective (right) conditions. As shown in FIG. 10B, the SECRETS screen is demonstrated with pSECRETS-A, pSECRETS-C with EMX1 ON, and pSECRETS-B with an EMX1 sgRNA on non-selective (left) vs selective (right) conditions. As shown in FIG. 10C, the SECRETS screen is demonstrated with pSECRETS-A, pSECRETS-C with EMX1 ON, and pSECRETS-B with an EMX1 sgRNA and EMX1 OFF1 on non-selective (left) vs selective (right) conditions. And finally, as shown in FIG. 10D, the SECRETS screen is demonstrated with pSECRETS-A, pSECRETS-C with EMX1 ON, and pSECRETS-B with an EMX1 x-gRNA library (N8) and EMX1 OFF1 on non-selective (left) vs selective (right) conditions.

[0107] If instead of the standard gRNA for EMX1 (FIG. 9B top) a library of EMX1 x-gRNA variants with 8 randomized nucleotides (N8) appended to its 5’ end (Figure IB bottom) was introduced, numerous E. coli colonies of survivors of the SECRETS protocol are found (FIG. 9D and FIG. 10), indicating that these x-gRNAs from the library demonstrate the high Cas9 activity and specificity required to survive. The pooled survivors were sequenced and each of the top five most prevalent x-gRNA sequences in the surviving population (FIG. 11A) were tested for activity and specificity in vitro (FIG. 1 IB and FIG. 12, Table 5).

Table 5 - Extensions from SECRETS protocol for EMX1, FANCF, and VEGFA targets and top off-target, (top five extensions sequences labeled as in FIG. 11).

[0108] In vitro Cas9 digestion assays revealed nuclease activities of Cas9 with all five of the x-gRNAs identified from the SECRETS screen had significantly reduced off-target activity at EXMI OFF1 compared to the standard EMX1 gRNA — effectively eliminating nuclease activity at the known off-target site — and exhibiting similar activity at EMX1 ON as the engineered “enhanced specificity” Cas9 variant eCas9. These five x-gRNAs identified through the SECRETS protocol could also exhibit higher levels of specificity in general, eliminating off- target activity across three other known EMX1 off-targets (EMX1 OFF2 - OFF4, containing 2 to 4 differences with the EMX1 ON sequence) for Cas9, and reducing nuclease activity at all four off-target sequences even more so relative to eCas9 with a standard gRNA (FIG. 1 IB and FIG. 12). In addition to x-gRNAs for EMX1, we were also able to readily identify multiple x- gRNAs using the SECRETS protocol for other targets — human genes VEGFA and FANCF (FIG. 12, FIG. 13A-D, and Table 5) — with superior activity and specificity profiles (FIG. 11C), including in cases (VEGFA) where eCas9 was not able to significantly reduce activity at OFF1 sites.

[0109] To confirm that the x-gRNAs identified via SECRETS would remain active in human cell lines for gene editing, RNP complexes with Cas9 variants (wild-type Cas9; catalytically- inactive dCas9; or engineered enhanced-specificity eCas9) and gRNA variants (standard gRNAs or x-gRNAs) were transfected into A549 human lung epithelial cells, then T7E1 mutation detection assays and next-generation sequencing (NGS) were performed to quantify mutation rates at the EMX1 target. The Cas9 RNPs with x-gRNAs identified via SECRETS exhibited robust gene editing cells in A549 cells, and higher on-target mutation rates than eCas9 (FIG. 1 IE). It is noted that, even though they have additional 5’ - nucleotides, hp-gRNAs had previously been found not to recognize or mutate any novel off-target sites compared to the standard gRNAs, likely because the 20 nt targeting segment of the hp-gRNA remains the same. To ensure that no new off-targets were introduced when using x-gRNAs identified from the SECRETS screen a test of genome-wide off-target nuclease activity (CHANGE-seq5) was also performed and, as expected, findings were significant reductions of off-target cleavage activity genome-wide and no new off-targets (FIG. 1 ID). Therefore, these findings demonstrate that the SECRETS protocol can robustly identify multiple high-performance x-gRNA candidates with strong potential for specific gene editing applications in human cells that eliminate off-target activity at selected loci.

[0110] In the demonstrations above, randomized x-gRNA libraries of 65,336 variants (N8) were screened in the SECRETS protocol for each gRNA target. Enhanced x-gRNAs from more complex initial libraries (>250,000 5’- extension sequence variants) were also identified, including pooled libraries of N8 variants containing different additional 4 nt tetraloop motifsl? (N8+4) designed to promote interactions between the N8 segment and the DNA-targeting segment of the x-gRNA (FIG. 14A). Indeed, it was noted that the ‘space’ of potential 5’- extension sequences for x-gRNAs is quite large (FIG. 14) — larger x-gRNA libraries of >4 A 10 or 4 A 12 (>1 - 16M) variants are expected to be quite readily generated and screened in E. coli for new targets/off-target pairs of interest. The results here suggest that this space is also quite rich with high performance x-gRNA variants, as multiple x-gRNAs were identified for several spacer sequences under relatively small-scale SECRETS screens that exhibited exceptional activity and specificity profiles (FIG. 14B, FIG. 15A-D). Furthermore, hp-gRNAs were able to improve the specificity of diverse CRISPR effectors, including Cas9 from Staphylococcus aureus and various Cast 2 effectors, the SECRETS protocol can be readily adapted to those systems as well.

[OHl] For many biotechnological or therapeutic applications, it is often desirable direct a Cas9/gRNA RNP to a specific nucleotide target of interest (e.g., where there might be no flexibility to target nearby sites). If it is determined there is a potential for off-target activity at certain sites in a specific sample or for a patient, there can be limited options to limit that activity at those specific off-targets. Here it is demonstrated that the SECRETS protocol can be used to robustly identify ultra-specific variants for those gRNAs of interest that have been explicitly counter-selected against activity at those off-target sites. This approach could be used to robustly generate x-gRNAs in a “design-free” way that effectively eliminates the need for individualized optimization, and has been experimentally streamlined for simplicity in cloning new target/off- target pairs on-demand into the screening plasmids and for ease of rapidly selecting enhanced x- gRNAs. Once any off-target activity has been suspected or characterized, the SECRETS screen therefore provides an accessible and reliable method to identify high-performance gRNA variants for specific targets of interest. It is expected that the continued output and development of this approach will allow for safer applications of advanced CRISPR gene editing approaches that require gRNAs with extreme specificity, such as SNP-targeting and/or allele-specific gene editing. As approaches to predicting and identifying novel off-target sequences at the level of individual patients become more sophisticated and routine, it is expected that methods like SECRETS, which can reliably and rapidly generate highly-specific and highly-active gRNA variants that effectively eliminate Cas9 activity at specific off-targets, will become increasing important in applications of gene therapies in personalized medicine.

EXAMPLE 4

Extended gRNA for Treatment of Sickle Cell Disease

[0112] The HBB gene encodes for the hemoglobin subunit beta protein, which along with alpha globin, is the most common form of hemoglobin in adult humans. Thousands of naturally occurring variants of HBB exist, including a point mutation producing HbS, which causes sickle cell disease. Others have developed a gRNA gene therapy treatment (classic.clinicaltrials.gov/ct2/show/NCT04774536). Alternative extended gRNA were developed using the methods of the present invention denoted by HBB X-l, HBB X-2, and HBB X-3. These gRNA were compared to WT sgRNA in FIG. 16 with:

On-target sequence, ON (SEQ ID NO: 102): CTTGCCCCACAGGGCAGTAACGG Off-target sequence, OT1 (SEQ ID NO: 103): TCAGCCCCACAGGGCAGTAAGGG [0113] OT1 is the top off-target sequence. The extended gRNA show a better safety profile for gene therapy treatment of sickle cell disease than WT, with HBB X-2 displaying twice the on-target cleavage compared to WT, along with no off-target cleavage.