Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
GTP CYCLOHYDROLASE-CLEAVING PROTEASES
Document Type and Number:
WIPO Patent Application WO/2024/050007
Kind Code:
A1
Abstract:
Aspects of the disclosure relate to Botulinum toxin X (BoNT X) protein variants. The variants provided herein have been evolved to cleave GTP cyclohydrolase 1 (GCH1). Some of the variants provided herein were evolved from a procaspase- 1 cleaving polypeptide. Further aspects of the disclosure relate to nucleic acids encoding the GCH1 cleaving polypeptides described herein and expression vectors comprising the nucleic acids, as well as host cells and fusion proteins comprising the GCH1 cleaving polypeptides described herein, and kits comprising the GCH1 polypeptides, fusion proteins, nucleic acids, expression vectors, or host cells described herein. Further aspects of the disclosure relate to methods of producing BoNT X variants and methods of using the BoNT X protein variants, for example, to reduce pain.

Inventors:
LIU DAVID (US)
BLUM TRAVIS (US)
DONG MIN (US)
MANION JOHN (US)
HEMEZ COLIN (US)
MCCREARY JULIA (US)
Application Number:
PCT/US2023/031703
Publication Date:
March 07, 2024
Filing Date:
August 31, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BROAD INST INC (US)
HARVARD COLLEGE (US)
CHILDRENS MEDICAL CENTER (US)
International Classes:
A61K38/00; C07K14/33; C12N9/52
Domestic Patent References:
WO2021011579A12021-01-21
WO2023081805A12023-05-11
WO2010028347A22010-03-11
WO2012088381A22012-06-28
WO2019040935A12019-02-28
WO2016077052A22016-05-19
WO2016168631A12016-10-20
WO2019056002A12019-03-21
WO2021011579A12021-01-21
Foreign References:
US9023594B22015-05-05
US9771574B22017-09-26
US201715713403A
US20090056194W2009-09-08
US201061426139P2010-12-22
US9394537B22016-07-19
US10336997B22019-07-02
US11214792B22022-01-04
US20110066747W2011-12-22
US201461929378P2014-01-20
US10179911B22019-01-15
US201916238386A2019-01-02
US20150012022W2015-01-20
US201562158982P2015-05-08
US201562187669P2015-07-01
US201462067194P2014-10-22
US10920208B22021-02-16
US20180048134W2018-08-27
US9267127B22016-02-23
US20150057012W2015-10-22
US20160027795W2016-04-15
US20180051557W2018-09-18
US20200042016W2020-07-14
US201715713403A
US201313922812A2013-06-20
Other References:
BLUM TRAVIS R. ET AL: "Phage-assisted evolution of botulinum neurotoxin proteases with reprogrammed specificity", SCIENCE, vol. 371, no. 6531, 19 February 2021 (2021-02-19), US, pages 803 - 810, XP093101906, ISSN: 0036-8075, DOI: 10.1126/science.abf5972
LATREMOLIERE ALBAN ET AL: "Reduction of Neuropathic and Inflammatory Pain through Inhibition of the Tetrahydrobiopterin Pathway", NEURON, vol. 86, no. 6, 17 June 2015 (2015-06-17), pages 1393 - 1406, XP029215126, ISSN: 0896-6273, DOI: 10.1016/J.NEURON.2015.05.033
RAWLINGS ET AL.: "MEROPS: the database of proteolytic enzymes, their substrates and inhibitors.", NUCLEIC ACIDS RES, vol. 42, 2014, pages D503 - D509
NAT COMMUN., vol. 8, 3 August 2017 (2017-08-03), pages 14130
MILLER ET AL., NATURE PROTOC, vol. 15, no. 12, December 2020 (2020-12-01), pages 4101 - 4127
ELIZABETH KUTTERALEXANDER SULAKVELIDZE: "Bacteriophages: Biology and Applications", December 2004, CRC PRESS
MARTHA R. J. CLOKIEANDREW M. KROPINSKI, BACTERIOPHAGES: METHODS AND PROTOCOLS, vol. 2
"Isolation, Characterization, and Interactions (Methods in Molecular Biology)", December 2008, HUMANA PRESS
ALTSCHUL, PROC. NATL. ACAD. SCI. USA, vol. 87, 1990, pages 2264 - 68
ALTSCHUL, PROC. NATL. ACAD. SCI. USA, vol. 90, 1993, pages 5873 - 77
ALTSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 403 - 10
ALTSCHUL ET AL., NUCLEIC ACIDS RES., vol. 25, no. 17, 1997, pages 3389 - 3402
ALTSCHUL, S F ET AL., NUC. ACIDS RES., vol. 25, 1997, pages 3389 3402
MILLER ET AL., NATURE PROTOC., vol. 15, no. 12, December 2020 (2020-12-01), pages 4101 - 4127
CHUNG ET AL., MOL THER., vol. 22, no. 5, May 2014 (2014-05-01), pages 952 - 963
SALAFFI ET AL., BEST PRACT RES CLIN RHEUMATOL., vol. 29, no. l, pages 164 - 86
MULEY ET AL., CNS NEUROSCI THER., vol. 22, no. 2, February 2016 (2016-02-01), pages 88 - 101
CRUCCU ET AL., PLOS MED., vol. 6, no. 4, pages 1000045
BENNETT, N. J.RAKONJAC, J.: "Unlocking of the filamentous bacteriophage virion during infection is mediated by the C domain of pill.", JOURNAL OF MOLECULAR BIOLOGY, vol. 356, no. 2, 2006, pages 266 - 73, XP024950566, DOI: 10.1016/j.jmb.2005.11.069
"Molecular Cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY PRESS
Attorney, Agent or Firm:
MACDONALD, Kevin et al. (US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A GCH1 cleaving polypeptide comprising an amino acid sequence that is at least 70% identical to the amino acid sequence set forth in SEQ ID NO: 9 and comprises one or more amino acid substitutions at one or more positions recited in Tables 4 and 6.

2. The GCH1 cleaving polypeptide of claim 1, wherein the polypeptide comprises an amino acid sequence that is at least 75%, 80%, 95%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set forth in SEQ ID NO: 9.

3. The GCH1 cleaving polypeptide of claim 1 or 2 comprising one or more amino acid substitutions at a position selected from N59, N61, A73, A75, 1102, 1115, K164, A166, Y168, 1175, K193, D199, 1235, F248, N260, L262, F264, A277, R324, R354, L364, P368, S395, S413, L428, Y430, and N439 relative to SEQ ID NO: 9.

4. The GCH1 cleaving polypeptide of any one of claims 1 to 3 comprising 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14 amino acid substitutions relative to SEQ ID NO: 9.

5. The GCH1 cleaving polypeptide of any one of claims 1 to 4, wherein the one or more amino acid substitutions are selected from N59D, N61S, A73T, A75V, I102L, Il 15V, K164E, A166T, Y168C, I175T, K193R, D199G, I235M, F248V, N260K, L262F, F264V, A277V, R324H, R354S, L364R, P368L, S395L, S413F, L428S, Y430N, Y430C, and N439T relative to SEQ ID NO: 9.

6. The GCH1 cleaving polypeptide of claim 5, wherein the GCH1 cleaving polypeptide further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49).

7. The GCH1 cleaving polypeptide of any one of claims 1 to 6 comprising the following amino acid substitutions relative to SEQ ID NO: 9: A166T, P368L, and N439T, wherein the GCH1 cleaving polypeptide further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49).

8. The GCH1 cleaving polypeptide of any one of claims 1 to 6 comprising the following amino acid substitutions relative to SEQ ID NO: 9: N61S, A73T, K164E, K193R, N260K, L262F, and S413F.

9. A GCH1 cleaving polypeptide comprising an amino acid sequence that is at least 60% identical to the sequence set forth in SEQ ID NO: 1 and comprises one or more amino acid substitutions at one or more positions recited in Tables 3 and 5.

10. The GCH1 cleaving polypeptide of claim 9, comprising one or more amino acid substitutions at a position selected from N59, N61, E72, A73, A75, 1102, El 13, 1115, 1119, D161, N164, A166, T167, Y168, Y171, P174, 1175, K193, Y199, N210, A218, N235, S240, F248, K252, N260, L262, F264, A277, S280, Y314, R324, R354, L364, P368, S395, S413, L428, Y430, and N439 relative to SEQ ID NO: 1.

11. The GCH1 cleaving polypeptide of claim 9 or 10 comprising 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 amino acid substitutions relative to SEQ ID NO: 1.

12. The GCH1 cleaving polypeptide of any one of claims 8 to 11, wherein the one or more amino acid substitutions are selected from N59D, N61S, E72R, A73T, A75V, I102L, E113K, Il 15V, Il 19V, D161N, N164E, N164K, A166T, T167A, Y168C, Y171D, P174L, I175T, K193R, Y199D, Y199G, N210D, A218V, N235I, N235M, S240V, K252E, N260K, L262F, F264V, A277V, S280P, Y314S, R324H, R354S, L364R, P368L, S395L, S413F, L428S, Y430C, Y430N, and N439T relative to SEQ ID NO: 1.

13. The GCH1 cleaving polypeptide of claim 12, wherein the GCH1 cleaving polypeptide further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49).

14. The GCH1 cleaving polypeptide of any one of claims 8 to 13 comprising the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, P368L, and N439T, wherein the GCH1 cleaving polypeptide further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49).

15. The GCH1 cleaving polypeptide of any one of claims 8 to 13 comprising the following amino acid substitutions relative to SEQ ID NO: 1: N61S, E72R, A73T, E113K, Il 19V, D161N, N164E, T167A, Y171D, P174L, K193R, Y199D, N210D, A218V, N235I, S240V, K252E, N260K, L262F, S280P, Y314S, and S413F.

16. The GCH1 cleaving polypeptide of any one of claims 1 to 15, wherein the polypeptide has at least 70% sequence identity to a sequence selected from SEQ ID NOs.: 10- 23.

17. The GCH1 cleaving polypeptide of any one of claims 1 to 16 comprising the amino acid sequence set forth in any one of SEQ ID NOs: 10-23.

18. The GCH1 cleaving polypeptide of any one of claims 1 to 17, wherein the polypeptide cleaves proteins comprising the amino acid cleavage sequence ETISDVLNDAIFDEDH (SEQ ID NO: 4).

19. The GCH1 cleaving polypeptide of any one of claims 1 to 18, wherein the polypeptide cleaves intracellular GCH1.

20. The GCH1 cleaving polypeptide of claim 19, wherein the GCH1 comprises the sequence set forth in SEQ ID NO: 2.

21. The GCH1 cleaving polypeptide of any one of claims 1 to 20, wherein the polypeptide cleaves GCH1 with increased selectivity relative to procaspase- 1.

22. The GCH1 cleaving polypeptide of any one of claims 1 to 21, wherein the polypeptide cleaves GCH1 with increased selectivity relative to VAMP1.

23. The GCH1 cleaving polypeptide of claim 19 or claim 22, wherein the increased selectivity comprises between 2-fold and 20,000-fold increased selectivity.

24. The GCH1 cleaving polypeptide of any one of claims 1 to 19, wherein the polypeptide does not cleave procaspase- 1.

25. The GCH1 cleaving polypeptide of any one of claims 1 to 19, wherein the polypeptide does not cleave a VAMP1 protein.

26. The GCH1 cleaving polypeptide of claim 24, wherein procaspase- 1 comprises the sequence set forth in SEQ ID NO: 5.

27. The GCH1 cleaving polypeptide of claim 25, wherein the VAMP1 protein comprises the sequence set forth in SEQ ID NO: 7.

28. The GCH1 cleaving polypeptide of any one of claims 1 to 27, wherein the polypeptide does not cleave a VAMP4, VAMP5, or Ykt6 protein.

29. The GCH1 cleaving polypeptide of any one of claims 1 to 28, further comprising a neurotoxin HCc domain, and/or a neurotoxin translocation domain (HCN).

30. A fusion protein comprising:

(i) the GCH1 cleaving polypeptide of any one of claims 1 to 29; and

(ii) a delivery domain.

31. The fusion protein of claim 30, wherein the delivery domain is a pleckstrin homology (PH).

32. The fusion protein of claim 31, wherein the PH domain is a human phospholipase C delta (PLC6) PH domain.

33. The fusion protein of claim 31, wherein the PH domain comprises an amino acid sequence that is at least 80% identical to a sequence set forth in SEQ ID NOs: 44-48).

34. The fusion protein of claim 31, wherein the PH domain comprises an amino acid sequence set forth in SEQ ID NOs: 44-48.

35. The fusion protein of claim 30, wherein the delivery domain is a BoNT X HC domain.

36. The fusion protein of any one of claims 30 to 35, wherein the GCH1 cleaving polypeptide and the delivery domain are directly connected.

37. The fusion protein of any one of claims 30 to 35, further comprising a linker.

38. The fusion protein of claim 37, wherein the linker comprises a peptide linker.

39. The fusion protein of claim 38, wherein the peptide linker comprises a glycine-rich linker, a proline-rich linker, glycine/serine-rich linker, and/or alanine/glutamic acid-rich linker.

40. A nucleic acid encoding the GCH1 cleaving polypeptide of any one of claims 1 to 29 or the fusion protein of any one of claims 30 to 39.

41. The nucleic acid of claim 40, having at least 60% sequence identity to a nucleic acid sequence selected from SEQ ID NOs.: 25-41.

42. The nucleic acid sequence of claim 40 or 41, wherein the nucleic acid sequence is codon-optimized .

43. An expression vector comprising a nucleic acid encoding the GCH1 cleaving polypeptide of any one of claims 40 to 42.

44. The expression vector of claim 43, wherein the vector is a phage, plasmid, cosmid, bacmid, or viral vector.

45. The expression vector of claim 43 or 44, wherein the nucleic acid comprises the sequence set forth in any one of SEQ ID NOs: 25-41.

46. A host cell comprising the GCH1 cleaving polypeptide of any one of claims 1 to 29, the fusion protein of any one of claims 30 to 39, the nucleic acid of any one of claims 40 to 42, or the expression vector of any one of claims 43 to 45.

47. The host cell of claim 46, wherein the cell is a bacterial cell.

48. The host cell of claim 46, wherein the cell is an animal cell.

49. The host cell of claim 48, wherein the animal cell is a mammalian cell.

50. The host cell of claim 49, wherein the mammalian cell is a human cell.

51. The host cell of claim 46 or 47, wherein the host cell is an E. coli cell.

52. A method for cleaving GCH1 in a cell, the method comprising delivering to a cell the GCH1 cleaving polypeptide of any one of claims 1 to 29.

53. The method of claim 52, wherein the GCH1 comprises the cleavage sequence ETISDVLNDAIFDEDH (SEQ ID NO: 4).

54. The method of claim 52 or 53, wherein the GCH1 comprises the amino acid sequence set forth in SEQ ID NO: 2.

55. The method of any one of claims 52 to 54, wherein the cell is in vitro.

56. The method of any one of claim 52 to 55, wherein the cell is a mammalian cell.

57. The method of claim 56, wherein the cell is a peripheral nerve cell.

58. The method of claim 57, wherein the cell is a neuron.

59. The method of claim 57 or 58, wherein the cell is a dorsal root ganglion (DRG) neuron.

60. The method of any one of claims 52 to 59, wherein the cell is in a subject.

61. The method of claim 60, wherein the subject is a mammal.

62. The method of claim 61, wherein the subject is a human.

63. The method of any one of claims 52 to 62, wherein the delivering results in cleavage of the GCH1 protein and subsequently, reduction of intracellular levels of tetrahydrobiopterin (BH4).

64. The method of any one of claims 52 to 63, wherein the delivering results in reduction of pain.

65. The method of claim 64, wherein the pain is chronic pain.

66. The method of claim 64, wherein the pain is neuropathic pain.

67. The method of claim 64, wherein the pain is inflammatory pain.

68. A method for reducing pain in a subject in need thereof, the method comprising administering to the subject the GCH1 cleaving polypeptide of any one of claims 1 to 29, the fusion protein of any one of claims 30 to 39, or the expression vector of any one of claims 43 to 45.

69. The method of claim 68, wherein the cell is a mammalian cell.

70. The method of claim 68 or 69, wherein the cell is a human cell.

71. The method of any one of claims 68 to 70, wherein the administering results in cleavage of GCH1 (SEQ ID NO: 2).

72. The method of any one of claims 68 to 71, wherein the pain is chronic pain.

73. The method of any one of claims 56 to 71, wherein the pain is neuropathic pain.

74. The method of any one of claims 56 to 71, wherein the pain is inflammatory pain.

75. A kit comprising a container housing the GCH1 cleaving polypeptide of any one of claims 1 to 29, the fusion protein of any one of claims 30 to 39, the nucleic acid of any one of claims 40 to 42, the expression vector of any one of claims 43 to 45, or the host cell of any one of claims 46 to 51.

Ill

Description:
GTP CYCLOHYDROLASE-CLEAVING PROTEASES

RELATED APPLICATIONS

This application claims priority under 35 U.S.C. § 119(e) to U.S. Provisional

Application No. 63/402,841, filed August 31, 2022, the entire contents of which are incorporated herein by reference.

REFERENCE TO AN ELECTRONIC SEQUENCE LISTING

The contents of the electronic sequence listing (B 119570146WO00-SEQ-CBD.xml; Size: 80,242 bytes; and Date of Creation: August 22, 2023) are incorporated herein by reference in their entirety.

FEDERALLY SPONSORED RESEARCH

This invention was made with government support under grant numbers R01EB027793, R01EB022376, and R35GM118062 awarded by the National Institutes of Health. The government has certain rights in the invention.

BACKGROUND

Over the last few decades, the medical community has witnessed a remarkable shift in the composition of pharmaceutical therapies from traditional small molecules to biomacromolecules (e.g., proteins, peptides, compositions of multiple proteins or peptides, nucleic acids). The growing number of macromolecular therapeutics is a result of their potential for highly specific interactions in biological systems and has been facilitated by improvements in molecular biology and biomolecule engineering. Despite their tremendous success, macromolecular therapies have been limited almost exclusively to extracellular targets due to the significant challenge of their controllable delivery into the cytoplasm. While a number of notable advances have been made in the area of macromolecular delivery, this critical problem remains a major barrier to the development and use of macromolecular therapeutics that address intracellular targets. As an alternative, several natural protein systems are capable of cytoplasmic self-delivery. However, the ability to reengineer these systems to imbue them with the necessary binding or catalytic activities and specificities for therapeutic effect is largely underexplored and underdeveloped at this time. SUMMARY

Aspects of the disclosure relate to novel Botulinum neurotoxin (BoNT) protease variants evolved using directed evolution technologies, such as, for example, PACE and PANCE, to cleave GTP cyclohydrolase 1 (GCH1). GCH1 inhibition has been found to reduce chronic pain, such as neuropathic pain and inflammatory pain, by decreasing levels of tetrahydrobiopterin (BH4), which is a precursor for peripheral neuropathic and inflammatory pain signals. However, the broad use of BH4 in non-dorsal root ganglia (DRG) tissues may lead to systemic toxicity, which limits the therapeutic use of GCH1 inhibition. Cell-specific engagement is needed to reduce BH4 levels within the peripheral nervous system (PNS) without reducing normal BH4 levels in other systems (e.g., in brain or endothelial cells). In some embodiments, cleavage of GCH1 in a cell (e.g., GCH1 present in DRG neurons) by the BoNT protease variants described herein inactivates GCH1 and results in a reduction of intracellular levels of BH4 below pathological pain levels.

BoNT proteases are attractive candidates for evolution because BoNTs provide a built-in cytosolic delivery mechanism, which allows BoNTs to cleave intracellular targets (e.g., GCH1). In some embodiments, BoNT X proteases are evolved to cleave novel substrates that are not native to wild-type BoNT X proteases. In some embodiments, BoNT X proteases are evolved to cleave GCH1. In some embodiments, BoNT X proteases are evolved to cleave human GCH1 (SEQ ID NO: 2). In some embodiments, BoNT X proteases are first evolved to cleave procaspase- 1 (e.g., SEQ ID NO: 5), and then the evolved BoNT protease variants (e.g., BoNT X(3015)8, SEQ ID NO.: 9) are further evolved to cleave GCH1. In some embodiments, evolved BoNT protease variants that cleave GCH1 are described herein, and are also referred to as “GCH1 cleaving polypeptides”. In some embodiments, the GCH1 comprises a cleavage sequence that is at least 80%, 85%, 90%, 95%, or 99% identical to the amino acid sequence ETISDVLNDAIFDEDH (SEQ ID NO: 4). In some embodiments, the GCH1 comprises the cleavage sequence ETISDVLNDAIFDEDH (SEQ ID NO: 4). In some embodiments, the polypeptide cleaves GCH1 in a cell. In some embodiments, the GCH1 comprises the amino acid sequence set forth in SEQ ID NO: 2. In some embodiments, evolved BoNT protease variants have a reduced selectivity for their native substrates or starting substrates. For example, the native substrates of wild-type BoNT X include VAMP1 (SEQ ID NO: 7), VAMP2, VAMP3, VAMP4, VAMP5, and Ykt6. In some embodiments, a VAMP1 substrate that is cleaved by wild-type BoNT X comprises the following cleavage sequence: TSNRRLQQTQAQVEEVVDIIRVNVDKVLERDQKLSELDDRADALQAGASQFESSAA KLKR (SEQ ID NO: 8). The starting substrate of the BoNT X protease variant, BoNT X(3015)8 described herein, is procaspase-1 (e.g., SEQ ID NO: 5). In some embodiments, the cleavage sequence of the starting substrate procaspase- 1 is NLSLPTTEEFEDDAIK (SEQ ID NO: 6). In some embodiments, the evolved BoNT protease variants have reduced selectivity for procaspase- 1. In some embodiments, the evolved BoNT protease variants do not cleave procaspase- 1. In some embodiments, the evolved BoNT protease variants have reduced selectivity for VAMP1, VAMP4, VAMP5, or Ykt6. In some embodiments, the evolved BoNT protease variants do not cleave VAMP1, VAMP4, VAMP5, or Ykt6.

In some aspects, the disclosure provides a GCH1 cleaving polypeptide comprising an amino acid sequence that is at least 70% (e.g., at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99%) identical to the sequence set forth in SEQ ID NO: 9 and comprises one or more amino acid substitutions at one or more positions recited in Tables 4 and 6. In some embodiments, the polypeptide comprises an amino acid sequence that is at least 75%, 80%, 85%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set forth in SEQ ID NO: 9.

In some embodiments, the GCH1 cleaving polypeptide comprises one or more amino acid substitutions at a position selected from N59, N61, A73, A75, 1102, Il 15, K164, A166, Y168, 1175, K193, D199, 1235, F248, N260, L262, F264, A277, R324, R354, L364, P368, S395, S413, L428, Y430, and N439 relative to SEQ ID NO: 9. In some embodiments, the GCH1 cleaving polypeptide comprises 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or 14 amino acid substitutions relative to SEQ ID NO: 9. In some embodiments, the one or more amino acid substitutions are selected from N59D, N61S, A73T, A75V, I102L, Il 15V, K164E, A166T, Y168C, I175T, K193R, D199G, I235M, F248V, N260K, L262F, F264V, A277V, R324H, R354S, L364R, P368L, S395L, S413F, L428S, Y430N, Y430C, and N439T relative to SEQ ID NO: 9. In some embodiments, a GCH1 cleaving polypeptide having a N439T relative to SEQ ID NO: 9 mutation further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, the GCH1 cleaving polypeptide comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T, P368L, and N439T. In some embodiments, a GCH1 cleaving polypeptide comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T, P368L, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, the GCH1 cleaving polypeptide comprises the following amino acid substitutions relative to SEQ ID NO: 9: N61S, A73T, K164E, K193R, N260K, L262F, and S413F.

In some aspects, the disclosure provides a GCH1 cleaving polypeptide comprising an amino acid sequence that is at least 60% (e.g., at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99%) identical to the sequence set forth in SEQ ID NO: 1 and comprises one or more amino acid substitutions at one or more positions recited in Tables 3 and 5. In some embodiments, the GCH1 cleaving polypeptide provided herein, further comprises one or more amino acid substitutions at a position selected from N59, N61, E72, A73, A75, 1102, E113, 1115, 1119, D161, N164, A166, T167, Y168, Y171, P174, 1175, K193, Y199, N210, A218, N235, S240, F248, K252, N260, L262, F264, A277, S280, Y314, R324, R354, L364, P368, S395, S413, L428, Y430, and N439 relative to SEQ ID NO: 1. In some embodiments, the GCH1 cleaving polypeptide comprises 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 amino acid substitutions relative to SEQ ID NO: 1. In some embodiments, the one or more amino acid substitutions are selected from N59D, N61S, E72R, A73T, A75V, I102L, E113K, Il 15V, Il 19V, D161N, N164E, N164K, A166T, T167A, Y168C, Y171D, P174L, I175T, K193R, Y199D, Y199G, N210D, A218V, N235I, N235M, S240V, K252E, N260K, L262F, F264V, A277V, S280P, Y314S, R324H, R354S, L364R, P368L, S395L, S413F, L428S, Y430C, Y430N, and N439T relative to SEQ ID NO: 1. In some embodiments, a GCH1 cleaving polypeptide having a N439T mutation relative to SEQ ID NO: 1 further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, the GCH1 cleaving polypeptide comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, P368L, and N439T. In some embodiments, a GCH1 cleaving polypeptide comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, P368L, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, the GCH1 cleaving polypeptide comprises the following amino acid substitutions relative to SEQ ID NO: 1: comprising the following amino acid substitutions relative to SEQ ID NO: 1: N61S, E72R, A73T, E113K, Il 19V, D161N, N164E, T167A, Y171D, P174L, K193R, Y199D, N210D, A218V, N235I, S240V, K252E, N260K, L262F, S280P, Y314S, and S413F.

In some embodiments, the GCH1 cleaving polypeptide has at least 70% sequence identity (e.g., at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or more identity) to a sequence selected from SEQ ID NOs.: 10-23. In some embodiments, the GCH1 cleaving polypeptide comprises the amino acid cleavage sequence set forth in any one of SEQ ID NOs: 10-23. In some embodiments, the GCH1 cleaving polypeptide cleaves proteins comprising a cleavage sequence that is at least 80%, 85%, 90%, 95%, or 99% identical to the amino acid sequence ETISDVLNDAIFDEDH (SEQ ID NO: 4). In some embodiments, the GCH1 cleaving polypeptide cleaves proteins comprising the cleavage sequence ETISDVLNDAIFDEDH (SEQ ID NO: 4). In some embodiments, the polypeptide cleaves intracellular GCH1. In some embodiments, GCH1 comprises the sequence set forth in SEQ ID NO: 2. In some embodiments, the GCH1 cleaving polypeptide cleaves GCH1 with increased selectivity (e.g., 2-fold, 5-fold, 10-fold, 50-fold, 100-fold, etc.) relative to a procaspase- 1 protein. In some the GCH1 cleaving polypeptide cleaves GCH1 with increased selectivity of between 2-fold and 20,000-fold relative to a procaspase- 1 protein. In some embodiments, the GCH1 cleaving polypeptide cleaves GCH1 with increased selectivity of about 10-fold to about 100-fold, about 50-fold to about 500-fold, about 100-fold to about 1000-fold, about 500-fold to about 5000-fold, about 750-fold to about 10000-fold, or about 10000-fold to about 20000-fold relative to a procaspase- 1 protein. In some embodiments, the polypeptide does not cleave procaspase- 1. In some embodiments, procaspase- 1 comprises the sequence set forth in SEQ ID NO: 5. In some embodiments, the GCH1 cleaving polypeptide cleaves GCH1 with increased selectivity (e.g., 2-fold, 5-fold, 10-fold, 100-fold, etc.) relative to a VAMP1 protein. In some embodiments, the GCH1 cleaving polypeptide cleaves GCH1 with increased selectivity of between 2-fold and 20,000-fold relative to a VAMP1 protein. In some embodiments, the GCH1 cleaving polypeptide cleaves GCH1 with increased selectivity of about 10-fold to about 100-fold, about 50-fold to about 500-fold, about 100-fold to about 1000-fold, about 500-fold to about 5000-fold, about 750-fold to about 10000-fold, or about 10000-fold to about 20000-fold relative to a VAMP1 protein. In some embodiments, the polypeptide does not cleave a VAMP1 protein. In some embodiments, the VAMP1 protein comprises the sequence set forth in SEQ ID NO: 7. In some embodiments, the GCH1 cleaving polypeptide cleaves GCH1 with increased selectivity (e.g., 2-fold, 5-fold, 10-fold, 100-fold, etc.) relative to a VAMP4, VAMP5, or Ykt6 protein. In some embodiments, the GCH1 cleaving polypeptide cleaves GCH1 with increased selectivity of between 2-fold and 20,000-fold relative to a VAMP4, VAMP5, or Ykt6 protein. In some embodiments, the GCH1 cleaving polypeptide cleaves GCH1 with increased selectivity of about 10-fold to about 100-fold, about 50-fold to about 500-fold, about 100-fold to about 1000-fold, about 500-fold to about 5000-fold, about 750-fold to about lOOOO-fold, or about lOOOO-fold to about 20000-fold relative to a VAMP4, VAMP5, or Ykt6 protein. In some embodiments, the polypeptide does not cleave a VAMP4, VAMP5, or Ykt6 protein. In some embodiments, the GCH1 cleaving polypeptide described herein further comprises a neurotoxin HCc domain (also called the C-terminal domain), and/or a neurotoxin HCN domain (also called the N- terminal domain or the translocation domain). In some embodiments, the HCc domain is the cell surface receptor-binding domain and the HCN domain mediates translocation of a BoNT light chain (LC) (e.g., an evolved BoNT LC) across the endosomal membrane of the cell.

In some aspects, the disclosure provides fusion proteins comprising the GCH1 cleaving polypeptide provided herein and a delivery domain. In some embodiments, the delivery domain is a pleckstrin homology (PH). In some embodiments, the PH domain is a human phospholipase C delta (PLC6) PH domain. In some embodiments, the PH domain has an amino acid sequence that is at least 80% e.g., at least 80%, at least 85%, at least 90%, at least 95%, at least 99%, at least 99.5%, at least 99.9%, or more) identical to a sequence set forth in SEQ ID NOs: 44-48). In some embodiments, the PH domain comprises an amino acid sequence set forth in SEQ ID NOs: 44-48. In some embodiments, the delivery domain is a BoNT X HC domain. In some embodiments, the fusion protein further comprises a linker between the GCH1 cleaving polypeptide and the delivery domain. In some embodiments, the linker is or comprises a peptide linker. In some embodiments, the peptide linker is or comprises a glycine -rich linker, a proline-rich linker, glycine/serine-rich linker, and/or alanine/glutamic acid-rich linker.

In some aspects, the disclosure provides a nucleic acid encoding the GCH1 cleaving polypeptide provided herein or the fusion protein provided herein. In some embodiments, the nucleic acid has at least 60% sequence identity (e.g., at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or more) to a nucleic acid sequence selected from SEQ ID NOs.: 25-41. In some embodiments, the nucleic acid sequence is codon-optimized. In some embodiments, the nucleic acid sequence is codon-optimized for enhanced expression in desired cells (e.g., increased expression in a particular cell type relative to a wild-type nucleic acid sequence encoding a GCH1 cleaving polypeptide). In some embodiments, the nucleic acid sequence is codon-optimized for expression in mammalian cells e.g., human cells).

In some aspects, the disclosure provides an expression vector comprising a nucleic acid encoding a GCH1 cleaving polypeptide provided herein. In some embodiments, the vector is a phage, plasmid, cosmid, bacmid, or viral vector. In some embodiments, the vector is a viral vector. In some embodiments, the viral vector is a lentiviral vector. In some embodiments, the nucleic acid comprises or consists of the sequence set forth in any one of SEQ ID NOs: 25-41.

In some aspects, the disclosure provides a host cell comprising the GCH1 cleaving polypeptide provided herein, the fusion protein provided herein, the nucleic acid provided herein, or the expression vector provided herein. In some embodiments, the cell is a bacterial cell. In some embodiments, the host cell is an E. coli cell. In some embodiments, the cell is an animal cell. In some embodiments, the animal cell is a mammalian cell. In some embodiments, the mammalian cell is a human cell.

Some aspects of this disclosure provide methods for using a GCH1 cleaving polypeptide, a fusion protein, or an expression vector provided herein.

In some aspects, the disclosure provides using a GCH1 cleaving polypeptide provided herein to cleave GCH1 in a cell, wherein the use comprises delivering to a cell a GCH1 cleaving polypeptide provided herein. In some embodiments, the use comprises contacting a GCH1 cleaving polypeptide provided herein with GCH1 in an intracellular environment.

In some aspects, the disclosure provides methods for cleaving GCH1 in a cell, the method comprising delivering to a cell a GCH1 cleaving polypeptide provided herein. In some embodiments, the GCH1 cleaving polypeptide contacts GCH1 in an intracellular environment. In some embodiments, the GCH1 comprises a cleavage sequence that is at least 80%, 85%, 90%, 95%, or 99% identical to the amino acid sequence ETISDVLNDAIFDEDH (SEQ ID NO: 4). In some embodiments, the GCH1 comprises the cleavage sequence ETISDVLNDAIFDEDH (SEQ ID NO: 4). In some embodiments, the GCH1 comprises the amino acid sequence set forth in SEQ ID NO: 2. In some embodiments, the cell is in vitro. In some embodiments, the cell is a mammalian cell. In some embodiments, the cell is a peripheral nerve cell. In some embodiments, the cell is a neuron. In some embodiments, the cell is a dorsal root ganglion (DRG) neuron. In some embodiments, the cell is in a subject. In some embodiments, the subject is a mammal. In some embodiments, the subject is a human. In some embodiments, delivering the GCH1 cleaving polypeptide to the cell results in cleavage of GCH1 and subsequently, reduction of intracellular levels of tetrahydrobiopterin (BH4). In some embodiments, delivering the GCH1 cleaving polypeptide to the cell results in inactivation of GCH1. In some embodiments, the delivering the GCH1 cleaving polypeptide to the cell results in reduction of pain (e.g., chronic pain, neuropathic pain, and/or inflammatory pain). In some embodiments, the pain is chronic pain. In some embodiments, the pain in neuropathic pain. In some embodiments, the pain is inflammatory pain.

In some aspects, the disclosure provides using a GCH1 cleaving polypeptide, a fusion protein, or an expression vector provided herein to cleave GCH1 in a cell to reduce pain in a subject in need thereof, wherein the use comprises administering to the subject a GCH1 cleaving polypeptide, a fusion protein, or an expression vector provided herein.

In some aspects, the disclosure provides methods for reducing pain in a subject in need thereof comprising administering to the subject a GCH1 cleaving polypeptide, a fusion protein, or an expression vector provided herein. In some embodiments, the GCH1 cleaving polypeptide, the fusion protein, or the expression vector is administered locally. In some embodiments, the GCH1 cleaving polypeptide, the fusion protein, or the expression vector is administered systemically. In some embodiments, the cell is a mammalian cell. In some embodiments, the cell is a human cell. In some embodiments, the administering of the GCH1 cleaving polypeptide, the fusion protein, or the expression vector to the subject results in the GCH1 cleaving polypeptide entering the cell. In some embodiments, the administering of the GCH1 cleaving polypeptide, the fusion protein, or the expression vector to the subject results in cleavage of GCH1 (SEQ ID NO: 2). In some embodiments, the pain is chronic pain. In some embodiments, the pain in neuropathic pain. In some embodiments, the pain is inflammatory pain.

In some aspects, the disclosure provides a kit comprising a container housing the GCH1 cleaving polypeptide, the fusion protein, the nucleic acid, the expression vector, or the host cell provided herein. It should be appreciated that the foregoing concepts, and additional concepts discussed below, may be arranged in any suitable combination, as the present disclosure is not limited in this respect. Further, other advantages and novel features of the present disclosure will become apparent from the following detailed description of various nonlimiting embodiments when considered in conjunction with the accompanying figures.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1A shows a schematic depicting the crystal structure of GCH1.

FIG. IB shows a schematic depicting the crystal structure of exemplary GCH1 target sites.

FIG. 2 shows representative data assessing the starting activity of BoNT X protease on GCH1 target sites. Sequences shown correspond to SEQ ID NOs: 50 (Ykt6), 3 (GCHl(80- 94), 51 (VAMP1), and 4 (GCH1(11-126)).

FIGs. 3A-3B show representative data indicating that PANCE evolution yielded BoNT protease variants with robust propagation at GCH1 target site 2. FIG. 3A shows a comparison of the target site sequences for procaspase- 1, the starting substrate, and GCH1, the novel substrate. Phage titer is shown for the seven passages of PANCE evolution performed on three replicates. Sequences shown correspond to SEQ ID NOs: 52 (procaspase- 1) and 4 (GCHl(site2)). FIG. 3B shows data from an activity assay on BoNT X protease variants from PANCE. OD normalized luminescence values were used to reflect proteolytic activity. BoNT X(3015)8, the starting protease in this evolution, was a positive control showing select activity on procaspase- 1, its substrate. Catalytically impaired dBoNT/F is unable to perform proteolysis and was used as a negative control. Isolated phage demonstrated activity on both procaspase- 1 and novel substrate, GCH1, with greater activity on GCH1. BoNT X 8(6715-1214)2.4 variant yields robust activity on GCH1.

FIG. 4 shows sequence analysis of BoNT X 8(6715-1214) variants following PANCE evolution. Fourteen positions (dotted residues) showed convergent mutations relative to BoNT X(3015)8 (SEQ ID NO: 9). Gray shaded residues are substitutions that arose from the previous evolution steps. SEQ ID NO: 53 (TNNGDFQHGIAQP) is shown.

FIGs. 5A-5B show in vitro cleavage assay data demonstrating that evolved proteases cleave GCH1. FIG. 5A is a gel showing isolation of the BoNT X 8(6715-1214)2.4 variant. FIG. 5B shows isolation of procaspase- 1 and GCH1 substrates (left gel) and shows evolved protease, BoNT X 8(6715-1214)2.4 incubated with procaspase-1 and GCH1 at 50 nM (right gel). The evolved protease BoNT X 8(6715-1214)2.4 shows cleavage of GCH1 target site in vitro, but retains cleavage of procaspase- 1 starting substrate.

FIG. 6 shows sequence analysis following PANCE evolution of BoNT X 8(6715- 1214) variants. There are twenty-eight total positions with mutations relative to wild-type BoNT X and fourteen positions (dotted residues) showed convergent mutations relative BoNT X(3015)8. Gray shaded residues are substitutions that arose from the previous evolution steps and represent mutations relative to wild-type BoNT X (SEQ ID NO: 1). SEQ ID NO: 53 (TNNGDFQHGIAQP) is shown.

FIG. 7 shows representative data indicating that PANCE evolution using simultaneous positive selection for GCH1 cleavage and negative selection against procaspase- 1 cleavage yielded BoNT/X variants that are specific for the cleavage of GCH1. The figure shows data from an activity assay on BoNT X protease variants from PANCE. OD normalized luminescence values were used to reflect proteolytic activity. BoNT X was used in simultaneous positive and negative selection PANCE/PACE to yield a procaspase-cleaving BoNT X variant, BoNT X(3015)8, which served as the basis for GCHl-cleaving BoNT X evolutions. Positive selection only yielded BoNT X 8(6715-1214)2.4fs (abbreviated “X(1214)2.4fs” in the figure), which retains cleavage activity on the procaspase-1 substrate. Simultaneous positive selection for GCH1 cleavage and negative selection against procaspase-1 cleavage yielded BoNT X variants X(n001)B9, X(n001)B9fs, X(n002)Al, X(n002)Alfs, X(n002)A2, and X(n002)A2fs. Variants 8(6715-1214)2.4fs, X(n001)B9fs, X(n002)Alfs, and X(n002)A2fs include a frameshift mutation (1-nucleotide deletion at residue 439), which appends a tail to the C-terminus of the protein. Reversion of this frameshift yields variants 8(6715-1214)2.4, X(n001)B9, X(n002)Al, and X(n002)A2, respectively. The frameshift was shown to have a negligible effect on the activity of the protease variants.

FIG. 8 shows in vitro cleavage assay data demonstrating that evolved BoNT X variant X(n002)A2 cleaves GCH1 and does not cleave the starting substrate procaspase- 1 after both positive and negative selection PANCE. BoNT X 8(6715-1214)2.4 (abbreviated “X(1214)2.4” in the figure), which has only undergone positive selection PANCE for cleavage of GCH1, retains activity on its starting substrate procaspase- 1. FIG. 9 shows in vitro cleavage assay data demonstrating that evolved BoNT X variant BoNT X 8(6715-1214)2.4 (abbreviated “X(1214)2.4” in the figure) cleaves full-length GCH1 purified from E. coli.

DEFINITIONS

The term “protein,” as used herein, refers to a polymer of amino acid residues linked together by peptide bonds. The term, as used herein, refers to proteins, polypeptides, and peptides of any size, structure, or function. Typically, a protein will be at least three amino acids long but is generally longer than 50 amino acids in length. A protein may refer to an individual protein or a collection of proteins. Inventive proteins preferably contain only natural amino acids, although non-natural amino acids (i.e., compounds that do not occur in nature but that can be incorporated into a polypeptide chain; see, for example, cco.caltech.edu/~dadgrp/Unnatstruct.gif, which displays structures of non-natural amino acids that have been successfully incorporated into functional ion channels) and/or amino acid analogs as are known in the art may alternatively be employed. Also, one or more of the amino acids in an inventive protein may be modified, for example, by the addition of a chemical entity such as a carbohydrate group, a hydroxyl group, a phosphate group, a farnesyl group, an isofarnesyl group, a fatty acid group, a linker for conjugation, functionalization, or other modification, etc. A protein may also be a single molecule or may be a multi-molecular complex. A protein may be just a fragment of a naturally occurring protein or peptide. A protein may be naturally occurring, recombinant, or synthetic, or any combination of these.

The term “peptide”, as used herein, refers to a short, contiguous chain of amino acids linked to one another by peptide bonds. Generally, a peptide ranges from about 2 amino acids to about 50 amino acids in length (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 amino acids in length) but may be longer in the case of a polypeptide. In some embodiments, a peptide is a fragment or portion of a larger protein, for example comprising one or more domains of a larger protein. Peptides may be linear (e.g., branched, unbranched, etc.) or cyclic (e.g., form one or more closed rings). A “polypeptide”, as used herein, refers to a longer (e.g., between about 50 and about 100), continuous, unbranched peptide chain. The term “pleckstrin homology domain” or “PH domain,” as used herein, refers to a polypeptide of roughly 100-120 amino acids in length that binds phosphatidylinositol lipids within biological membranes (e.g., phosphatidylinositol (3,4,5)-trisphosphate and phosphatidylinositol (4,5)-bisphosphate) and proteins, such as the Py-subunits of heterotrimeric G proteins, and protein kinase C. Generally, PH domains function in recruiting and trafficking proteins to different cellular and intracellular membranes. PH domains are found in proteins across several organisms, for example, humans, yeast (e.g., S. cerevisiae) and nematodes (e.g., C. elegans). Hundreds of proteins in humans alone include PH domains. Sequences of PH domains are known in the art, for example, as described by European Molecular Biology Lab Protein Family (Pfam) database entry “PF00169” and InterPro database entry IPR001849.

The term “protease,” as used herein, refers to an enzyme that catalyzes the hydrolysis of a peptide (amide) bond linking amino acid residues together within a protein. The term embraces both naturally occurring, evolved, and engineered proteases. Many proteases are known in the art. Proteases can be classified by their catalytic residue, and classes of proteases include, without limitation, serine proteases (serine alcohol), threonine proteases (threonine secondary alcohol), cysteine proteases (cysteine thiol), aspartate proteases (aspartate carboxylic acid), glutamic acid proteases (glutamate carboxylic acid), and metalloproteases (metal ion, e.g., zinc). The structures in parentheses in the preceding sentence correlate to the respective catalytic moiety of the proteases of each class. Some proteases are highly promiscuous and cleave a wide range of protein substrates, e.g., trypsin or pepsin. Other proteases are highly specific, and only cleave substrates with a specific target sequence. Some blood clotting proteases such as, for example, thrombin, and some viral proteases, such as, for example, HCV or TEV protease, are highly specific proteases. To give but another example, Botulinum neurotoxin (BoNT) proteases generally cleave specific SNARE proteins e.g., synapto some- associated proteins (SNAP25), syntaxin proteins, vesicle-associated membrane proteins (VAMPs)). Proteases that cleave in a specific manner typically bind to multiple amino acid residues of their substrate. Suitable proteases and protease cleavage sites, also sometimes referred to as “protease substrates,” “protein substrates,” or “amino acid substrates,” will be apparent to those of skill in the art and include, without limitation, proteases listed in the MEROPS database, accessible at merops.sanger.ac.uk and described in Rawlings et al., (2014) MEROPS: the database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res 42, D503-D509, the entire contents of each of which are incorporated herein by reference. The disclosure is not limited in this respect.

The term “GTP cyclohydrolase 1” or “GCH1,” as used herein, refers to a protein encoded by the GCH1 gene. GTP cyclohydrolase 1 is the first and rate-limiting enzyme in de novo biosynthesis of tetrahydrobiopterin (BH4). In the initiating step of BH4 biosynthesis, GCH1 catalyzes the conversion of GTP into 7, 8 -dihydroneopterin triphosphate. Cleavage of the GCH1 protein inactivates the protein. Cleavage of GCH1 results in reduced chronic pain, such as neuropathic pain and inflammatory pain, by decreasing intracellular levels of BH4. In some embodiments, the GCH1 protein is a human GCH1 protein comprising the sequence set forth in SEQ ID NO: 2. In some embodiments, a BoNT X variant cleaves a target sequence that is at least 80%, 85%, 90%, 95%, or 99% identical to the amino acid sequence ETISDVLNDAIFDEDH (SEQ ID NO: 4). In some embodiments, a BoNT X variant cleaves a target sequence comprising ETISDVLNDAIFDEDH (SEQ ID NO: 4). In some embodiments, a BoNT X variant cleaves a target sequence that at least 80%, 85%, 90%, 95%, or 99% identical to the amino acid sequence SSLGENPQRQGLLKT (SEQ ID NO: 3). In some embodiments, a BoNT X variant cleaves a target sequence comprising SSLGENPQRQGLLKT (SEQ ID NO: 3).

The term “tetrahydrobiopterin” or “BH4”, an essential cofactor for nitric oxide synthase (NOS), tryptophan hydroxylase, phenylalanine hydroxylase, and tyrosine hydroxylase, making it indispensable for the synthesis of serotonin, dopamine, epinephrine, norepinephrine, and nitric oxide. BH4 is critical for several biological systems including pain. The production of BH4 within the dorsal root ganglion plays a critical role in pain signaling. BH4 is a precursor for peripheral neuropathic and inflammatory pain signals and thus is an intrinsic regulator of pain sensitivity and chronicity. Intracellular levels of BH4 are determined by three metabolic pathways: de novo, salvage, and recycling. De novo biosynthesis involves a series of reactions involving three enzymes: GCH1, 6-pyruvoyl tetrahydrobiopterin synthase (PTS), and sepiapterin reductase (SPR). GCH1, as the ratelimiting enzyme for BH4 biosynthesis, is essential to the production of BH4, and GCH1 levels determine the intracellular concentration of BH4.

The term “procaspase,” as used herein, refers to an inactive zymogen protease. Procaspases undergo dimerization or oligomerization, followed by cleavage for activation. During the activation process, the interdomain linker (IDL) is cleaved into small and large subunits, which then associate with each other to form an active caspase. Procaspase- 1 is a zymogen protease, which is the inactive precursor to caspase- 1. Procaspase- 1 comprises three domains, a caspase activation and recruitment domain (CARD), a large subunit (p20), and a small subunit (plO), separated by two linkers, the CARD linker and IDL linker. The IDL linker separates the small and large subunits, and the CARD linker separates the CARD and the large subunit. Procaspase-1 has three endogenous cleavage sites, DI 19, D297, and D316. Cleavage of procaspase- 1 at the IDL results in production of active caspase- 1, which can initiate pyroptotic cell death of cells (e.g., cancer cells).

The term “Botulinum neurotoxin (BoNT) protease,” as used herein, refers to a protease derived from, or having at least 70% sequence identity to (e.g., at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or more identity to) a Botulinum neurotoxin (BoNT), for example, a BoNT derived from a bacterium of the genus Clostridium (e.g., C. botulinum). Structurally, BoNT proteins comprise two conserved domains, a “heavy chain” (HC) and a “light chain” (LC). The LC comprises a zinc metalloprotease domain responsible for the catalytic activity of the protein. The HC serves as a delivery vehicle and mediates entry into the neuron cytosol, where disulfide bond reduction releases the LC. The HC typically comprises an HCc domain, which is responsible for binding to neuronal cells, and an HCN domain, which mediates translocation of the protein into a cell. Examples of BoNT HC domains are represented by the amino acid sequences set forth in SEQ ID NOs.: 42 and 43 below.

BoNT X HCN Domain

SLLNGCIEVENKDLFLISNKDSLNDINLSEEKIKPETTVFFKDKLPPQDITLSNY DFTEANSIPSISQQNILERNEELYEPIRNSLFEIKTIYVDKLTTFHFLEAQNIDESIDSS KI RVELTDSVDEALSNPNKVYSPFKNMSNTINSIETGITSTYIFYQWLRSIVKDFSDETGK IDVIDKSSDTLAIVPYIGPLLNIGNDIRHGDFVGAIELAGITALLEYVPEFTIPILVGLE VI GGELAREQVEAIVNNALDKRDQKWAEVYNITKAQWWGTIHLQINTRLAHTYKALS RQANAIKMNMEFQLANYKGNIDDKAKIKNAISETEILLNKSVEQAMKNTEKFMIKLS NSYLTKEMIPKVQDNLKNFDLETKKTLDKFIKEKEDILGTNLSSSLRRKVSIRLNKNIA FDINDIPFSEFDDLINQYK (BoNT X HCN, translocation domain; SEQ ID NO.: 42) BoNT X HCc Domain

NEIEDYEVLNLGAEDGKIKDLSGTTSDINIGSDIELADGRENKAIKIKGSENSTI KIAMNKYLRFSATDNFSISFWIKHPKPTNLLNNGIEYTLVENFNQRGWKISIQDSKLI WYLRDHNNSIKIVTPDYIAFNGWNLITITNNRSKGSIVYVNGSKIEEKDISSIWNTEVD DPIIFRLKNNRDTQAFTLLDQFSIYRKELNQNEVVKLYNYYFNSNYIRDIWGNPLQYN KKYYLQTQDKPGKGLIREYWSSFGYDYVILSDSKTITFPNNIRYGALYNGSKVLIKNS KKLDGLVRNKDFIQLEIDGYNMGISADRFNEDTNYIGTTYGTTHDLTTDFEIIQRQEK YRNYCQLKTPYNIFHKSGLMSTETSKPTFHDYRDWVYSSAWYFQNYENLNLRKHTK TNWYFIPKDEGWDED (BoNT X HCc, Binding domain; SEQ ID NO.: 43)

There are eight serotypes of BoNTs, denoted BoNT A-G and X. BoNT serotypes A, C, and E cleave synaptosome-associated protein (SNAP25). BoNT serotype C has also been observed to cleave syntaxin. BoNT serotypes B, D, F, and G cleave vesicle-associated membrane proteins (VAMPs). BoNT X was more recently discovered and seems to show a more promiscuous substrate profile than the other serotypes. BoNT X has the lowest sequence identity with other BoNTs serotypes and is also not recognized by antisera against known BoNT serotypes. BoNT X is similar to the other BoNT serotypes, however, in cleaving vesicle-associated membrane proteins (VAMP) 1, 2 and 3, but does so at a novel site (Arg66-Ala67 in VAMP2). Lastly, BoNT X is the only toxin that also cleaves non-canonical substrates VAMP4, VAMP5, and Ykt6 (Nat Commun. 2017 Aug 3;8: 14130. doi: 10.1038/ncommsl4130, ncbi.nlm.nih.gov/pubmed/28770820). An example of a VAMP protein (e.g., VAMP1) that is cleaved by wild-type BoNT proteases (e.g., BoNT X) is represented by the amino acid sequence set forth in SEQ ID NO.: 7 below.

VAMP1 protein sequence MSAPAQPPAEGTEGTAPGGGPPGPPPNMTSNRRLQQTQAQVEEVVDIIRVNVDKVLE RDQKLSELDDRADALQAGASQFESSAAKLKRKYWWKNCKMMIMLGAICAIIVVVIV RRG (SEQ ID NO: 7)

In some embodiments, a VAMP1 substrate that is cleaved by wild-type BoNT proteases comprises the following cleavage sequence: TSNRRLQQTQAQVEEVVDIIRVNVDKVLERDQKLSELDDRADALQAGASQFESSAA KLKR (SEQ ID NO: 8).

A wild-type BoNT protease refers to the amino acid sequence of a BoNT protease as it naturally occurs in Clostridium botulinum (C. botulinum). A non-limiting example of a wild-type BoNT X protease light chain sequence is represented by the amino acid sequence set forth in SEQ ID NO: 1.

The term “BoNT protease variant,” as used herein, refers to a protein (e.g., a BoNT protease) having one or more amino acid variations introduced into the amino acid sequence, e.g., as a result of application of PACE/PANCE or by genetic engineering (e.g., recombinant gene expression, gene synthesis, etc.), as compared to the amino acid sequence of a naturally- occurring or wild-type BoNT protein (e.g., SEQ ID NO: 1). Amino acid sequence variations may include one or more mutated residues within the amino acid sequence of the protease, e.g., as a result of a substitution of one amino acid for another, deletions of one or more amino acids (e.g., a truncated protein), insertions of one or more amino acids, or any combination of the foregoing. In some embodiments, the BoNT protease variants described herein comprise an evolved BoNT LC. In some embodiments, the BoNT protease variants described herein do not require an additional domain (e.g., HC domain or PH domain). In certain embodiments, a BoNT protease variant cleaves a different target protein (e.g., has broadened or different substrate specificity) relative to a wild-type BoNT protease. For example, in some embodiments, a BoNT X protease variant is a GCH1 cleaving protease that cleaves a GCH1 cleavage sequence (e.g., a target sequence within a GCH1 protein) or a GCH1 protein. In some embodiments, a BoNT X variant cleaves a target sequence that is at least 80%, 85%, 90%, 95%, or 99% identical to the amino acid sequence ETISDVLNDAIFDEDH (SEQ ID NO: 4). In some embodiments, a BoNT X variant cleaves a target sequence having between 1 and 5 (e.g., 1, 2, 3, 4, 5) amino acid substitutions (e.g., mutations) relative to SEQ ID NO: 4. In some embodiments, a BoNT X variant cleaves a target sequence comprising ETISDVLNDAIFDEDH (SEQ ID NO: 4). In some embodiments, a BoNT X variant cleaves a target sequence that is at least 80%, 85%, 90%, 95%, or 99% identical to the amino acid sequence SSLGENPQRQGLLKT (SEQ ID NO: 3). In some embodiments, a BoNT X variant cleaves a target sequence having between 1 and 5 (e.g., 1, 2, 3, 4, 5) amino acid substitutions (e.g., mutations) relative to SEQ ID NO: 3. In some embodiments, a BoNT X variant cleaves a target sequence comprising SSLGENPQRQGLLKT (SEQ ID NO: 3). In some embodiments, cleavage of the target GCH1 results in reduction of intracellular levels of tetrahydrobiopterin (BH4). In some embodiments, cleavage of the target GCH1 results in the reduction of pain (e.g., chronic, inflammatory, and/or neuropathic pain). In some embodiments, a BoNT X variant comprises a C-terminal extension. The term “C-terminal extension,” as used herein, refers to a polypeptide sequence not normally present in a wild-type BoNT that extends beyond a mutation due to a frameshift. In some embodiments, the length of the C-terminal extension is about 5-30 amino acids in length. In some embodiments, the length of the C-terminal extension is 12 amino acids in length. In some embodiments, the C-terminal extension is positioned after the substituted amino acid causing a frameshift mutation.

The term “VAMP,” as used interchangeably herein with the term “Vesicle-associated membrane protein,” refers to proteins belonging to the SNARE protein family, and these proteins share structural similarity. Different proteins make up the collection VAMP1, VAMP2, VAMP3, VAMP4, VAMP5, VAMP6, VAMP7, and VAMP8 and are mostly involved in vesicle fusion. For example, VAMP1 and VAMP2 proteins are expressed in brain and are constituents of the synaptic vesicles, where they participate in neurotransmitter release; VAMP3 is expressed and participates in regulated and constitutive exocytosis as a constituent of secretory granules and secretory vesicles; VAMP4 is involved in transport out of the Golgi apparatus; VAMP5 and VAMP7 participate in constitutive exocytosis; VAMP5 is a constituent of secretory vesicles; VAMP7 is also found both in secretory granules and endosomes; and VAMP8 is part of endocytosis and is found in early endosomes. VAMP8 is also involved in exocytosis in pancreatic acinar cells.

The term “continuous evolution,” as used herein, refers to an evolution process, in which a population of nucleic acids encoding a protein of interest e.g., BoNT) is subjected to multiple rounds of: (a) replication, (b) mutation (or modification of the nucleic acids in the population), and (c) selection to produce a desired evolved product, for example, a novel nucleic acid encoding a novel protein with a desired activity, wherein the multiple rounds of replication, mutation, and selection can be performed without investigator interaction, and wherein the processes (a)-(c) can be carried out simultaneously. Typically, the evolution procedure is carried out in vitro, for example, using cells in culture as host cells (e.g., bacterial cells). During a continuous evolution process, the population of nucleic acids replicates in a flow of host cells, e.g., a flow through a lagoon. In general, a continuous evolution process provided herein relies on a system in which a gene of interest is provided in a nucleic acid vector that undergoes a life-cycle including replication in a host cell and transfer to another host cell, wherein a critical component of the life-cycle is deactivated, and reactivation of the component is dependent upon a desired variation in an amino acid sequence of a protein encoded by the gene of interest.

In some embodiments, the gene of interest (e.g., a gene encoding a BoNT protease, such as BoNT X or variants thereof) is transferred from cell to cell in a manner dependent on the activity of the gene of interest. In some embodiments, the transfer vector is a virus infecting cells, for example, a bacteriophage or a retroviral vector. In some embodiments, the viral vector is a phage vector that infects bacterial host cells. In some embodiments, the transfer vector is a conjugative plasmid transferred from a donor bacterial cell to a recipient bacterial cell.

In some embodiments, the nucleic acid vector comprising the gene of interest (e.g., a gene encoding a BoNT protease, such as BoNT X or variants thereof) is a phage, a viral vector, or naked DNA (e.g., a mobilization plasmid). In some embodiments, transfer of the gene of interest from cell to cell is via infection, transfection, transduction, conjugation, or uptake of naked DNA, and efficiency of cell-to-cell transfer (e.g., transfer rate) is dependent on an activity of a product encoded by the gene of interest. For example, in some embodiments, the nucleic acid vector is a phage harboring the gene of interest and the efficiency of phage transfer (via infection) is dependent on an activity of the gene of interest in that a protein required for the generation of phage particles (e.g., pill for M13 phage) is expressed in the host cells only in the presence of the desired activity of the gene of interest, for example, cleavage of a target amino acid sequence or target nucleic acid sequence.

Some embodiments provide a continuous evolution system, in which a population of viral vectors comprising a gene of interest to be evolved replicates in a flow of host cells, e.g., a flow through a lagoon (e.g., evolution vessel), wherein the viral vectors are deficient in a gene (e.g., full-length pill gene) encoding a protein that is essential for the generation of infectious viral particles, and wherein that gene is in the host cell under the control of a conditional promoter that can be activated by a gene product encoded by the gene of interest (e.g., gene encoding a BoNT protease, such as BoNT X or variants thereof), or a mutated version thereof. In some embodiments, the activity of the conditional promoter depends on a desired function of a gene product encoded by the gene of interest (e.g., gene encoding X of interest). Viral vectors, in which the gene of interest e.g., gene encoding a BoNT protease, such as BoNT X or variants thereof) has not acquired a desired function as a result of a variation of amino acids introduced into the gene product protein sequence, will not activate the conditional promoter, or may only achieve minimal activation, while any mutations introduced into the gene of interest that confers the desired function will result in activation of the conditional promoter. Since the conditional promoter controls an essential protein for the viral life cycle, e.g., pill, activation of this promoter directly corresponds to an advantage in viral spread and replication for those vectors that have acquired an advantageous mutation.

The term “flow,” as used herein in the context of host cells, refers to a stream of host cells, wherein fresh host cells are being introduced into a host cell population, for example, a host cell population in a lagoon, remain within the population for a limited time, and are then removed from the host cell population. In a simple form, a host cell flow may be a flow through a tube, or a channel, for example, at a controlled rate. In some embodiments, a flow of host cells is directed through a lagoon that holds a volume of cell culture media and comprises an inflow and an outflow. The introduction of fresh host cells may be continuous or intermittent and removal may be passive, e.g., by overflow, or active, e.g., by active siphoning or pumping. Removal further may be random, for example, if a stirred suspension culture of host cells is provided, removed liquid culture media will contain freshly introduced host cells as well as cells that have been a member of the host cell population within the lagoon for some time. Even though, in theory, a cell could escape removal from the lagoon indefinitely, the average host cell will remain only for a limited period of time within the lagoon, which is determined mainly by the flow rate of the culture media (and suspended cells) through the lagoon.

Since the viral vectors replicate in a flow of host cells, in which fresh, uninfected host cells are provided while infected cells are removed, multiple consecutive viral life cycles can occur without investigator interaction, which allows for the accumulation of multiple advantageous mutations in a single evolution experiment.

The term “phage-assisted continuous evolution” (also used interchangeably herein with “PACE”), as used herein, refers to continuous evolution that employs phage as viral vectors. The general concept of PACE technology has been described, for example, in U.S. Patent No. 9,023,594, issued May 5, 2015; U.S. Patent No. 9,771,574, issued September 26, 2017; U.S. Patent Application Serial No. 15/713,403, filed September 22, 2017; International PCT Application PCT/US2009/056194, filed September 8, 2009, published as WO 2010/028347 on March 11, 2010; U.S. Provisional Patent Application Serial No. 61/426,139, filed December 22, 2010; U.S. Patent No. 9,394,537, issued July 19, 2016; U.S. Patent No. 10,336,997, issued July 2, 2019; U.S. Patent No. 11,214,792, issued January 4, 2022; International PCT Application PCT/US2011/066747, filed December 22, 2011, published as WO 2012/088381 on June 28, 2012; U.S. Provisional Patent Application Serial No. 61/929,378 filed January 20, 2014; U.S. Patent No. 10,179,911, issued January 15, 2019; U.S. Patent Application Serial No. 16/238,386, filed January 2, 2019; International PCT Application PCT/US2015/012022, filed January 20, 2015; U.S. Provisional Patent Application Serial No. 62/158,982, filed May 8, 2015; U.S. Provisional Patent Application Serial No. 62/187,669, filed July 1, 2015; U.S. Provisional Patent Application Serial No. 62/067,194, filed October 22, 2014; U.S. Patent No. 10,920,208, issued February 16, 2021; International PCT Application PCT/US2018/048134, filed August 27, 2018, published as WO 2019/040935 on February 28, 2019; U.S. Patent No. 9,267,127, issued February 23, 2016; International PCT Application PCT Application, PCT/US2015/057012, filed October 22, 2015, published as WO 2016/077052 on May 19, 2016; International PCT Application PCT/US2016/027795, filed April 15, 2016, published as WO 2016/168631 on October 20, 2016; International PCT Application, PCT/US2009/056194, filed September 8, 2009, published as WO 2010/028347 on March 11, 2010; International PCT Application, PCT/US2011/066747, filed December 22, 2011, published as WO 2012/088381 on June 28, 2012; and International PCT Application, PCT/US2018/051557, filed September 18, 2018, published as WO 2019/056002 on March 21, 2019, the entire contents of each of which is incorporated herein by reference.

The term “non-continuous evolution,” as used herein, also refers to an evolution procedure in which a population of nucleic acids encoding a protein of interest (e.g., BoNT) is subjected to multiple rounds of: (a) replication, (b) mutation (or modification of the primary sequence of nucleotides of the nucleic acids in the population), and (c) selection to produce a desired evolved product, for example, a novel nucleic acid encoding a novel protein with a desired activity, wherein the multiple rounds of replication, mutation, and selection require investigator intervention to move the process from one phase to another. Non-continuous evolution is similar to continuous evolution in that it uses the same selection principles, but it is performed using serial dilutions instead of under continuous flow. A non- continuous evolution process may be used as a lower stringency alternative to continuous evolution process.

The term “phage-assisted non-continuous evolution” (also used interchangeably herein with “PANCE”), as used herein, refers to non-continuous evolution that employs phage as viral vectors. The general concept of PANCE technology has been described, for example, in Miller et al., Nature Protoc 2020 Dec;15(12):4101-4127, and International PCT Application PCT/US2020/042016, filed July 14, 2020, published as WO 2021/011579 on January 21, 2021, the entire contents of each of which are incorporated herein by reference. PANCE uses the same selection principles as PACE, but it is performed through serial dilution instead of under continuous flow. PANCE has a lower stringency nature than PACE due to increased time allowed for phage propagation. PANCE may be performed in multiwell plates which enables parallel evolution towards many different targets or many replicates of the same evolution.

The term “nucleic acid,” as used herein, refers to a polymer of nucleotides. The polymer may include natural nucleosides (z.e., adenosine, thymidine, guanosine, cytidine, uridine, deoxyadenosine, deoxythymidine, deoxyguanosine, and deoxy cytidine), nucleoside analogs (e.g., 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine, 5-methylcytidine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 7-deazaadenosine, 7-deazaguanosine, 8-oxoadenosine, 8-oxoguanosine, O(6)-methylguanine, 4-acetylcytidine, 5-(carboxyhydroxymethyl)uridine, dihydrouridine, methylpseudouridine, 1-methyl adenosine, 1-methyl guanosine, N6-methyl adenosine, and 2-thiocytidine), chemically modified bases, biologically modified bases (e.g., methylated bases), intercalated bases, modified sugars (e.g., 2 '-fluororibose, ribose, 2 '-deoxyribose, 2 '-O-methylcytidine, arabinose, and hexose), or modified phosphate groups (e.g., phosphorothioates and 5' -N-phosphoramidite linkages).

An “isolated nucleic acid” generally refers to refers to a nucleic acid that is: (i) amplified in vitro by, for example, polymerase chain reaction (PCR); (ii) recombinantly produced by molecular cloning; (iii) purified, as by restriction endonuclease cleavage and gel electrophoretic fractionation, or column chromatography; or (iv) synthesized by, for example, chemical synthesis. An isolated nucleic acid is one which is readily manipulatable by recombinant DNA techniques known in the art. Thus, a nucleotide sequence contained in a vector in which 5' and 3' restriction sites are known or for which polymerase chain reaction (PCR) primer sequences have been disclosed is considered isolated but a nucleic acid sequence existing in its native state in its natural host is not. An isolated nucleic acid may be substantially purified but need not be. For example, a nucleic acid that is isolated within a cloning or expression vector is not pure in that it may comprise only a tiny percentage of the material in the cell in which it resides. Such a nucleic acid is isolated, however, as the term is used herein because it is readily manipulatable by standard techniques known to those of ordinary skill in the art. As used herein with respect to proteins or peptides, the term “isolated” refers to a protein or peptide that has been isolated from its natural environment or artificially produced (e.g., by chemical synthesis, by recombinant DNA technology, etc.).

The term “gene of interest” or “gene encoding a protein (e.g., BoNT protease, such as BoNT X or variants thereof) of interest,” as used herein, refers to a nucleic acid construct comprising a nucleotide sequence encoding a gene product e.g., a BoNT protease such as BoNT X or variants thereof) of interest (e.g., for its properties, either desirable or undesirable) to be evolved in a continuous evolution process as described herein. The term includes any variations of a gene of interest that are the result of a continuous evolution process according to methods described herein (e.g., increase expression, decreased expression, modulated or changed activity, modulated or changed specificity). For example, in some embodiments, a gene of interest is a nucleic acid construct comprising a nucleotide sequence encoding a BoNT protease, such as BoNT X or variants thereof, be evolved, cloned into a viral vector, for example, a phage genome, so that the expression of the encoding sequence is under the control of one or more promoters in the viral genome. In some embodiments, a gene of interest is a nucleic acid construct comprising a nucleotide sequence encoding a BoNT protease such as BoNT X or variants thereof to be evolved and a promoter operably linked to the encoding sequence. When cloned into a viral vector, for example, a phage genome, the expression of the encoding sequence of such genes of interest is under the control of the heterologous promoter and, in some embodiments, may also be influenced by one or more promoters in the viral genome.

The term “vector,” as used herein, refers to any genetic element, such as a plasmid, phage, transposon, cosmid, chromosome, artificial chromosome, virus, virion, etc., which is capable of replication when associated with the proper control elements, and which can transfer gene sequences between cells.

The term “viral vector,” as used herein, refers to a nucleic acid (or isolated nucleic acid) comprising a viral genome that, when introduced into a suitable host cell, can be replicated and packaged into viral particles able to transfer the viral genome into another host cell. The term viral vector extends to vectors comprising truncated or partial viral genomes. For example, in some embodiments, a viral vector is provided that lacks a gene encoding a protein essential for the generation of infectious viral particles or for viral replication. In suitable host cells, for example, host cells comprising the lacking gene under the control of a conditional promoter, however, such truncated viral vectors can replicate and generate viral particles able to transfer the truncated viral genome into another host cell. In some embodiments, the viral vector is a phage, for example, a filamentous phage (e.g., an M13 phage). In some embodiments, a viral vector, for example, a phage vector, is provided that comprises a gene of interest to be evolved.

The term “host cell,” as used herein, refers to a cell that can host a viral vector useful for a continuous evolution process as provided herein. A cell can host a viral vector if it supports expression of genes of viral vector, replication of the viral genome, and/or the generation of viral particles. One criterion to determine whether a cell is a suitable host cell for a given viral vector is to determine whether the cell can support the viral life cycle of a wild-type viral genome that the viral vector is derived from. For example, if the viral vector is a modified M13 phage genome, as provided in some embodiments described herein, then a suitable host cell would be any cell that can support the wild-type M13 phage life cycle. Suitable host cells for viral vectors useful in continuous evolution processes are well known to those of skill in the art, and the invention is not limited in this respect.

In some embodiments, modified viral vectors are used in continuous evolution processes as provided herein. In some embodiments, such modified viral vectors lack a gene required for the generation of infectious viral particles. In some such embodiments, a suitable host cell is a cell comprising the gene required for the generation of infectious viral particles, for example, under the control of a constitutive or a conditional promoter e.g., in the form of an accessory plasmid, as described herein). In some embodiments, the viral vector used lacks a plurality of viral genes. In some such embodiments, a suitable host cell is a cell that comprises a helper construct providing the viral genes required for the generation of viral particles. A cell is not required to actually support the life cycle of a viral vector used in the methods provided herein. For example, a cell comprising a gene required for the generation of infectious viral particles under the control of a conditional promoter may not support the life cycle of a viral vector that does not comprise a gene of interest able to activate the promoter, but it is still a suitable host cell for such a viral vector. In some embodiments, the viral vector is a phage, and the host cell is a bacterial cell. In some embodiments, the host cell is an E. coli cell. Suitable E. coli host strains will be apparent to those of skill in the art, and include, but are not limited to, New England Biolabs (NEB) Turbo, ToplOF’, DH12S, ER2738, ER2267, XLl-Blue MRF’, and DH10B. In some embodiments, the strain of E. coli used is known as S1030 (available from Addgene). In some embodiments, the strain of E. coli use to express proteins is BL21(DE3). These strain names are art recognized, and the genotype of these strains has been well characterized. It should be understood that the above strains are exemplary only, and that the invention is not limited in this respect.

The term “fresh,” as used herein interchangeably with the terms “non-infected” or “uninfected” in the context of host cells, refers to a host cell that has not been infected by a viral vector comprising a gene of interest as used in a continuous evolution process provided herein. A fresh host cell can, however, have been infected by a viral vector unrelated to the vector to be evolved or by a vector of the same or a similar type but not carrying the gene of interest.

The term “promoter” refers to a nucleic acid molecule with a sequence recognized by the cellular transcription machinery and able to initiate transcription of a downstream gene. A promoter can be constitutively active, meaning that the promoter is always active in a given cellular context, or conditionally active, meaning that the promoter is only active under specific conditions. For example, a conditional promoter may only be active in the presence of a specific protein that connects a protein associated with a regulatory element in the promoter to the basic transcriptional machinery, or only in the absence of an inhibitory molecule. A subclass of conditionally active promoters are inducible promoters that require the presence of a small molecule “inducer” for activity. Examples of inducible promoters include, but are not limited to, arabinose-inducible promoters, Tet-on promoters, and tamoxifen-inducible promoters. A variety of constitutive, conditional, and inducible promoters are well known to the skilled artisan, and the skilled artisan will be able to ascertain a variety of such promoters useful in carrying out the instant invention, which is not limited in this respect.

The term “phage,” as used herein interchangeably with the term “bacteriophage,” refers to a virus that infects bacterial cells. Typically, phages consist of an outer protein capsid enclosing genetic material. The genetic material can be ssRNA, dsRNA, ssDNA, or dsDNA, in either linear or circular form. Phages and phage vectors are well known to those of skill in the art and non-limiting examples of phages that are useful for carrying out the methods provided herein are (Lysogen), T2, T4, T7, T12, R17, M13, MS2, G4, Pl, P2, P4, Phi X174, N4, <66, and <629. In certain embodiments, the phage utilized in the present invention is M13. Additional suitable phages and host cells will be apparent to those of skill in the art, and the invention is not limited in this aspect. For an exemplary description of additional suitable phages and host cells, see Elizabeth Kutter and Alexander Sulakvelidze: Bacteriophages: Biology and Applications . CRC Press; 1 st edition (December 2004), ISBN: 0849313368; Martha R. J. Clokie and Andrew M. Kropinski: Bacteriophages: Methods and Protocols, Volume 1: Isolation, Characterization, and Interactions (Methods in Molecular Biology) Humana Press; 1 st edition (December, 2008), ISBN: 1588296822; Martha R. J. Clokie and Andrew M. Kropinski: Bacteriophages: Methods and Protocols, Volume 2: Molecular and Applied Aspects (Methods in Molecular Biology) Humana Press; 1 st edition (December 2008), ISBN: 1603275649; all of which are incorporated herein in their entirety by reference for disclosure of suitable phages and host cells as well as methods and protocols for isolation, culture, and manipulation of such phages).

In some embodiments, the phage is a filamentous phage. In some embodiments, the phage is an M13 phage. M13 phages are well known to those in the art and the biology of M13 phages has extensively been studied. Wild type M13 phage particles comprise a circular, single-stranded genome of approximately 6.4 kb. In certain embodiments, the wildtype genome of an M 13 phage includes eleven genes, gl-gXI, which, in turn, encode the eleven M13 proteins, pI-pXI, respectively. gVIII encodes pVIII, also often referred to as the major structural protein of the phage particles, while gill encodes pill, also referred to as the minor coat protein, which is required for infectivity of M13 phage particles, whereas gill- neg encodes and antagonistic protein to pill.

The term “selection phage,” as used herein interchangeably with the term “selection plasmid,” refers to a modified phage that comprises a gene of interest to be evolved and lacks a full-length gene encoding a protein required for the generation of infectious phage particles. For example, some M13 selection phages provided herein comprise a nucleic acid sequence encoding a BoNT protease such as BoNT X or variants thereof to be evolved, e.g., under the control of an M 13 promoter, and lack all or part of a phage gene encoding a protein required for the generation of infectious phage particles, e.g., gl, gll, gill, gIV, gV, gVI, gVII, gVIII, glX, or gX, or any combination thereof. For example, some M13 selection phages provided herein comprise a nucleic acid sequence encoding a BoNT protease, such as BoNT X or variants thereof to be evolved, e.g., under the control of an M13 promoter, and lack all or part of a gene encoding a protein required for the generation of infective phage particles, e.g., the gill gene encoding the pill protein.

The term “helper phage,” as used herein interchangeable with the terms “helper phagemid” and “helper plasmid,” refers to a nucleic acid construct comprising a phage gene required for the phage life cycle, or a plurality of such genes, but lacking a structural element required for genome packaging into a phage particle. For example, a helper phage may provide a wild-type phage genome lacking a phage origin of replication. In some embodiments, a helper phage is provided that comprises a gene required for the generation of phage particles, but lacks a gene required for the generation of infectious particles, for example, a full-length pill gene. In some embodiments, the helper phage provides only some, but not all, genes required for the generation of phage particles. Helper phages are useful to allow modified phages that lack a gene required for the generation of phage particles to complete the phage life cycle in a host cell. Typically, a helper phage will comprise the genes required for the generation of phage particles that are lacking in the phage genome, thus complementing the phage genome. In the continuous evolution context, the helper phage typically complements the selection phage, but both lack a phage gene required for the production of infectious phage particles.

The term “replication product,” as used herein, refers to a nucleic acid that is the result of viral genome replication by a host cell. This includes any viral genomes synthesized by the host cell from a viral genome inserted into the host cell. The term includes nonmutated as well as mutated replication products.

The term “accessory plasmid,” as used herein, refers to a plasmid comprising a gene required for the generation of infectious viral particles under the control of a conditional promoter. In the context of continuous evolution described herein, the conditional promoter of the accessory plasmid is typically activated by a function of the gene of interest to be evolved. Accordingly, the accessory plasmid serves the function of conveying a competitive advantage (in the case of positive selection) to those viral vectors in a given population of viral vectors that carry a gene of interest able to activate the conditional promoter. Only viral vectors carrying an “activating” gene of interest will be able to induce expression of the gene required to generate infectious viral particles in the host cell, and, thus, allow for packaging and propagation of the viral genome in the flow of host cells. Vectors carrying nonactivating versions of the gene of interest, on the other hand, will not induce expression of the gene required to generate infectious viral vectors, and, thus, will not be packaged into viral particles that can infect fresh host cells.

In some embodiments, the conditional promoter of the accessory plasmid is a promoter the transcriptional activity of which can be regulated over a wide range, for example, over 2, 3, 4, 5, 6, 7, 8, 9, or 10 orders of magnitude by the activating function, for example, function of a protein encoded by the gene of interest. In some embodiments, the level of transcriptional activity of the conditional promoter depends directly on the desired function of the gene of interest. This allows for starting a continuous evolution process with a viral vector population comprising versions of the gene of interest that only show minimal activation of the conditional promoter. In the process of continuous evolution, any mutation in the gene of interest that increases activity of the conditional promoter directly translates into higher expression levels of the gene required for the generation of infectious viral particles, and, thus, into a competitive advantage over other viral vectors carrying minimally active or loss-of-function versions of the gene of interest.

The term “mutagen,” as used herein, refers to an agent that induces mutations or increases the rate of mutation in a given biological system, for example, a host cell, to a level above the naturally-occurring level of mutation in that system. Some exemplary mutagens useful for continuous evolution procedures are provided elsewhere herein and other useful mutagens will be evident to those of skill in the art. Useful mutagens include, but are not limited to, ionizing radiation, ultraviolet radiation, base analogs, deaminating agents (e.g., nitrous acid), intercalating agents (e.g., ethidium bromide), alkylating agents e.g., ethylnitrosourea), transposons, bromine, azide salts, psoralen, benzene, 3- Chloro-4- (dichloromethyl)-5-hydroxy-2(5H)-furanone (MX) (CAS no. 77439-76-0), O,O-dimethyl-S- (phthalimidomethyl)phosphorodithioate (phos-met) (CAS no. 732-11- 6), formaldehyde (CAS no. 50-00-0), 2-(2-furyl)-3-(5-nitro-2-furyl)acrylamide (AF-2) (CAS no. 3688-53-7), glyoxal (CAS no. 107-22-2), 6-mercaptopurine (CAS no. 50-44- 2), N-(trichloromethylthio)- 4-cyclohexane-l,2-dicarboximide (captan) (CAS no. 133- 06-2), 2-aminopurine (CAS no. 452-06-2), methyl methane sulfonate (MMS) (CAS No. 66-27-3), 4-nitroquinoline 1 -oxide (4-NQO) (CAS No. 56-57-5), N4-Aminocytidine (CAS no. 57294-74-3), sodium azide (CAS no. 26628-22-8), N-ethyl-N-nitrosourea (ENU) (CAS no. 759-73-9), N-methyl-N-nitrosourea (MNU) (CAS no. 820-60-0), 5- azacytidine (CAS no. 320-67-2), cumene hydroperoxide (CHP) (CAS no. 80-15-9), ethyl methanesulfonate (EMS) (CAS no. 62-50-0), N-ethyl-N - nitro-N-nitrosoguanidine (ENNG) (CAS no. 4245-77-6), N-methyl-N-nitro-N- nitrosoguanidine (MNNG) (CAS no. 70-25-7), 5-diazouracil (CAS no. 2435-76-9) and t- butyl hydroperoxide (BHP) (CAS no. 75-91-2). Additional mutagens can be used in continuous evolution procedures as provided herein, and the invention is not limited in this respect.

Ideally, a mutagen is used at a concentration or level of exposure that induces a desired mutation rate in a given host cell or viral vector population, but is not significantly toxic to the host cells used within the average time frame a host cell is exposed to the mutagen or the time a host cell is present in the host cell flow before being replaced by a fresh host cell.

The term “mutagenesis plasmid,” as used herein, refers to a plasmid comprising a gene encoding a gene product that acts as a mutagen. In some embodiments, the gene encodes a DNA polymerase lacking a proofreading capability. In some embodiments, the gene is a gene involved in the bacterial SOS stress response, for example, a UmuC, UmuD', or RecA gene. In some embodiments, the gene is a GATC methylase gene, for example, a deoxyadenosine methylase (dam methylase) gene. In some embodiments, the gene is involved in binding of hemimethylated GATC sequences, for example, a seqA gene. In some embodiments, the gene is involved with repression of mutagenic nucleobase export, for example, emrR. In some embodiments, the gene is involved with inhibition of uracil DNA- glycosylase, for example, a Uracil Glycosylase Inhibitor (ugi) gene. In some embodiments, the gene is involved with deamination of cytidine (e.g., a cytidine deaminase from Petromyzon marinas), for example, cytidine deaminase 1 (CDA1). In some embodiments, the mutagenesis-promoting gene is under the control of an inducible promoter. In some embodiments, a bacterial host cell population is provided in which the host cells comprise a mutagenesis plasmid in which a dnaQ926, UmuC, UmuD', and RecA expression cassette is controlled by an arabinose-inducible promoter. In some such embodiments, the population of host cells is contacted with the inducer, for example, arabinose in an amount sufficient to induce an increased rate of mutation. In some embodiments, the mutagenesis plasmid is an MP4 mutagenesis plasmid or an MP6 mutagenesis plasmid. The MP4 and MP6 mutagenesis plasmids are described, for example in PCT Application PCT/US2016/027795, filed April 15, 2016, published as WO 2016/168631 on October 20, 2016, the content of which is incorporated herein in its entirety. The MP4 mutagenesis plasmid comprises the following genes: dnaQ926, dam, and seqA. The MP6 mutagenesis plasmid comprises the following genes: dnaQ926, dam, seqA, emrR, Ugi, and CDA1.

The term “cell,” as used herein, refers to a cell derived from an individual organism, for example, from a mammal. A cell may be a prokaryotic cell or a eukaryotic cell. In some embodiments, the cell is a eukaryotic cell, for example, a human cell, a mouse cell, a dog cell, a cat cell, a horse cell, a guinea pig cell, a pig cell, a hamster cell, a non-human primate (e.g., monkey) cell, etc. In some embodiments, a cell is obtained from a subject having pain. In some embodiments, a cell is obtained from a subject having chronic pain, neuropathic, and/or inflammatory pain. In some embodiments, the cell is in a subject (e.g., the cell is in vivo). In some embodiments, the cell is intact (e.g., the outer membrane of the cell, such as the plasma membrane, is intact or not permeabilized).

The term “intracellular environment,” as used herein, refers to the aqueous biological fluid (e.g., cytosol or cytoplasm) forming the microenvironment contained by the outer membrane of a cell. For example, in a subject, an intracellular environment may include the cytoplasm of a cell or cells of a target organ or tissue (e.g., the nucleoplasm of the nucleus of a cell). In another example, a cellular environment is the cytoplasm of a cell or cells surrounded by cell culture growth media housed in an in vitro culture vessel, such as a cell culture plate or flask.

The term “subject,” as used herein, refers to an individual organism, for example, an individual mammal. In some embodiments, the subject is a human. In some embodiments, the subject is a non-human mammal. In some embodiments, the subject is a non-human primate. In some embodiments, the subject is a rodent (e.g., mouse, rat, hamster, guinea pig, etc.). In some embodiments, the subject is a sheep, a goat, a cow, a cat, or a dog. In some embodiments, the subject is a vertebrate, an amphibian, a reptile, a fish, an insect, a fly, or a nematode. In some embodiments, the subject is genetically engineered, e.g., a genetically engineered non-human subject. The subject may be of any sex and at any stage of development. In some embodiments, the subject suffers from chronic pain, inflammatory pain, and/or neuropathic pain.

The “percent identity” of two amino acid sequences may be determined using algorithms or computer programs, for example, the algorithm of Karlin and Altschul, Proc. Natl. Acad. Sci. USA 87:2264-68, 1990, modified as in Karlin and Altschul, Proc. Natl. Acad. Sci. USA 90:5873-77, 1993. Such an algorithm is incorporated into various computer programs, for example NBLAST and XBLAST programs (version 2.0) of Altschul et al. J. Mol. Biol. 215:403-10, 1990. BLAST protein searches can be performed with the XBLAST program, score=50, word length=3 to obtain amino acid sequences homologous to the protein molecules of interest. Where gaps exist between two sequences, Gapped BLAST can be utilized as described in Altschul et al., Nucleic Acids Res. 25(17):3389-3402, 1997. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. BLAST nucleotide searches can be performed with the NBLAST nucleotide program parameters set, e.g., for score=100, wordlength=12 to obtain nucleotide sequences homologous to a nucleic acid molecule described herein. BLAST protein searches can be performed with the XBLAST program parameters set, e.g., to score 50, wordlength=3 to obtain amino acid sequences homologous to a protein molecule described herein. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul, S F et al., (1997) Nuc. Acids Res. 25: 3389 3402. Alternatively, PSI BLAST can be used to perform an iterated search which detects distant relationships between molecules (Id.). When utilizing BLAST, Gapped BLAST, and PSI Blast programs, the default parameters of the respective programs (e.g., of XBLAST and NBLAST) can be used (see, e.g., National Center for Biotechnology Information (NCBI) on the worldwide web, ncbi.nlm.nih.gov). Another specific, nonlimiting example of a mathematical algorithm utilized for the comparison of sequences is the algorithm of Myers and Miller, 1988, CABIOS 4:11 17. Such an algorithm is incorporated in the ALIGN program (version 2.0) which is part of the GCG sequence alignment software package. When utilizing the ALIGN program for comparing amino acid sequences, a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used. The percent identity between two sequences can be determined using techniques similar to those described above, with or without allowing gaps. In calculating percent identity, typically only exact matches are counted.

DETAILED DESCRIPTION

Aspects of the disclosure relate to compositions and methods for cleaving intracellular protein targets (e.g., GCH1). Cleavage of intracellular GCH1 (e.g., GCH1 present in DRG neurons) by the BoNT protease variants provided herein, results in a reduction of intracellular levels of BH4 below pathological pain levels, thereby decreasing pain. By targeting intracellular GCH1 (e.g., peripherally, e.g., in DRG), pain can be reduced without affecting normal levels of BH4 in other cells, organs, and/or systems e.g., in the brain or endothelial cells). In some embodiments, the GCH1 being targeted for proteolysis is human GCH1 comprising the sequence set forth in SEQ ID NO: 2. The BoNT protease variants (e.g., GCH1 cleaving polypeptides) provided herein may be used to treat pain in a human subject.

Some aspects of this disclosure are based on the recognition that certain directed evolution technologies, for example, PACE and PANCE, can be employed to alter the target site of a protease and to create protein variants that cleave intracellular proteins (e.g., GTP cyclohydrolase 1 (GCH1)). The evolution includes positive and negative selection systems that bias evolution of a BoNT protease towards production of evolved protein variants (e.g., BoNT X protease variants) that cleave GCH1. In some embodiments, protein variants described herein are evolved from wild-type Botulinum toxin (BoNT) proteases, for example, BoNT X. In certain embodiments, protein variants described herein are evolved from a procaspase-1 cleaving polypeptide (e.g., BoNT X(3015)8). For example, in some embodiments, BoNT X proteases are first evolved to cleave procaspase- 1 (e.g., SEQ ID NO: 5), and the evolved BoNT protease variants (e.g., BoNT X(3015)8, SEQ ID NO.: 9) are further evolved to cleave GTP cyclohydrolase 1 (GCH1) (e.g., SEQ ID NO: 2). In some embodiments, the GCHl-cleaving polypeptides cleave target sequences found in GCH1 (e.g., SEQ ID NO: 4 or SEQ ID NO: 3). Proteases may require many successive mutations to remodel complex networks of contacts with protein substrates and are thus not readily manipulated by conventional, iterative evolution methods. Continuous evolution strategies, which require little or no researcher intervention between generations, therefore are well- suited to evolve proteases, such as BoNT proteases, e.g., BoNT X or variants thereof. The ability of PACE and PANCE to perform the equivalent of hundreds of rounds of iterative evolution methods within days enables complex protease evolution experiments that are impractical with conventional methods. This disclosure provides data demonstrating the use of PACE and PANCE evolution to evolve BoNT proteases (e.g., BoNT X) to cleave GCH1 etc. As described in the Examples, wild-type BoNT X protease (SEQ ID NO: 1), which normally cleaves the VAMP1 target sequence (e.g., SEQ ID NO.: 8), was first evolved by PACE and PANCE to cleave a target sequence e.g., SEQ ID NO: 6) found in procaspase- 1, which is not a native substrate of BoNT proteases. This evolved BoNT protease variant is referred to as BoNT X(3015)8 (SEQ ID NO: 9). BoNT X(3015)8 was then further evolved by PANCE to cleave a target sequence found in GCH1 (e.g., SEQ ID NO: 4), which is also not a native substrate of BoNT proteases.

After constructing a pathway of evolutionary stepping-stones and performing iterative evolutions using PANCE, it was observed that the resulting BoNT protease variants (e.g., BoNT X variants) contain up to 14 amino acid substitutions relative to the procaspase- 1 cleaving polypeptide, BoNT X(3015)8 (SEQ ID NO: 9), and up to 28 amino acid substitutions relative to wild-type BoNT X protease (e.g., SEQ ID NO.: 1) and cleave human GCH1 (e.g., SEQ ID NO.: 9) at the intended target peptide bond. Together, the work described herein provides novel proteins resulting from directed evolution with changed substrate specificities and the ability to cleave proteins implicated in neuropathic and inflammatory pain signals in humans.

The evolution of a protease that can degrade a non-canonical target protein of interest often necessitates changing substrate sequence specificity at more than one position, and thus may require many generations of evolution. Continuous evolution strategies, which require little or no researcher intervention between generations, therefore are well-suited to evolve proteases capable of cleaving a target protein (e.g., GCH1) that differs substantially in sequence from the preferred substrate of a wild-type protease. In phage- assisted continuous evolution (PACE), a population of evolving selection phage (SP) is continuously diluted in a fixed- volume vessel by an incoming culture of host cells, e.g., E. coli. The SP is a modified phage genome in which the evolving gene of interest has replaced gene III (gill), a gene essential for phage infectivity. If the evolving gene of interest possesses the desired activity, it will trigger expression of gene III from an accessory plasmid (AP) in the host cell, thus producing infectious progeny encoding active variants of the evolving gene. The mutation rate of the SP is controlled using an inducible mutagenesis plasmid (MP), such as MP6, which upon induction increases the mutation rate of the SP by >300, OOO-fold. Because the rate of continuous dilution is slower than phage replication but faster than E. coli replication, mutations only accumulate in the SP.

The PACE technology has been described previously, for example, in U.S. Patent No. 9,023,594, issued May 5, 2015; U.S. Patent No. 9,771,574, issued September 26, 2017; U.S. Patent Application Serial No. 15/713,403, filed September 22, 2017; International PCT Application PCT/US2009/056194, filed September 8, 2009, published as WO 2010/028347, on March 11, 2010; U.S. Provisional Patent Application Serial No. 61/426,139, filed December 22, 2010; U.S. Patent No. 9,394,537, issued July 19, 2016; U.S. Patent No. 10,336,997, issued July 2, 2019; U.S. Patent No. 11,214,792, issued January 4, 2022; International PCT Application PCT/US2011/066747, filed December 22, 2011, published as WO 2012/088381, on June 28, 2012; U.S. Provisional Patent Application Serial No. 61/929,378 filed January 20, 2014; U.S. Patent No. 10,179,911, issued January 15, 2019; U.S. Patent Application Serial No. 16/238,386, filed January 2, 2019; International PCT Application PCT/US2015/012022, filed January 20, 2015; U.S. Provisional Patent Application Serial No. 62/158,982, filed May 8,2015; U.S. Provisional Patent Application Serial No. 62/187,669, filed July 1, 2015; U.S. Provisional Patent Application Serial No. 62/067,194, filed October 22, 2014; U.S. Patent No. 10,920,208, issued February 16, 2021; and International PCT Application PCT/US2018/048134, filed August 27, 2018, published as WO 2019/040935 on February 28, 2019; U.S. Patent No. 9,267,127, issued February 23, 2016; International PCT Application PCT Application, PCT/US2015/057012, filed October 22, 2015, published as WO 2016/077052, on May 19, 2016; International PCT Application PCT/US2016/027795, filed April 15, 2016, published as WO 2016/168631, on October 20, 2016; International PCT Application, PCT/US2009/056194, filed September 8, 2009, published as WO 2010/028347, on March 11, 2010; International PCT Application, PCT/US2011/066747, filed December 22, 2011, published as WO 2012/088381, on June 28, 2012; and International PCT Application, PCT/US2018/051557, filed September 18, 2018, published as WO 2019/056002, on March 21, 2019, the entire contents of each of which is incorporated herein by reference.

The PACE system may also be adapted into the format of PANCE (phage-assisted non-continuous evolution), a non-continuous form of PACE in which cultures propagate phage in wells through multiple generations but undergo serial daily passaging in lieu of continuous flow, permitting a less stringent and more sensitive initial selection. PANCE has been described previously, for example, in Miller et al. Nature Protoc. 2020 Dec;15(12):4101-4127, and International PCT Application PCT/US2020/042016, published as WO 2021/011579, the entire contents of each of which are incorporated herein by reference. PACE and PANCE are useful in evolving BoNT proteases (e.g., BoNT X) to cleave intracellular targets (e.g., GCH1). For example, the evolution described herein includes positive and negative selection systems that bias evolution of a BoNT protease towards production of evolved protein variants (e.g., BoNT X protease variants) that cleave GCH1.

BoNT Protease Variants

BoNT protease variants (e.g., GCH1 cleaving polypeptides) disclosed herein are protein variants evolved from BoNT proteases to target a novel substrate (compared to a BoNT’s native or canonical substrate). In some embodiments, the BoNT protease variants have one or more amino acid variations introduced into the amino acid sequence, e.g., as a result of application of the PACE/PANCE methods or by genetic engineering, as compared to the amino acid sequence of a naturally-occurring or wild-type BoNT protein (e.g., SEQ ID NO: 1). Amino acid sequence variations may include one or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more) mutated residues within the amino acid sequence of the protease, e.g., as a result of a substitution of one amino acid for another, the deletion of one or more amino acids (e.g., a truncated protein), the insertion of one or more amino acids, or any combination of the foregoing. In some embodiments, the amino acid sequence variations include one or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more) mutated residues as a result of a substitution of one amino acid for another, relative to a wild-type BoNT protease (e.g, BoNT X) or a starting protease (e.g., a procaspase- 1 cleaving polypeptide).

In some embodiments, a BoNT protease variant is evolved by phage-assisted continuous evolution (PACE) and/or phage-assisted non-continuous evolution (PANCE). In some embodiments, an evolved BoNT protease variant requires many generations (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 25, 50 or more generations) of evolution. In some embodiments, the disclosure provides variants of BoNT proteases that are derived from a procaspase- 1 cleaving polypeptide (e.g., BoNT X(3015)8, SEQ ID NO.: 9). For example, in some embodiments, a BoNT X protease was first evolved to cleave procaspase- 1. In some embodiments, the procaspase cleaving polypeptide (e.g., BoNT X(3015)8, SEQ ID NO.: 9) was then further evolved to cleave GTP cyclohydrolase 1 (GCH1). In some embodiments, a BoNT protease (e.g., BoNT X protease) was evolved to cleave GCH1. In some embodiments, the BoNT protease variants e.g., GCH1 cleaving polypeptides) comprise at least one amino acid variation at at least one of the positions selected from the group consisting of N59, N61, A73, A75, 1102, 1115, K164, A166, Y168, 1175, K193, D199, 1235, F248, N260, L262, F264, A277, R324, R354, L364, P368, S395, S413, L428, Y430, and N439, relative to SEQ ID NO: 9.

The variation in amino acid sequence generally results from a mutation, insertion, or deletion in a DNA coding sequence. In some embodiments, mutation of a DNA sequence results in a non-synonymous (i.e., conservative, semi-conservative, or radical) amino acid substitution. In some embodiments an insertion or deletion is an “in-frame” insertion or deletion that does not alter the reading frame of the resulting mutant protein.

The amount or level of variation between a starting protease (e.g., a procaspase- 1 cleaving polypeptide, e.g., BoNT X(3015)8, SEQ ID NO.: 9) and a BoNT protease variant (e.g., GCH1 cleaving polypeptide) provided herein can be expressed as the percent identity of the nucleic acid sequences or amino acid sequences between the two genes or proteins, respectively.

In some embodiments, the amount of variation between a starting protease (e.g., a procaspase-1 cleaving polypeptide, e.g., BoNT X(3015)8, SEQ ID NO.: 9) and a BoNT protease variant (e.g., GCH1 cleaving polypeptide) provided herein is expressed as the percent identity at the amino acid sequence level. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) is from about 60% to about 99.9% identical, 70% to about 98% identical, about 75% to about 95% identical, about 80% to about 90% identical, about 85% to about 95% identical, or about 95% to about 99% identical to the sequence set forth in SEQ ID NO: 9. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises an amino acid sequence that is at least 60% identical to the sequence set forth in SEQ ID NO: 9. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises an amino acid sequence that is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% identical to the sequence set forth in SEQ ID NO: 9.

In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) is about 70%, about 71%, about 72%, about 73%, about 74%, about 75%, about 76%, about 77%, about 78%, about 79%, about 80%, about 81%, about 82%, about 83%, about 84%, about 85%, about 86%, about 87%, about 88%, about 89%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 99.9% identical to the sequence set forth in SEQ ID NO: 9. In some embodiments, a BoNT protease variant e.g., GCH1 cleaving polypeptide) is about 70%, about 71%, about 72%, about 73%, about 74%, about 75%, about 76%, about 77%, about 78%, about 79%, about 80%, about 81%, about 82%, about 83%, about 84%, about 85%, about 86%, about 87%, about 88%, about 89%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 99.9% identical to the sequence set forth in SEQ ID NO: 9, and comprises an amino acid substitution at one or more of the following positions N59, N61, A73, A75, 1102, 1115, K164, A166, Y168, 1175, K193, D199, 1235, F248, N260, L262, F264, A277, R324, R354, L364, P368, S395, S413, L428, Y430, and N439.

Some aspects of the disclosure provide BoNT protease variants (e.g., GCH1 cleaving polypeptides) having between about 80% and about 99.9% (e.g., about 80%, about 80.5%, about 81%, about 81.5%, about 82%, about 82.5%, about 83%, about 83.5%, about 84%, about 84.5%, about 85%, about 85.5%, about 86%, about 86.5%, about 87%, about 87.5%, about 88%, about 88.5%, about 89%, about 89.5%, about 90%, about 90.5%, about 91%, about 91.5%, about 92%, about 92.5%, about 93%, about 93.5%, about 94%, about 94.5%, about 95%, about 95.5%, about 96%, about 96.5%, about 97%, about 97.5%, about 98%, about 98.5%, about 99%, about 99.2%, about 99.4%, about 99.6%, about 99.8%, or about 99.9%) identity to the sequence set forth in SEQ ID NO: 9. In some embodiments, the BoNT protease variant (e.g., GCH1 cleaving polypeptide) is no more than 99.9% identical to the sequence set forth in SEQ ID NO: 9. In some embodiments, a BoNT protease variants (e.g., GCH1 cleaving polypeptides) is between about 80% and about 99.9% (e.g., about 80%, about 80.5%, about 81%, about 81.5%, about 82%, about 82.5%, about 83%, about 83.5%, about 84%, about 84.5%, about 85%, about 85.5%, about 86%, about 86.5%, about 87%, about 87.5%, about 88%, about 88.5%, about 89%, about 89.5%, about 90%, about 90.5%, about 91%, about 91.5%, about 92%, about 92.5%, about 93%, about 93.5%, about 94%, about 94.5%, about 95%, about 95.5%, about 96%, about 96.5%, about 97%, about 97.5%, about 98%, about 98.5%, about 99%, about 99.2%, about 99.4%, about 99.6%, about 99.8%, or about 99.9%) identical to the sequence set forth in SEQ ID NO: 9, and comprises an amino acid substitution at one or more of the following positions N59, N61, A73, A75, 1102, 1115, K164, A166, Y168, 1175, K193, D199, 1235, F248, N260, L262, F264, A277, R324, R354, E364, P368, S395, S413, E428, Y430, and N439.

Some aspects of the disclosure provide BoNT protease variants (e.g., GCH1 cleaving polypeptides) having between 1 and 15 (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 etc.) amino acid substitutions (e.g., mutations) relative to SEQ ID NO: 9. Some aspects of the disclosure provide BoNT protease variants (e.g., GCH1 cleaving polypeptides) having more than 15 (e.g., 16, 17, 18, 19, 20, 25, 30, 35, 40, etc.) amino acid substitutions (e.g., mutations) relative to SEQ ID NO: 9. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) has 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 amino acid substitutions relative to a SEQ ID NO: 9. The mutations disclosed herein are not exclusive of other mutations which may occur or be introduced. For example, a protease variant may have a mutation as described herein in addition to at least one mutation not described herein (e.g., 1, 2, 3, 4, 5, etc. additional mutations).

In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises one or more amino acid substitutions at a position selected from N59, N61, A73, A75, 1102, 1115, K164, A166, Y168, 1175, K193, D199, 1235, F248, N260, L262, F264, A277, R324, R354, L364, P368, S395, S413, L428, Y430, and N439 relative to SEQ ID NO: 9. In some embodiments, a BoNT protease variant (e.g., GCH1 polypeptide) comprises one or more amino acid substitutions selected from N59D, N61S, A73T, A75V, I102L, Il 15V, K164E, A166T, Y168C, I175T, K193R, D199G, I235M, F248V, N260K, L262F, F264V, A277V, R324H, R354S, L364R, P368L, S395L, S413F, L428S, Y430N, Y430C, and N439T relative to SEQ ID NO: 9. In some embodiments, a GCH1 cleaving polypeptide having a N439T mutation relative to SEQ ID NO: 9 further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49).

In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T and P368L. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T, P368L, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T, P368L, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: Y199G, N235M, F248V, P368L, and L428S. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: Y199G, N235M, F248V, P368L, L428S, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: Y199G, N235M, F248V, P368L, L428S, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T, Y168C, and A277V. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T, Y 168C, A277V, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T, Y168C, A277V, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T, N235M, and P368L. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T, N235M, P368L, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T, N235M, P368L, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A75V, A166T, 211N, and R354S. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A75V, A166T, A277V, R354S, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A75V, A166T, A277V, R354S, and N439T and further comprises a C- terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T, and S395L. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T, S395L, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T, S395L, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T, and P368L. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T, P368L, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T, P368L, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A166T and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: 1102L, A166T, R324H, P368L, and Y430C. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: 1102L, A166T, R324H, P368L, Y430C, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: I102L, A166T, R324H, P368L, Y430C, and N439T and further comprises a C- terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49).

In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A73T, K164E, I175T, K193R, L262F, S413F, and Y430N. In some embodiments, a BoNT protease variant e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A73T, K164E, I175T, K193R, L262F, S413F, Y430N, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: A73T, K164E, I175T, K193R, L262F, S413F, Y430N, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: N59D, A75V, Il 15V, A166T, I235M, L262F, F264V, L364R, and P368L. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: N59D, A75V, Il 15V, A166T, I235M, L262F, F264V, L364R, P368L, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: N59D, A75V, Il 15V, A166T, I235M, L262F, F264V, L364R, P368L, and N439T and further comprises a C- terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: N61S, A73T, K164E, K193R, N260K, L262F, and S413F. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: N61S, A73T, K164E, K193R, N260K, L262F, S413F, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 9: N61S, A73T, K164E, K193R, N260K, L262F, S413F, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49).

In some embodiments, the disclosure provides variants of BoNT proteases that comprise at least one amino acid variation at at least one position relative to SEQ ID NO: 1. In some embodiments, the disclosure provides variants of BoNT proteases that comprise at least one amino acid variation in at least one of the positions selected from N59, N61, E72, A73, A75, 1102, E113, 1115, 1119, D161, N164, A166, T167, Y168, Y171, P174, 1175, K193, Y199, N210, A218, N235, S240, F248, K252, N260, L262, F264, A277, S280, Y314, R324, R354, L364, P368, S395, S413, L428, Y430, and N439 relative to SEQ ID NO: 1.

The amount or level of variation between a wild-type BoNT protease (e.g., BoNT X) and a BoNT protease variant (e.g., GCH1 cleaving polypeptide) provided herein can be expressed as the percent identity of the nucleic acid sequences or amino acid sequences between the two genes or proteins, respectively.

In some embodiments, the amount of variation between a wild-type BoNT protease (e.g., BoNT X) and a BoNT protease variant e.g., GCH1 cleaving polypeptide) is expressed as the percent identity at the amino acid sequence level. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving peptide) is from about 50% to about 99.9% identical, about 60% to about 98% identical, about 75% to about 95% identical, about 80% to about 90% identical, about 85% to about 95% identical, or about 95% to about 99% identical to the sequence set forth in SEQ ID NO: 1. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises an amino acid sequence that is at least 50% identical to the sequence set forth in SEQ ID NO: 1. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises an amino acid sequence that is at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99% identical to the sequence set forth in SEQ ID NO: 1.

In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) is about 60%, about 61%, about 62%, about 63%, about 64%, about 65%, about 66%, about 67%, about 68%, about 69%, about 70%, about 71%, about 72%, about 73%, about 74%, about 75%, about 76%, about 77%, about 78%, about 79%, about 80%, about 81%, about 82%, about 83%, about 84%, about 85%, about 86%, about 87%, about 88%, about 89%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 99.9% identical to the sequence set forth in SEQ ID NO: 1. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) is about 60%, about 61%, about 62%, about 63%, about 64%, about 65%, about 66%, about 67%, about 68%, about 69%, about 70%, about 71%, about 72%, about 73%, about 74%, about 75%, about 76%, about 77%, about 78%, about 79%, about 80%, about 81%, about 82%, about 83%, about 84%, about 85%, about 86%, about 87%, about 88%, about 89%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 99.9% identical to the sequence set forth in SEQ ID NO: 1, and comprises an amino acid substitution at one or more of the following positions N59, N61, A73, E72, A75, 1102, E113, 1115, 1119, D161, N164, A166, T167, Y168, Y171, P174, 1175, K193, Y199, N210, A218, N235, S240, F248, K252, N260, L262, F264, A277, S280, Y314, R324, R354, E364, P368, S395, S413, E428, Y430, and N439.

Some aspects of the disclosure provide BoNT protease variants (e.g., GCH1 cleaving polypeptides) having between about 70% and about 99.9% (e.g., about 70%, about 70.5%, about 71%, about 71.5%, about 72%, about 72.5%, about 73%, about 73.5%, about 74%, about 74.5%, about 75%, about 75.5%, about 76%, about 76.5%, about 77%, about 77.5%, about 78%, about 78.5%, about 79%, about 79.5%, about 80%, about 80.5%, about 81%, about 81.5%, about 82%, about 82.5%, about 83%, about 83.5%, about 84%, about 84.5%, about 85%, about 85.5%, about 86%, about 86.5%, about 87%, about 87.5%, about 88%, about 88.5%, about 89%, about 89.5%, about 90%, about 90.5%, about 91%, about 91.5%, about 92%, about 92.5%, about 93%, about 93.5%, about 94%, about 94.5%, about 95%, about 95.5%, about 96%, about 96.5%, about 97%, about 97.5%, about 98%, about 98.5%, about 99%, about 99.2%, about 99.4%, about 99.6%, about 99.8%, or about 99.9%) identity to the sequence set forth in SEQ ID NO: 1. In some embodiments, the BoNT protease variant e.g., GCH1 cleaving polypeptide) is no more than 99.9% identical to the sequence set forth in SEQ ID NO: 1. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) is between about 70% and about 99.9% (e.g., about 70%, about 70.5%, about 71%, about 71.5%, about 72%, about 72.5%, about 73%, about 73.5%, about 74%, about 74.5%, about 75%, about 75.5%, about 76%, about 76.5%, about 77%, about 77.5%, about 78%, about 78.5%, about 79%, about 79.5%, about 80%, about 80.5%, about 81%, about 81.5%, about 82%, about 82.5%, about 83%, about 83.5%, about 84%, about 84.5%, about 85%, about 85.5%, about 86%, about 86.5%, about 87%, about 87.5%, about 88%, about 88.5%, about 89%, about 89.5%, about 90%, about 90.5%, about 91%, about 91.5%, about 92%, about 92.5%, about 93%, about 93.5%, about 94%, about 94.5%, about 95%, about 95.5%, about 96%, about 96.5%, about 97%, about 97.5%, about 98%, about 98.5%, about 99%, about 99.2%, about 99.4%, about 99.6%, about 99.8%, or about 99.9%) identical to the sequence set forth in SEQ ID NO: 1, and comprises an amino acid substitution at one or more of the following positions: N59, N61, A73, E72, A75, 1102, El 13, 1115, 1119, D161, N164, A166, T167, Y168, Y171, P174, 1175, K193, Y199, N210, A218, N235, S240, F248, K252, N260, L262, F264, A277, S280, Y314, R324, R354, L364, P368, S395, S413, L428, Y430, and N439.

Some aspects of the disclosure provide BoNT protease variants (e.g., GCH1 cleaving polypeptides) having between 1 and 30 (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.) amino acid substitutions (e.g., mutations) relative to SEQ ID NO: 1. Some aspects of the disclosure provide BoNT protease variants (e.g., GCH1 cleaving polypeptides) having more than 30 (e.g., 35, 30, 40, 50, 60 etc.) amino acid substitutions (e.g., mutations) relative to SEQ ID NO: 1. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) has 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 amino acid substitutions relative to a SEQ ID NO: 1. The mutations disclosed herein are not exclusive of other mutations which may occur or be introduced. For example, a protease variant may have a mutation as described herein in addition to at least one mutation not described herein (e.g., 1, 2, 3, 4, 5, etc. additional mutations).

In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises one or more amino acid substitutions at a position selected from N59, N61, A73, E72, A75, 1102, E113, 1115, 1119, D161, N164, A166, T167, Y168, Y171, P174, 1175, K193, Y199, N210, A218, N235, S240, F248, K252, N260, L262, F264, A277, S280, Y314, R324, R354, L364, P368, S395, S413, L428, Y430, and N439 relative to SEQ ID NO: 1. In some embodiments, a BoNT protease variant (e.g., GCH1 polypeptide) comprises one or more amino acid substitutions selected from N59D, N61S, E72R, A73T, A75V, I102L, E113K, Il 15V, Il 19V, D161N, N164E, N164K, A166T, T167A, Y168C, Y171D, P174L, I175T, K193R, Y199D, Y199G, N210D, A218V, N235I, N235M, S240V, F248V, K252E, N260K, L262F, F264V, A277V, S280P, Y314S, R324H, R354S, L364R, P368L, S395L, S413F, L428S, Y430C, Y430N, and N439T relative to SEQ ID NO: 1. In some embodiments, a GCH1 cleaving polypeptide having a N439T mutation relative to SEQ ID NO: 1 further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, and Y314S.

In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, and P368L. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, P368L, and N439T. In some embodiments, a BoNT protease variant e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, P368L, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, and Y314S. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, T167A, Y171D, P174L, Y199G, N210D, A218V, N235M, S240V, F248V, K252E, S280P, Y314S, P368L, and L428S. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, T167A, Y171D, P174L, Y199G, N210D, A218V, N235M, S240V, F248V, K252E, S280P, Y314S, P368L, L428S, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, T167A, Y171D, P174L, Y199G, N210D, A218V, N235M, S240V, F248V, K252E, S280P, Y314S, P368L, L428S, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y168C, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, A277V, S280P, and Y314S. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y168C, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, A277V, S280P, Y314S, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y168C, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, A277V, S280P, Y314S, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235M, S240V, K252E, S280P, Y314S, and P368L. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235M, S240V, K252E, S280P, Y314S, P368L, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235M, S240V, K252E, S280P, Y314S, P368L, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, A75V, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, A277V, S280P, Y314S, and R354S. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, A75V, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, A277V, S280P, Y314S, R354S, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, A75V, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, A277V, S280P, Y314S, R354S, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, and S395L. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, S395L, and N439T. In some embodiments, a BoNT protease variant e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, S395L, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, and P368L. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, P368L, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, P368L, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, and Y314S. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, and N439T and further comprises a C- terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49). In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, I102L, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, R324H, P368L, and Y430C. In some embodiments, a BoNT protease variant e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, I102L, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, R324H, P368L, Y430C, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, I102L, E113K, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235I, S240V, K252E, S280P, Y314S, R324H, P368L, Y430C, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49).

In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, A73T, E113K, Il 19V, D161N, N164E, T167A, Y171D, P174L, I175T, K193R, Y199D, N210D, A218V, N235I, S240V, K252E, L262F, S280P, Y314S, S413F, and Y430N.

In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, A73T, E113K, Il 19V, D161N, N164E, T167A, Y171D, P174L, I175T, K193R, Y199D, N210D, A218V, N235I, S240V, K252E, L262F, S280P, Y314S, S413F, Y430N, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: E72R, A73T, E113K, Il 19V, D161N, N164E, T167A, Y171D, P174L, I175T, K193R, Y199D, N210D, A218V, N235I, S240V, K252E, L262F, S280P, Y314S, S413F, Y430N, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49).

In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: N59D, E72R, A75V, E113K, Il 15V, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235M, S240V, K252E, L262F, F264V, S280P, Y314S, L364R, and P368L. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: N59D, E72R, A75V, E113K, Il 15V, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174L, Y199D, N210D, A218V, N235M, S240V, K252E, L262F, F264V, S280P, Y314S, E364R, P368E, and N439T. In some embodiments, a BoNT protease variant e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: N59D, E72R, A75V, E113K, Il 15V, Il 19V, D161N, N164K, A166T, T167A, Y171D, P174E, Y199D, N210D, A218V, N235M, S240V, K252E, E262F, F264V, S280P, Y314S, E364R, P368E, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49).

In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: N61S, E72R, A73T, E113K, Il 19V, D161N, N164E, T167A, Y171D, P174L, K193R, Y199D, N210D, A218V, N235I, S240V, K252E, N260K, L262F, S280P, Y314S, and S413F.

In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: N61S, E72R, A73T, E113K, Il 19V, D161N, N164E, T167A, Y171D, P174L, K193R, Y199D, N210D, A218V, N235I, S240V, K252E, N260K, L262F, S280P, Y314S, S413F, and N439T. In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises the following amino acid substitutions relative to SEQ ID NO: 1: N61S, E72R, A73T, E113K, Il 19V, D161N, N164E, T167A, Y171D, P174L, K193R, Y199D, N210D, A218V, N235I, S240V, K252E, N260K, L262F, S280P, Y314S, S413F, and N439T and further comprises a C-terminal extension comprising the sequence NNGDFQHGIAQP (SEQ ID NO: 49).

In some embodiments, a BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises or consists of an amino acid sequence selected from SEQ ID NOs.: 10-23 as provided in Table 1. In some embodiments, a BoNT protease variant has at least 70% sequence identity to (e.g., at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or more identity to a sequence selected from SEQ ID NOs.: 10-23. In some embodiments, a BoNT protease variant comprises or consists of an amino acid sequence set forth in any one of SEQ ID NOs.: 10-23. In some embodiments, the BoNT protease variant (e.g., GCH1 cleaving polypeptide) is a BoNT X protease variant. In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) cleaves a non-natural or novel substrate (compared to its starting substrate e.g., procaspase- 1)). In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) cleaves proteins comprising an amino acid cleavage sequence that is at least 80%, 85%, 90%, 95%, or 99% identical to ETISDVLNDAIFDEDH (SEQ ID NO: 4). In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) cleaves proteins comprising the amino acid cleavage sequence ETISDVLNDAIFDEDH (SEQ ID NO: 4). In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) cleaves proteins comprising an amino acid cleavage sequence that is at least 80%, 85%, 90%, 95%, or 99% identical to SSLGENPQRQGLLKT (SEQ ID NO: 3). In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) cleaves proteins comprising the amino acid cleavage sequence SSLGENPQRQGLLKT (SEQ ID NO: 3). In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) cleaves a human GCH1 protein. In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) cleaves a human GCH1 protein comprising the sequence set forth in SEQ ID NO: 2.

In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) cleaves GCH1 with increased selectivity (e.g., 2-fold, 5-fold, 10-fold, 100-fold, etc.) relative to a procaspase- 1 protein. In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) cleaves GCH1 with increased selectivity of between 2-fold and 20,000- fold relative to a procaspase- 1 protein. In some embodiments, a BoNT X protease variant (e.g., a GCH1 cleaving polypeptide) cleaves GCH1 with increased selectivity of about 10- fold to about 100-fold, about 50-fold to about 500-fold, about 100-fold to about 1000-fold, about 500-fold to about 5000-fold, about 750-fold to about 10000-fold, or about 10000-fold to about 20000-fold relative to a procaspase- 1 protein.

In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) cleaves a procaspase- 1 protein with reduced selectivity (e.g., 2-fold, 5-fold, 10-fold, 100-fold, etc.) relative to a GCH1 protein. In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) cleaves a procaspase- 1 protein with reduced selectivity of between 2-fold and 20,000-fold reduced selectivity relative to a GCH1 protein. In some embodiments, a BoNT X protease variant (e.g., a GCH1 cleaving polypeptide) cleaves a procaspase- 1 protein with reduced selectivity of about 10-fold to about 100-fold, about 50- fold to about 500-fold, about 100-fold to about 1000-fold, about 500-fold to about 5000-fold, about 750-fold to about 10000-fold, or about 10000-fold to about 20000-fold relative to a GCH1 protein. In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) does not cleave a procaspase- 1 protein. In some embodiments, a BoNT X protease variant e.g., GCH1 cleaving polypeptide) does not cleave a procaspase- 1 protein comprising the sequence set forth in SEQ ID NO: 5.

In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) cleaves GCH1 with increased selectivity (e.g., 2-fold, 5-fold, 10-fold, 100-fold, etc.) relative to a VAMP1 protein. In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) cleaves GCH1 with increased selectivity of between 2-fold and 20,000-fold relative to a VAMP1 protein. In some embodiments, a BoNT X protease variant (e.g., a GCH1 cleaving polypeptide) cleaves a GCH1 protein with increased selectivity of about 10- fold to about 100-fold, about 50-fold to about 500-fold, about 100-fold to about 1000-fold, about 500-fold to about 5000-fold, about 750-fold to about lOOOO-fold, or about lOOOO-fold to about 20000-fold relative to a VAMP1 protein.

In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) cleaves a VAMP1 protein with reduced selectivity (e.g., 2-fold, 5-fold, 10-fold, 100-fold, etc.) relative to a GCH1 protein. In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) cleaves a VAMP1 protein with reduced selectivity of between 2-fold and 20,000-fold reduced selectivity relative to a GCH1 protein. In some embodiments, a BoNT X protease variant (e.g., a GCH1 cleaving polypeptide) cleaves a VAMP1 protein with reduced selectivity of about 10-fold to about 100-fold, about 50-fold to about 500-fold, about 100-fold to about 1000-fold, about 500-fold to about 5000-fold, about 750-fold to about lOOOO-fold, or about lOOOO-fold to about 20000-fold relative to GCH1 protein. In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) does not cleave a VAMP1 protein. In some embodiments, a BoNT X protease variant (e.g., GCH1 cleaving polypeptide) does not cleave a VAMP1 protein comprising the sequence set forth in SEQ ID NO: 7.

In some embodiments, evolved BoNT protease variants as described herein may be expressed as a part of a full-length protein comprising a BoNT light chain (LC) and a BoNT heavy chain (HC). Typically, the catalytic protease domain is located in the light chain (LC) of the BoNT. Generally, a BoNT HC comprises a translocation domain (e.g., HCN) and a binding domain (e.g., HCc). In some embodiments, in a wild-type BoNT X HC, the translocation domain comprises SEQ ID NO.: 42. In some embodiments, in a wild-type BoNT X HC, the binding domain comprises SEQ ID NO.: 43. Without wishing to be bound by any particular theory, the binding domain binds to specific receptors typically found on the surface of a cell, and the translocation domain enables the BoNT protease variant to cross cellular membranes, resulting in intracellular delivery of the catalytic domain of the protease, where the BoNT LC cleaves target proteins (e.g., GCH1).

It should be appreciated that evolved BoNT protease variants described herein may comprise an evolved BoNT LC and a wild-type HC, or both an evolved BoNT LC and evolved HC. In some embodiments, an evolved BoNT protease variant comprises a wildtype BoNT HC. In some embodiments, an evolved BoNT protease variant comprises a BoNT HC having one or more amino acid mutations relative to a wild-type BoNT HC. In some embodiments, an evolved BoNT protease variant comprises a wild-type BoNT LC. In some embodiments, an evolved BoNT protease variant comprises a BoNT LC having one or more amino acid mutations relative to a wild-type BoNT LC. In some embodiments, the receptor-binding domain of the BoNT HC has been replaced by a protein domain capable of binding to a cell surface receptor or ligand. In some embodiments, this protein domain may take the form of an antibody or fragment thereof, lectin, monobody, single-chain variable fragment (scFv), hormone, signaling factor, or other targeting moiety.

The HC and LC of a BoNT may be directly connected (e.g., expressed as a fusion protein) or indirectly connected (e.g., conjugated together or connected using one or more linking molecules).

Fusion Proteins

Aspects of the disclosure relate to fusion proteins. A fusion protein generally refers to a protein comprising a first peptide derived from a first protein that is linked in a contiguous chain to a second peptide derived from a second protein that is different than the first protein. The first and second peptides may be linked directly (e.g., the C-terminus of the first peptide may be directly linked, such as by a peptide bond, to the N-terminus of the second peptide, or vice versa) or indirectly (e.g., the first peptide and second peptide are joined by a linker, such as an amino acid or polymeric linker).

In some aspects, the disclosure provides fusion proteins comprising a BoNT X protease light chain variant of the GCH1 cleaving polypeptide disclosed herein linked to a delivery domain. In some embodiments, the delivery domain is a BoNT X HC domain. In some embodiments, the delivery domain is a pleckstrin homology (PH domain). In some embodiments, the PH domain is a human PH domain. Examples of human PH domains are shown below (SEQ ID NOs: 44-48). In some embodiments, a PH domain comprises a human phospholipase C delta 1 (PLC61) PH domain. In some embodiments, a PH domain has an amino acid sequence that is at least 80% (e.g., at least 80%, 85%, 90%, 95%, 99%, etc.) identical to a sequence set forth in SEQ ID NO.: 44-48). Additional suitable delivery domains will be apparent to those of skill in the art, and the invention is not limited in this aspect. The disclosure contemplates fusion proteins comprising the GCH1 cleaving polypeptides described herein and any suitable delivery domain.

In some embodiments, the delivery domain and the BoNT X protease light chain variant (e.g., GCH1 cleaving polypeptide) are directly linked together (e.g., the two peptides are bonded together without an intervening linker sequence). In some embodiments, the C- terminus of the delivery domain is linked to the N-terminus of the BoNT X protease light chain variant (e.g., GCH1 cleaving polypeptide). In some embodiments, the BoNT X protease light chain variant (e.g., GCH1 cleaving polypeptide) is modified to lack an N- terminal methionine residue.

In some embodiments, a delivery domain is indirectly linked to a BoNT X protease light chain variant (e.g., GCH1 cleaving polypeptide) via a linker. A linker is generally a peptide linker, for example, a glycine-rich linker (e.g., a poly-glycine- serine linker) or a proline-rich linker (e.g., a poly-Pro linker). The length of the linker may vary. In some embodiments, a linker ranges from about two amino acids in length to about 50 amino acids in length. In some embodiments, a linker comprises 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 amino acids. In some embodiments, a linker comprises more than 25 amino acids, for example 30, 35, 40, 45, or 50 amino acids. In some embodiments, a linker is a non-peptide linker, for example a polypropylene linker, polyethylene glycol (PEG) linker, etc.). In some embodiments, the BoNT X protease light chain variant (e.g., GCH1 cleaving polypeptide) is catalytically active.

Human PH Domains

Human phospholipase C delta 1 (PLC61) pleckstrin homology (PH) domain amino acid sequence (SEQ ID NO.: 44) MDSGRDFLTLHGLQDDEDLQALLKGSQLLKVKSSSWRRERFYKLQEDCKTIWQESR KVMRTPESQLFSIEDIQEVRMGHRTEGLEKFARDVPEDRCFSIVFKDQRNTLDLIAPSP ADAQHWVLGLHKIIHHSGSMDQRQKLQHWIHSCLRKADKNKDNKMSFKELQNFLK

Human cytohesin-1 PH domain amino acid sequence (SEQ ID NO.: 45) NPDREGWLLKLGGGRVKTWKRRWFILTDNCLYYFEYTTDKEPRGIIPLENLSIREVE DSKKPNCFELYIPDNKDQVIKACKTEADGRVVEGNHTVYRISAPTPEEKEEWIKCIKA AIS

Human cytohesin-2 PH domain amino acid sequence (SEQ ID NO.: 46) NPDREGWLLKLGGGRVKTWKRRWFILTDNCLYYFEYTTDKEPRGIIPLENLSIREVD DPRKPNCFELYIPNNKGQLIKACKTEADGRVVEGNHMVYRISAPTQEEKDEWIKSIQ AAVS

Human cytohesin-3 PH domain amino acid sequence (SEQ ID NO.: 47) NPDREGWLLKLGGGRVKTWKRRWFILTDNCLYYFEYTTDKEPRGIIPLENLSIREVE DPRKPNCFELYNPSHKGQVIKACKTEADGRVVEGNHVVYRISAPSPEEKEEWMKSIK ASIS

Human tyro sine-protein kinase BTK PH domain amino acid sequence (SEQ ID NO.: 48) AVILESIFLKRSQQKKKTSPLNFKKRLFLLTVHKLSYYEYDFERGRRGSKKGSIDVEKI TCVETVVPEKNPPPERQIPRRGEESSEMEQISIIERFPYPFQVVYDEGPLYVFSPTEELR KRWIHQLKNVIR

Nucleic Acids, Vectors, and Kits

In some aspects, provided herein is a nucleic acid encoding the GCH1 cleaving polypeptide disclosed herein. In some embodiments, the nucleic acid is at least 60% sequence identity to (e.g., at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.9%, or identical to a nucleic acid sequence selected from SEQ ID NOs.: 25-41. In some embodiments, the nucleic acid sequence is codon-optimized. In some embodiments, the nucleic acid is or comprises the sequence set forth in any one of SEQ ID NOs: 25-41. In some aspects, provided herein is a nucleic acid encoding a fusion protein disclosed herein.

In some aspects, provided herein is an expression vector comprising a nucleic acid encoding a GCH1 cleaving polypeptide disclosed herein. In some embodiments, the vector is a phage, plasmid, cosmid, bacmid, or viral vector. In some embodiments, the disclosure provides a vector for use in cleaving an intracellular protein (e.g., GCH1), comprising delivering to a cell the vector described herein, whereby the fusion protein contacts and cleaves the intracellular protein e.g., GCH1) in the cell. Viral vectors include retroviruses, lentiviruses, adeno-associated virus, pox viruses, baculovirus, reoviruses, vaccinia viruses, herpes simplex viruses, Epstein-Barr viruses, and adenovirus vectors, for example. In some embodiments the viral vector is a lentiviral vector. “Lentivirus” generally refers a family of retroviruses that cause chronic and severe infections in mammalian species. Lentiviruses infect and integrate their genomes into dividing and non-dividing cells (e.g., neurons). Nonlimiting examples of lentiviruses used for vectors include human immunodeficiency virus (HIV), simian immunodeficiency virus (SIV), feline immunodeficiency virus (FIV), equine infectious anemia virus (EIAV), bovine immunodeficiency virus (BIV) and caprine arthritis encephalitis virus (CAEV). In some embodiments, lentiviral TRs are derived from HIV (e.g., share at least 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 100% nucleic acid sequence identity with an HIV TR), for example, as described by Chung et al., Mol Ther. 2014 May; 22(5): 952-963.

In some aspects, provided herein is a kit comprising a container housing the GCH1 cleaving polypeptide, the nucleic acid, the fusion protein, the expression vector, or the host cell disclosed herein.

Methods of Use

Some aspects of this disclosure provide methods for using a BoNT variant provided herein (e.g., a GCH1 cleaving polypeptide). In some embodiments, such methods include contacting a protein (GCH1) comprising a target cleavage sequence (e.g., SEQ ID NO.: 4), for example, ex vivo, in vitro, or in vivo (e.g., in a subject), with the BoNT variant (e.g., GCH1 cleaving polypeptide).

In some aspects, provided herein are methods for cleaving intracellular GCH1 proteins using the BoNT protease variants, such as GCH1 cleaving polypeptides, described herein. In some embodiments, a method of cleaving intracellular GCH1 comprises delivering to a cell the GCH1 cleaving polypeptide disclosed herein. Upon delivery of the BoNT protease variant (e.g., GCH1 cleaving polypeptide) to a cell, the BoNT protease variant cleaves intracellular GCH1, thereby inactivating it. The BoNT protease variant can be any of BoNT protease variants described herein. In some embodiments, the BoNT protease variant (e.g., GCH1 cleaving polypeptide) comprises or consists of an amino acid sequence selected from SEQ ID NOs.: 10-23. In some embodiments, a method for cleaving an intracellular GCH1 protein comprises delivering to a cell the GCH1 cleaving polypeptide disclosed herein. In some embodiments, the intracellular GCH1 protein to be cleaved has an amino acid sequence that is at least 80% (e.g., at least 80%, 85%, 90%, 95%, 99%, etc.) identical to a sequence set forth in SEQ ID NO.: 2. In some embodiments, the intracellular GCH1 protein to be cleaved comprises the sequence set forth in SEQ ID NO: 2. In some embodiments, the intracellular GCH1 comprises a cleavage sequence that is at least 80%, 85%, 90%, 95%, or 99% identical to the amino acid sequence ETISDVLNDAIFDEDH (SEQ ID NO: 4). In some embodiments, the intracellular GCH1 comprises the cleavage sequence ETISDVLNDAIFDEDH (SEQ ID NO: 4). In some embodiments, the intracellular GCH1 protein is a human GCH1 protein. In some embodiments, delivering the GCH1 cleaving polypeptide to the cell results in cleavage of the intracellular GCH1 protein, resulting in inactivation of GCH1.

In some embodiments, inactivation of GCH1 subsequently results in reduction of intracellular levels of tetrahydrobiopterin (BH4). In some embodiments, delivery of GCH1 cleaving polypeptides described herein results in subsequent inactivation of GCH1 and reduction of BH4. In some embodiments, inactivation of GCH1 and reduction of BH4 results in reduction of pain e.g., chronic pain, neuropathic pain, inflammatory pain).

In some embodiments, delivery of GCH1 cleaving polypeptides described herein results in reduction of pain (e.g., chronic pain, neuropathic pain, inflammatory pain). In some embodiments, the cell is a mammalian cell. In some embodiments, the cell is a human cell. In some embodiments, the cell is in the peripheral nervous system. In some embodiments, the cell is a peripheral nerve cell. In certain embodiments, the cell is a neuron. In some embodiments, the cell is a dorsal root ganglion (DRG) neuron. In some embodiments, the cell is a sensory neuron. In some embodiments, the cell is in vitro. In some embodiments, the cell is in vivo. In some embodiments, the cell is in a subject. In some embodiments, the subject is a mammal (e.g., a human or a non-human mammal). In some embodiments, the subject is a non-human mammal. In some embodiments, the subject is human. In some embodiments, the subject is a non-human primate. In some embodiments, the subject is a rodent e.g., mouse, rat, hamster, guinea pig, etc.). In some embodiments, the subject is a sheep, a goat, a cow, a cat, or a dog. In some embodiments, the subject is a vertebrate, an amphibian, a reptile, a fish, an insect, a fly, or a nematode.

Aspects of the disclosure relate to BoNT protease variants that cleave intracellular proteins (e.g., GCH1) involved in certain biological systems, for example, pain. In some embodiments, the BoNT protease variants are capable of crossing the cellular membrane and entering the intracellular environment of neurons and neuronal cell types. In some embodiments, the intracellular protein is GCH1, which catalyzes the conversion of GTP into 7,8-dihydroneopterin triphosphate, in the initiating step of BH4 synthesis. The production of BH4 within the dorsal root ganglion plays a critical role in pain signaling because BH4 is a precursor for peripheral neuropathic and inflammatory pain signals. Cleavage of GCH1 results in reduced pain, such as chronic pain, neuropathic pain, and/or inflammatory pain by decreasing intracellular levels of BH4.

In some aspects, the disclosure provides methods for reducing pain in a subject in need thereof comprising contacting a cell of the subject with a BoNT protease variant (e.g., GCH1 cleaving polypeptide) provided herein (e.g., by administering the GCH1 cleaving polypeptide to the subject, e.g., locally or systemically), an expression vector encoding such a BoNT protease variant, or fusion protein comprising such a BoNT protease variant. In some embodiments, the pain is chronic pain. In some embodiments, the pain is acute pain. In some embodiments, the pain is neuropathic pain. In some embodiments, the pain is inflammatory pain. In some embodiments, the pain is nociceptive pain. In some embodiments, the methods provided herein comprise contacting the cell of a subject with a GCH1 cleaving polypeptide provided herein e.g., by administering the GCH1 cleaving polypeptide to the subject, either locally or systemically. In some embodiments, the cell is a non-human mammalian cell. In some embodiments, the cell is a human cell. In some embodiments, the cell is in the peripheral nervous system. In some embodiments, the cell is a peripheral nerve cell. In some embodiments, the cell is a neuron. In some embodiments, the cell is a dorsal root ganglion (DRG) neuron. In some embodiments, the cell is a sensory neuron. In some embodiments, the contacting (e.g., by administering the GCH1 cleaving polypeptide to the subject) results in the GCH1 cleaving polypeptide entering the cell. In some embodiments, the contacting e.g., by administering the GCH1 cleaving polypeptide to the subject) results in cleavage of GCH1. In some embodiments, cleavage of GCH1 inactivates GCH1. In some embodiments, cleavage of GCH1 subsequently results in the reduction of intracellular levels of tetrahydrobiopterin (BH4). In some embodiments, the cell is a mammalian cell (e.g., a human cell, a mouse cell, a dog cell, a cat cell, a horse cell, a guinea pig cell, a pig cell, a hamster cell, a non-human primate (e.g. monkey) cell, etc.

In some embodiments, administering the GCH1 cleaving polypeptide to a subject reduces intracellular levels of tetrahydrobiopterin (BH4) by at least 25% (e.g., at least 25%, 30%, 40%, 50%, 60%, 70%, 75%, 80%, 90%, 95%, 99%, or 100%).

In some such embodiments, the GCH1 cleaving polypeptide, expression vector, or fusion protein is administered to the subject in an amount effective to result in a measurable decrease in pain. Chronic pain may be measured by any known method in the art (see for example, Salaffi et al., Best Pr act Res Clin Rheumatol. 2015 Feb;29(l): 164-86). In some embodiments, chronic pain is assessed by identifying the subject’s subjective intensity of pain using a pain scale. Inflammatory pain may be measured by any known method in the art (see for example, Muley et al., CNS Neurosci Ther. 2016 Feb; 22(2): 88-101). Neuropathic pain may be measured by any known method in the art (see for example, Cruccu et al., pLoS Med. 2009 Apr; 6(4): el000045). Non-limiting examples of causes of pain include prior surgery or injury, nerve damage, arthritis, headaches (e.g., migraines), fibromyalgia, cancer, chronic fatigue syndrome, endometriosis, inflammatory bowel disease, interstitial cystitis, temporomandibular joint dysfunction, vulvodynia, multiple sclerosis, stomach ulcers, AIDS, Lyme disease, shingles, Epstein-Barr virus, hepatitis B and C, leprosy, diphtheria, gallbladder disease, autoimmune diseases (e.g., Sjogren’s syndrome, lupus, Guillain-Barre syndrome, chronic inflammatory demyelinating polyneuropathy, vasculitis), bone marrow disorders, kidney disease, liver disease, connective tissue disorders, and hypothyroidism.

In some embodiments, administering the GCH1 cleaving polypeptide to a subject results in a reduction of pain by at least 25% (e.g., at least 25%, 30%, 40%, 50%, 60%, 70%, 75%, 80%, 90%, 95%, 99%, or 100%). In some embodiments, administering the GCH1 cleaving polypeptide to the subject results in a decrease in pain score from 10 to 1, 9 to 1, 8 to 1, 7 to 1, 10 to 2, 9 to 2, 8 to 2, 7 to 2, 10 to 3, 9 to 3, or 8 to 3. Table 1: Amino Acid Sequences Table 2: Nucleic Acid Sequences

Engineering ofBoNT Protease Variants using PACE and PANCE

Some aspects of this disclosure provide compositions, systems, and methods for evolving a BoNT protease (e.g., BoNT X protease). In some embodiments, a method of evolving a BoNT protease (e.g., BoNT X protease) is provided that comprises (a) contacting a population of host cells with a population of expression vectors comprising a gene encoding a BoNT protease (e.g., BoNT X protease) to be evolved. The expression vectors are typically deficient in at least one gene required for the transfer of the phage vector from one cell to another, e.g., a gene required for the generation of infectious phage particles. In some embodiments of the provided methods, (1) the host cells are amenable to transfer of the expression vector; (2) the expression vector allows for expression of the BoNT protease e.g., BoNT X protease) in the host cell, can be replicated by the host cell, and the replicated expression vector can transfer into a second host cell; and (3) the host cell expresses a gene product encoded by the at least one gene for the generation of infectious phage particles (a) in response to the activity of the protease (e.g., ability to cleave a target protein or amino acid sequence), and the level of gene product expression depends on the activity of the protease. The methods of protease evolution provided herein typically comprise (b) incubating the population of host cells under conditions allowing for mutation of the gene encoding the BoNT protease (e.g., BoNT X protease), and the transfer of the expression vector comprising the gene encoding the BoNT protease of interest (e.g., BoNT X protease) from host cell to host cell. The host cells are removed from the host cell population at a certain rate, e.g., at a rate that results in an average time a host cell remains in the cell population that is shorter than the average time a host cell requires to divide, but long enough for the completion of a life cycle (uptake, replication, and transfer to another host cell) of the expression vector. The population of host cells is replenished with fresh host cells that do not harbor the expression vector. In some embodiments, the rate of replenishment with fresh cells substantially matches the rate of removal of cells from the cell population, resulting in a substantially constant cell number or cell density within the cell population. The methods of protease evolution provided herein typically also comprise (c) isolating a replicated expression vector from the host cell population of step (b), wherein the replicated expression vector comprises a mutated version of the gene encoding the BoNT protease (e.g., BoNT X protease).

In some aspects, provided herein is a host cell comprising the GCH1 cleaving polypeptide disclosed herein, the fusion protein disclosed herein or the expression vector disclosed herein. In some embodiments, the host cell is a bacterial cell, fungal cell, or animal cell (e.g., mammalian cell). In some embodiments, the host cell is a bacterial cell. In some embodiments, the host cell is a fungal cell. In some embodiments, the host cell is an animal cell. In some embodiments, the host cell is a mammalian cell. In some embodiments, a mammalian cell is a human cell. In some embodiments, a mammalian cell is a non-human primate cell, dog cell, cat cell, horse cell, guinea pig cell, pig cell, or mouse cell. In some embodiments, the host cell is an E. coli cell.

In some embodiments the host cell used in the method of evolving a BoNT X protease further expresses a dominant negative gene product for the at least one gene for the generation of infectious phage particles which expresses an antagonistic effect to infectious phage production in response to the activity of the canonical protease activity of native BoNT X. In some embodiments, expression of the dominant negative gene product is controlled by an undesired activity of a BoNT protease variant, for example cleavage of a starting substrate such as procaspase- 1. In some embodiments, expression of the dominant negative gene product is controlled by an undesired activity of a BoNT protease variant, for example cleavage of a native substrate of BoNT X, such as VAMP1, VAMP4, VAMP5, or Ykt6.

Some embodiments provide a continuous evolution system (e.g., PACE), in which a population of viral vectors, e.g., M13 phage vectors, comprising a gene encoding a BoNT X protease to be evolved replicates in a flow of host cells, e.g., a flow through a lagoon, wherein the viral vectors are deficient in a gene encoding a protein that is essential for the generation of infectious viral particles, and wherein that gene is in the host cell under the control of a conditional promoter, the activity of which depends on the activity of the protease of interest. Some embodiments provide a non-continuous evolution system e.g., PANCE), in which a population of viral vectors, comprising a gene encoding a BoNT protease to be evolved replicates by undergoing serial daily passaging in lieu of continuous flow.

In some embodiments, transcription from the conditional promoter may be activated by cleavage of a fusion protein comprising a transcription factor and an inhibitory protein fused to the transcriptional activator via a linker comprising a target site of the protease. In some embodiments, the transcriptional activator is fused to an inhibitor that either directly inhibits or otherwise hinders the transcriptional activity of the transcriptional activator, for example, by directly interfering with DNA binding or transcription, by targeting the transcriptional activator for degradation through the host cells protein degradation machinery, or by directing export from the host cell or localization of the transcriptional activator into a compartment of the host cell in which it cannot activate transcription from its target promoter. In some embodiments, the inhibitor is fused to the transcriptional activator’s N- terminus. In some embodiments, it is fused to the activator’s C-terminus. Some embodiments of the protease PACE technology described herein utilize a “selection phage,” a modified phage that comprises a gene of interest to be evolved and lacks a full-length gene encoding a protein required for the generation of infectious phage particles. In some such embodiments, the selection phage serves as the vector that replicates and evolves in the flow of host cells. For example, some M13 selection phages provided herein comprise a nucleic acid sequence encoding a protease to be evolved, e.g., under the control of an M13 promoter, and lack all or part of a phage gene encoding a protein required for the generation of infectious phage particles, e.g., gl, gll, gill, gIV, gV, gVI, gVII, gVIII, glX, or gX, or any combination thereof. For example, some M13 selection phages provided herein comprise a nucleic acid sequence encoding a BoNT protease to be evolved, e.g., under the control of an M 13 promoter, and lack all or part of a gene encoding a protein required for the generation of infectious phage particles, e.g., the gill gene encoding the pill protein.

One prerequisite for evolving proteases with a desired activity is to provide a selection system that confers a selective advantage to mutated protease variants exhibiting such an activity. The expression systems and fusion proteins comprising transcriptional activators in an inactive form that are activated by protease activity thus constitute an important feature of some embodiments of the protease PACE and PANCE technology provided herein.

In some embodiments, the transcriptional activator directly drives transcription from a target promoter. For example, in some such embodiments, the transcriptional activator may be an RNA polymerase. Suitable RNA polymerases and promoter sequences targeted by such RNA polymerases are well known to those of skill in the art. Exemplary suitable RNA polymerases include, but are not limited to, T7 polymerases (targeting T7 promoter sequences) and T3 RNA polymerases (targeting T3 promoter sequences). Additional suitable RNA polymerases will be apparent to those of skill in the art based on the instant disclosure, which is not limited in this respect.

In some embodiments, the transcriptional activator does not directly drive transcription, but recruits the transcription machinery of the host cell to a specific target promoter. Suitable transcriptional activators, such as, for example, Gal4 or fusions of the transactivation domain of the VP 16 transactivator with DNA-binding domains, will be apparent to those of skill in the art based on the instant disclosure, and the disclosure is not limited in this respect. In some embodiments, it is advantageous to link protease activity to enhanced phage packaging via a transcriptional activator that is not endogenously expressed in the host cells in order to minimize leakiness of the expression of the gene required for the generation of infectious phage particles through the host cell basal transcription machinery. For example, in some embodiments, it is desirable to drive expression of the gene required for the generation of infectious phage particles from a promoter that is not or is only minimally active in host cells in the absence of an exogenous transcriptional activator, and to provide the exogenous transcriptional activator, such as, for example, T7 RNA polymerase, as part of the expression system linking protease (e.g., BoNT protease variant) activity to phage packaging efficiency. In some embodiments, the at least one gene for the generation of infectious phage particles is expressed in the host cells under the control of a promoter activated by the transcriptional activator, for example, under the control of a T7 promoter if the transcriptional activator is T7 RNA polymerase, and under the control of a T3 promoter if the transcriptional activator is T3 polymerase, and so on.

In some embodiments, the protease evolution methods provided herein comprise an initial or intermittent phase of diversifying the population of vectors by mutagenesis, in which the cells are incubated under conditions suitable for mutagenesis of the gene encoding the protease in the absence of stringent selection or in the absence of any selection for evolved protease variants that have acquired a desired activity. Such low-stringency selection or no selection periods may be achieved by supporting expression of the gene for the generation of infectious phage particles in the absence of desired protease activity, for example, by providing an inducible expression construct comprising a gene encoding the respective packaging protein under the control of an inducible promoter and incubating under conditions that induce expression of the promoter, e.g., in the presence of the inducing agent. Suitable inducible promoters and inducible expression systems are described herein and in International PCT Application, PCT/US2011/066747, filed December 22, 2011, published as WO 2012/088381 on June 28, 2012; and U.S. Patent No. 9,267,127, issued February 3, 2016; International PCT Application, PCT/US2015/057012, filed October 22, 2015, published as WO 2016/077052 on May 19, 2016; and, PCT/US2016/027795, filed April 15, 2016, published as WO 2016/168631 on October 20, 2016, the entire contents of each of which are incorporated herein by reference. Additional suitable promoters and inducible gene expression systems will be apparent to those of skill in the art based on the instant disclosure. In some embodiments, the method comprises a phase of stringent selection for a mutated protease version. If an inducible expression system is used to relieve selective pressure, the stringency of selection can be increased by removing the inducing agent from the population of cells in the lagoon, thus turning expression from the inducible promoter off, so that any expression of the gene required for the generation of infectious phage particles must come from the protease activity-dependent expression system.

One aspect of the PACE protease evolution methods provided herein is the mutation of the initially provided vectors encoding a protease of interest (e.g., BoNT). In some embodiments, the host cells within the flow of cells in which the vector replicates are incubated under conditions that increase the natural mutation rate. This may be achieved by contacting the host cells with a mutagen, such as certain types of radiation or to a mutagenic compound, or by expressing genes known to increase the cellular mutation rate in the cells. Additional suitable mutagens will be known to those of skill in the art, and include, without limitation, those described in International PCT Application, PCT/US2011/066747, filed December 22, 2011, published as WO 2012/088381 on June 28, 2012; and U.S. Application, U.S. Patent No. 9,267,127, issued February 23, 2016; International PCT Application, PCT/US2015/057012, filed October 22, 2015, published as WO 2016/077052 on May 19, 2016; and, PCT/US2016/027795, filed April 15, 2016, published as WO 2016/168631 on October 20, 2016, the entire contents of each of which are incorporated herein by reference and the disclosure is not limited in this respect.

In some embodiments, the host cells comprise the accessory plasmid encoding the at least one gene for the generation of infectious phage particles, e.g., of the M13 phage, encoding the protease to be evolved and a helper phage, and together, the helper phage and the accessory plasmid comprise all genes required for the generation of infectious phage particles. Accordingly, in some such embodiments, variants of the vector that do not encode a protease variant that can untether the inhibitor from the transcriptional activator will not efficiently be packaged, since they cannot affect an increase in expression of the gene required for the generation of infectious phage particles from the accessory plasmid. On the other hand, variants of the vector that encode a protease variant that can efficiently cleave the inhibitor from the transcriptional activator will affect increased transcription of the at least one gene required for the generation of infectious phage particles from the accessory plasmid and thus be efficiently packaged into infectious phage particles. In some embodiments, the protease PACE and PANCE methods provided herein further comprises a negative selection for undesired protease activity in addition to the positive selection for a desired protease activity. Such negative selection methods are useful, for example, in order to maintain protease specificity when increasing the cleavage efficiency of a protease directed towards a specific target site. This can avoid, for example, the evolution of proteases that show a generally increased protease activity, including an increased protease activity towards off-target sites, which is generally undesired in the context of therapeutic proteases.

In some embodiments, negative selection is applied during a continuous evolution process as described herein, by penalizing the undesired activities of evolved proteases. This is useful, for example, if the desired evolved protease is an enzyme with high specificity for a target site, for example, a protease with altered, but not broadened, specificity. In some embodiments, negative selection of an undesired activity, e.g., off-target protease activity, is achieved by causing the undesired activity to interfere with pill production, thus inhibiting the propagation of phage genomes encoding gene products with an undesired activity. In some embodiments, expression of a dominant-negative version of pill or expression of an antisense RNA complementary to the gill RBS and/or gill start codon is linked to the presence of an undesired protease activity. Suitable negative selection strategies and reagents useful for negative selection, such as dominant-negative versions of M13 pill, are described herein and in International PCT Application, PCT/US2011/066747, filed December 22, 2011, published as WO 2012/088381 on June 28, 2012; and U.S. Application, U.S.S.N. 13/922,812, filed June 20, 2013; International PCT Application, PCT/US2015/057012, filed October 22, 2015, published as WO 2016/077052 on May 19, 2016; and, PCT/US2016/027795, filed April 15, 2016, published as WO 2016/168631 on October 20, 2016, the entire contents of each of which are incorporated herein by reference.

In some embodiments, counter- selection against activity on non-target substrates is achieved by linking undesired evolved protease activities to the inhibition of phage propagation. In some embodiments, a dual selection strategy is applied during a continuous evolution experiment, in which both positive selection and negative selection constructs are present in the host cells. In some such embodiments, the positive and negative selection constructs are situated on the same plasmid, also referred to as a dual selection accessory plasmid. One advantage of using a simultaneous dual selection strategy is that the selection stringency can be fine-tuned based on the activity or expression level of the negative selection construct as compared to the positive selection construct. Another advantage of a dual selection strategy is that the selection is not dependent on the presence or the absence of a desired or an undesired activity, but on the ratio of desired and undesired activities, and, thus, the resulting ratio of pill and pill-neg that is incorporated into the respective phage particle.

For example, in some embodiments, the host cells comprise an expression construct encoding a dominant-negative form of the at least one gene for the generation of infectious phage particles, e.g., a dominant-negative form of the pill protein (pill-neg), under the control of an inducible promoter that is activated by a transcriptional activator other than the transcriptional activator driving the positive selection system. Expression of the dominantnegative form of the gene diminishes or completely negates any selective advantage an evolved phage may exhibit and thus dilutes or eradicates any variants exhibiting undesired activity from the lagoon.

For example, if the positive selection system comprises a T3 promoter driving the expression of the at least one gene for the generation of infectious phage particles, and an evolved variant of T7 RNA polymerase that transcribes selectively from the T3 promoter, fused to a T7-RNA polymerase inhibitor via a linker comprising a protease target site that is cleaved by a desired protease activity, the negative selection system uses an orthogonal RNA polymerase. For example, in some such embodiments, the negative selection system could be based on T7 polymerase activity, e.g., in that it comprises a T7 promoter driving the expression of a dominant-negative form of the at least one gene for the generation of infectious phage particles, and a T7 RNA polymerase fused to a T7-RNA polymerase inhibitor via a linker comprising a protease target site that is cleaved by an undesired protease activity. In some embodiments, the negative selection polymerase is a T7 RNA polymerase gene comprising one or more mutations that render the T7 polymerase able to transcribe from the T3 promoter but not the T7 promoter, for example: N67S, R96E, K98R, H176P, E207K, E222K, T375A, M401I, G675R, N748D, P759E, A798S, A819T, etc. In some embodiments the negative selection polymerase may be fused to a T7-RNA polymerase inhibitor via a linker comprising a protease target site that is cleaved by an undesired protease activity. In other embodiments, if the positive selection system comprises a T7 promoter driving the expression of the at least one gene for the generation of infectious phage particles, and an evolved variant of T3 RNA polymerase that transcribes selectively from the T7 promoter, fused to a T3-RNA polymerase inhibitor via a linker comprising a protease target site that is cleaved by a desired protease activity, the negative selection system uses an orthogonal RNA polymerase. For example, in some such embodiments, the negative selection system could be based on T3 polymerase activity, e.g., in that it comprises a T3 promoter driving the expression of a dominant-negative form of the at least one gene for the generation of infectious phage particles, and a T3 RNA polymerase fused to a T3-RNA polymerase inhibitor via a linker comprising a protease target site that is cleaved by an undesired protease activity. In some embodiments, the negative selection polymerase is a T3 RNA polymerase gene comprising one or more mutations that render the T3 polymerase able to transcribe from the T7 promoter but not the T3 promoter. In some embodiments the negative selection polymerase may be fused to a T3-RNA polymerase inhibitor via a linker comprising a protease target site that is cleaved by an undesired protease activity.

When used together, such positive-negative PACE selection results in the evolution of proteases that exhibit the desired activity but not the undesired activity. In some embodiments, the undesired function is cleavage of an off-target protease cleavage site. In some embodiments, GCH-1 is selected to be evolved (e.g., cleaved more efficiently), while procaspase-1 and VAMP1 (e.g., a VAMP1 cleavage substrate sequence) are negatively selected (e.g., cleaved less efficiently, or not at all). In some embodiments, the undesired function is cleavage of the linker sequence of the fusion protein outside of the protease cleavage site.

Some aspects of this invention provide or utilize a dominant negative variant of pill (pill- neg). These aspects are based on the recognition that a pill variant that comprises the two N-terminal domains of pill and a truncated, termination-incompetent C-terminal domain is not only inactive but is a dominant-negative variant of pill. A pill variant comprising the two N-terminal domains of pill and a truncated, termination-incompetent C-terminal domain was described in Bennett, N. J.; Rakonjac, J., Unlocking of the filamentous bacteriophage virion during infection is mediated by the C domain of pill. Journal of Molecular Biology 2006, 356 (2), 266-73; the entire contents of which are incorporated herein by reference. The dominant negative property of such pill variants has been described in more detail in PCT Application PCT/US2011/066747, filed December 22, 2011, published as WO 2012/088381 on June 28, 2012, the entire contents of which are incorporated herein by reference.

The pill-neg variant as provided in some embodiments herein is efficiently incorporated into phage particles, but it does not catalyze the unlocking of the particle for entry during infection, rendering the respective phage noninfectious even if wild type pill is present in the same phage particle. Accordingly, such pill-neg variants are useful for devising a negative selection strategy in the context of PACE, for example, by providing an expression construct comprising a nucleic acid sequence encoding a pill- neg variant under the control of a promoter comprising a recognition motif, the recognition of which is undesired. In some embodiments, pill-neg is used in a positive selection strategy, for example, by providing an expression construct in which a pill-neg encoding sequence is controlled by a promoter comprising a nuclease target site or a repressor recognition site, the recognition of either one is desired. In some embodiments, a protease PACE or PANCE experiment according to methods provided herein is run for a time sufficient for at least 10, at least 20, at least 30, at least 40, at least 50, at least 100, at least 200, at least 300, at least 400, at least, 500, at least 600, at least 700, at least 800, at least 900, at least 1000, at least 1250, at least 1500, at least 1750, at least 2000, at least 2500, at least 3000, at least 4000, at least 5000, at least 7500, at least 10000, or more consecutive viral life cycles. In certain embodiments, the viral vector is an M 13 phage, and the length of a single viral life cycle is about 10-20 minutes.

In some embodiments, the host cells are contacted with the vector and/or incubated in suspension culture. For example, in some embodiments, bacterial cells are incubated in suspension culture in liquid culture media. Suitable culture media for bacterial suspension culture will be apparent to those of skill in the art, and the invention is not limited in this regard. See, for example, Molecular Cloning: A Laboratory Manual, 2nd Ed., ed. by Sambrook, Fritsch, and Maniatis (Cold Spring Harbor Laboratory Press: 1989); Elizabeth Kutter and Alexander Sulakvelidze: Bacteriophages: Biology and Applications . CRC Press; 1 st edition (December 2004), ISBN: 0849313368; Martha R. J. Clokie and Andrew M. Kropinski: Bacteriophages: Methods and Protocols, Volume 1: Isolation, Characterization, and Interactions (Methods in Molecular Biology) Humana Press; 1 st edition (December, 2008), ISBN: 1588296822; Martha R. J. Clokie and Andrew M. Kropinski: Bacteriophages: Methods and Protocols, Volume 2: Molecular and Applied Aspects (Methods in Molecular Biology) Humana Press; 1 st edition (December 2008), ISBN: 1603275649; all of which are incorporated herein in their entirety by reference for disclosure of suitable culture media for bacterial host cell culture).

The protease PACE methods provided herein are typically carried out in a lagoon. Suitable lagoons and other laboratory equipment for carrying out protease PACE methods as provided herein have been described in detail elsewhere. See, for example, International PCT Application, PCT/US2011/066747, filed December 22, 2011, published as WO 2012/088381 on June 28, 2012, the entire contents of which are incorporated herein by reference. In some embodiments, the lagoon comprises a cell culture vessel comprising an actively replicating population of vectors, for example, phage vectors comprising a gene encoding the protease of interest (e.g., BoNT), and a population of host cells, for example, bacterial host cells. In some embodiments, the lagoon comprises an inflow for the introduction of fresh host cells into the lagoon and an outflow for the removal of host cells from the lagoon. In some embodiments, the inflow is connected to a turbidostat comprising a culture of fresh host cells. In some embodiments, the outflow is connected to a waste vessel or sink. In some embodiments, the lagoon further comprises an inflow for the introduction of a mutagen into the lagoon. In some embodiments that inflow is connected to a vessel holding a solution of the mutagen. In some embodiments, the lagoon comprises an inflow for the introduction of an inducer of gene expression into the lagoon, for example, of an inducer activating an inducible promoter within the host cells that drives expression of a gene promoting mutagenesis (e.g., as part of a mutagenesis plasmid), as described in more detail elsewhere herein. In some embodiments, that inflow is connected to a vessel comprising a solution of the inducer, for example, a solution of arabinose.

In some embodiments, a PACE method as provided herein is performed in a suitable apparatus as described herein. For example, in some embodiments, the apparatus comprises a lagoon that is connected to a turbidostat comprising a host cell as described herein. In some embodiments, the host cell is an E. coli host cell. In some embodiments, the host cell comprises an accessory plasmid as described herein, a helper plasmid as described herein, a mutagenesis plasmid as described herein, and/or an expression construct encoding a fusion protein as described herein, or any combination thereof. In some embodiments, the lagoon further comprises a selection phage as described herein, for example, a selection phage encoding a protease of interest. In some embodiments, the lagoon is connected to a vessel comprising an inducer for a mutagenesis plasmid, for example, arabinose. In some embodiments, the host cells are E. coli cells comprising the F’ plasmid, for example, cells of the genotype F'proA + B + A(lacIZY) zzf::TnlO(TetR)/ endAl recAl galE15 galK16 nupG rpsL AlacIZYA araD139 A(ara,leu)7697 mcrA A(mrr-hsdRMS-mcrBC) proBA::pirl l6 E.

Some aspects of this invention relate to host cells for continuous evolution processes and non-continuous processes as described herein. In some embodiments, a host cell is provided that comprises at least one viral gene encoding a protein required for the generation of infectious viral particles under the control of a conditional promoter, and a fusion protein comprising a transcriptional activator targeting the conditional promoter and fused to an inhibitor via a linker comprising a protease cleavage site. For example, some embodiments provide host cells for phage-assisted continuous evolution and phage-assisted non-continuous processes, wherein the host cell comprises an accessory plasmid comprising a gene required for the generation of infectious phage particles, for example, M13 gill, under the control of a conditional promoter, as described herein. In some embodiments, the host cells comprise an expression construct encoding a fusion protein as described herein, e.g., on the same accessory plasmid or on a separate vector. In some embodiments, the host cell further provides any phage functions that are not contained in the selection phage, e.g., in the form of a helper phage. In some embodiments, the host cell provided further comprises an expression construct comprising a gene encoding a mutagenesis-inducing protein, for example, a mutagenesis plasmid as provided herein.

In some embodiments, modified viral vectors are used in continuous evolution processes and non-continuous evolution processes as provided herein. In some embodiments, such modified viral vectors lack a gene required for the generation of infectious viral particles. In some such embodiments, a suitable host cell is a cell comprising the gene required for the generation of infectious viral particles, for example, under the control of a constitutive or a conditional promoter e.g., in the form of an accessory plasmid, as described herein). In some embodiments, the viral vector used lacks a plurality of viral genes. In some such embodiments, a suitable host cell is a cell that comprises a helper construct providing the viral genes required for the generation of infectious viral particles. A cell is not required to actually support the life cycle of a viral vector used in the methods provided herein. For example, a cell comprising a gene required for the generation of infectious viral particles under the control of a conditional promoter may not support the life cycle of a viral vector that does not comprise a gene of interest able to activate the promoter, but it is still a suitable host cell for such a viral vector.

In some embodiments, the host cell is a prokaryotic cell, for example, a bacterial cell. In some embodiments, the host cell is an E. coli cell. In some embodiments, the host cell is a eukaryotic cell, for example, a yeast cell, an insect cell, or a mammalian cell. The type of host cell, will, of course, depend on the viral vector employed, and suitable host cell/viral vector combinations will be readily apparent to those of skill in the art.

In some embodiments, the viral vector is a phage and the host cell is a bacterial cell. In some embodiments, the host cell is an E. coli cell. Suitable E. coli host strains will be apparent to those of skill in the art, and include, but are not limited to, New England Biolabs (NEB) Turbo, ToplOF’, DH12S, ER2738, ER2267, and XLl-Blue MRF’. These strain names are art recognized and the genotype of these strains has been well characterized. It should be understood that the above strains are exemplary only and that the invention is not limited in this respect.

In some PACE embodiments, for example, in embodiments employing an M13 selection phage, the host cells are E. coli cells expressing the Fertility factor, also commonly referred to as the F factor, sex factor, or F-plasmid. The F-factor is a bacterial DNA sequence that allows a bacterium to produce a sex pilus necessary for conjugation and is essential for the infection of E. coli cells with certain phage, for example, with M13 phage. For example, in some embodiments, the host cells for M13-PACE are of the genotype F'proA + B + A(lacIZY) zzf::TnlO(TetR)/ endAl recAl galE15 galK16 nupG rpsE AlacIZYA araD139 A(ara,leu)7697 mcrA A(mrr-hsdRMS-mcrBC) proBA::pirl l6 .

Some of the embodiments, advantages, features, and uses of the technology disclosed herein will be more fully understood from the Examples below. The Examples are intended to illustrate some of the benefits of the present disclosure and to describe particular embodiments, but are not intended to exemplify the full scope of the disclosure and, accordingly, do not limit the scope of the disclosure.

EXAMPLES

Example 1. Evolution of BoNT X and BoNT X Protease Variant

This example describes evolution of a Botulinum neurotoxin (BoNT) protease to cleave GTP cyclohydrolase 1 (GCH1). Cleavage of intracellular GCH1 (e.g., GCH1 present in DRG neurons) results in a reduction of intracellular levels of BH4 below pathological pain levels.

The crystal structure of GCH1 is shown in FIG. 1A and a close-up of the crystal structure showing target sites is shown in FIG. IB. Starting activity on two target sites of GCH1 was assessed (see FIG. 2). GCH1 site 1 corresponds to amino acid cleavage sequence SSLGENPQRQGLLKT (SEQ ID NO: 3). GCH1 site 2 corresponds to amino acid cleavage sequence ETISDVLNDAIFDEDH (SEQ ID NO: 4). OD normalized luminescence values were used to reflect proteolytic activity. Isolated phage demonstrated greater activity on GCH1 site 2. GCH1 site 2 was selected.

BoNT X was first evolved to cleave procaspase- 1 by using PACE and PANCE. The BoNT protease variant, BoNT X(3015)8, was further evolved to cleave GCH1. BoNT X(3015)8 was evolved to cleave GCH1 by using the evolutionary process PANCE. Several rounds of PANCE were carried out. BoNT X variants that are selective for GCH1 were identified. Tables 3 and 4 show a summary of amino acid substitutions present in the BoNT X variants relative to wild-type BoNT X (Table 3) and relative to BoNT X(3015)8 (Table 4) (also see FIGs. 4 and 6). There are fourteen positions (orange residues) with convergent mutations from this evolution, relative BoNT X(3015)8 (see FIGs. 4 and 6). Gray shaded residues are substitutions that arose from the previous evolution steps and represent mutations relative to wild-type BoNT X. There are twenty-eight positions with mutations relative to wild-type BoNT X (see FIG. 6).

PANCE evolution yielded BoNT protease variants with robust propagation at GCH1 target site 2. Target site sequences for procaspase-1, the starting substrate, and GCH1, the novel substrate are shown in FIG. 3A. Phage titer is shown for the seven passages of PANCE evolution performed on three replicates. Data from an activity assay on BoNT X protease variants from PANCE is shown in FIG. 3B. OD normalized luminescence values were used to reflect proteolytic activity. BoNT X(3015)8, the starting protease in this evolution, was a positive control showing select activity on procaspase- 1, its substrate. Catalytically impaired dBoNT/F is unable to perform proteolysis and was used as a negative control. Isolated phage demonstrated activity on both procaspase- 1 and novel substrate, GCH1, with greater activity on GCH1. The GCHl-cleaving BoNT X variants are shown in Table 1 (SEQ ID NOs: 10-17). BoNT X 8(6715-1214)2.4 variant, which has amino acid substitutions A166T and P368L relative to BoNT X(3015)8, yields robust activity on GCH1. In vitro assays were performed to assess the activity of the evolved protease. Briefly, 41 amino acid fragments from proscaspase-1 or 25 amino acid fragments from GCH1 were expressed as a fusion protein with maltose binding protein (MBP) and glutathione- S- transferase (GST), and subsequently isolated. Substrates were then incubated with 50 nM BoNT X 8(6715-1214)2.4 for 1 hour at 37 °C. Protein samples were then subjected to PAGE electrophoresis and protein bands were visualized via Coomassie staining. FIG. 5A-5B show in vitro cleavage assay data demonstrating that the evolved protease BoNT X 8(6715- 1214)2.4 cleaves GCH1.

Negative Selection

A negative selection strategy was developed to select for GCH1 cleaving polypeptides that do not cleave off-target proteins, for example procaspase- 1. Briefly, expression of a dominant-negative form of pill was coupled to activity of procaspase- 1 cleaving protease variants. This is achieved using a T7 RNA polymerase fused to a T7 RNA polymerase inactivating protein via an amino acid linker that encodes the off-target protein, such as procaspase- 1. Cleavage of the linker sequence triggers the expression of pill-neg, leading to the production of noninfectious phage particles. Negative selection is performed with simultaneous positive selection. This is achieved using a T7 RNA polymerase with mutations that render the T7 polymerase able to transcribe from the T3 promoter but not the T7 promoter. This T3-activating polymerase is fused to an inactivating protein via an amino acid linker that encodes the on-target protein, such as GCH1. Cleavage of the linker sequence triggers the expression of pill, leading to the production of infectious phage particles. PANCE and PACE can both be performed using simultaneous positive and negative selection. Simultaneous positive and negative selection of BoNT X variants that cleave GCH1 was performed (FIG. 7). The GCHl-cleaving BoNT X variants following positive and negative selection are shown in Table 1 (SEQ ID NOs: 18-23). The amino acid substitutions present in the variants relative to wild-type BoNT X are shown in Table 5 and relative to BoNT X(3015)8 are shown in Table 6.

In vitro assays were performed to assess the activity of evolved proteases. 500 nM of BoNT X, starting protease BoNT X(3015)8, and evolved variants BoNT X(1214)2.4, and BoNT X(n002)A2 were incubated with 5pM of each of VAMP1, procaspase-1, and GCH1 substrates for 105 minutes at 37°C. Additionally, 50 nM of BoNT X, starting protease BoNT X(3015)8, and evolved variants BoNT X(1214)2.4, and BoNT X(n002)A2 were incubated with 5pM of each of VAMP1, procaspase-1, and GCH1 substrates for 60 minutes at 37°C. Protein samples were then subjected to PAGE electrophoresis and protein bands were visualized via Coomassie staining. The results show that the evolved protease BoNT

5 X(1214)2.4 cleaves both GCH1 and procaspase-1 after positive selection only and BoNT

X(n002)A2 cleaves only GCH1 (and not VAMP1 or procaspase-1) after both positive and negative selection (see FIG. 8).

Table 3: Summary of substitutions in BoNT X Variants relative to wild-type BoNT X (SEQ0 ID NO: 1).

Table 4: Summary of substitutions in BoNT X variants relative to BoNT X(3015)8 (SEQ ID NO: 9).

5 Table 5: Summary of substitutions in BoNT X variants following positive and negative selection relative to wild-type BoNT X (SEQ ID NO: 1).

Table 6: Summary of substitutions in BoNT X variants following positive and negative selection relative to BoNT X(3015)8 (SEQ ID NO: 9). Example 2. Cleavage activity of evolved GCHl-cleaving proteases

To perform in vitro cleavage assays, genetic constructs comprising of human GCH1 fused to maltose binding protein (MBP) on the N-terminal end of GCH1 and glutathione S- transferase (GST) on the C-terminal end of GCH1 were expressed in Escherichia coli BL21 cells via induction with 1 mM isopropyl- 1-thio-galactopyranoside (IPTG) added to the growth media. After IPTG induction, cultures were incubated at 18 °C for 18-24 hours. Cells were centrifuged, resuspended in 10-20 ml of lysis buffer (20 mM HEPES pH 7.3, 200 mM NaCl, and EDTA-free protease inhibitor tablets), and lysed using a sonicator. Cell lysates were incubated with glutathione agarose to bind MBP-GCH1-GST, and protein was eluted from the agarose using 50 mM glutathione. Eluted protein was concentrated via centrifugation. Evolved GCHl-cleaving proteases, BoNT X(1214)2.4 (SEQ ID NO: 10), were purified using a similar protocol, using genetic constructs comprising of protease-6x histidine fusions and using nickel or cobalt affinity resin instead of glutathione resin to bind protease from cell lysates.

1 pM of purified MBP-GCH1-GST protein was incubated with 40-1000 mM purified GCHl-cleaving protease, BoNT X(1214)2.4, for 1-24 hours at 37 °C in 30-50 pl volumes. To visualize cleavage products, the reaction was quenched with the addition of SDS-PAGE buffer and samples were run on polyacrylamide gels. Proteins were visualized via Coomassie staining. Cleavage products appear as lower molecular-weight bands within samples treated with protease. The results demonstrate that evolved GCHl-cleaving botulinum neurotoxin proteases are capable of cleaving full-length human GCH1 protein (see FIG. 9).

This procedure can be repeated using non-human GCH1 homologs (mouse, rat, primate) as well as other genetic construct configurations (GST on the N-terminal end of GCH1 and MBP on the C-terminal end of GCH1, or GST alone on either end of GCH1).

EQUIVALENTS AND SCOPE

Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. The scope of the present invention is not intended to be limited to the above description, but rather is as set forth in the appended claims. In the claims articles such as “a,” “an,” and “the” may mean one or more than one unless indicated to the contrary or otherwise evident from the context. Claims or descriptions that include “or” between one or more members of a group are considered satisfied if one, more than one, or all of the group members are present in, employed in, or otherwise relevant to a given product or process unless indicated to the contrary or otherwise evident from the context. The invention includes embodiments in which exactly one member of the group is present in, employed in, or otherwise relevant to a given product or process. The invention also includes embodiments in which more than one, or all of the group members are present in, employed in, or otherwise relevant to a given product or process.

Furthermore, it is to be understood that the invention encompasses all variations, combinations, and permutations in which one or more limitations, elements, clauses, descriptive terms, etc., from one or more of the claims or from relevant portions of the description is introduced into another claim. For example, any claim that is dependent on another claim can be modified to include one or more limitations found in any other claim that is dependent on the same base claim. Furthermore, where the claims recite a composition, it is to be understood that methods of using the composition for any of the purposes disclosed herein are included, and methods of making the composition according to any of the methods of making disclosed herein or other methods known in the art are included, unless otherwise indicated or unless it would be evident to one of ordinary skill in the art that a contradiction or inconsistency would arise.

Where elements are presented as lists, e.g., in Markush group format, it is to be understood that each subgroup of the elements is also disclosed, and any element(s) can be removed from the group. It is also noted that the term “comprising” is intended to be open and permits the inclusion of additional elements or steps. It should be understood that, in general, where the invention, or aspects of the invention, is/are referred to as comprising particular elements, features, steps, etc., certain embodiments of the invention or aspects of the invention consist, or consist essentially of, such elements, features, steps, etc. For purposes of simplicity those embodiments have not been specifically set forth in haec verba herein. Thus, for each embodiment of the invention that comprises one or more elements, features, steps, etc., the invention also provides embodiments that consist or consist essentially of those elements, features, steps, etc. Where ranges are given, endpoints are included. Furthermore, it is to be understood that unless otherwise indicated or otherwise evident from the context and/or the understanding of one of ordinary skill in the art, values that are expressed as ranges can assume any specific value within the stated ranges in different embodiments of the invention, to the tenth of the unit of the lower limit of the range, unless the context clearly dictates otherwise. It is also to be understood that unless otherwise indicated or otherwise evident from the context and/or the understanding of one of ordinary skill in the art, values expressed as ranges can assume any subrange within the given range, wherein the endpoints of the subrange are expressed to the same degree of accuracy as the tenth of the unit of the lower limit of the range.

In addition, it is to be understood that any particular embodiment of the present invention may be explicitly excluded from any one or more of the claims. Where ranges are given, any value within the range may explicitly be excluded from any one or more of the claims. Any embodiment, element, feature, application, or aspect of the compositions and/or methods of the invention, can be excluded from any one or more claims. For purposes of brevity, all of the embodiments in which one or more elements, features, purposes, or aspects is excluded are not set forth explicitly herein.