Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
IMPROVED ENGINEERED IMMUNOGLOBULINS AND BINDING FRAGMENTS
Document Type and Number:
WIPO Patent Application WO/2024/095130
Kind Code:
A1
Abstract:
Provided herein is an engineered immunoglobulin (Ig) variable domain comprising or capable of comprising at least one non-canonical disulfide bond formed between a first cysteine and a second cysteine, wherein the first and second cysteines are selected from a cysteine residue introduced at position 24, position 86, position 22, position 88, position 4, position 25, position 118, position 5, position 120, position 6, position 119, position 45, and/or position 100, wherein the position numbering refers to IMGT numbering, and wherein the first and second cysteines are not at the same position. Methods of making and using said engineered Ig variable domain(s) are also provided.

Inventors:
KIM DAE YOUNG (CA)
TANHA JAMSHID (CA)
Application Number:
PCT/IB2023/060921
Publication Date:
May 10, 2024
Filing Date:
October 30, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NAT RES COUNCIL CANADA (CA)
International Classes:
C07K16/00; A61K39/395; C12N5/10; C12N15/13; C40B30/00; C40B40/02; C40B40/08
Attorney, Agent or Firm:
SMITH, Jessica et al. (CA)
Download PDF:
Claims:
CLAIMS

1. An engineered immunoglobulin (Ig) variable domain comprising an introduced pair of cysteine residues comprising a first cysteine residue and a second cysteine residue, wherein: the first cysteine residue is at position 24 and the second cysteine residue is at position 86; the first cysteine residue is at position 22 and the second cysteine residue is at position 88; the first cysteine residue is at position 4 and the second cysteine residue is at position 25; the first cysteine residue is at position 4 and the second cysteine residue is at position 118; the first cysteine residue is at position 5 and the second cysteine residue is at position 120; the first cysteine residue is at position 6 and the second cysteine residue is at position 119; or the first cysteine residue is at position 45 and the second cysteine residue is at position 100, wherein the position numbering refers to IMGT numbering and wherein at least one disulfide bond is formed or is capable of forming between the first and second cysteine residues.

2. The engineered Ig variable domain of claim 1 , wherein the variable domain is a heavy chain variable domain.

3. The engineered Ig variable domain of claim 1 , wherein the variable domain is a VHH variable domain, a light chain variable domain, or a VNAR variable domain.

4. The engineered Ig variable domain of claim 1 or 2, wherein the engineered Ig variable domain comprises any one of the amino acid sequences set forth in SEQ ID NOs: 332-338, 340-346, 348- 354, 356-362, 364-370, 372-378, 380-386, 388-394, 396-402, 404-410, 412-418, 420-426, 428-434, 436-442, 444-450, 452-458, 460-466, 468-474, 476-482, 484-490, 492-498, 500-506, 508-514, 516- 522, 524-530, 532-538, 540-546, 548-554, 556-562, 564-570, 572-578, 580-586, 588-594, and 596- 602 or an amino acid sequence having at least 80% sequence identity to any one of the amino acid sequences set forth in SEQ ID NOs: 332-338, 340-346, 348-354, 356-362, 364-370, 372-378, 380- 386, 388-394, 396-402, 404-410, 412-418, 420-426, 428-434, 436-442, 444-450, 452-458, 460-466, 468-474, 476-482, 484-490, 492-498, 500-506, 508-514, 516-522, 524-530, 532-538, 540-546, 548- 554, 556-562, 564-570, 572-578, 580-586, 588-594, and 596-602, wherein the first and second cysteine residues are maintained.

5. The engineered Ig variable domain of any one of claims 1 to 4, wherein the pair of cysteine residues is introduced by mutating a nucleic acid molecule encoding the engineered Ig variable domain.

6. An antibody or antigen-binding fragment thereof comprising at least one engineered Ig variable domain of any one of claims 1 to 5.

7. The antibody or antigen-binding fragment thereof of claim 6, which is a single domain antibody, optionally a nanobody.

8. The antibody or antigen-binding fragment thereof of claim 6, which is an lgG1 , lgG2, lgG3, lgG4, IgE, IgA, IgY, IgD, IgM, and/or IgNAR antibody.

9. The antibody or antigen-binding fragment thereof of claim 6, which is an antibody fragment selected from Fab, Fab', F(ab')2, scFv, dsFv, ds-scFv, dimers, minibodies, diabodies, and multimers thereof.

10. A polypeptide comprising a plurality of the engineered Ig variable domain of any one of claims 1 to 5.

11. A nucleic acid molecule encoding the engineered Ig variable domain of any one of claims 1 to 5, the antibody or antigen-binding fragment thereof of any one of claims 6 to 9, or the polypeptide of claim 10.

12. A vector comprising the nucleic acid molecule of claim 11 .

13. A cell comprising the nucleic acid molecule of claim 11 or the vector of claim 12.

14. An animal comprising the nucleic acid molecule of claim 11 or the vector of claim 12.

15. A cell expressing the engineered Ig variable domain of any one of claims 1 to 5, the antibody or antigen-binding fragment thereof of any one of claims 6 to 9, the polypeptide of claim 10, or the nucleic acid molecule of claim 11 .

16. An animal expressing the engineered Ig variable domain of any one of claims 1 to 5, the antibody or antigen-binding fragment thereof of any one of claims 6 to 9, the polypeptide of claim 10, or the nucleic acid molecule of claim 11 .

17. A composition comprising the engineered Ig variable domain of any one of claims 1 to 5, the antibody or antigen-binding fragment thereof of any one of claims 6 to 9, the polypeptide of claim 10, the nucleic acid molecule of claim 11 , or the vector of claim 12, and optionally comprising a pharmaceutically acceptable carrier and/or excipient.

18. A kit comprising the engineered Ig variable domain of any one of claims 1 to 5, the antibody or antigen-binding fragment thereof of any one of claims 6 to 9, the polypeptide of claim 10, the nucleic acid molecule of claim 11 , or the vector of claim 12 and packaging material.

19. An engineered library comprising at least one nucleic acid molecule encoding at least one engineered Ig variable domain of any one of claims 1 to 5 or encoding at least one antibody or antigen-binding fragment thereof of any one of claims 6 to 9.

20. The engineered library of claim 19, wherein the at least one engineered Ig variable domain or the at least one antibody or antigen-binding fragment thereof is a plurality and wherein at least a subset of the plurality comprises different CDR sequences.

21 . A method of making an antibody or antigen-binding fragment thereof with increased stability relative to a control, the method comprising generating an antibody or antigen-binding fragment thereof comprising at least one engineered Ig variable domain of any one of claims 1 to 5, wherein the control is an antibody or antigen-binding fragment thereof into which the pair of cysteine residues has not been introduced but is otherwise the same as the antibody or antigen-binding fragment thereof produced by the method.

22. A method of increasing stability of an antibody or antigen-binding fragment thereof relative to a control, the method comprising introducing into at least one variable domain of the antibody or antigen-binding fragment thereof a pair of cysteine residues, wherein the pair of cysteine residues is selected from: a cysteine residue at position 24 and a cysteine residue at position 86; a cysteine residue at position 22 and a cysteine residue at position 88; a cysteine residue at position 4 and a cysteine residue at position 25; a cysteine residue at position 4 and a cysteine residue at position 118; a cysteine residue at position 5 and a cysteine residue at position 120; a cysteine residue at position 6 and a cysteine residue at position 119; and a cysteine residue at position 45 and a cysteine residue at position 100, wherein: the position numbering refers to IMGT numbering, the cysteine residues introduced form or are capable of forming a disulfide bond, and the control is an antibody or antigen-binding fragment thereof into which the pair of cysteine residues has not been introduced but is otherwise the same as the antibody or antigen-binding fragment thereof produced by the method.

23. The method of claim 22, wherein the variable domain is a heavy chain variable domain.

24. The method of claim 22, wherein the variable domain is a VHH variable domain, a light chain variable domain, or a VNAR variable domain.

25. The method of claim 22 or 23, wherein the engineered Ig variable domain comprises any one of the amino acid sequences of SEQ ID NOs: 332-338, 340-346, 348-354, 356-362, 364-370, 372- 378, 380-386, 388-394, 396-402, 404-410, 412-418, 420-426, 428-434, 436-442, 444-450, 452-458, 460-466, 468-474, 476-482, 484-490, 492-498, 500-506, 508-514, 516-522, 524-530, 532-538, 540- 546, 548-554, 556-562, 564-570, 572-578, 580-586, 588-594, and 596-602 or an amino acid sequence having at least 80%, at least 85% at least 90% or at least 95% sequence identity to any one of the amino acid sequences of SEQ ID NOs: 332-338, 340-346, 348-354, 356-362, 364-370, 372-378, 380-386, 388-394, 396-402, 404-410, 412-418, 420-426, 428-434, 436-442, 444-450, 452- 458, 460-466, 468-474, 476-482, 484-490, 492-498, 500-506, 508-514, 516-522, 524-530, 532-538, 540-546, 548-554, 556-562, 564-570, 572-578, 580-586, 588-594, and 596-602, wherein the one or more than one cysteine pair is maintained.

26. The method of any one of claims 21 to 25, wherein the antibody or antigen-binding fragment thereof has increased thermostability and/or increased resistance to protease activity relative to the control.

27. A method of identifying one or more than one location in an engineered immunoglobulin (Ig) variable domain tolerant of disulfide bonds, the method comprising: a) generating a library of nucleic acid molecules encoding an engineered immunoglobulin (Ig) variable domain lacking a cysteine residue at position 23 and/or lacking a cysteine residue at position 104 and thus lacking a disulfide bond between position 23 and position 104, wherein at least one cysteine residue has been introduced into the amino acid sequence of each engineered Ig variable domain encoded by the nucleic acid molecules of the library; b) identifying Ig variable domains encoded by the nucleic acid molecules of the library which have formed one or more than one disulfide bond other than between positions 23 and 104; and c) identifying the positions between which the one or more than one disulfide bond is formed in each engineered Ig variable domain, wherein the position numbering refers to IMGT numbering.

28. A method of identifying engineered immunoglobulin (Ig) variable domain disulfide bonds that increase stability, the method comprising: a) generating a library of nucleic acid molecules encoding engineered immunoglobulin (Ig) variable domains lacking a cysteine residue at position 23 and/or lacking a cysteine residue at position 104 and thus lacking a disulfide bond between position 23 and position 104, wherein engineered pairs of cysteine residues have been introduced into the amino acid sequence of each engineered Ig variable domain encoded by the nucleic acid molecules of the library; b) identifying Ig variable domains encoded by the nucleic acid molecules of the library which have formed at least one disulfide bond other than between positions 23 and 104; c) determining the stability of the identified Ig variable domains, and d) identifying the positions between which the disulfide bonds are formed in each engineered Ig variable domain that has increased stability as compared to a control, wherein the position numbering refers to IMGT numbering.

29. The method of claim 27 or 28, wherein step a) comprises generating a library of nucleic acid molecules encoding an engineered immunoglobulin (Ig) variable domain lacking a cysteine residue at position 23 and/or lacking a cysteine residue at position 104.

30. The method of any one of claims 27 to 29, wherein the library is generated using Kunkel mutagenesis.

31 . The method of any one of claims 27 to 30 wherein the Ig variable domains, encoded by the nucleic acid molecules of the library, that have formed at least one disulfide bond other than between positions 23 and 104 are identified using phage display.

32. The method of any one of claims 27 to 31 , wherein identifying the positions between which the disulfide bonds are formed comprises next generation sequencing.

33. The method of claim 28, wherein the stability is determined by measuring thermostability and/or protease resistance of the engineered Ig variable domain.

34. The method of any one of claims 21 to 33, wherein the variable domain is a heavy chain variable domain.

35. The method of any one of claims 21 to 33, wherein the variable domain is a VHH variable domain, a light chain variable domain, or a VNAR variable domain.

36. The engineered library of claim 19, wherein the library comprises a plurality of nucleic acid molecules, each encoding an engineered Ig variable domain or an antibody or antigen-binding fragment thereof, wherein the Ig variable domains or the antibodies or antigen-binding fragments thereof encoded by the nucleic acid molecules comprise the same engineered cysteine residues but have different framework and/or CDR sequences.

Description:
IMPROVED ENGINEERED IMMUNOGLOBULINS AND BINDING FRAGMENTS

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of United States provisional application no. 63/381 ,624, filed October 31 , 2022, the contents of which is herein incorporated by reference in its entirety.

REFERENCE TO AN ELECTRONIC SEQUENCE LISTING

[0002] The contents of the electronic sequence listing (2019-087-02_SL_270ct2023.xml;

Size: 573,615 bytes; and Date of Creation: October 27, 2023) is herein incorporated by reference in its entirety.

FIELD

[0003] The present disclosure relates to engineered immunoglobulins (Igs) and binding fragments thereof, and in particular to engineered Igs and antigen-binding fragments thereof that have improved stability comprising one or more non-canonical disulfide bond(s).

BACKGROUND

[0004] Intra- and inter-domain disulfide linkages play critical roles in the folding and stability of antibodies and other large, secreted macromolecules (1). Depending on isotype, human immunoglobulin (Ig) molecules contain between 14 and 23 disulfide linkages, including a highly conserved intradomain disulfide linkage connecting p-strands B and F at the core of each Ig domain (2). Incorporation of exogenous disulfide linkages into proteins can enhance their thermodynamic and kinetic stability (3), which, for therapeutic antibodies, is associated with improved manufacturability, safety, and efficacy (4).

[0005] Incorporation of engineered disulfide linkages spanning p-strands A and G (Cys4- Cys119 and Cys6-Cys1 19; IMGT numbering used throughout (45)) improved the thermostability of the Ig constant (C) C H 2 domain (5). In Ig variable (V) domains, which form the antigen-combining sites of antibodies, the canonical B-F strand intradomain disulfide linkage is formed between Cys23-Cys104. Rarely, Ig V domains contain additional non- canonical disulfide linkages formed between Cys pairs located in framework regions (FRs): naturally occurring non-canonical linkages have been identified spanning Cys54-Cys78 (p- strands C’-D; 6,7) and Cys40-Cys55 (p-strands C-C’; 8,9), and a non-canonical linkage was successfully engineered between Cys39-Cys87 (p-strands C-E) based on structural modeling (7). The Ig V domains of some vertebrates (e.g., camelids; chickens; cows; sharks) contain non-canonical disulfide linkages at much higher frequencies (10), but these almost exclusively involve at least one Cys located in a complementarity-determining region (CDR). Incorporation of one or more non-canonical disulfide linkage(s) into the FRs of single or multiple Ig domains can additively improve the thermostability of both the individual domains and the whole IgG molecules containing them (11 ,12). Similar stabilizing effects have been observed for non- canonical disulfide linkages formed between positions within or structurally analogous to the CDR loops (13, 14), but these cannot be tolerated in all Ig V domains without compromising antigen binding.

SUMMARY

[0006] Most immunoglobulin (Ig) domains bear only a single highly conserved canonical intradomain disulfide linkage formed between Cys23-Cys104, and incorporation of rare non- canonical disulfide linkages at other locations can enhance Ig domain stability. Here, the sequence tolerance of Ig variable (V) domain framework regions (FRs) to non-canonical disulfide linkages is exhaustively surveyed. This approach identified seven novel Cys pairs in V H FRS (Cys24-Cys86, Cys22-Cys88, Cys4-Cys25, Cys4-Cys118, Cys5-Cys120, Cys6- Cys119 and Cys45-Cys100 by IMGT numbering) whose presence putatively resulted in formation of non-canonical disulfide linkages (four intra- and three inter-p sheet), rescuing domain folding and stability. None of the non-canonical disulfide linkages were present in the natural human V H repertoire. These data reveal an unexpected permissiveness of Ig V domains to non-canonical disulfide linkages at diverse locations in FRs, absent in the human repertoire, whose presence is compatible with antigen recognition and improves domain biophysical properties.

Accordingly, an aspect of the disclosure includes an engineered immunoglobulin (Ig) variable domain comprising an introduced pair of cysteine residues comprising a first cysteine residue and a second cysteine residue, wherein: the first cysteine residue is at position 24 and the second cysteine residue is at position 86; the first cysteine residue is at position 22 and the second cysteine residue is at position 88; the first cysteine residue is at position 4 and the second cysteine residue is at position 25; the first cysteine residue is at position 4 and the second cysteine residue is at position 118; the first cysteine residue is at position 5 and the second cysteine residue is at position 120; the first cysteine residue is at position 6 and the second cysteine residue is at position 119; or the first cysteine residue is at position 45 and the second cysteine residue is at position 100, wherein the position numbering refers to IMGT numbering and wherein at least one disulfide bond is formed or is capable of forming between the first and second cysteine residues.

[0007] In an embodiment, the variable domain is a heavy chain variable domain. In another embodiment, the variable domain is a V H H variable domain, a light chain variable domain, or a VNAR variable domain.

[0008] In an embodiment, the engineered Ig variable domain comprises any one of the amino acid sequences set forth in SEQ ID NOs: 332-338, 340-346, 348-354, 356-362, 364- 370, 372-378, 380-386, 388-394, 396-402, 404-410, 412-418, 420-426, 428-434, 436-442, 444-450, 452-458, 460-466, 468-474, 476-482, 484-490, 492-498, 500-506, 508-514, 516- 522, 524-530, 532-538, 540-546, 548-554, 556-562, 564-570, 572-578, 580-586, 588-594, and 596-602 or an amino acid sequence having at least 80% sequence identity to any one of the amino acid sequences set forth in SEQ ID NOs: 332-338, 340-346, 348-354, 356-362, 364-370, 372-378, 380-386, 388-394, 396-402, 404-410, 412-418, 420-426, 428-434, 436- 442, 444-450, 452-458, 460-466, 468-474, 476-482, 484-490, 492-498, 500-506, 508-514, 516-522, 524-530, 532-538, 540-546, 548-554, 556-562, 564-570, 572-578, 580-586, 588- 594, and 596-602, wherein the first and second cysteine residues are maintained.

[0009] In an embodiment, the pair of cysteine residues is introduced by mutating a nucleic acid molecule encoding the engineered Ig variable domain.

[0010] Another aspect of the disclosure includes an antibody or antigen-binding fragment thereof comprising at least one engineered Ig variable domain as described herein.

[0011] In an embodiment, the antibody or antigen-binding fragment thereof is a single domain antibody, optionally a nanobody. In another embodiment, the antibody or antigenbinding fragment thereof is an lgG1 , lgG2, lgG3, lgG4, IgE, IgA, IgY, IgD, IgM, and/or IgNAR antibody. In an embodiment, the antibody or antigen-binding fragment thereof is an antibody fragment selected from Fab, Fab', F(ab')2, scFv, dsFv, ds-scFv, dimers, minibodies, diabodies, and multimers thereof.

[0012] Another aspect of the disclosure includes a polypeptide comprising a plurality of engineered Ig variable domains as described herein.

[0013] Another aspect of the disclosure includes a nucleic acid molecule encoding an engineered Ig variable domain as described herein, an antibody or antigen-binding fragment thereof as described herein, or a polypeptide as described herein. The nucleic acid molecule may be comprised in a vector. The nucleic acid molecule may, for example, be incorporated into an expression cassette or expression vector for expression of the antibody binding fragment. Accordingly, an aspect of the disclosure includes an expression cassette ora vector, optionally an expression vector, comprising a nucleic acid molecule as described herein. Suitable expression vectors include but are not limited to cosmids, plasmids, or modified viruses (e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses). [0014] Another aspect of the disclosure includes a vector comprising a nucleic acid molecule as described herein.

[0015] Another aspect of the disclosure includes a cell comprising a nucleic acid molecule as described herein. In an embodiment, the cell is a mammalian cell. In a further embodiment, the cell is a human embryonic kidney (HEK) cell or a Chinese hamster ovary (CHO) cell.

[0016] Another aspect of the disclosure includes an animal comprising a nucleic acid molecule as described herein. In an embodiment, the animal is a non-human mammal or bird. In an embodiment, the animal is a transgenic or genetically engineered animal, including a progeny or descendant thereof that comprises the nucleic acid molecule.

[0017] Another aspect of the disclosure includes a cell expressing an engineered Ig variable domain as described herein, an antibody or antigen-binding fragment thereof as described herein, a polypeptide as described herein, a nucleic acid molecule as described herein, a vector as described herein, or an expression cassette as described herein.

[0018] Another aspect of the disclosure includes an animal expressing an engineered Ig variable domain as described herein, an antibody or antigen-binding fragment thereof as described herein, a polypeptide as described herein, a nucleic acid molecule as described herein, a vector as described herein, or an expression cassette as described herein. In an embodiment, the animal is a non-human mammal or bird. In an embodiment, the animal is a transgenic or genetically engineered animal, including a progeny or descendant thereof that comprises and expresses the engineered Ig variable domain, the antibody or binding fragment, the polypeptide, the nucleic acid molecule, the vector, or the expression cassette.

[0019] Another aspect of the disclosure includes a composition comprising an engineered Ig variable domain as described herein, an antibody or antigen-binding fragment thereof as described herein, a polypeptide as described herein, a nucleic acid molecule as described herein, a vector as described herein, or an expression cassette as described herein, and optionally comprising a pharmaceutically acceptable carrier and/or excipient.

[0020] Another aspect of the disclosure includes a kit comprising an engineered Ig variable domain as described herein, an antibody or antigen-binding fragment thereof as described herein, a polypeptide as described herein, a nucleic acid molecule as described herein, a vector as described herein, or an expression cassette as described herein and packaging material.

[0021] Another aspect of the disclosure includes an engineered library comprising at least one nucleic acid molecule encoding at least one engineered Ig variable domain as described herein or encoding at least one antibody or antigen-binding fragment thereof as described herein. In some embodiments, the at least one engineered Ig variable domain or the at least one antibody or antigen-binding fragment thereof is a plurality and wherein at least a subset of the plurality comprises different CDR sequences. In some embodiments, each nucleic acid molecule in the library encodes an engineered Ig variable domain as described herein. In other embodiments, only a subset of nucleic acid molecules in the library encode an engineered Ig variable domain as described herein, while other nucleic acid molecules in the library encode different Ig variable domains. In some embodiments, the library is a display library; such as a phage display library, a ribosome display library, a yeast display library, a mammalian cell display library, or a bacterial display library. In an embodiment, the library comprises a plurality of nucleic acid molecules, each encoding an engineered Ig variable domain or an antibody or antigen-binding fragment thereof, wherein the Ig variable domains or the antibodies or antigen-binding fragments thereof encoded by the nucleic acid molecules comprise the same engineered cysteine residues but have different framework and/or CDR sequences.

[0022] Another aspect of disclosure includes a method of making an antibody or antigenbinding fragment thereof having increased stability relative to a control, the method comprising generating an antibody or antigen-binding fragment thereof comprising at least one of the engineered Ig variable domains as described herein, wherein the control is an antibody or antigen-binding fragment thereof into which the pair of cysteine residues has not been introduced but is otherwise the same as the antibody or antigen-binding fragment thereof produced by the method.

[0023] Another aspect of the disclosure includes a method of increasing stability of an antibody or antigen-binding fragment thereof relative to a control, the method comprising introducing into at least one variable domain of the antibody or antigen-binding fragment thereof a pair of cysteine residues, wherein the pair of cysteine residues is selected from: a cysteine residue at position 24 and a cysteine residue at position 86; a cysteine residue at position 22 and a cysteine residue at position 88; a cysteine residue at position 4 and a cysteine residue at position 25; a cysteine residue at position 4 and a cysteine residue at position 118; a cysteine residue at position 5 and a cysteine residue at position 120; a cysteine residue at position 6 and a cysteine residue at position 119; and a cysteine residue at position 45 and a cysteine residue at position 100, wherein: the position numbering refers to IMGT numbering, the cysteine residues introduced form or are capable of forming a disulfide bond, and the control is an antibody or antigen-binding fragment thereof into which the pair of cysteine residues has not been introduced but is otherwise the same as the antibody or antigen-binding fragment thereof produced by the method.

[0024] In some embodiments, the variable domain is a heavy chain variable domain. In other embodiments, the variable domain is a V H H variable domain, a light chain variable domain, or a VNAR variable domain.

[0025] In particular embodiments, the engineered Ig variable domain comprises any one of the amino acid sequences of SEQ ID NOs: 332-338, 340-346, 348-354, 356-362, 364-370, 372-378, 380-386, 388-394, 396-402, 404-410, 412-418, 420-426, 428-434, 436-442, 444- 450, 452-458, 460-466, 468-474, 476-482, 484-490, 492-498, 500-506, 508-514, 516-522, 524-530, 532-538, 540-546, 548-554, 556-562, 564-570, 572-578, 580-586, 588-594, and 596-602 or an amino acid sequence having at least 80%, at least 85% at least 90% or at least 95% sequence identity to any one of the amino acid sequences of SEQ ID NOs: 332-338, 340-346, 348-354, 356-362, 364-370, 372-378, 380-386, 388-394, 396-402, 404-410, 412- 418, 420-426, 428-434, 436-442, 444-450, 452-458, 460-466, 468-474, 476-482, 484-490, 492-498, 500-506, 508-514, 516-522, 524-530, 532-538, 540-546, 548-554, 556-562, 564- 570, 572-578, 580-586, 588-594, and 596-602, wherein the one or more than one cysteine pair is maintained.

[0026] In some embodiments, the resulting antibody or antigen-binding fragment thereof has increased thermostability and/or increased resistance to protease activity relative to the control.

[0027] Another aspect of the disclosure includes a method for identifying one or more than one location in an engineered immunoglobulin (Ig) variable domain tolerant of disulfide bonds, the method comprising a) generating a library of nucleic acid molecules encoding an engineered immunoglobulin (Ig) variable domain lacking a cysteine residue at position 23 and/or lacking a cysteine residue at position 104 and thus lacking a disulfide bond between position 23 and position 104, wherein at least one cysteine residue has been introduced into the amino acid sequence of each engineered Ig variable domain encoded by the nucleic acid molecules of the library; b) identifying Ig variable domains encoded by the nucleic acid molecules of the library which have formed one or more than one disulfide bond other than between positions 23 and 104; and c) identifying the positions between which the one or more than one disulfide bond is formed in each engineered Ig variable domain, wherein the position numbering refers to IMGT numbering.

[0028] Another aspect of the disclosure includes a method of identifying engineered immunoglobulin (Ig) variable domain disulfide bonds that increase stability or solubility of the Ig variable domain, the method comprising: a) generating a library of nucleic acid molecules encoding engineered immunoglobulin (Ig) variable domains lacking a cysteine residue at position 23 and/or lacking a cysteine residue at position 104 and thus lacking a disulfide bond between position 23 and position 104, wherein engineered pairs of cysteine residues have been introduced into the amino acid sequence of each engineered Ig variable domain encoded by the nucleic acid molecules of the library; b) identifying Ig variable domains encoded by the nucleic acid molecules of the library which have formed at least one disulfide bond other than between positions 23 and 104; c) determining the stability of the identified Ig variable domains, and d) identifying the positions between which the disulfide bonds are formed in each engineered Ig variable domain that has increased stability as compared to a control, wherein the position numbering refers to IMGT numbering. In an embodiment of the method, the stability is determined by measuring thermostability and/or protease resistance of the engineered Ig variable domain.

[0029] In some embodiments, generating a library of nucleic acid molecules comprises generating a library of nucleic acid molecules encoding an engineered Ig variable domain lacking a cysteine residue at position 23 and/or lacking a cysteine residue at position 104. In some embodiments, the library is generated using Kunkel mutagenesis.

[0030] In some embodiments, the Ig variable domains, encoded by the nucleic acid molecules of the library, that have formed at least one disulfide bond other than between positions 23 and 104, are identified using phage display.

[0031] In some embodiments, identifying the positions between which the disulfide bonds are formed comprises next generation sequencing.

[0032] In some embodiments, the variable domain is a heavy chain variable domain. [0033] In some embodiments, the variable domain is a V H H variable domain, a light chain variable domain, or a VNAR variable domain.

BRIEF DESCRIPTION OF DRAWINGS

[0034] An embodiment of the present disclosure will now be described in relation to the drawings in which:

[0035] Figures 1A - 1 D. Fig. 1A provides a cartoon representation showing rescue of a destabilized Cys23-Cys104 null immunoglobulin V H domain by Cys pair scanning of FRs and enrichment of phage-displayed V H s bearing non-canonical disulfide linkages from libraries using protein A selection. Fig. 1 B shows panning of fd phage displaying VHS or their Cys23- Cys104 null derivatives against protein A. Only VH413 C23-c104 nu " phage showed a significant difference in post-selection output phage titer compared with the WT V H -displaying phage (arrow heads). Pre, pre-selection; Post, post-selection. Fig. 1 C shows the effect of introducing known canonical (Cys23-Cys104) and non-canonical (Cys40-Cys55 and Cys54-Cys78) disulfide linkages into VH413 C23 C104 nu " fd phage on output phage titer following protein A selection. “None” refers to unmodified VH413 C23 C104 nu " displaying fd phage (black bar and dotted line). The effects of a negative control Cys pair (Cys54-Cys87) were also assessed. Fig. 1 D shows panning of test VH413 C23-c104 nu " Cys pair scan libraries (B-F, C-C’ and C’-D) against protein A. The percent frequencies of known non-canonical stabilizing disulfide linkages formed between these p-strands is indicated. Four rounds of panning were performed. The pannings in Figures 1 B, 1 C and 1 D were performed at room temperature with no heating step. ‘Library’ denotes the unpanned libraries, cfu, colony-forming units.

[0036] Figures 2A and 2B. Fig. 2A shows the effect of introducing of eight Cys pairs forming putative non-canonical FR disulfide linkages on the T m s of two V H s (VH413 and VH428) and two VHHS (A4.2 and A26.8). T m was measured using a differential scanning fluorimetry. Four negative control Cys pairs predicted not to form disulfide linkages were also assessed. Fig. 2B shows the effect of introducing seven Cys pairs forming putative non- canonical FR disulfide linkages on Ig V domain thermostability, other properties and antigen recognition. VHHS are shown in open squares, VHS in open circles, and VLS in X symbols. Expression (%) refers to the proportion of sdAbs expressing with adequate yields for subsequent experiments. Changes in sdAb T m and fraction refolded (a-value) following introduction of FR Cys pairs were measured by circular dichroism. Changes in sdAb monomericity following introduction of FR Cys pairs were measured by SEC-MALS. Changes in the K D s of antigen specific V H HS following introduction of FR Cys pairs were assessed by SPR. For changes (A) in biophysical properties, values above the null (AMonomer 0%, AT m 0°C, and Aa-value 0) indicate improvement, while for changes in binding affinity, values above the null (AKD 1) reflect weaker binding. Median values are indicated by black lines. [0037] Figure 3 shows the effect of introducing Cys pairs forming putative non-canonical FR disulfide linkages on V H H pepsin resistance. The three Cys pairs conferring the highest increase in sdAb T m (Cys24-Cys86, Cys4-Cys25 and Cys6-Cys1 19) were assessed as well as one Cys pair (Cys22-Cys88) that had an intermediate effect on T m . The V H HS were digested with three concentrations of pepsin for 1 h at 37°C, and then the remaining undigested V H H was quantitated by SDS-PAGE and densitometry. Lines indicate averages for each Cys pair.

[0038] Figures 4A - 4E show the presence of non-canonical Cys residues in the human expressed V H repertoires of 17 individuals. Fig. 4A shows frequency of non-canonical Cys residues in V H FRs and CDRs, disregarding IMGT positions 23 and 104. Fig. 4B shows frequency of expression VHS bearing the indicated number of Cys residues. This analysis includes Cys23/Cys104, and thus the majority of V H s bear two Cys residues. Fig. 4C provides a circos plot showing the relative frequencies and strands of non-canonical Cys pairs in the subset of expressed human V H s bearing >4 FR Cys residues. The size of each p-strand along the circle’s circumference reflects the frequency of non-canonical Cys, and the overlap shows the relative frequency of the second Cys in the pair. Data represent median values across 17 V H repertoires. Fig. 4D shows most frequent p-strands bearing non-canonical Cys pairs among human V H s bearing >4 FR Cys residues. Fig. 4E shows most frequent non-canonical Cys pairs among human V H s bearing >4 FR Cys residues. Lines represent medians. Note that y axes are log-scaled and zero values are not plotted.

[0039] Figure 5 shows LC-MS analysis of free sulfhydryl abundance in WT and Cys- engineered VHHS. For each category of VHH (WT or the indicated engineered Cys pair), the percentage of reduced and non-reduced protein analyzed bearing 0, 2, and 4 maleimide- PEG2-biotin labels is indicated (left y-axis; sum: 100%). The molar ratio of free sulhydryl groups to total V H H protein is shown in black dotted line (right y-axis). Error bars represent standard errors of the means. Abbreviations: G, guanidine-HCI; L, label (maleimide-PEG2- biotin); T, TCEP.

DETAILED DESCRIPTION

I. Definitions

[0040] As used herein, the following terms may have meanings ascribed to them below, unless specified otherwise. However, it should be understood that other meanings that are known or understood by those having ordinary skill in the art are also possible, and within the scope of the present disclosure. In the case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting. [0041] Unless otherwise defined, scientific and technical terms used in connection with the present disclosure shall have the meanings that are commonly understood by those of ordinary skill in the art. Generally, nomenclatures utilized in connection with, and techniques of, cell and tissue culture, molecular biology, and protein and oligonucleotide or polynucleotide chemistry and hybridization described herein are those well-known and commonly used in the art (see, e.g., Green, M. and Sambrook, J. (2012) Molecular Cloning: A Laboratory Manual. 4th Edition, Vol. II, Cold Spring Harbor Laboratory Press, New York.).

[0042] As used in this specification and the appended claims, the singular forms “a”, “an” and “the” include plural references unless the content clearly dictates otherwise. Thus, for example, a composition comprising “a compound” includes a mixture of two or more compounds and the term "a cell" includes a single cell as well as a plurality or population of cells. It should also be noted that the term “or” is generally employed in its sense including “and/or” unless the content clearly dictates otherwise.

[0043] Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range is encompassed within the description. Ranges from any lower limit to any upper limit are contemplated. The upper and lower limits of these smaller ranges which may independently be included in the smaller ranges is also encompassed within the description, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either both of those included limits are also included in the description.

[0044] As used in this application and claim(s), the word “consisting” and its derivatives, are intended to be close-ended terms that specify the presence of stated features, elements, components, groups, integers, and/or steps, and also exclude the presence of other unstated features, elements, components, groups, integers and/or steps.

[0045] In understanding the scope of the present disclosure, the term "comprising" and its derivatives, (such as "comprise" and "comprises"), "having" (and any form of having, such as "have" and "has"), "including" (and any form of including, such as "include" and "includes") or "containing" (and any form of containing, such as "contain" and "contains"), as used herein, are intended to be open ended terms that specify the presence of the stated features, elements, components, groups, integers, and/or steps, but do not exclude the presence of other unstated features, elements, components, groups, integers and/or steps. The foregoing also applies to words having similar meanings such as the terms, "including", "having" and their derivatives. [0046] The terms "about", “substantially” and “approximately” as used herein mean a reasonable amount of deviation of the modified term such that the end result is not significantly changed. These terms of degree should be construed as including a deviation of at least ±5% or at least ±10% of the modified term if this deviation would not negate the meaning of the word it modifies.

[0047] The phrase "and/or," as used herein in the specification and in the claims, should be understood to mean "either or both" of the elements so conjoined, i.e., elements that are conjunctively present in some cases and disjunctively present in other cases. Multiple elements listed with "and/or" should be construed in the same fashion, i.e., "one or more" of the elements so conjoined. Other elements may optionally be present other than the elements specifically identified by the "and/or" clause, whether related or unrelated to those elements specifically identified.

[0048] As used herein in the specification and in the claims, "or" should be understood to have the same meaning as "and/or" as defined above. For example, when separating items in a list, "or" or "and/or" shall be interpreted as being inclusive, i.e., the inclusion of at least one, but also including more than one, of a number or list of elements, and, optionally, additional unlisted items. Only terms clearly indicated to the contrary, such as "only one of or "exactly one of or, when used in the claims, "consisting of will refer to the inclusion of exactly one element of a number or list of elements. In general, the term "or" as used herein shall only be interpreted as indicating exclusive alternatives (i.e., "one or the other but not both") when preceded by terms of exclusivity, such as "either," "one of," "only one of," or "exactly one of."

[0049] As used herein in the specification and in the claims, the phrase "at least one," in reference to a list of one or more element(s), should be understood to mean at least one element selected from any one or more of the element(s) in the list of elements, but not necessarily including at least one of each and every element specifically listed within the list of elements and not excluding any combinations of elements in the list of elements. This definition also allows that elements may optionally be present other than the elements specifically identified within the list of elements to which the phrase "at least one" refers, whether related or unrelated to those elements specifically identified.

[0050] It should also be understood that, in certain methods described herein that include more than one step or act, the order of the steps or acts of the method is not necessarily limited to the order in which the steps or acts of the method are recited unless the context indicates otherwise.

[0051] The term “complementarity determining region” or “CDR” as used herein refers to particular hypervariable regions of antibodies that are commonly presumed to contribute to epitope binding. [0052] The term "antigen-binding fragment" as used herein refers to a part or portion of an antibody or antibody chain comprising at least a variable domain and fewer amino acid residues than an intact or complete antibody or antibody chain and which binds the antigen or competes with intact antibody. Exemplary antigen-binding fragments include Fab, Fab', F(ab')2, scFv, dsFv, ds-scFv, minibodies, diabodies, and multimers thereof, fusions comprising at least two antibody domains, and single domain antibodies such as V H HS, V L S, VHS and variable new antigen receptors (VNARS). Fragments can be obtained via chemical or enzymatic treatment of an intact or complete antibody or antibody chain. Fragments can also be obtained by recombinant means. Antibody binding fragments may be any class of immunoglobulins including: IgG, IgM, IgD, IgA, IgY or IgE; and any isotype thereof, including lgG1 , lgG2 (e.g., lgG2a, lgG2b), lgG3 and lgG4.

[0053] The term “variable domain” as used herein refers to a polypeptide that is an immunoglobulin domain comprising three complementarity determining regions along with four framework regions (i.e., FR1 , FR2, FR3, and FR4).

[0054] The term “cell” as used herein refers to a single cell or a plurality of cells.

[0055] A "conservative amino acid substitution" as used herein, is one in which one amino acid residue is replaced with another amino acid residue without abolishing the protein's desired properties. Suitable conservative amino acid substitutions can be made by substituting amino acids with similar hydrophobicity, polarity, and R-chain length for one another. Examples of conservative substitutions include the substitution of one non-polar (hydrophobic) residue such as alanine, isoleucine, valine, leucine or methionine for another, the substitution of one polar (hydrophilic) residue for another such as between arginine and lysine, between glutamine and asparagine, between glycine and serine, the substitution of one basic residue such as lysine, arginine or histidine for another, or the substitution of one acidic residue, such as aspartic acid or glutamic acid for another. The phrase “conservative substitution” also includes the use of a chemically derivatized residue or non-natural amino acid in place of a non-derivatized residue provided that such polypeptide displays the requisite activity.

[0056] As used herein, the terms “peptide,” “polypeptide,” and “protein” refer to any chain of two or more natural or unnatural amino acid residues, regardless of post-translational modifications (e.g., glycosylation or phosphorylation). Included are proteins that are a single polypeptide chain and multisubunit proteins (e.g., composed of 2 or more polypeptide chains).

[0057] The term "sequence identity" as used herein refers to the percentage of sequence identity between two amino acid sequences or two nucleic acid sequences. To determine the percent identity of two amino acid sequences or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in the sequence of a first amino acid or nucleic acid sequence for optimal alignment with a second amino acid or nucleic acid sequence). The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences (i.e., % identity = [number of identical overlapping positions] I [total number of positions] X 100%). The determination of percent identity between two sequences can also be accomplished using a mathematical algorithm. One non-limiting example of a mathematical algorithm utilized for the comparison of two sequences is the algorithm of Karlin and Altschul, 1990, Proc. Natl. Acad. Sci. U.S.A. 87:2264-2268, modified as in Karlin and Altschul, 1993, Proc. Natl. Acad. Sci. U.S.A. 90:5873-5877. Such an algorithm is incorporated into the NBLAST and XBLAST programs of Altschul et al., 1990. BLAST nucleotide searches can be performed with the NBLAST nucleotide program parameters set, e.g., for score=100, wordlength=12 to obtain nucleotide sequences homologous to nucleic acid molecules of the present disclosure. BLAST protein searches can be performed with the XBLAST program parameters set, e.g., to score-50, wordlength=3 to obtain amino acid sequences homologous to a protein molecule of the present disclosure. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., 1997, Nucleic Acids Res. 25:3389-3402. Alternatively, PSI-BLAST can be used to perform an iterated search which detects distant relationships between molecules. When utilizing BLAST, Gapped BLAST, and PSI-Blast programs, the default parameters of the respective programs (e.g., of XBLAST and NBLAST) can be used (see, e.g., the NCBI website). Another non-limiting example of a mathematical algorithm utilized forthe comparison of sequences is the algorithm of Myers and Miller, 1988, CABIOS 4:1 1-17. Such an algorithm is incorporated in the ALIGN program (version 2.0) which is part of the GCG sequence alignment software package. When utilizing the ALIGN program for comparing amino acid sequences, a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used. The percent identity between two sequences can be determined using techniques similar to those described above, with or without allowing gaps. In calculating percent identity, typically only exact matches are counted.

[0058] The term "nucleic acid” or “nucleic acid molecule", as used herein, are intended to include unmodified DNA or RNA or modified DNA or RNA. The nucleic acid molecules of the disclosure may contain one or more modified base(s) or DNA or RNA backbone(s) modified for stability or for other reasons. Unless otherwise indicated, standard IUPAC-IUB nomenclature is used herein. "Modified" bases include, for example, tritiated bases and unusual bases such as inosine. A variety of modifications can be made to DNA and RNA; thus "nucleic acid molecule" embraces chemically, enzymatically, or metabolically modified forms. The term "polynucleotide" shall have a corresponding meaning. The nucleic acid molecule can be either double stranded or single stranded, and represents the sense or antisense strand. Further, the term "nucleic acid molecule" includes the complementary nucleic acid sequences as well as codon optimized or synonymous codon equivalents. The term "isolated nucleic acid molecules" as used herein refers to a nucleic acid molecule substantially free of cellular material or culture medium when produced by recombinant DNA techniques, or chemical precursors, or other chemicals when chemically synthesized.

[0059] The term "vector" as used herein comprises any intermediary vehicle for a nucleic acid molecule which enables said nucleic acid molecule, for example, to be introduced into prokaryotic and/or eukaryotic cells and/or integrated into a genome, and includes plasmids, phagemids, bacteriophage or viral vectors such as retroviral based vectors, adeno-associated viral vectors and the like. The term "plasmid" as used herein generally refers to a construct of extrachromosomal genetic material, usually a circular DNA duplex, which can replicate independently of chromosomal DNA.

[0060] The term “pharmaceutically acceptable” means compatible with the treatment of animals, in particular, humans.

[0061] The definitions and embodiments described in particular sections are intended to be applicable to other embodiments herein described for which they are suitable as would be understood by a person skilled in the art.

[0062] The recitation of numerical ranges by endpoints herein includes all numbers and fractions subsumed within that range (e.g., 1 to 5 includes 1 , 1 .5, 2, 2.75, 3, 3.90, 4, and 5). It is also to be understood that all numbers and fractions thereof are presumed to be modified by the term "about".

II. Engineered Variable Domains, Antibodies, Nucleic Acids, Libraries, Cells, and Animals

[0063] An aspect of the disclosure includes an engineered immunoglobulin (Ig) variable domain comprising or capable of comprising at least one non-canonical disulfide bond wherein the at least one non-canonical disulfide bond is or can be formed between a cysteine residue at position 24 and a cysteine residue at position 86; a cysteine residue at position 22 and a cysteine residue at position 88; a cysteine residue at position 4 and a cysteine residue at position 25; a cysteine residue at position 4 and a cysteine residue at position 1 18; a cysteine residue at position 5 and a cysteine residue at position 120; a cysteine residue at position 6 and a cysteine residue at position 1 19; and/or a cysteine residue at position 45 and a cysteine residue at position 100; wherein the position numbering refers to IMGT numbering.

[0064] The engineered immunoglobulin (Ig) variable domain “capable of comprising at least one non-canonical disulfide bond” refers to for example, an engineered Ig variable domain that will form or does form (e.g., can be formed, capable of forming) a disulfide bridge between two cysteine residues under suitable conditions, e.g., oxidizing conditions. In some embodiments, the engineered immunoglobulin (Ig) variable domain is capable of comprising at least one non-canonical disulfide bond wherein the at least one non-canonical disulfide bond can be formed between two cysteine residues under oxidizing conditions. Suitable oxidizing conditions can include conditions found in an oxidizing portion or organelle of a cell, such as: the periplasmic space of a prokaryotic cell, the cytoplasm of a prokaryotic cell that has been engineered to have an oxidizing cytoplasm (e.g., an E. coli strain with a trxB and/or gor mutation), or the endoplasmic reticulum or Golgi apparatus of a eukaryotic cell. Suitable oxidizing conditions can also include in vitro conditions. For example, a denatured engineered Ig variable domain may be renatured using an oxidizing renaturation solution, as will be known to one skilled in the art. Typical renaturation solutions include a suitable buffer, buffer concentration and buffer pH; a suitable concentration of the protein to be renatured; stabilizer(s), such as L-Arg; and a suitable concentration of the redox pair. A renaturation solution may also include chaperones and enzymes to help with the folding of the protein and formation of correct disulphide linkages.

[0065] In some embodiments, the at least one non-canonical disulfide bond is formed within a single framework region. In some embodiments, the at least one non-canonical disulfide bond is formed between two different framework regions.

[0066] In some embodiments, the at least one non-canonical disulfide bond is formed between a cysteine residue at position 24 and a cysteine residue at position 86. In some embodiments, the at least one non-canonical disulfide bond is formed between a cysteine residue at position 22 and a cysteine residue at position 88. In some embodiments, the at least one non-canonical disulfide bond is formed between a cysteine residue at position 4 and a cysteine residue at position 25. In some embodiments, the at least one non-canonical disulfide bond is formed between a cysteine residue at position 4 and a cysteine residue at position 1 18. In some embodiments, the at least one non-canonical disulfide bond is formed between a cysteine residue at position 5 and a cysteine residue at position 120. In some embodiments, the at least one non-canonical disulfide bond is formed between a cysteine residue at position 6 and a cysteine residue at position 119. In some embodiments, the at least one non-canonical disulfide bond is formed between a residue at position 45 and a cysteine residue at position 100. [0067] In some embodiments, the variable domain is a heavy chain variable domain.

[0068] In some embodiments, the variable domain is a V H H domain.

[0069] In some embodiments, the variable domain is a light chain variable domain. In some embodiments, the light chain variable domain is of the of the lambda or kappa family.

[0070] In some embodiments, the variable domain is a VNAR variable domain.

[0071] It is expected that all Ig variable domains (irrespective of sequence) can be used herein, due to structural conservation of the variable Ig fold in different Ig variable domains. In some embodiments, the variable domain is human. In other embodiments, the variable domain is non-human, for example murine, rabbit, shark, camel, llama, alpaca, vicugna, guanaco or chicken.

[0072] In some embodiments, the engineered Ig variable domain comprises any one of the amino acid sequences of SEQ ID NOs: 332-338, 340-346, 348-354, 356-362, 364-370, 372-378, 380-386, 388-394, 396-402, 404-410, 412-418, 420-426, 428-434, 436-442, 444- 450, 452-458, 460-466, 468-474, 476-482, 484-490, 492-498, 500-506, 508-514, 516-522, 524-530, 532-538, 540-546, 548-554, 556-562, 564-570, 572-578, 580-586, 588-594, and 596-602 or an amino acid sequence having at least 70%, at least 80%, at least 85% at least 90% or at least 95% sequence identity to any one of the amino acid sequences of SEQ ID NOs: 332-338, 340-346, 348-354, 356-362, 364-370, 372-378, 380-386, 388-394, 396-402, 404-410, 412-418, 420-426, 428-434, 436-442, 444-450, 452-458, 460-466, 468-474, 476- 482, 484-490, 492-498, 500-506, 508-514, 516-522, 524-530, 532-538, 540-546, 548-554, 556-562, 564-570, 572-578, 580-586, 588-594, and 596-602, wherein the cysteine residue(s) introduced at position 24, position 86, position 22, position 88, position 4, position 25, position 1 18, position 5, position 120, position 6, position 119, position 45, and/or position 100 are maintained. In some embodiments, the engineered Ig variable domain comprises a conservatively substituted amino acid sequence of any one of the amino acid sequences of SEQ ID NOs: 332-338, 340-346, 348-354, 356-362, 364-370, 372-378, 380-386, 388-394, 396-402, 404-410, 412-418, 420-426, 428-434, 436-442, 444-450, 452-458, 460-466, 468- 474, 476-482, 484-490, 492-498, 500-506, 508-514, 516-522, 524-530, 532-538, 540-546, 548-554, 556-562, 564-570, 572-578, 580-586, 588-594, and 596-602, wherein the cysteine residue(s) introduced at position 24, position 86, position 22, position 88, position 4, position 25, position 118, position 5, position 120, position 6, position 1 19, position 45, and/or position 100 are maintained. In some embodiments, the CDRs of the engineered Ig domain can be any suitable CDRs.

[0073] In some embodiments, the cysteine residue is introduced by mutating a nucleic acid molecule encoding the engineered Ig variable domain as further described below. In some embodiments, artificial gene synthesis, site-directed mutagenesis, overlap extension polymerase chain reaction (PCR), or genome editing technologies, such as CRISPR/Cas9 is used to mutate the nucleic acid molecule encoding the engineered Ig variable domain. In some embodiments, artificial gene synthesis is used to mutate the nucleic acid molecule encoding the engineered Ig variable domain.

[0074] Another aspect of the disclosure includes an antibody or antigen-binding fragment thereof comprising at least one engineered Ig variable domain as described herein.

[0075] In some embodiments, the antibody is human. In other embodiments, the antibody is non-human, for example murine, rabbit or chicken.

[0076] In some embodiments, the antibody is a single domain antibody. In some embodiments, the single domain antibody is a nanobody.

[0077] In some embodiments, the antibody is an lgG1 , lgG2, lgG3, lgG4, IgE, IgA, IgD, IgY, IgM, and/or IgNAR.

[0078] In some embodiments, the antigen-binding fragment is selected from Fab, Fab', F(ab')2, scFv, dsFv, ds-scFv, minibody, diabody, and multimers thereof.

[0079] Another aspect of the disclosure includes a polypeptide comprising a plurality of engineered Ig variable domains as described herein. In some embodiments, the plurality comprises a plurality of the same engineered Ig variable domain. In some embodiments, the plurality comprises different engineered Ig variable domains.

[0080] In some embodiments, an antibody or antigen-binding fragment thereof as described herein, a polypeptide as described herein or an engineered Ig variable domain as described herein may be fused to an antibody fragment such as one or more than one Fc domain, Fab, Fab', F(ab')2, scFv, dsFv, ds-scFv, minibody, diabody, or multimers thereof. In some embodiments, an antibody or antigen-binding fragment thereof as described herein, a polypeptide as described herein or an engineered Ig variable domain as described herein may be fused to a single domain antibody, such as a V H H, V L , V H or VNAR.

[0081] In some embodiments, an antibody or antigen-binding fragment thereof as described herein, a polypeptide as described herein, or an engineered Ig variable domain as described herein may be fused or conjugated to a cargo molecule, optionally via a linker or a chelator. The cargo molecule may be, for example, a diagnostic or therapeutic agent, which are known in the art. For example, a therapeutic agent may be a radioisotope, which may be used for radioimmunotherapy; a toxin, such as an immunotoxin; a cytokine, such as an immunocytokine; a cytotoxin; an apoptosis inducer; an enzyme; or any other suitable therapeutic molecule known in the art. A diagnostic agent may include, for example, a radioisotope, a paramagnetic label, a fluorophore, an affinity label, fluorescent protein or any other suitable agent that may be detected by imaging methods.

[0082] Another aspect of the disclosure includes a nucleic acid molecule encoding an engineered Ig variable domain as described herein, an antibody or antigen-binding fragment thereof as described herein, or a polypeptide as described herein. The nucleic acid molecule may be comprised in a vector. The nucleic acid molecule may, for example, be incorporated into an expression cassette or expression vector for expression of the antibody or antigenbinding fragment thereof. Accordingly, an aspect of the present disclosure includes an expression cassette or a vector, optionally an expression vector, comprising a nucleic acid molecule as described herein. Suitable expression vectors include, but are not limited to, cosmids, plasmids, or modified viruses (e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses).

[0083] Another aspect of the disclosure is a library comprising a plurality of nucleic acid molecules or vectors as described herein. In an embodiment, the nucleic acid molecules or vectors in the library may comprise different engineered cysteine residues, while otherwise having the same sequence. Such a library may be used, for example, to screen for disulfide bonds that increase stability of an engineered Ig variable domain as described herein, an antibody or antigen-binding fragment thereof as described herein, or a polypeptide as described herein. In another embodiment, the nucleic acid molecules or vectors in the library may encode the same engineered cysteine residues, but encode different framework and/or CDR sequences. Such a library may, for example, be used to screen for an Ig variable domain as described herein, an antibody or antigen-binding fragment thereof as described herein, or a polypeptide as described herein that binds an antigen or target of interest. In a particular embodiment, the library may be a phage display library.

[0084] Another aspect of the disclosure includes a vector comprising a nucleic acid molecule as described herein.

[0085] Another aspect of the disclosure includes a cell expressing an engineered Ig variable domain as described herein, an antibody or antigen-binding fragment thereof as described herein, a polypeptide as described herein, a nucleic acid molecule as described herein, or a vector or expression cassette as described herein. The engineered cell can be prepared by introducing the nucleic acid molecule, expression cassette, or vector into a suitable host cell. Preferably, the host cell is suitable for producing large quantities of the expression cassette or the vector. As would be known to the skilled artisan, a vector compatible with the particular host cell is used.

[0086] The engineered cell can be generated using any host cell suitable for producing a polypeptide, for example suitable for producing an engineered Ig variable domain, an antibody and/or an antigen-binding fragment thereof. For example, to introduce a nucleic acid molecule and/or a vector into a cell, the host cell may be transfected, transformed or infected, depending upon the vector employed. Suitable host cells include a wide variety of prokaryotic and eukaryotic host cells. For example, engineered Ig variable domains and antibodies and antigen-binding fragments thereof as described herein may be expressed in bacterial cells such as E. coli, insect cells (using baculovirus), yeast cells or mammalian cells. In some embodiments, the engineered cell is a human cell. In some embodiments, the engineered cell is a HEK cell. In some embodiments, the engineered cell is a CHO cell. The engineered cell may be used, for example, for in vitro production of an engineered Ig variable domain as described herein, an antibody or antigen-binding fragment thereof as described herein, or a polypeptide as described herein.

[0087] Another aspect of the disclosure includes an animal engineered to comprise VDJ gene segments comprising a nucleic acid molecule encoding an engineered Ig variable domain as described herein, an antibody or antigen-binding fragment thereof as described herein, or a polypeptide as described herein, which allows the animal to express the engineered Ig variable domain, the antibody or antigen-binding fragment thereof, or the polypeptide. The animal may be any species into which VDJ gene segments comprising the nucleic acid molecule may be introduced, for example by transgenesis or genetic engineering. Suitable animals include, for example, mice, rats, chickens, rabbits, cattle, pigs, sheep, and goats. In some embodiments, the animal may be used to produce heterologous antibodies, or antigen-binding fragments thereof, such as human, humanized, or camelid antibodies or antigen-binding fragments thereof.

[0088] Another aspect of the disclosure includes an engineered library comprising at least one nucleic acid molecule encoding at least one engineered Ig variable domain or at least one antibody or antigen-binding fragment thereof as described herein. In some embodiments, the at least one engineered Ig variable domain or the at least one antibody or antigen-binding fragment thereof is a plurality and at least a subset of the plurality comprises different CDR sequences. In some embodiments, the at least one engineered Ig variable domain is part of at least one antibody binding fragment, such as Fab, Fab', F(ab')2, scFv, dsFv, ds-scFv, minibody, diabody, or multimers thereof or a single domain antibody such as VH, VL, VHH, or VNAR-

[0089] In some embodiments, the engineered library is a phage display library. In some embodiments, the library has a diversity of a library in Table 1. In some embodiments, the library is a library as described in Example 1 . In some embodiments, the engineered library is made using mutagenesis, optionally Kunkel mutagenesis. In some embodiments, the engineered library is made using the method described in Example 1 . In some embodiments, the engineered library is a ribosome display library. In some embodiments, the engineered library is a yeast display library. In some embodiments, the engineered library is a mammalian cell display library. In some embodiments, the engineered library is a bacterial display library. In some embodiments, the engineered library is a V H H library. In some embodiments, the engineered library is an Ig light chain variable domain library. In some embodiments, the engineered library is an Ig heavy chain variable domain library.

[0090] In some embodiments, the library is generated using an antibody display technology. In some embodiments, the library is generated using phage display. In some embodiments, the library is generated using ribosome display. In some embodiments, the library is generated using yeast display. In some embodiments, the library is generated using mammalian cell display. In some embodiments, the library is generated using bacterial display.

[0091] In some embodiments, the library is generated using Kunkel mutagenesis. In some embodiments, the library is generated according to the methods provided in the Examples. In some embodiments, the library is generated using trinucleotide mutagenesis, PCR with degenerate oligonucleotides, random error-prone PCR mutagenesis, recombination, or any other methods known in the art for library generation.

[0092] An engineered Ig variable domain as described herein can be considered a variable domain scaffold for preparing desired stabilized antibodies. For example, the engineered variable domain, i.e., scaffold, can be used to prepare specific antibodies by grafting in desired CDRs.

III. Compositions

[0093] Another aspect of the disclosure includes a composition comprising an engineered Ig variable domain as described herein, an antibody or antigen-binding fragment thereof as described herein, a polypeptide as described herein, a nucleic acid molecule as described herein, or a vector or expression cassette as described herein and optionally a pharmaceutically acceptable carrier and/or excipient.

[0094] In an embodiment the composition comprises a diluent. Suitable diluents for nucleic acid molecules include but are not limited to water, saline solutions and ethanol. Suitable diluents for polypeptides, including engineered Ig variable domains, antibodies or fragments thereof and/or cells, include but are not limited to saline solutions, pH buffered solutions and glycerol solutions or other solutions suitable for freezing polypeptides and/or cells.

[0095] The composition may be formulated for use or prepared for administration to a subject using pharmaceutically acceptable formulations known in the art. Conventional procedures and ingredients for the selection and preparation of suitable formulations are described, for example, in Remington's Pharmaceutical Sciences (2003 - 20th edition) and in The United States Pharmacopeia: The National Formulary (USP 24 NF19) published in 1999.

[0096] The compositions described herein can be prepared by per se known methods for the preparation of pharmaceutically acceptable compositions that can be administered to subjects such that an effective quantity of the active substance is combined in a mixture with a pharmaceutically acceptable vehicle.

[0097] Pharmaceutical compositions include, without limitation, lyophilized powders or aqueous or non-aqueous sterile injectable solutions or suspensions, which may further contain antioxidants, buffers, bacteriostats and solutes that render the compositions substantially compatible with the tissues or the blood of an intended recipient. Other components that may be present in such compositions include, for example, water, surfactants (such as Tween), alcohols, polyols, glycerin and vegetable oils. Extemporaneous injection solutions and suspensions may be prepared from sterile powders, granules, tablets, or concentrated solutions or suspensions. The composition may be supplied, for example but not by way of limitation, as a lyophilized powder which is reconstituted with sterile water or saline prior to administration to a patient.

IV. Methods

[0098] Another aspect of disclosure includes a method of making an antibody or antigenbinding fragment thereof with increased stability, the method comprising generating an antibody or antigen-binding fragment thereof comprising at least one of the engineered Ig variable domains described herein.

[0099] In some embodiments, the method comprises substituting one or more than one amino acid residue in a variable domain at position 24, 86, 22, 88, 4, 25, 118, 5, 120, 6, 1 19, 45, and/or 100, based on IMGT numbering, with a cysteine residue, wherein said substituting comprises providing a polynucleotide encoding an amino acid sequence of the modified antibody in which the one or more than one amino acid residue at position 24, 86, 22, 88, 4, 25, 118, 5, 120, 6, 119, 45, and/or 100 is a cysteine residue; and producing the antibody with increased stability.

[00100] In some embodiments, the antibody or antigen-binding fragment thereof is generated by expressing a nucleic acid molecule encoding the at least one engineered Ig variable domain described herein. In some embodiments, the nucleic acid molecule encoding the at least one engineered Ig variable domain described herein is generated using Kunkel mutagenesis. In some embodiments, the nucleic acid molecule encoding the at least one engineered Ig variable domain described herein is generated using the methods described in the Examples. In some embodiments, the nucleic acid molecule encoding the at least one engineered Ig variable domain described herein comprises a cysteine residue introduced by mutating a nucleic acid molecule encoding the engineered Ig variable domain as further described below. In some embodiments, artificial gene synthesis, site-directed mutagenesis, overlap extension PCR, or a genome editing technology, such as CRISPR/Cas9, is used to mutate the nucleic acid molecule encoding the engineered Ig variable domain. In some embodiments, artificial gene synthesis is used to mutate the nucleic acid molecule encoding the at least one engineered Ig variable domain.

[00101] Another aspect of the disclosure includes a method of increasing stability of an antibody or antigen-binding fragment thereof, the method comprising introducing into at least one variable domain of the antibody or antigen-binding fragment thereof a cysteine residue at position 24 and a cysteine residue at position 86; a cysteine residue at position 22 and a cysteine residue at position 88; a cysteine residue at position 4 and a cysteine residue at position 25; a cysteine residue at position 4 and a cysteine residue at position 118; a cysteine residue at position 5 and a cysteine residue at position 120; a cysteine residue at position 6 and a cysteine residue at position 119; and/or a cysteine residue at position 45 and a cysteine residue at position 100, wherein the position numbering refers to IMGT numbering, and wherein the cysteine residue(s) introduced form or are capable of forming a disulfide bond.

[00102] Another aspect of the disclosure includes a method of increasing stability of an antibody or antigen-binding fragment thereof, the method comprising introducing into at least one variable domain of the antibody or antigen-binding fragment thereof a first cysteine and a second cysteine, wherein the first and second cysteines are selected from a cysteine residue introduced at position 24, position 86, position 22, position 88, position 4, position 25, position 1 18, position 5, position 120, position 6, position 1 19, position 45, and/or position 100, wherein the position numbering refers to IMGT numbering, wherein the first and second cysteines are not at the same position, and wherein the cysteine residue(s) introduced form or are capable of forming a disulfide bond.

[00103] In some embodiments, the at least one non-canonical disulfide bond is formed between a cysteine residue at position 24 and a cysteine residue at position 86. In some embodiments, the at least one non-canonical disulfide bond is formed between a cysteine residue at position 22 and a cysteine residue at position 88. In some embodiments, the at least one non-canonical disulfide bond is formed between a cysteine residue at position 4 and a cysteine residue at position 25. In some embodiments, the at least one non-canonical disulfide bond is formed between a cysteine residue at position 4 and a cysteine residue at position 1 18. In some embodiments, the at least one non-canonical disulfide bond is formed between a cysteine residue at position 5 and a cysteine residue at position 120. In some embodiments, the at least one non-canonical disulfide bond is formed between a cysteine residue at position 6 and a cysteine residue at position 119. In some embodiments, the at least one non-canonical disulfide bond is formed between a residue at position 45 and a cysteine residue at position 100.

[00104] In some embodiments, the variable domain is a heavy chain variable domain.

[00105] In some embodiments, the variable domain is a VHH domain.

[00106] In some embodiments, the variable domain is a light chain variable domain. In some embodiments, the light chain variable domain is of the of the lambda or kappa family.

[00107] In some embodiments, the variable domain is a human variable domain. In other embodiments, the variable domain is non-human, for example murine, rabbit or chicken.

[00108] In some embodiments, the engineered Ig variable domain comprises any one of the amino acid sequences of SEQ ID NOs: 332-338, 340-346, 348-354, 356-362, 364-370, 372-378, 380-386, 388-394, 396-402, 404-410, 412-418, 420-426, 428-434, 436-442, 444- 450, 452-458, 460-466, 468-474, 476-482, 484-490, 492-498, 500-506, 508-514, 516-522, 524-530, 532-538, 540-546, 548-554, 556-562, 564-570, 572-578, 580-586, 588-594, and 596-602 or an amino acid sequence having at least 50%, at least 60%, at least 70%, at least 80%, at least 85% at least 90% or at least 95% sequence identity to any one of the amino acid sequences of SEQ ID NOs: 332-338, 340-346, 348-354, 356-362, 364-370, 372-378, 380- 386, 388-394, 396-402, 404-410, 412-418, 420-426, 428-434, 436-442, 444-450, 452-458, 460-466, 468-474, 476-482, 484-490, 492-498, 500-506, 508-514, 516-522, 524-530, 532- 538, 540-546, 548-554, 556-562, 564-570, 572-578, 580-586, 588-594, and 596-602, wherein the cysteine residue(s) introduced at position 24, position 86, position 22, position 88, position 4, position 25, position 1 18, position 5, position 120, position 6, position 119, position 45, and/or position 100 are maintained. In some embodiments, the engineered Ig variable domain comprises a conservatively substituted amino acid sequence of any one of the amino acid sequences of SEQ ID NOs: 332-338, 340-346, 348-354, 356-362, 364-370, 372-378, 380- 386, 388-394, 396-402, 404-410, 412-418, 420-426, 428-434, 436-442, 444-450, 452-458, 460-466, 468-474, 476-482, 484-490, 492-498, 500-506, 508-514, 516-522, 524-530, 532- 538, 540-546, 548-554, 556-562, 564-570, 572-578, 580-586, 588-594, and 596-602, wherein the cysteine residue(s) introduced at position 24, position 86, position 22, position 88, position 4, position 25, position 1 18, position 5, position 120, position 6, position 119, position 45, and/or position 100 are maintained. In some embodiments, the CDRs of the engineered Ig domain can be any suitable CDRs. [00109] In some embodiments, the antibody or antigen-binding fragment thereof has increased thermostability and/or increased resistance to protease activity as compared to a control antibody or antigen-binding fragment. In some embodiments, the control is the same antibody or antigen-binding fragment comprising a non-engineered Ig variable domain (i.e., the control does not have the non-canonical disulfide bonds described herein introduced therein but is otherwise the same as the engineered antibody or antigen-binding fragment thereof).

[00110] Another aspect of the disclosure includes a method for identifying one or more location(s) in engineered immunoglobulin (Ig) variable domains tolerant of disulfide bonds, the method comprising generating a library of nucleic acid molecules encoding engineered immunoglobulin (Ig) variable domains lacking a disulfide bond between a cysteine residue at position 23 and a cysteine residue at position 104, wherein at least one cysteine residue has been introduced into the amino acid sequence of each engineered Ig variable domain encoded by the nucleic acid molecules of the library; identifying Ig variable domains encoded by the nucleic acid molecules of the library which have formed at least one disulfide bond; and identifying the positions between which the at least one disulfide bond is formed in each engineered Ig variable domain, wherein the position numbering refers to IMGT numbering.

[00111] In some embodiments, the Ig variable domain is a heavy chain variable domain. In some embodiments, the Ig variable domain is a light chain variable domain. In some embodiments, the Ig variable domain is a V H H domain.

[00112] In some embodiments, the engineered Ig variable domain is part of at least one antibody binding fragment, such as Fab, Fab', F(ab')2, scFv, dsFv, ds-scFv, minibody, diabody, and multimers thereof or a single domain antibody such as V H , V L , V H H, or VNAR.

[00113] Another aspect of the disclosure includes a method of identifying engineered immunoglobulin (Ig) variable domain disulfide bonds that increase stability or solubility of the Ig variable domain, the method comprising: generating a library of nucleic acid molecules encoding engineered immunoglobulin (Ig) variable domains lacking a disulfide bond between a cysteine residue at position 23 and a cysteine residue at position 104, wherein engineered pairs of cysteine residues have been introduced into the amino acid sequence of each engineered Ig variable domain encoded by the nucleic acid molecules of the library; identifying Ig variable domains encoded by the nucleic acid molecules of the library which have formed at least one disulfide bond; determining the stability or solubility of the identified Ig variable domains, and identifying the positions between which the disulfide bonds are formed in each engineered Ig variable domain that has increased stability or solubility as compared to a control, wherein the position numbering refers to IMGT numbering.

[00114] In some embodiments, the engineered Ig variable domains are part of at least one antibody binding fragment, such as Fab, Fab', F(ab')2, scFv, dsFv, ds-scFv, minibody, diabody, or multimers thereof or a single domain antibody such as V H , V L , V H H, or VNAR.

[00115] In some embodiments, the library is generated using an antibody display technology. In some embodiments, the library is generated using phage display. In some embodiments, the library is generated using ribosome display. In some embodiments, the library is generated using yeast display. In some embodiments, the library is generated using mammalian cell display. In some embodiments, the library is generated using bacterial display.

[00116] In some embodiments, the library is generated using Kunkel mutagenesis. In some embodiments, the library is generated according to the methods provided in the Examples. In some embodiments, the library is generated using trinucleotide mutagenesis, PCR with degenerate oligonucleotides, random error-prone PCR mutagenesis, recombination, or any other methods known in the art for library generation.

[00117] In some embodiments, the engineered library is a V H H library. In some embodiments, the engineered library is an Ig light chain variable domain library. In some embodiments, the engineered library is an Ig heavy chain variable domain library. In some embodiments, the engineered library is an antibody binding fragment library, comprising nucleic acid molecules encoding antibody binding fragments such as Fab, Fab', F(ab')2, scFv, dsFv, ds-scFv, minibodies, diabodies, and multimers thereof. In some embodiments, the engineered library is a single domain antibody library, comprising nucleic acid molecules encoding single domain antibodies such as V H s, V L s, V H HS, or VNARS.

[00118] In some embodiments, the Ig variable domains encoded by the nucleic acid molecules of the library which have formed at least one disulfide bond other than at positions 23 and 104 are identified using panning of the library. In some embodiments, the Ig variable domains encoded by the nucleic acid molecules of the library which have formed at least one disulfide bond other than at positions 23 and 104 are identified according to the methods provided in the Examples. In some embodiments, the Ig variable domains encoded by the nucleic acid molecules of the library which have formed at least one disulfide bond other than at positions 23 and 104 are identified using negative selection to negatively select molecules with free Cys residues and/or molecules that form intermolecular S-S linkages (dimers). In some embodiments, the Ig variable domains encoded by the nucleic acid molecules of the library which have formed at least one disulfide bond other than at positions 23 and 104 are identified by screening all the generated Ig variable domains individually using high-throughput screening.

[00119] In some embodiments, the positions between which the disulfide bonds are formed are identified using next generation sequencing. In some embodiments, the positions between which the disulfide bonds are formed are identified according to the methods provided in the Examples.

[00120] In some embodiments, the stability is determined by measuring thermostability of the engineered Ig variable domain. In some embodiments, the stability is determined by measuring protease resistance. In some embodiments, stability is measured according to the methods provided in the Examples. In some embodiments, solubility is measured accordingly to the methods provided in the Examples.

[00121] In some embodiments, the Ig variable domain is a heavy chain variable domain. In some embodiments, the Ig variable domain is a light chain variable domain. In some embodiments, the Ig variable domain is a V H H domain.

V. Kits

[00122] Another aspect of the disclosure includes a kit comprising a engineered Ig variable domain as described herein, a antibody or antigen-binding fragment thereof as described herein, a polypeptide as described herein, a nucleic acid molecule as described herein, a vector or expression cassette as described herein and/or a composition as described herein. In some embodiments, the kit further comprises packaging material (e.g., one or more vial(s)) and/or instructions for using the components included in the kit.

[00123] The definitions and embodiments described in particular sections are intended to be applicable to other embodiments herein described for which they are suitable as would be understood by a person skilled in the art. For example, in the following passages, different aspects of the disclosure are defined in more detail. Each aspect so defined may be combined with any other aspect or aspects unless clearly indicated to the contrary. In particular, any feature indicated as being preferred or advantageous may be combined with any other feature or features, including those indicated as being preferred or advantageous.

[00124] The above disclosure generally describes the present application. A more complete understanding can be obtained by reference to the following specific examples. These examples are described solely for the purpose of illustration and are not intended to limit the scope of the application. Changes in form and substitution of equivalents are contemplated as circumstances might suggest or render expedient. Although specific terms have been employed herein, such terms are intended in a descriptive sense and not for purposes of limitation.

[00125] The following non-limiting examples are illustrative of the present disclosure:

EXAMPLES

Example 1

[00126] Based on the serendipitous discovery of non-canonical disulfide linkages within Ig V domains in the past and an increasing acknowledgement of their importance in structural diversification of Ig repertoires (15,16), it was hypothesized that formation of such linkages at additional locations within Ig V domain FRs would be compatible with antigen binding. To investigate this possibility, an unbiased library approach was applied to identify Ig V domain FR positions permissive for non-canonical disulfide linkage formation. Surprisingly, it was discovered that diverse intra- and inter-strand disulfide linkages that are absent in the natural human repertoire could be accommodated within Ig V domain FRs, improving the stability of these domains without compromising antigen binding.

[00127] Starting from a destabilized human V H domain lacking the canonical Cys23- Cys104 disulfide linkage, phage-displayed libraries of engineered VHS bearing all possible pairwise combinations of Cys residues in neighboring p-strands of the Ig fold FRs were generated and screened. This approach identified seven novel Cys pairs in V H FRs (Cys24- Cys86, Cys22-Cys88, Cys4-Cys25, Cys4-Cys118, Cys5-Cys120, Cys6-Cys119 and Cys45- Cys100 by IMGT numbering) whose presence putatively resulted in formation of non- canonical disulfide linkages (four intra- and three inter-p sheet), rescuing domain folding and stability. Introduction of a subset of these putative disulfide linkages (Cys 22-Cys88, Cys24- Cys86, Cys4-Cys25 and Cys6-Cys119) into a diverse panel of V H , V L and V H H domains enhanced their thermostability and protease resistance without significantly impacting expression, solubility, or binding to cognate antigens. None of the non-canonical disulfide linkages were present in the natural human V H repertoire. These data reveal an unexpected permissiveness of Ig V domains to non-canonical disulfide linkages at diverse locations in FRs, absent in the human repertoire, whose presence is compatible with antigen recognition and improves domain biophysical properties. This represents the most complete assessment to date of the role of engineered non-canonical disulfide bonding within FRs in Ig V domain structure and function.

Results

Selection of V H domain for Cys pair scanning. [00128] To identify novel non-canonical disulfide linkages within Ig V domains, a model system was needed in which domain folding was perturbed by abrogation of the canonical Cys23-Cys104 disulfide linkage (Cys23-Cys104 null) and which could be rescued by the stabilizing effects of non-canonical disulfide linkage formation elsewhere in the domain (Fig. 1A). Importantly, restoration of domain stability must be associated with a phenotype selectable by panning of phage-displayed libraries of Ig V domains such as reactivity with bacterial Ig-binding proteins, e.g., protein L, protein A (17, 18). VH domains were deemed suitable for this purpose as the protein A binding site is a conformational epitope on V H domains formed from at least four different p-strands in FR1 and FR3 (18); thus, binding of unfolded or partially folded domains (i.e., Cys23-Cys104 null V H s with non-stabilizing non- canonical Cys pairs) cannot occur. As a result, such domains would be de-selected following panning of Cys pair scan V H domain libraries on protein A, while folded domains (i.e., Cys23- Cys104 null V H s containing stabilizing non-canonical Cys pairs) would be selected.

[00129] Fd phage displaying four different V H s (VHB82, VH13, VHM81 and VH428; 19,20) and their Cys23-Cys104 null derivatives were generated. Following selection for binding to protein A, phage displaying the Cys23-Cys104 null variants of two of the V H s had equivalent titers to those displaying the wild-type (WT) V H s, indicating no loss of protein A binding in the destabilized VHS. Surprisingly, both the WT and Cys23-Cys104 null variant of VH428 was inefficiently selected by panning on protein A. However, the output titers of VH413-displaying phage were 3-4 logs higher than those of VH413 C23 C104 null -displaying phage following protein A selection (Fig. 1 B), reflecting loss of stability and folding in the destabilized V H . Introduction of the we 11- characterized Cys54-Cys78 and Cys40-Cys55 non-canonical disulfide linkages, but not a negative control Cys pair at a location non-permissive to disulfide linkage formation (Cys54-Cys87), into phage-displayed VH413 C23-c104 nu " restored the titers of phage eluted from protein A selections to near-WT levels (Fig. 1C).

[00130] As proof-of-concept that VH413 C23 C104 nu " could be used to identify novel non- canonical disulfide linkages in Ig V domains, three test Cys pair scan libraries of phage- displayed VH413 C23 C104 nu " domains bearing all possible combinations of Cys pairs spanning p-strands B-F, C-C’ and C’-D were first constructed. Panning of the libraries rapidly enriched for VH413 C23 C104 nu " domains bearing previously identified stabilizing non-canonical disulfide linkages spanning Cys23-Cys104 (B-F), Cys40-Cys55 (C-C’) and Cys54-Cys78 (C’-D) (Fig. 1 D). Efficient selection of stabilized V H s from these libraries confirmed that introduction of both canonical and non-canonical disulfide linkages could rescue folding of the VH413 C23 C104 nu " domain and validated this strategy for discovery of novel non-canonical disulfide linkages.

Design and screening of V H Cys pair scan libraries. [00131] To identify novel Ig V domain disulfide linkages without a priori assumptions of their locations, an additional 14 phage-displayed VH413 C23 C104 nu " libraries (distinct from the three test libraries described above) were constructed, each comprising mutant V H s bearing all possible combinations of Cys pairs in adjoining or opposing p-strands. In seven libraries the V H s bore Cys pairs on adjoining p-strands that were putatively capable of forming intra-p- sheet disulfide linkages, and in 10 libraries the V H s bore Cys pairs on opposing p-strands that were putatively capable of forming inter-p-sheet disulfide linkages (Table 1). Sequencing of the libraries revealed that 70% of clones displayed V H s mutagenized as intended, while 14% displayed V H s bearing a single unpaired Cys and 16% displayed unmodified VH413 C23-c104 nu ". Comparison of library size versus theoretical diversity values revealed that for all 14 libraries, each Cys pair clone was present multiple times in its respective library. Cys pairs in the libraries were generally evenly distributed across all targeted positions except for library C”- D, in which Cys residues at the N-terminus of strand C” and the C-terminus of strand D predominated.

[00132] Table 1. Diversity of 17 Cys pair scan phage-displayed V H libraries

1 Total number of Cys pair combinations calculated based on the number of residues present on the P-strands.

Calculated as cfu x % of clones correctly mutagenized.

[00133] The three test libraries described above (B-F, C-C’ and C’-D) were not included in library selections to prevent dominant selection of V H s bearing Cys23-Cys104, Cys54-Cys78 and Cys40-Cys55 disulfide linkages. The remaining six intra-p-sheet and eight inter-p-sheet libraries were pooled separately and panned on protein A. Panning of the six intra-p-sheet libraries was conducted both at room temperature and with a 30-min phage incubation at 50°C at the beginning of each round, while panning of the eight inter-p-sheet libraries was conducted only with the additional heating step. Sequencing of clones from the second, third and fourth rounds of both pannings revealed enrichment of VHS bearing five putative intra-p-sheet disulfide linkages (Cys4-Cys25, Cys19-Cys91 , Cys22-Cys88, Cys24- Cys86 and Cys45-Cys100) and three putative inter-p-sheet disulfide linkages Cys4-Cys118, Cys5-Cys120 and Cys6-Cys119). In early panning rounds, V H s bearing Cys pairs at other positions were identified at low frequencies, but in most cases disulfide linkage formation was structurally implausible (Tables 2 and 3).

[00134] Table 2. Output of panning the 6 intra-p-sheet VH413 C23 C104 nu " Cys pair scan libraries. a Disulfide linkage formation not structurally plausible.

Cys pairs shown in bold were deemed structurally plausible and were introduced into sdAbs and the resulting proteins characterized. means these Cys pairs were not observed following sequencing of 40 to 50 clones per panning round.

[00135] Table 3. Output of panning the 8 inter-p-sheet VH413 C23 C104 nu " Cys pair scan libraries.

a Disulfide linkage formation not structurally plausible.

While disulfide linkage formation may be possible at these positions, they were not tested further due to resource limitations.

'Previously known Cys39-Cys87 stabilizing disulfide linkage (Saerens et al., 2008, J. Mol. Biol. 377:478- 488).

Cys pairs shown in bold were deemed structurally plausible and were introduced into sdAbs and the resulting proteins characterized, except for those noted with footnote b. means these Cys pairs were not observed following sequencing of 40 to 50 clones per panning round. n.d., not determined.

Effects of novel disulfide linkages on Ig V domain biophysical properties.

[00136] As an initial screen, first assessed were the effects of introducing the eight putative non-canonical disulfide linkages, along with several negative control Cys pairs predicted not to form disulfide linkages, on the biophysical properties of two V H s (VH413 and VH428) and their Cys23-Cys104 null variants as well as two VHHS (A4.2 and A26.8; 21). VH413 C23 C104 nu " and VH428 C23 C104 nu " could not be expressed in significant amounts, but introduction of Cys24-Cys86 and, surprisingly, the negative control Cys pair Cys47-Cys72, resulted in low expression of V H s with melting temperatures (T m s) between 40-50°C (data not shown). In this initial screen, introduction of seven of eight putative non-canonical disulfide linkages into V H s and V H HS bearing the canonical Cys24-Cys104 disulfide linkage resulted in T m increases ranging from 1 ,5-28°C by differential scanning fluorometry (Fig. 2A). Exceptions included Cys19-Cys91 , which resulted in no expression or decreased T m in all cases, and Cys5-Cys120, which resulted in decreased T m for one VH. AS expected, negative control Cys pairs generally resulted in loss of V H /V H H expression or decreased T m .

[00137] Based on these results, seven putative non-canonical disulfide linkages (Cys4- Cys25, Cys22-Cys88, Cys24-Cys86, Cys45-Cys100, Cys4-Cys118, Cys5-Cys120 and Cys6- Cys119) were introduced into a larger set of 12 VHS, 11 VLS and 11 VHHS. Three of the putative disulfide linkages (Cys45-100, Cys4-118, and Cys5-120) were introduced into only the set of 11 V H HS. The V H s and V L s were non-antigen-specific (19, 20, 22) while the V H HS targeted Clostridioides difficile toxins A and B (21 , 23), epidermal growth factor receptor (EGFR) (24), insulin-like growth factor 1 receptor (IGF1 R) (25). Introduction of all seven putative non- canonical disulfide linkages was generally well tolerated by most Ig V domains, except for Cys22-Cys88 which was successfully introduced into only 8/11 V L s and 5/12 V H s (Fig. 2B and Table 4). Three of the putative non-canonical disulfide linkages had only minor effects on T m (Cys45-Cys100, Cys4-Cys118, Cys5-Cys120, average AT m 2.7°C, 1.5°C, and 4.2°C, respectively by circular dichroism), in line with thermostability changes associated with FR mutation alone rather than disulfide linkage formation (26), and were not investigated further. By contrast, introduction of putative non-canonical disulfide linkages formed between Cys24- Cys86, Cys4-Cys25 and Cys6-Cys119 resulted in average increases in Ig V domain T m of 12- 15°C by circular dichroism, while introduction of the putative Cys22-Cys88 non-canonical disulfide linkage conferred an intermediate phenotype, increasing Ig V domain T m by an average of 6.1 °C with high variability. All four putative disulfide linkages generally conferred larger thermostability enhancements on V H s than V H HS or V s. Increased thermostability conferred by these four putative disulfide linkages was achieved at the cost of variable but generally modest decreases in the thermal refolding efficiency (a-value) and monomericity of V H HS, while the impacts on solubility and refolding of V H and V L domains were more variable and less clear. The binding affinities of most antigen-specific V H HS were unaffected or only slightly weakened (AK D <5) by introduction of the four putative non-canonical disulfide linkages. Introduction of selected S-S linkages into the domains tested did not appear to abrogate protein A binding (for VHHS and VHS) or protein L binding (for VLS).

[00138] Table 4. Biophysical properties, thermal stability, and equilibrium dissociation constants (K D s) of wild-type sdAbs and sdAbs bearing putative non-canonical disulfide linkages. n.d., not determined; n.a., not applicable; pA, protein A; pL, protein L a Calculated from the sdAb primary amino acid sequence including c-Myc and Hiss tags b Calculated for the monomeric sdAb peak area from SEC-MALS data '-Calculated from SEC-MALS data d Mean + S.E.M. of duplicate circular dichroism measurements except for samples with limiting amounts e Measured by surface plasmon resonance

Effects of novel disulfide linkages on Ig V domain protease resistance.

[00139] It was previously found that introduction of a non-canonical disulfide linkage formed between Cys54-Cys78 increased the pepsin resistance of V H HS (27) and V L s (22). To assess whether any of the putative non-canonical disulfide linkages identified in this study had similar effects, the pepsin, trypsin, and chymotrypsin resistance of WT VHHS and engineered variants bearing non-canonical disulfide linkages formed between Cys22-Cys88, Cys24- Cys86, Cys4-Cys25, and Cys6-Cys119 was tested. As observed previously for the Cys54- Cys78 linkage, the putative Cys24-Cys86 and Cys6-Cys119 non-canonical disulfide linkages significantly increased the pepsin resistance of the majority of V H HS tested (Fig. 3 and Table 5). By contrast, the putative Cys22-Cys88 and Cys4-Cys25 linkages only modestly increased resistance to low concentrations of pepsin. None of the four putative non-canonical disulfide linkages had any effect on trypsin and chymotrypsin resistance (data not shown).

[00140] Table 5. Percent VHH protein undigested following 1 h digestion with pepsin at 37°C.

Results represent the means + SEM of three independent experiments. N.d., not determined.

Presence of novel non-canonical disulfide linkages in human Ig V H domain repertoires. [00141] To determine whether the putative non-canonical disulfide linkages identified herein were present in the human antibody repertoire, Illumina MiSeq® sequencing was used to interrogate the peripheral blood resting memory B-cell V H repertoires of 17 individuals to an average depth of 10 5 reads (Table 6). Cys residues in Ig V H domains were most common in CDR-H3 (Fig. 4A), and V H s bearing an additional non-canonical Cys pair located in FRs were rare in the repertoire, representing <1% of all rearranged V H s (Fig. 4B). No V H sequences were identified bearing putative non-canonical disulfide linkages formed between Cys22- Cys88, Cys24-86, Cys4-25, Cys6-119, or any of the other positions identified above. Non- canonical Cys pairs in FRs were distributed widely across various V H domain strands in locations unlikely to enable disulfide linkage formation due to distance in three-dimensional space (Fig. 4C-E). These results agree with those of a recent repertoire sequencing study showing that putative non-canonical disulfide linkages formed between Cys residues in FRs were also not detected in a recent repertoire sequencing study (28). Instead, most non- canonical Cys residues in the human V H repertoire lie in FR2 and FR3, where they may be able to form disulfide linkages with Cys residues in CDRs.

[00142] Table 6. Metrics for Illumina MiSeq® data analyzed in this study.

Confirmation of the presence of non-canonical disulfide linkages by mass spectrometry.

[00143] Disulfide linkage formation in Cys-engineered Ig V domains was evaluated by measuring the abundance of free sulhydryl groups with liquid chromatography-mass spectrometry (LC-MS). WT and Cys-engineered V H HS bearing putative non-canonical disulfide linkages formed between Cys22-Cys88, Cys24-86, Cys4-25, and Cys6-119 were denatured in 6 M guanidine-HCI, reduced with 5 mM tris(2-carboxyethyl)phosphine (TCEP), and then labeled with a 150-fold molar excess of maleimide-PEG2-biotin. Non-reduced (guanidine-HCI + mPEG2-biotin) and unlabeled (guanidine-HCI only) proteins served as controls. Intact LC-MS analysis confirmed the presence of all four non-canonical disulfide linkages in Ig V domains, as shown by incorporation of only two maleimide-PEG2-biotin labels in a WT V H H bearing only the canonical Cys23-Cys104 disulfide linkage and up to four labels in all of the Cys-engineered V H HS following TCEP reduction (Fig. 5 and Table 7). As observed previously for the Cys54-Cys78 and Cys40-Cys55 non-canonical disulfide linkages, there was some evidence that the four novel non-canonical disulfide linkages were formed heterogeneously following introduction into V H HS bearing the canonical Cys23-Cys104 linkage. Unlike formation of the canonical Cys23-Cys104 linkage in a WT V H H or formation of the Cys4-Cys25 and Cys6-Cys119 non-canonical disulfide linkages, minor proportions of V H HS bearing the Cys22-Cys88 and Cys24-Cys86 linkages (1.0-12.7%) had two free sulfhydryl groups (indicating the presence of unpaired Cys residues) without reduction. Moreover, reduction of the canonical Cys23-Cys104 disulfide linkage in a WT V H H as well as reduction of the non-canonical Cys4-Cys25 disulfide linkage in Cys-engineered VHHS was ~90% efficient, while the Cys22-Cys88, Cys24-Cys86, and Cys6-Cys119 non-canonical disulfide linkages were not fully reduced in a subpopulation of VHHS under the conditions tested.

[00144] Table 7. Intact LC-MS analysis of free sulfhydryl abundance in wild-type and Cys- engineered V H HS

Abbreviations: GdnCI, guanidine-HCI; m-PEG2-biotin, maleimide-PEG2-biotin; TCEP, tris(2- carboxyethyl)phosphine; n.a., not applicable; WT, wild-type.

Discussion

[00145] The roles of intradomain disulfide linkages in Ig V domains are poorly understood. Even the canonical Cys23-Cys104 linkages present in all V H and V L domains are often dispensable for antigen binding and domain stability (29). The roles of non-canonical disulfide linkages, which typically involve one or more Cys residue(s) in CDRs, are even less clear. Such linkages are often germline encoded, with at least one Cys located in V, D, or J gene segments, and are present at relatively high frequencies in the Ig V domain repertoires of several non-human vertebrates, where they may play roles in expanding paratope diversity (15,16), structuring ultra-long CDR-H3 loops (30), reducing entropic penalties for antigen binding (31), and stabilizing the Ig V domain fold under extreme conditions (32). Of the three previously described intradomain non-canonical Ig V disulfide linkages in FRs (Cys54-Cys78, Cys40-Cys55, and Cys39-Cys87), Cys54-Cys78 was the result of somatic hypermutation in a camelid V H H (7), Cys40-Cys55 is encoded in the germline of several rabbit IGHV genes (8), and Cys39-Cys87 does not occur naturally (7). Thus, non-canonical disulfide linkages in FRs appear to occur naturally more rarely than inter-CDR loop disulfide linkages even though their impacts on Ig V domain biophysical properties can be dramatic.

[00146] Here, the sequence tolerance of Ig V domains to introduction of non-canonical disulfide linkages in FRs was explored. A Cys pair scanning phage library approach was used to identify such linkages (Fig. 1), reasoning that non-canonical disulfide linkages within FRs would be generalizable to a broader spectrum of Ig V domains than those formed between Cys residues in CDRs.

[00147] Screening of the Cys pair scan V H libraries identified seven novel Cys pairs putatively forming non-canonical disulfide linkages in FRs (inter-p-sheet: Cys4-Cys25, Cys22- Cys88, Cys24-Cys86 and Cys45-Cys100; intra-p-sheet: Cys4-Cys118, Cys5-Cys120 and Cys6-Cys119). Four of these seven putative non-canonical linkages (Cys22-Cys88, Cys24- Cys86, Cys4-Cys25, and Cys6-Cys119) showed significant enhancement of T m and/or protease resistance when incorporated into panels of sdAbs bearing the canonical Cys23- Cys104 disulfide linkage (Figs. 2, 3). Both the Cys4-Cys119 and Cys6-119 non-canonical linkages were previously shown to improve the thermostability of Ig C domains (5), and although Cys4-Cys119 was not identified in our screens and its impact on Ig V domain biophysical properties was not assessed, it is presumed that this linkage can form in Ig V domains as well. I ntrig uingly , other Cys pairs were identified whose introduction was tolerated nearby on strands A-G (Cys4-Cys118 and Cys5-Cys120); although these had only marginal effects on T m , this may reflect a hotspot for disulfide linkage formation, and other combinations of Cys residues in this region may be capable of forming non-canonical disulfide linkages. Introduction of the seven Cys pairs identified in our screens into sdAbs was generally well tolerated, with minimal detrimental impact on antigen binding affinity with the potential exception of Cys4-Cys1 18, consistent with the location of the Trp118— >Cys substitution at the base of the CDR3 loop. Introduction of all seven Cys pairs also resulted in modest decreases in the monomericity and thermal refolding efficiency of V H HS and more variable effects on the monomericity and thermal refolding of V H s and V L s. These parameters are much more variable for sometimes-labile V H and V L domains, and that the non-canonical disulfide linkages identified in this study most likely have similar effects on all Ig V domains.

[00148] Interrogation of the expressed human VH repertoires of 17 individuals using next-generation DNA sequencing did not detect any of the seven putative non-canonical disulfide linkages identified from our screens at any appreciable frequency (Fig. 4). Among the rare V H s bearing two non-canonical Cys residues in FRs (<1 % of the repertoire), Cys pairs were widely distributed across p-strands and three-dimensional space, suggesting that their role did not involve disulfide linkage formation with one another. These FR Cys residues may instead form disulfide linkages with Cys in CDRs or may simply reflect somatic hypermutation of FRs. Interestingly, Cys residues were identified at relatively high frequencies at the N- terminus of strand G (positions 118/119), supporting the hypothesis that this region may be a hotspot for Cys residues and non-canonical disulfide linkage formation within Ig V domain FRs. A caveat to our repertoire sequencing studies was that non-canonical Cys residues could not be detected in the N-terminus of strand A or the C-terminus of strand G because these regions were primer-forced. Nevertheless, it is concluded that as with the previously described Cys39-Cys87 linkage, the novel Ig V domain non-canonical disulfide linkages described here are not present in the human repertoire despite their ability to enhance domain biophysical properties.

[00149] In summary, it was found that Ig V domain FRs were unexpectedly permissive to non-canonical disulfide linkages at multiple positions, absent in the human repertoire, whose presence was compatible with domain folding and antigen recognition while improving their thermostability and, in some cases, pepsin resistance. The results of this study more than doubles the previously known spectrum of non-canonical disulfide linkages discovered in Ig V domains, the majority of which were identified in naturally occurring antibodies. Because individual Ig V domains may differ in their tolerance of specific non-canonical disulfide linkages, this expanding panel of linkages provides a useful toolbox for stability engineering of antibodies. Clearly, the strong conservation of the canonical Cys23-Cys104 linkage does not preclude formation of other non-canonical linkages in Ig V domain FRs. This work also demonstrates the power of molecular engineering and library approaches to surpass the limitations of natural antibody repertoires.

Experimental Procedures

Cloning of V H genes into fd-tetGIIID.

[00150] VH genes were cloned into the fd-tetGIIID vector (33) using standard molecular biology techniques. Genes encoding Cys23-Cys104 null mutants of V H s were synthesized and cloned into pSJF2H by Genscript (Piscataway, NJ). Briefly, genes encoding VHB82, VH413 and VHM81 and their Cys23-Cys104 null derivatives were PCR-amplified from the pSJF2H vector using primers that introduced 5’ ApaLI and 3’ Not sites (Table 8). The amplicons and fd-tetGIIID were double-digested with ApaLI and Not (New England Biolabs, Ipswich, MA) and purified from 1.2% agarose gels using a QIAquick® gel extraction kit (QIAGEN, Hilden, Germany). Each insert (500 ng) was ligated with 100 ng of dephosphorylated fd-tetGIIID in 20-pL reactions containing 1 U T4 DNA ligase (Life Technologies, Carlsbad, CA) at 4°C overnight. Electrocompetent Escherichia coli TG1 cells were transformed with the ligation mixtures by electroporation and plated on 2xYT agar plates containing 12.5 pg/mL tetracycline (Sigma, St. Louis, MO). Following overnight incubation at 37°C, single colonies harboring fd- tetGIIID vectors with VH inserts were identified by colony PCR and DNA sequencing.

[00151] Table 8. Primers for subcloning of V H s from pSJF2H into fd-tetGIIID.

Restriction enzyme recognition sequences are shown in bold underlined text. Preparation of dUTP -containing fd-tetGIIID-VH413 C23 C104 nu " ssDNA (dll-ssDNA)

[00152] dU-ssdDNA was prepared as previously described with modifications (34,35). A single E. coli CJ236 (dutrunff) was picked from a 2xYT agar plate containing 15 pg/mL chloramphenicol (Sigma) and used to inoculate a 5 mL 2xYT culture. The culture was grown at 37°C with 240 rpm shaking until an OD 6 oo of 0.5 was reached, then cells (0.1 mL) were infected with fd-tetGIIID-VH413 C23 C104 nu " phage (10 12 cfu/mL, 10 pL) for 30 min at 37°C without shaking and plated on 2xYT agar plates containing 12.5 pg/mL tetracycline. Following overnight incubation at 37°C, a 10 mL 2xYT culture containing tetracycline (12.5 pg/mL) and chloramphenicol (15 pg/mL) was inoculated with a single colony and grown at 37°C with 240 rpm shaking until an ODeoo of 1 .0 was reached. Two 0.5 L 2xYT cultures containing the same antibiotics and 0.25 pg/mL uridine (Sigma) were inoculated with the starter culture and grown at 37°C with 240 rpm shaking. The next day, the cells were pelleted by centrifugation at 3,000xg/4°C for 15 min and phage were purified by polyethylene glycol precipitation (36). Successful incorporation of uridine into phage ssDNA was verified by comparative titering of the phage on E. coli TG1 and CJ236 cells. Finally, dU-ssDNA was extracted using the QIAprep Spin M13 Kt (QIAGEN) and resuspended in water. DNA was quantitated via absorbance at 260 nm using a ND-2000 spectrophotometer (Thermo-Fisher, Waltham, MA).

Generation of fd-tetGIIID-VH413 C23 C104 null Cys mutants

[00153] fd-tetGIIID-VH413 C23 C104 nu " was engineered to bear non-canonical disulfide linkages formed between Cys54-Cys78 and Cys40-Cys55 using Kunkel mutagenesis as previously described with modifications (34,35). In addition, a negative control Cys pair was introduced at locations tolerant to Cys residue introduction but known not to permit disulfide linkage formation (Cys54-Cys87). Briefly, 0.1 pg (~10 pmol) each mutagenic oligonucleotide (Table 9) was phosphorylated for 2 h at 37°C in a 10-pL reaction containing 50 mM Tris-HCI, pH 7.5, 10 mM MgCI 2 , 5 mM dithiothreitol (DTT), 0.5 mM ATP and 2.5 U T4 polynucleotide kinase (New England Biolabs). Subsequently, 20 pL of each phosphorylated oligonucleotide pair (10 pL of each oligonucleotide) were annealed with 330 ng of fd-tetGIIID-VH413 C23 C104 nu " dU-ssDNA in a 50-pL reaction whose temperature was cycled as follows: 90°C, 3 min, 50°C, 3 min, and 20°C, 5 min. The annealed oligonucleotides were extended by adding 0.5 pL of 10 mM ATP, 1 pL of 10 mM dNTPs, 1 pL of 100 mM DTT, 2.5 U of T7 DNA polymerase (New England Biolabs) and 1.25 U of T4 DNA ligase (New England Biolabs) and letting covalently closed circular (CCC) heteroduplex DNA form at room temperature overnight. Formation of CCC heteroduplex DNA was verified by agarose gel electrophoresis. The reaction mixtures were purified using the QIAquick® PCR purification kit (QIAGEN) and DNA was quantitated via absorbance at 260 nm using a ND-2000 spectrophotometer. Electrocompetent E. co//TG1 cells were transformed with the CCC heteroduplex DNA by electroporation and plated on 2xYT agar plates containing 12.5 pg/mL tetracycline. Following overnight incubation at 37°C, single colonies harboring fd-tetGIIID vectors encoding V H s with the desired mutations were identified by DNA sequencing.

[00154] Table 9. Primers for Kunkel mutagenesis of fd-tetGIIID- VH413 C23 C104 nu ".

Generation of Cys pair scan V H libraries.

[00155] The 17 Cys pair scan phage displayed libraries of fd-tetGIIID-VH413 C23 C104 nu " were generated by Kunkel mutagenesis using oligonucleotide pairs targeting specific p- strands essentially as described above. The sequences of all mutagenic oligonucleotides are provided in the sequence listing, with additional information provided in Table 10. For each library, the number of mutagenesis reactions was a * b, where a and b are the respective numbers of residues in the two p-strands being targeted. Mutagenic oligonucleotides (0.1 pg) were phosphorylated as described above, then phosphorylated oligonucleotides (either pairs of oligonucleotides each mutating one p-strand, or a single oligonucleotide mutating two p-strands; 10 pL each) were annealed with 330 ng of fd-tetGIIID- VH413 C23- cw4 nuii du-ssDNA in 50-pL reactions prepared in 96-well plates. T7 DNA polymerase and T4 DNA ligase were added to the reactions and the annealed oligonucleotides were extended overnight at room temperature to form CCC heteroduplex DNA. The individual CCC heteroduplex DNAs were pooled for each library and 25% of the total volume was purified using the QIAquick® PCR purification kit and quantitated via absorbance at 260 nm using a ND-2000 spectrophotometer. Electrocompetent E. coli TG1 cells were transformed with the CCC heteroduplex DNA by electroporation (five transformations per library) and plated on 2xYT agar plates containing 12.5 pg/mL tetracycline. Following overnight incubation at 37°C, 40-50 single colonies per library harboring fd-tetGIIID vectors encoding VHS were sequenced to determine the proportion correctly mutagenized and the functional library size.

[00156] Table 10. Sequences of mutagenic oligonucleotides used to construct 17 Cys pair scan phage-displayed fd-tetGIIID-VH413 C23 C104 nu " libraries.

[00157] Table 11. Sequences of V H s, V H HS, V L S and Cys-engineered variants.

Panning of phage-displayed V H s and V H libraries against protein A

[00158] To prepare fd phage displaying V H s and their Cys-engineered variants (Table 11) as well as Cys pair scan libraries of VH413 C23 C104 nu ", E. coli TG1 cells transformed with the heteroduplex CCC DNAs above were transferred to 100 mL 2xYT cultures containing tetracycline (12.5 pg/mL) and grown overnight at 37°C with 240 rpm shaking. The next day, phage were purified by polyethylene glycol precipitation and resuspended in 2 mL of phosphate-buffered saline (PBS), pH 7.4. Phage concentrations were estimated spectrophotometrically using the formula: mr m io L ns = — x io No. of bases in gen -ome phage were aliquoted and stored at -20°C. For panning, wells of NUNC MaxiSorp™ microtiter plates (Thermo-Fisher) were coated overnight at 4°C with 5 pg of protein A in 100 pL of PBS. The next day, wells were blocked with 300 pL of PBS containing 2% (w/v) skim milk at 37°C for 2.5 h. Single V H - displaying phage clones or Cys pair scan libraries of phage-displayed V H s (10 12 cfu/mL, 100 pL in PBS containing 2% skim milk) were added to wells and incubated for 1 .5 h. Pooled intra- and inter-p-sheet Cys pair scan libraries were panned in duplicate using either (i) phage prepared as above or (ii) phage heated at 50°C for 30 min, allowed to cool for 20 min at room temperature, then centrifuged at 16,000xg for 5 min prior to each round of selection. Pooled inter-p-sheet Cys pair scan libraries were panned in duplicate using phage subjected to heating and centrifugation as described. Wells were washed 15 times with PBS containing 0.05% (v/v) TWEEN®-20 (Sigma) and five times with PBS. Bound phage were eluted with 150 pL of 100 mM trimethylamine (Sigma) for 10 min with occasional mixing and neutralized with an equivalent volume of 1 M Tris-HCI, pH 7.5. Phage titers were determined by infection of mid-log phase E. coli TG1 cells (OD 6 oo = 0.5) and plating on 2xYT agar plates containing 12.5 pg/mL tetracycline. Phage were amplified by overnight growth of the infected cells in 100 mL 2xYT cultures containing tetracycline (12.5 pg/mL) at 37°C with 240 rpm shaking and purified from culture supernatants the next day for subsequent panning rounds.

Expression of V H H, V H and V L domains

[00159] The coding sequences of sdAbs were synthesized and directionally inserted between the EcoRI and BamHI sites of the pSJF2H expression vector by Genscript. Following transformation of E. coli TG1 cells, single colonies were grown in 10 mL 2xYT media containing 1 % glucose and 100 pg/mL ampicillin for 4 h at 37°C with 250 rpm shaking. The pre-cultures were used to inoculate overnight cultures (37°C, 250 mL 2xYT medium, 250 rpm shaking) or 5-day cultures (25°C, 1 L fortified M9 minimal medium, 180 rpm shaking). Expression was induced with 0.5 mM isopropyl p-D-1 -thiogalactopyranoside (IPTG) when the optical densities (600 nm) of 250 mL overnight cultures reached 0.4-0.5 and with 100 mL of 10x induction medium/0.1 mM IPTG after 30 h for 1 L M9 cultures (37). Periplasmic proteins were extracted by sucrose shock and Hise-tagged sdAbs were purified by immobilized metal affinity chromatography on HisTrap™ HP affinity columns (Cytiva Life Sciences, Marlborough, MA) connected to an KTA FPLC system (Cytiva Life Sciences). Protein yield and purity was assessed by spectrophotometry (280 nm) and SDS-PAGE.

Circular dichroism

[00160] Proteins (0.1 mg/mL) were dialyzed into 0.1 M sodium phosphate buffer the night before circular dichroism (CD) experiments. CD spectra and temperature-dependent ellipticity measurements were collected using a Jasco J-815 spectropolarimeter (Jasco, Easton, MD) equipped with a Peltier thermoelectric type temperature control system. Samples (0.2 mL) were placed into 1-mm pathlength cuvettes. Spectral scans over wavelengths from 190-250 nm were performed at 25°C with three accumulations, a scan speed of 200 nm/min, a digital integration time (DIT) of 1 s, and a bandwidth of 1 nm. Ellipticity was measured at wavelengths of 205-210 nm over a temperature range of 25-106°C with intervals of 0.5°C, a ramp rate of 2°C/min, a DIT of 2 s, and a bandwidth of 1 nm. The data were normalized to a percent scale and fitted to a Boltzmann distribution for calculation of T m . To determine refolding efficiencies, the scanned samples were cooled to 25°C and used to obtain second sigmoidal melting curves as described. Refolding efficiencies were calculated as the ratios of the normalized upper plateau ellipticity values for the second melting curves to those of the first melting curves.

Differential scanning fluorometry

[00161] Differential scanning fluorometry was performed using SYPRO® Orange (Life Technologies) essentially as described previously (19). SYPRO®Orange (5 pL) was added to sdAbs (45 pL; 1 mg/mL in PBS) in wells of 96-well thin-wall optical plates. A temperature ramp rate of 1 °C/min was applied using an iQ TM 5 real-time PCR system (Bio-Rad, Hercules, CA) and thermal unfolding was monitored by measuring fluorescence (excitation and emission 490 and 575 nm, respectively) at 0.5°C intervals. T m s were calculated as the temperature at which the maximum rate of change in fluorescent signal [d(RFU)/df] was achieved.

Size exclusion chromatography-multiangle light scattering (SEC-MALS)

[00162] UPLC SEC-MALS was performed using an Acquity BEH-125 column (Waters, Milford, MA) connected to an Acquity UPLC™ H-Class Bio system (Waters) with miniDAWN™ MALS detector and Optilab® UT-rEX™ refractometer (Wyatt Technology, Santa Barbara, CA). The sdAbs (10-20 pg) were injected at 30°C in a PBS mobile phase at a flow rate of 0.4 mL/min. Weighted average molecular mass (MMALS) was calculated using a protein concentration determined from A280 measurements and extinction coefficients calculated from amino acid sequences. Data were processed using ASTRA 6.1 software (Wyatt Technology).

Surface plasmon resonance (SPR)

[00163] Prior to SPR experiments, all sdAbs were purified by size exclusion chromatography (SEC) using a Superdex™ 75 GL column (Cytiva Life Sciences) connected an KTA FPLC system. The mobile phase for SEC consisted of HBS-EP (10 mM HEPES, pH 7.4, containing 150 mM NaCI, 3 mM ethylenediaminetetraacetic acid and 0.005% surfactant P20; Cytiva Life Sciences). EGFR [Z03194-50, Genscript; 263 response units (RUs)], IGF1 R (391-GR-050, R&D Systems, Minneapolis, MN; 364 RUs), C. difficile toxin A (Cedarlane, Burlington, Canada; 4778 RUs) and C. difficile toxin B fragment (aa 1751-2366; a generous gift from Ken Ng, University of Calgary; 1246 RUs) were immobilized on sensor chips CM5 by amine coupling in 10 mM sodium acetate, pH 3.5-4.5. An ethanolamine-blocked flow cell served as a reference. Six to eight different concentrations of sdAbs (25 pM to 7.4 pM, depending on K D ) were injected over the surfaces at 25°C on a Biacore® 3000 instrument (Cytiva Life Sciences) at a flow rate of 20-40 pL/min with contact times of 120-300 s and dissociation times of 160-300 s. The EGFR and toxin A surfaces were regenerated with running buffer, while the IGF1 R and toxin B surfaces were regenerated using 10 mM glycine, pH 1 .5. Affinities were calculated by fitting the data to a 1 :1 binding model or a steady-state binding model using BIAevaluation software version 4.1.1 (Cytiva Life Sciences).

[00164] For analyses of protein A and protein L binding, protein A (21 184, Thermo-Fisher; 1387 RUs) and protein L (PI21189, Thermo-Fisher; 637 RUs) were immobilized on CM5 sensor chips by amine coupling in 10 mM sodium acetate, pH 4.5. SEC-purified sdAbs in HBS- EP (for Biacore 3000 experiments) or HBS-EP+ (for Biacore T200 experiments; identical to HBS-EP but containing 0.05% surfactant P20). An ethanolamine- or ovalbumin-blocked flow cell served as a reference. Single concentrations of V H HS, V H S (2.5-10 pM) or V L s (500 nM to 3 pM) were injected over the surfaces on Biacore 3000 or T200 instruments at a flow rate of 20 pL/min with a contact time of 120 s and a dissociation time of 300 s. The surfaces were regenerated using 10 mM glycine, pH 1 .5. The data were fitted to a steady state binding model using BIAevaluation Software v3.0 (Biacore T200) or v4.1.1 (Biacore 3000) and reported as yes/no binding.

Protease digestion assays

[00165] Digestions with pepsin, trypsin and chymotrypsin were performed essentially as described previously (27). Briefly, titrated working concentrations of enzymes were prepared in 1 mM HCI (pepsin) or PBS containing 10 mM CaCh (trypsin and chymotrypsin). Five micrograms of each sdAb were digested at 37°C for 1 h in 20-pL reactions with a final pH of 2.0 for pepsin digestions. Pepsin digestions were neutralized with 1 M NaOH and trypsin/chymotrypsin digestions were neutralized with a protease inhibitor cocktail (Sigma). The digested sdAbs were separated by SDS-PAGE and band densitometry was conducted using Imaged v.1 .53. Three independent digestions were conducted for each V H H.

Mass spectrometry

[00166] Analyses of free sulfhydryl abundance using LC-MS were performed essentially as previously described (38). Briefly, WT or Cys-engineered sdAbs (60 pg) were buffer exchanged into 100 mM sodium acetate, pH 5.5, containing 6 M guanidine-HCI and 20 pg was set aside as the unlabelled control. Another 20 pg was reduced with 5 mM TCEP for 15 min at 37°C. Both the reduced and unreduced samples were then labeled with EZ-Link™ maleimide-PEG2-biotin (Thermo Fisher) for 1 h at room temperature using a 150:1 (mol label: mol protein) ratio for the unreduced samples and a 150:1 (mol label: mol Cys) ratio for the reduced samples. Samples were buffer exchanged again to remove excess label. LC-MS analysis of intact proteins was performed using a Dionex UltiMate 3000 HPLC instrument (Thermo Fisher) equipped with a POROS™ R2 (10 pm, 2.1 x 30 mm) column (Thermo Scientific) coupled to a LTQ-Orbitrap XL mass spectrometer equipped with an electrospray ionization source (Thermo Scientific). Approximately 5 pg of protein was injected at a flow rate of 3 mL/min with a column temperature of 80°C. The mobile phases were 0.1% formic acid in ddH 2 O (A) and acetonitrile (B). Proteins were eluted with a linear gradient of 10% to 75% mobile phase B over 3 min and split at 100 pL/min to the LTQ-Orbitrap XL. MS analysis was conducted in positive electrospray ionization mode using appropriate tune files for analysis of small intact proteins. Data was deconvoluted using the MaxEntl algorithm available through MassLynx software (Waters).

Illumina MiSeq sequencing

[00167] Sequencing of human expressed V H repertoires was conducted essentially as previously described (39). The study was approved by the Simon Fraser University Research Ethics Board (#2012s0182) and abided by the principles laid out in the Declaration of Helsinki. Peripheral blood mononuclear cells were isolated from the blood of healthy and HIV + individuals by density gradient centrifugation, and resting memory B cells (CD21+lgG+CD19+CD27+CD10-CD20+) were obtained by fluorescent-activated cell sorting as described (40). Total cellular RNA was extracted and reverse transcribed into cDNA, and then expressed V H genes were amplified in two rounds of universal tag PCR. The resulting amplicons were pooled and purified by gel extraction followed by solid-phase reversible immobilization with AMPure®XP beads (Beckman-Coulter, Brea, CA). The pooled amplicons were quantitated using a Qubit 2.0 fluorometer (Life Technologies) and sequenced using a 500-cycle Reagent Kit V2 and a MiSeq® instrument (Illumina, San Diego, CA) with a 5% PhiX genomic DNA spike. Forward and reverse reads were merged using FLASH (41) with default parameters and quality filtered using the FASTX toolkit (42) with a stringency of Q30 over >95% of each read. The filtered data were processed using IMGT/High-VQUEST (43) and analyzed using custom R scripts, available from the corresponding author by request. Circos analyses were performed using Circos Table Viewer (44).

[00168] While the present application has been described with reference to what are presently considered to be the preferred examples, it is to be understood that the application is not limited to the disclosed examples. To the contrary, the application is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

[00169] All publications, patents and patent applications are herein incorporated by reference in their entirety to the same extent as if each individual publication, patent or patent application was specifically and individually indicated to be incorporated by reference in its entirety. Specifically, the sequences associated with each accession number provided herein including for example accession numbers for proteins and/or nucleic acid molecules provided in the Tables or elsewhere, are incorporated by reference in their entirely.

[00170] The scope of the claims should not be limited by the preferred embodiments and examples but should be given the broadest interpretation consistent with the description as a whole.

REFERENCES

1 . Bechtel, T.J., and Weerapana, E. (2017) From structure to redox: the diverse functional roles of disulfides and implications in disease. Proteomics. 17, 10.1002/pmic.201600391.

2. Liu, H., and May, K. (2012) Disulfide bond structures of IgG molecules: Structural variations, chemical modifications and possible impacts to stability and biological function. MAbs. 4, 17- 23.

3. Dombkowski, A.A., Sultana, K.Z., and Craig, D.B. (2013) Protein disulfide engineering. FEBS Lett. 588, 206-212.

4. Willuda, J., Honegger, A., Waibel, R., Schubiger, P.A., Stahel, R., Zangemeister-Wittke, U., and Pluckthun, A. (1999) Thermal stability is essential for tumor targeting of antibody fragments: Engineering of a humanized anti-epithelial glycoprotein-2 (epithelial cell adhesion molecule) single-chain Fv fragment. Cancer Res. 59, 5758-5767.

5. Gong, R., Vu, B.K., Feng, Y., Prieto, D.A., Dyba, M.A., Walsh, J.D., Prabakaran, P., Veenstra, T.D., Tarasov, S.G., Ishima, R., and Dimitrov, D.S. (2009) Engineered human antibody constant domains with increased stability. J. Biol. Chem. 284, 14203-1410.

6. Hagihara, Y., Mine, S., and Uegaki, K. (2007) Stabilization of an immunoglobulin fold domain by an engineered disulfide bond at the buried hydrophobic region. J. Biol. Chem. 282, 36489-36495.

7. Saerens, D., Conrath, K., Govaert, J., and Muyldermans, S. (2008) Disulfide bond introduction for general stabilization of immunoglobulin heavy-chain variable domains. J. Mol. Biol. 377, 478-488.

8. Pan, R., Sampson, J.M., Chen, Y., Vaine, M., Wang, S., Lu, S., and Kong, X.P. (2013) Rabbit anti-HIV-1 monoclonal antibodies raised by immunization can mimic the antigenbinding modes of antibodies derived from HIV-1 -infected humans. J. Virol. 87, 10221-10231.

9. Kim, D.Y., Kandalaft, H., Hussack, G., Raphael, S., Ding, W., Kelly, J.F., Henry, K.A., and Tanha, J. (2019) Evaluation of a noncanonical Cys40-Cys55 disulfide linkage for stabilization of single-domain antibodies. Protein Sci. 28, 881-888.

10. de los Rios, M., Criscitiello, M.F., and Smider, V.V. (2015) Structural and genetic diversity in antibody repertoires from diverse species. Curr. Opin. Struct. Biol. 33, 27-41 .

1 1. McConnell, A.D., Spasojevich, V., Macomber, J.L., Kapf, I.P., Chen, A., Sheffer, J.C., Berkebile, A., Horlick, R.A., Neben, S., King, D.J., and Bowers, P.M. (2013) An integrated approach to extreme thermostabilization and affinity maturation of antibody. Protein Eng. Des. Sei. 26, 151-164.

12. McConnell, A.D., Zhang, X., Macomber, J.L., Chau, B., Sheffer, J.C., Rahmanian, S., Hare, E., Spasojevic, V., Horlick, R.A., King, D.J., and Bowers, P.M. (2014) A general approach to antibody thermostabilization. MAbs. 6, 1274-1282.

13. Wozniak-Knopp, G., Stadlmann, J., and Riiker, F. (2012) Stabilisation of the Fc fragment of human IgG 1 by engineered intradomain disulfide bonds. PLoS One. 7, e30083. 14. Zabetakis, D., Olson, M.A., Anderson, G.P., Legler, P.M., and Goldman, E.R. (2014) Evaluation of disulfide bond position to enhance the thermal stability of a highly stable single domain antibody. PLoS One. 9, e115405.

15. Wang, F., Ekiert, D.C., Ahmad, I., Yu, W., Zhang, Y., Bazirgan, O., Torkamani, A., Raudsepp, T., Mwangi, W., Criscitiello, M.F., Wilson, I.A., Schultz, P.G. and Smider, V.V. (2013) Reshaping antibody diversity. Cell. 153, 1379-1393.

16. Prabakaran, P., and Chowdhury, P.S. (2020) Landscape of non-canonical cysteines in human VH repertoire revealed by immunogenetic analysis. Cell Rep. 31 , 107831.

17. Graille, M., Stura, E.A., Housden, N.G., Beckingham, J.A., Bottomley, S.P., Beale, D., Taussig, M.J., Sutton, B.J., Gore, M.G., and Charbonnier, J.B. (2001) Complex between Peptostreptococcus magnus protein L and a human antibody reveals structural convergence in the interaction modes of Fab binding proteins. Structure. 9, 679-687.

18. Graille, M., Stura, E.A., Corper, A.L., Sutton, B.J., Taussig, M.J., Charbonnier, J.B., and Silverman, G.J. (2000) Crystal structure of a Staphylococcus aureus protein A domain complexed with the Fab fragment of a human IgM antibody: Structural basis for recognition of B-cell receptors and superantigen activity. Proc. Natl. Acad. Sci. USA. 97, 5399-5404.

19. Henry, K.A., Kim, D.Y., Kandalaft, H., Lowden, M.J., Yang, Q., Schrag, J.D., Hussack G., MacKenzie, C.R., and Tanha, J. (2017) Stability-diversity tradeoffs impose fundamental constraints on selection of synthetic human VH/VL single-domain antibodies from in vitro display libraries. Front. Immunol. 8, 1759.

20. To, R., Hirama, T., Arbabi-Ghahroudi, M., MacKenzie, R., Wang, P., Xu, P., Ni, F., and Tanha, J. (2005) Isolation of monomeric human VHs by a phage selection. J. Biol. Chem. 280, 41395-41403.

21. Hussack, G., Arbabi-Ghahroudi, M., van Faassen, H., Songer, J.G., Ng, K.K., MacKenzie,

R., and Tanha, J. (2011) Neutralization of Clostridium difficile toxin A with single-domain antibodies targeting the cell receptor binding domain. J. Biol. Chem. 286, 8961-8976.

22. Kim, D.Y., To, R., Kandalaft, H., Ding, W., van Faassen, H., Luo, Y., Schrag, J.D., St- Amant, N., Hefford, M., Hirama, T., Kelly, J.F., MacKenzie, R., and Tanha, J. (2014) Antibody light chain variable domains and their biophysically improved versions for human immunotherapy. MAbs. 6, 219-235.

23. Hussack, G., Ryan, S., van Faassen, H., Rossotti, M., MacKenzie, C.R., and Tanha, J. (2018) Neutralization of Clostridium difficile toxin B with VHH-Fc fusions targeting the delivery and CROPS domains. PLoS One. 13, e0208978.

24. Bell, A., Wang, Z.J., Arbabi-Ghahroudi, M., Chang, T.A., Durocher, Y., Trojahn, U., Baardsnes, J., Jaramillo, M.J., Li, S., Baral, T.N., O’Connor-McCourt, M., MacKenzie, R., and Zhang, J. (2010) Differential tumor-targeting abilities of three single-domain antibody formats. Cancer Lett. 289: 81-90.

25. Stanimirovic, D., Kemmerich, K.., Haqqani, A.S., Sulea, T., Arbabi-Ghahroudi, M., Massie B., and Gilbert, R., inventors; National Research Council Canada, assignee (2015) Insulin-like growth factor 1 receptor-specific antibodies and uses thereof. United States patent US 10100117 B2.

26. Kunz, P., Flock, T., Soler, N., Zaiss, M., Vincke, C., Sterckx, Y., Kastelic, D., Muyldermans,

S., and Hoheisel, J.D. (2017) Exploiting sequence and stability information for directing nanobody stability engineering. Biochim. Biophys. Acta Gen. Subj. 1861 , 2196-2205.

27. Hussack, G., Hirama, T., Ding, W., MacKenzie, R., and Tanha, J. (201 1) Engineered single-domain antibodies with high protease resistance and thermal stability. PLoS One. 6, e28218. 28. Prabakaran, P., and Chowdhury, P.S. (2020) Landscape of non-canonical cysteines in human VH repertoire revealed by immunogenetic analysis. Cell Rep. 31 , 107831.

29. Hagihara, Y., and Saerens, D. (2014) Engineering disulfide bonds within an antibody. Biochim. Biophys. Acta. 1844, 2016-2023.

30. Haakenson, J.K., Deiss, T.C., Warner, G.F., Mwangi, W., Criscitiello, M.F., and Smider, V.V. (2019) A broad role for cysteines in bovine antibody diversity. ImmunoHorizons. 3, 478- 487.

31. Govaert, J., Pellis, M., Deschacht, N., Vincke, C., Conrath, K, Muyldermans, S., and Saerens, D. (2012) Dual beneficial effect of interloop disulfide bond for single domain antibody fragments. J. Biol. Chem. 287, 1970-1979.

32. Mendoza, M.N., Jian, M, King, M.T., and Brooks, C.L. (2020) Role of a noncanonical disulfide bond in the stability, affinity, and flexibility of a VHH specific for the Listeria virulence factor InIB. Protein Sci. 29, 990-1003.

33. MacKenzie, C.R., and To, R. (1998) The role of valency in the selection of anticarbohydrate single-chain Fvs from phage display libraries. J. Immunol. Methods. 220, 39-49.

34. Henry, K.A., Kandalaft, H., Lowden, M.J., Rossotti, M.A., van Faassen, H., Hussack, G., Durocher, Y., Kim, D.Y., and Tanha, J. (2017) A disulfide-stabilized human VL single-domain antibody library is a source of soluble and highly thermostable binders. Mol. Immunol. 90, 190- 196.

35. Rajan, S., and Sidhu, S.S. (2012) Simplified synthetic antibody libraries. Methods Enzymol. 502, 3-23.

36. Wickner, W. (1975) Asymmetric orientation of a phage coat protein in cytoplasmic membrane of Escherichia coli. Proc. Natl. Acad. Sci. USA. 72, 4749-4753.

37. Baral, T.N., MacKenzie, R., and Arbabi-Ghahroudi, M. (2013) Single-domain antibodies and their utility. Curr. Protoc. Immunol. 103, Unit 2.17.

38. Robotham, A.C., and Kelley, J.F. (2019) Detection and quantification of free sulfhydryls in monoclonal antibodies using maleimide labeling and mass spectrometry. MAbs. 1 1 , 757-766.

39. Henry, K.A. (2018) Next-generation DNA sequencing of VH/VL repertoires: A primer and guide to applications in single-domain antibody discovery. Methods Mol. Biol. 1701 , 425-446.

40. Henry, K.A. (2013) HIV-associated modulation of serological and immunogenetic features of humoral immunity. Ph.D. thesis, Simon Fraser University.

41. Magoc, T, and Salzberg, S.L. (2011) FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics. 27, 2957-2963.

42. Schmieder, R., and Edwards, R. (2011) Quality control and preprocessing of metagenomic datasets. Bioinformatics. 27, 863-864.

43. Alamyar, E., Duroux, P., Lefranc, M.P., Giudicelli, V. (2012) IMGT® tools forthe nucleotide analysis of immunoglobulin (IG) and T cell receptor (TR) V-(D)-J repertoires, polymorphisms, and IG mutations: IMGTA/-QUEST and IMGT/HighV-QUEST for NGS. Methods Mol. Biol. 882, 569-604.

44. Krzywinski, M.I., Schein, J.E., Birol, I., Connors, J., Gascoyne, R., Horsman, D., Jones, S.J., and Marra, M.A. (2009) Circos: An information aesthetic for comparative genomics. 19, 1639-1645.

45. Lefranc, M-P., Pommie, C., Ruiz, M., Giudicelli, V., Foulquier, E., Truong, L., Thouvenin- Contet, V., Lefranc, G. (2003) IMGT unique numbering for immunoglobulin and T cell receptor variable domains and Ig superfamily V-like domains, Developmental & Comparative Immunology, 27(1), 55-77.