SYSTEMS AND METHODS FOR SINGLE CELL DETECTION OF PROTEIN SECRETION

Title:

SYSTEMS AND METHODS FOR SINGLE CELL DETECTION OF PROTEIN SECRETION

Document Type and Number:

WIPO Patent Application WO/2024/097893

Kind Code:

Abstract:

Disclosed herein are fusion proteins and systems for detection of protein secretion. Also disclosed herein are methods for identifying at least one genomic region in a cell that modulates the secretion of a protein and methods for detecting the secretion level of a protein in a cell.

Inventors:

YATES JOSHUA D (US)
HILL JONATHON T (US)

Application Number:

PCT/US2023/078542

Publication Date:

May 10, 2024

Filing Date:

November 02, 2023

Export Citation:

Click for automatic bibliography generation Help

Assignee:

PIONEER BIOLABS LLC (US)

International Classes:

C12N9/48; C12Q1/37

Attorney, Agent or Firm:

COWIE, Ashley M. et al. (US)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

What is claimed is:

1. A fusion protein comprising:

(a) a translocated protein domain comprising a signal peptide, wherein the signal peptide mediates translocation of at least a portion of the translocated protein domain across a membrane;

(b) a membrane association domain comprising a linker for linking the fusion protein to the membrane;

(c) a post-translational modification site, wherein the post-translational modification site is capable of being modified when the translocated protein domain is localized to a distinct subcellular compartment or plasma membrane; and

(d) an indicator domain, wherein the indicator domain confers a detectable characteristic to a cell comprising the fusion protein upon modification of the post-translational modification site, and wherein the indicator domain remains a constituent of the cell after export of the translocated protein domain.

2. The fusion protein of claim 1 , wherein the translocated protein domain comprises a heterologous protein or an endogenous protein.

3. The fusion protein of claim 1 , wherein the translocated protein domain comprises a carbohydrase, an alpha-amylase, a protease, or a subtilisin.

4. The fusion protein of claim 1 , wherein the indicator domain is encoded by the nucleotide sequence of SEQ ID NO: 41 or SEQ ID NO: 43, or comprises the amino acid sequence of SEQ ID NO: 42 or SEQ ID NO: 44.

5. The fusion protein of claim 1 , wherein the translocated protein domain is encoded by the nucleotide sequence of SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, or SEQ ID NO: 11 , or comprises the amino acid sequence of SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, or SEQ ID NO: 12.

6. The fusion protein of claim 1 , further comprising a transmembrane domain.

7. The fusion protein of claim 1 , further comprising a single-pass transmembrane domain.

8. The fusion protein of claim 6, wherein the transmembrane domain is encoded by the nucleotide sequence of SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, or SEQ ID NO: 21 , or comprises the amino acid sequence of SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, or SEQ ID NO: 22.

9. The fusion protein of claim 1 , wherein the indicator domain comprises a cytoplasmic protein, a transcription factor, an antibiotic metabolizing protein, a fluorescent protein, a split fluorescent protein, or a Forster resonance energy transfer fluorescent protein.

10. The fusion protein of claim 1 , wherein the post-translational modification site comprises a tobacco etch virus protease cleavage site, an intramembrane protease cleavage site, a cleavage site for gamma secretase, a phosphorylation site, or a glycosylation site.

11. The fusion protein of claim 1 , further comprising a cleavage site, wherein cleavage of the fusion protein at the cleavage site separates the translocated protein domain from the remainder of the fusion protein.

12. The fusion protein of claim 1 , wherein the cell is a Saccharomyces cerevisiae cell, a Schizosaccharomyces pombe cell, a Pichia pastoris cell, a Yarrowia lipolytica cell, a Chinese hamster ovary cell, a murine myeloma cell, or a human embryonic kidney cell.

13. The fusion protein of claim 1 , wherein the fusion protein is encoded by the nucleotide sequence of SEQ ID NO: 107, SEQ ID NO: 110, or SEQ ID NO: 113 or comprises the amino acid sequence of SEQ ID NO: 108, SEQ ID NO: 111 , or SEQ ID NO: 114.

14. A polynucleotide that encodes the fusion protein of claim 1.

15. A method for identifying at least one genomic region in a cell that modulates the secretion of a protein comprising:

(a) generating at least one cell that expresses the fusion protein of claim 1; (b) introducing an agent into the cell, wherein the agent modulates at least one genomic region of the cell; and

16. The method of claim 15, further comprising (d) isolating the cell.

17. The method of claim 16, further comprising (e) sequencing at least one genomic region of the cell.

18. The method of claim 15, wherein the agent is a chemical mutagen, an oligonucleotide, a CRISPR complex, or a CRISPR complex comprising a guide RNA and a Cas9 molecule.

19. The method of claim 15, wherein a population of cells is generated.

20. The method of claim 19, wherein at least one cell is isolated using fluorescence- activated cell sorting or through dilution and plating on a solid media.

21. The method of claim 19, wherein the agent comprises a CRISPR guide RNA library.

22. A system comprising a cell comprising:

(a) the fusion protein of claim 1 ; and

(b) a membrane localized protein comprising an enzymatic domain, wherein the enzymatic domain modulates the post-translational modification site of the fusion protein.

23. The system of claim 22, wherein the membrane localized protein is localized to the plasma membrane.

24. The system of claim 23, wherein the membrane localized protein is localized to the plasma membrane through S-palmitoylation.

25. The system of claim 22, wherein the modulation of the post-translational modification site comprises cleavage of the fusion protein.

26. The system of claim 22, wherein the membrane localized protein comprises a tobacco etch virus protease.

27. A method for detecting the secretion level of a protein in a cell comprising:

(a) generating at least one cell comprising a fusion protein comprising:

(i) a translocated protein domain comprising a signal peptide, wherein the signal peptide mediates translocation of at least a portion of the translocated protein domain across a membrane;

(ii) a membrane association domain comprising a linker for linking the fusion protein to the membrane;

(iii) a post-translational modification site, wherein the post-translational modification site is capable of being modified when the translocated protein domain is localized to a distinct subcellular compartment or plasma membrane; and

(iv) an indicator domain, wherein the indicator domain confers a detectable characteristic to a cell comprising the fusion protein upon modification of the post-translational modification site, and wherein the indicator domain remains a constituent of the cell after export of the translocated protein domain; and

(b) measuring the detectable characteristic conferred to the cell.

28. The method of claim 27, wherein generation of the cell comprises introducing a polynucleotide encoding the fusion protein into the cell.

29. The method of claim 27, wherein generation of the cell comprises introducing a polynucleotide encoding the indicator domain into the cell.

30. The method of claim 27, wherein measuring the detectable characteristic comprises use of a flow cytometer or a microscope.

31. The method of claim 27, wherein a population of cells is generated.

Description:

SYSTEMS AND METHODS FOR SINGLE CELL DETECTION OF PROTEIN SECRETION

CROSS-REFERENCE TO RELATED APPLICATION(S)

[0001] This application claims priority to U.S. Provisional Patent Application No. 63/421 ,953, filed on November , 2022, which is incorporated herein by reference in its entirety.

FIELD

[0002] This disclosure relates to fusion proteins for measuring protein secretion of an individual cell. Also disclosed herein are systems and methods including the fusion proteins for measuring protein secretion of an individual cell.

INTRODUCTION

[0003] The production of proteins is becoming an increasingly important process to many industries including biotherapeutics, food, and consumer products. The ability to engineer a cell to produce a heterologous protein and produce the protein in large quantities has found applications in many different industries. However, heterologous proteins have a wide range of properties that can impact the levels of secretion in different host organisms. These differences often result in low yield. The inability to express proteins at commercially viable quantities can restrict their usage in industrial applications. Modification of genomic regions in the host cell that modulate metabolism, transcription, translation, folding, and subcellular trafficking of a protein can increase production for a specific protein. However, these genes can vary depending on the host cell and the protein being expressed and can be difficult to predict. High- throughput methods to screen for genes modulating the production of a particular protein in a specific host cell are therefore useful tools in engineering strains for industrial applications.

[0004] In some high-throughput screening methods, yeast surface display of a heterologous protein can be used as a proxy for secretion of the protein. Agalp and Aga2p are examples of proteins involved in making cell to cell contacts between yeast cells during mating. Because Aga2p can localize to the plasma membrane through the secretion pathway, fusing a heterologous protein to the C-terminus of Aga2p is one possible technique to localize a heterologous protein to the surface of a yeast cell. The Aga2 fusion protein can form two disulfide bonds with GPI anchored Agalp resulting in the heterologous protein being attached to the surface of the cell. This system can be applied to many applications including the development of antibodies. Additionally, heterologous proteins fused to Aga2p can be used as a proxy for secretion levels of heterologous proteins. Various methods applying this technique can result in display of a heterologous protein on the surface of a yeast cell that can then be labeled by a fluorescent antibody. A library of genes encoded by plasmids can then be overexpressed in cells and a fluorescent antibody is used to label the cells and cells can be sorted using fluorescence activated cell sorting (FACS) to isolate cells overexpressing proteins that modulate the fluorescent intensity signal of a labeled cell. The plasmids in those cells can then be sequenced to determine which overexpressed genes corresponded with the increase of fluorescent signal. There are several limitations with this technique. The display efficiency can be low, the 5 prime Aga2a fusion can alter translation efficiency, the Aga1 p-Aga2p complex takes up space on the surface of the cell, fusion proteins can stabilize proteins when folding, and the technique can require antibody labeling before cells can be selected. Due to these limitations, multiple rounds of selection using a flow cytometer can be required and genes relevant to processes such as protein folding may be missed.

[0005] In other approaches, individual cells from a mutagenized population are captured in microfluidic droplets where each cell expresses a fluorescently labeled heterologous protein resulting in a fluorescent droplet that can then be sorted using fluorescence activated cell sorting (FACS). The genomes of clones can be sequenced to determine mutations that give rise to increased protein secretion. This approach also has limitations. The technique can require a machine capable of forming droplets, the optimal time after induction to measure protein secretion after droplet creation often needs to be determined, the environment of the droplet may differ from that of a bioreactor, the empty droplets required to capture individual cells decreases sorting efficiency, and the fusion protein can stabilize protein folding. Multiple rounds of selection are also often used.

[0006] Thus, there is a need for systems, methods, and compositions for measuring the level of protein secretion of an individual cell where a protein confers a detectable characteristic to a cell that varies depending on the level of heterologous protein secretion.

SUMMARY

[0007] In an aspect, the disclosure relates to fusion protein comprising: (a) a translocated protein domain comprising a signal peptide, wherein the signal peptide mediates translocation of at least a portion of the translocated protein domain across a membrane; (b) a membrane association domain comprising a linker for linking the fusion protein to the membrane; (c) a post-translational modification site, wherein the post-translational modification site is capable of being modified when the translocated protein domain is localized to a distinct subcellular compartment or plasma membrane; and (d) an indicator domain, wherein the indicator domain confers a detectable characteristic to a cell comprising the fusion protein upon modification of the post-translational modification site, and wherein the indicator domain remains a constituent of the cell after export of the translocated protein domain. In an embodiment, the translocated protein domain comprises a heterologous protein or an endogenous protein. In another embodiment, the translocated protein domain comprises a carbohydrase, an alpha-amylase, a protease, or a subtilisin. In another embodiment, the indicator domain is encoded by the nucleotide sequence of SEQ ID NO: 41 or SEQ ID NO: 43, or comprises the amino acid sequence of SEQ ID NO: 42 or SEQ ID NO: 44. In another embodiment, the translocated protein domain is encoded by the nucleotide sequence of SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, or SEQ ID NO: 11 , or comprises the amino acid sequence of SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, or SEQ ID NO: 12. In another embodiment, a fusion protein described herein further comprises a transmembrane domain. In another embodiment, a fusion protein described herein further comprises a single-pass transmembrane domain. In another embodiment, the transmembrane domain is encoded by the nucleotide sequence of SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, or SEQ ID NO: 21 , or comprises the amino acid sequence of SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, or SEQ ID NO: 22. In another embodiment, the indicator domain comprises a cytoplasmic protein, a transcription factor, an antibiotic metabolizing protein, a fluorescent protein, a split fluorescent protein, or a Forster resonance energy transfer fluorescent protein. In another embodiment, the post-translational modification site comprises a tobacco etch virus protease cleavage site, an intramembrane protease cleavage site, a cleavage site for gamma secretase, a phosphorylation site, or a glycosylation site. In another embodiment, a fusion protein described herein further comprises a cleavage site, wherein cleavage of the fusion protein at the cleavage site separates the translocated protein domain from the remainder of the fusion protein. In another embodiment, the cell is a Saccharomyces cerevisiae cell, a Schizosaccharomyces pombe cell, a Pichia pastoris cell, a Yarrowia lipolytica cell, a Chinese hamster ovary cell, a murine myeloma cell, or a human embryonic kidney cell. In another embodiment, the fusion protein is encoded by the nucleotide sequence of SEQ ID NO: 107, SEQ ID NO: 110, or SEQ ID NO: 113 or comprises the amino acid sequence of SEQ ID NO: 108, SEQ ID NO: 111 , or SEQ ID NO: 114. [0008] In a further aspect, the disclosure relates to a polynucleotide that encodes a fusion protein described herein.

[0009] Another aspect of the disclosure provides a method for identifying at least one genomic region in a cell that modulates the secretion of a protein comprising: (a) generating at least one cell that expresses a fusion protein described herein; (b) introducing an agent into the cell, wherein the agent modulates at least one genomic region of the cell; and (c) measuring the detectable characteristic of the cell. In an embodiment, the method further comprises (d) isolating the cell. In another embodiment, the method further comprises (e) sequencing at least one genomic region of the cell. In another embodiment, the agent is a chemical mutagen, an oligonucleotide, a CRISPR complex, or a CRISPR complex comprising a guide RNA and a Cas9 molecule. In another embodiment, a population of cells is generated. In another embodiment, at least one cell is isolated using fluorescence-activated cell sorting or through dilution and plating on a solid media. In another embodiment, the agent comprises a CRISPR guide RNA library.

[00010] Another aspect of the disclosure provides a system comprising a cell comprising: (a) a fusion protein described herein; and (b) a membrane localized protein comprising an enzymatic domain, wherein the enzymatic domain modulates the post-translational modification site of the fusion protein. In an embodiment, the membrane localized protein is localized to the plasma membrane. In another embodiment, the membrane localized protein is localized to the plasma membrane through S-palmitoylation. In another embodiment, the modulation of the post-translational modification site comprises cleavage of the fusion protein. In another embodiment, the membrane localized protein comprises a tobacco etch virus protease.

[00011] Another aspect of the disclosure provides a method for detecting the secretion level of a protein in a cell comprising: (a) generating at least one cell comprising a fusion protein comprising: (i) a translocated protein domain comprising a signal peptide, wherein the signal peptide mediates translocation of at least a portion of the translocated protein domain across a membrane; (ii) a membrane association domain comprising a linker for linking the fusion protein to the membrane; (iii) a post-translational modification site, wherein the post-translational modification site is capable of being modified when the translocated protein domain is localized to a distinct subcellular compartment or plasma membrane; and (iv) an indicator domain, wherein the indicator domain confers a detectable characteristic to a cell comprising the fusion protein upon modification of the post-translational modification site, and wherein the indicator domain remains a constituent of the cell after export of the translocated protein domain; and (b) measuring the detectable characteristic conferred to the cell. In an embodiment, generation of the cell comprises introducing a polynucleotide encoding the fusion protein into the cell. In another embodiment, generation of the cell comprises introducing a polynucleotide encoding the indicator domain into the cell. In another embodiment, measuring the detectable characteristic comprises use of a flow cytometer or a microscope. In another embodiment, a population of cells is generated.

[00012] The disclosure provides for other aspects and embodiments that will be apparent in light of the following detailed description and accompanying figures.

BRIEF DESCRIPTION OF THE DRAWINGS

[00013] FIG. 1 is a diagram showing an exemplary fusion protein at various subcellular compartments and the plasma membrane in a cell. The exemplary fusion protein has 4 components where: 1 represents a translocated protein domain configured to be exported from the cell, 2 represents a transmembrane domain, 3 represents a tobacco etch virus (TEV) protease cleavage site, 4 represents a split fluorescent protein domain configured to fluoresce upon cleavage of the TEV protease site. FIG. 1 shows the exemplary fusion protein moving from the nucleus/endoplasmic reticulum (ER) to the Golgi apparatus to the plasma membrane of the cell. At the plasma membrane of the cell, 5 represents a TEV protease anchored to the plasma membrane by S-palmitoylation. In the cytosol of the cell, 6 represents the split fluorescent protein domain and the fluorescent signal it confers to the cell after it has been cleaved from the exemplary fusion protein. In the plasma membrane of the cell, 7 represents the remaining portion of the exemplary fusion protein after cleavage.

[00014] FIG. 2 is a diagram showing various schematics of exemplary fusion proteins and several domains of each protein. Each rectangle represents a rigid alpha helix configured to prevent contacts between beta sheet 10 and beta sheet 11. Each small black circle represents a tobacco etch virus (TEV) protease cleavage site. Cleavage of a TEV site can result in assembly of the beta sheets and maturation of the chromophore.

[00015] FIGS. 3A-C are flowcharts showing methods of detecting secretion of a protein from a cell. FIG. 3A is a flowchart showing an exemplary method (“302”) for detecting the secretion level of a protein in a cell. In a first step (“304”) a cell is configured to export a fusion protein described herein. In a second step (“306”) the level of a detectable characteristic conferred to the cell is measured. FIG. 3B is a flowchart showing an exemplary method (“300”) for identifying genomic regions in a cell that modulate the secretion of a protein. In a first step (“305”) a cell is configured to express a fusion protein described herein. In a second step (“310”) an agent is introduced into the cell wherein the agent is configured to modulate at least one genomic region of the cell. In a third step (“315”) the level of a detectable characteristic conferred to the cell is measured. FIG. 3C is a flowchart showing an exemplary method (“320”) for identifying genomic regions in a cell that modulate the secretion of a protein. In a first step (“325”) a polynucleotide is generated that encodes a fusion protein as described herein that comprises a portion of a fluorescent protein configured to assemble upon cleavage of the cleavage site, wherein assembly results in maturation of a chromophore and the portion of a fluorescent protein is configured to remain a constituent of the cell after export of the translocated protein domain of the fusion protein. In a second step (“330”) the polynucleotide is ligated into a plasmid comprising homology sequences in a genomic region of the cell. In a third step (“335”) the plasmid is linearized. Alternatively, the polynucleotide encoding the fusion protein generated in 325 may be introduced into the cell without ligating the polynucleotide into a plasmid (330) and without linearization of the plasmid (335). In a fourth step (“340”) the linearized plasmid is introduced into the cell where homologous recombination takes place. In a fifth step (“345”) a population of cells is generated from the cell. In a sixth step (“350”) an agent configured to modulate at least one genomic region such as a plurality of oligonucleotides that encode CRISPR Cas9 sgRNA sequences are introduced into the population of cells. In a seventh step (“355”) the fusion protein is expressed in the cells. In an eighth step (“360”) cells in the population of cells are separated into two containers based on the level of fluorescent signal of each cell using fluorescence activated cell sorting (FACS). In a ninth step (“365”) genomic regions of at least one cell sorted into at least one of the containers are sequenced to determine genomic regions modulating the level of the secretion of the fusion protein.

[00016] FIG. 4 is a diagram showing an exemplary fusion protein at various subcellular compartments and the plasma membrane in a cell. The exemplary fusion protein has 4 components where: 1 represents a translocated protein domain configured to be exported from the cell, 2 represents a transmembrane domain, 3 represents a tobacco etch virus (TEV) protease cleavage site, 4 represents a transcription factor configured to localize to the nucleus upon cleavage of the TEV protease site. FIG. 4 shows the exemplary fusion protein moving from the nucleus/endoplasmic reticulum (ER) to the Golgi apparatus to the plasma membrane of the cell. At the plasma membrane of the cell, 5 represents a TEV protease anchored to the plasma membrane by S-palmitoylation. In the cytosol of the cell, 6 represents the transcription factor after it has been cleaved from the exemplary fusion protein. Upon binding to a regulatory domain in the DNA a fluorescent protein is expressed and fluorescence can be detected. In the plasma membrane of the cell, 7 represents the remaining portion of the exemplary fusion protein after cleavage.

DETAILED DESCRIPTION

[00017] Described herein are systems, methods, and compositions for measuring the level of protein secretion of an individual cell. In various embodiments, a protein confers a detectable characteristic to a cell which varies depending on the level of heterologous protein secretion. In some embodiments, the protein confers a fluorescent signal when the heterologous protein is exported from the cell. In some embodiments, the methods also employ a CRISPR-based genome-wide screen to detect mutation in a high-throughput manner. In various embodiments, an oligonucleotide encoding a fusion protein is delivered to a cell wherein the fusion protein comprises a heterologous translocated protein domain, a single-pass transmembrane domain, a cleavage site for tobacco etch virus protease (TEV), and a split fluorescent protein domain configured to fluoresce upon cleavage of the TEV protease site. In some embodiments, a TEV protease is anchored to the plasma membrane by S-palmitoylation wherein localization of the fusion protein to the plasma membrane can result in cleavage of the TEV cleavage site and fluorescence of the split fluorescent protein domain. In some embodiments, misfolding of the heterologous protein in the endoplasmic reticulum (ER) can result in endoplasmic-reticulum- associated protein degradation (ERAD) of the fusion protein wherein cleavage of the TEV cleavage site may not occur and a fluorescent signal may not be detected.

[00018] There are several advantages of the systems, methods, and compositions described herein over other methods and systems for detecting the level of protein secretion in a cell. The 5’ end of a transcript comprising the translational start site will have the same sequence as a secreted protein. Membrane separation of the heterologous protein and a fluorescent protein in some instances can be less likely to stabilize the heterologous protein. Cells can be taken directly from a bioreactor. No microfluidic equipment is required. The fusion protein takes up little space on the plasma membrane. The signal can be tuned using a degron sequence.

1. Definitions [00019] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. The meaning and scope of the terms should be clear. In case of conflict, the present document, including definitions, take precedent over any dictionary or extrinsic definition. Further, unless otherwise required by context, singular terms shall include pluralities and plural terms shall include the singular. Preferred methods and materials are described below, although methods and materials similar or equivalent to those described herein can be used in practice or testing of the present invention. All publications, patent applications, patents and other references mentioned herein are incorporated by reference in their entirety. The materials, methods, and examples disclosed herein are illustrative only and not intended to be limiting.

[00020] The terms “comprise(s),” “include(s),” “having,” “has,” “can,” “contain(s),” and variants thereof, as used herein, are intended to be open-ended transitional phrases, terms, or words that do not preclude the possibility of additional acts or structures. The singular forms “a,” “and,” and “the” include plural references unless the context clearly dictates otherwise. The present disclosure also contemplates other embodiments “comprising,” “consisting of,” and “consisting essentially of,” the embodiments or elements presented herein, whether explicitly set forth or not.

[00021] For the recitation of numeric ranges herein, each intervening number there between with the same degree of precision is explicitly contemplated. For example, for the range of 6-9, the numbers 7 and 8 are contemplated in addition to 6 and 9, and for the range 6.0-7.0, the number 6.0, 6.1 , 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, and 7.0 are explicitly contemplated.

[00022] The term “about” or “approximately” as used herein as applied to one or more values of interest, refers to a value that is similar to a stated reference value, or within an acceptable error range for the particular value as determined by one of ordinary skill in the art, which will depend in part on how the value is measured or determined, such as the limitations of the measurement system. In certain aspects, the term “about” refers to a range of values that fall within 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, or less in either direction (greater than or less than) of the stated reference value unless otherwise stated or otherwise evident from the context (except where such number would exceed 100% of a possible value). Alternatively, “about” can mean within 3 or more than 3 standard deviations, per the practice in the art. Alternatively, such as with respect to biological systems or processes, the term “about” can mean within an order of magnitude, preferably within 5-fold, and more preferably within 2-fold, of a value.

[00023] “Amino acid” as used herein refers to naturally occurring and non-natural synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally occurring amino acids are those encoded by the genetic code. Amino acids can be referred to herein by either their commonly known three-letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Amino acids include the side chain and polypeptide backbone portions.

[00024] “Coding sequence” or “encoding nucleic acid” as used herein means the nucleic acids (RNA or DNA molecule) that comprise a nucleotide sequence which encodes a protein. The coding sequence can further include initiation and termination signals operably linked to regulatory elements including a promoter and polyadenylation signal capable of directing expression in cells to which the nucleic acid is administered. The coding sequence may be codon optimized.

[00025] “Complement” or “complementary” as used herein means a nucleic acid can mean Watson-Crick (e.g., A-T/U and C-G) or Hoogsteen base pairing between nucleotides or nucleotide analogs of nucleic acid molecules. “Complementarity” refers to a property shared between two nucleic acid sequences, such that when they are aligned antiparallel to each other, the nucleotide bases at each position will be complementary.

[00026] “Fusion protein” as used herein refers to a chimeric protein created through the joining of two or more genes that originally coded for separate proteins. The translation of the fusion gene results in a single polypeptide with functional properties derived from each of the original proteins.

[00027] “Genetic construct" as used herein refers to the DNA or RNA molecules that comprise a polynucleotide that encodes a protein. The coding sequence includes initiation and termination signals operably linked to regulatory elements including a promoter and polyadenylation signal capable of directing expression in cells. As used herein, the term “expressible form” refers to gene constructs that contain the necessary regulatory elements operable linked to a coding sequence that encodes a protein such that when present in a cell, the coding sequence will be expressed. [00028] The term “heterologous” as used herein refers to nucleic acid comprising two or more subsequences that are not found in the same relationship to each other in nature. For instance, a nucleic acid that is recombinantly produced typically has two or more sequences from unrelated genes synthetically arranged to make a new functional nucleic acid, for example, a promoter from one source and a coding region from another source. The two nucleic acids are thus heterologous to each other in this context. When added to a cell, the recombinant nucleic acids would also be heterologous to the endogenous genes of the cell. Similarly, a heterologous protein indicates that the protein comprises two or more subsequences that are not found in the same relationship to each other in nature (for example, a “fusion protein,” where the two subsequences are encoded by a single nucleic acid sequence).

[00029] “Identical” or “identity” as used herein in the context of two or more polynucleotide or polypeptide sequences means that the sequences have a specified percentage of residues that are the same over a specified region. The percentage may be calculated by optimally aligning the two sequences, comparing the two sequences over the specified region, determining the number of positions at which the identical residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the specified region, and multiplying the result by 100 to yield the percentage of sequence identity. In cases where the two sequences are of different lengths or the alignment produces one or more staggered ends and the specified region of comparison includes only a single sequence, the residues of single sequence are included in the denominator but not the numerator of the calculation. When comparing DNA and RNA, thymine (T) and uracil (U) may be considered equivalent. Identity may be performed manually or by using a computer sequence algorithm such as BLAST or BLAST 2.0.

[00030] “Nucleic acid” or “oligonucleotide” or “polynucleotide” as used herein means at least two nucleotides covalently linked together. The depiction of a single strand also defines the sequence of the complementary strand. Thus, a polynucleotide also encompasses the complementary strand of a depicted single strand. Many variants of a polynucleotide may be used for the same purpose as a given polynucleotide. Thus, a polynucleotide also encompasses substantially identical polynucleotides and complements thereof. A single strand provides a probe that may hybridize to a target sequence under stringent hybridization conditions. Thus, a polynucleotide also encompasses a probe that hybridizes under stringent hybridization conditions. Polynucleotides may be single stranded or double stranded or may contain portions of both double stranded and single stranded sequence. The polynucleotide can be nucleic acid, natural or synthetic, DNA, genomic DNA, cDNA, RNA, or a hybrid, where the polynucleotide can contain combinations of deoxyribo- and ribo-nucleotides, and combinations of bases including, for example, uracil, adenine, thymine, cytosine, guanine, inosine, xanthine hypoxanthine, isocytosine, and isoguanine. Polynucleotides can be obtained by chemical synthesis methods or by recombinant methods.

[00031] “Operably linked” as used herein means that expression of a gene is under the control of a promoter with which it is spatially connected. A promoter may be positioned 5' (upstream) or 3' (downstream) of a gene under its control. The distance between the promoter and a gene may be approximately the same as the distance between that promoter and the gene it controls in the gene from which the promoter is derived. As is known in the art, variation in this distance may be accommodated without loss of promoter function. Nucleic acid or amino acid sequences are “operably linked” (or “operatively linked”) when placed into a functional relationship with one another. For instance, a promoter or enhancer is operably linked to a coding sequence if it regulates, or contributes to the modulation of, the transcription of the coding sequence. Operably linked DNA sequences are typically contiguous, and operably linked amino acid sequences are typically contiguous and in the same reading frame. However, since enhancers generally function when separated from the promoter by up to several kilobases or more and intronic sequences may be of variable lengths, some polynucleotide elements may be operably linked but not contiguous. Similarly, certain amino acid sequences that are non-contiguous in a primary polypeptide sequence may nonetheless be operably linked due to, for example folding of a polypeptide chain. With respect to fusion polypeptides or protein and non-protein linkages such as a fusion protein linked to a membrane, the terms “operatively linked” and “operably linked” can refer to the fact that each of the components performs the same function in linkage to the other component as it would if it were not so linked.

[00032] A “peptide” or “polypeptide” is a linked sequence of two or more amino acids linked by peptide bonds. The polypeptide can be natural, synthetic, or a modification or combination of natural and synthetic. Peptides and polypeptides include proteins such as binding proteins, receptors, and antibodies. The terms “polypeptide”, “protein,” and “peptide” are used interchangeably herein. “Primary structure” refers to the amino acid sequence of a particular peptide. “Secondary structure” refers to locally ordered, three dimensional structures within a polypeptide. These structures are commonly known as “domains”, for example, enzymatic domains, extracellular domains, transmembrane domains, pore domains, and cytoplasmic tail domains. “Domains” are portions of a polypeptide that form a compact unit of the polypeptide and are typically 15 to 350 amino acids long. Exemplary domains include domains with translocation activity, indicator activity, or membrane association activity. Typical domains are made up of sections of lesser organization such as stretches of beta-sheet and alpha-helices. “Tertiary structure” refers to the complete three-dimensional structure of a polypeptide monomer. “Quaternary structure” refers to the three-dimensional structure formed by the noncovalent association of independent tertiary units. A “motif is a portion of a polypeptide sequence and includes at least two amino acids. A motif may be 2 to 20, 2 to 15, or 2 to 10 amino acids in length. In some embodiments, a motif includes 3, 4, 5, 6, or 7 sequential amino acids. A domain may be comprised of a series of the same type of motif.

[00033] “Promoter” as used herein means a synthetic or naturally derived molecule which is capable of conferring, activating, or enhancing expression of a nucleic acid in a cell. A promoter may comprise one or more specific transcriptional regulatory sequences to further enhance expression and/or to alter the spatial expression and/or temporal expression of same. A promoter may also comprise distal enhancer or repressor elements, which may be located as much as several thousand base pairs from the start site of transcription. A promoter may be derived from sources including viral, bacterial, fungal, plants, insects, and animals. A promoter may regulate the expression of a gene component constitutively, or differentially with respect to cell, the tissue or organ in which expression occurs or, with respect to the developmental stage at which expression occurs, or in response to external stimuli such as physiological stresses, pathogens, metal ions, or inducing agents. Representative examples of promoters include the bacteriophage T7 promoter, bacteriophage T3 promoter, SP6 promoter, lac operator-promoter, tac promoter, SV40 late promoter, SV40 early promoter, RSV-LTR promoter, CMV I promoter, SV40 early promoter or SV40 late promoter, human U6 (hU6) promoter, CMV IE promoter, pTDH3 promoter, pCCW12 promoter, pPGK1 promoter, pHHF2 promoter, pTEF1 promoter, pTEF2 promoter, pHHF1 promoter, pHTB2 promoter, pRPL18B promoter, pALD6 promoter, pPAB1 promoter, pRET2 promoter, pRNR1 promoter, pSAC6 promoter, pRNR2 promoter, pPOP6 promoter, pRAD27 promoter, pPSP2 promoter, pREV1 promoter, pMFA1 promoter, pMFa2 promoter, pGAL1 promoter, and pCUP1 promoter.

[00034] The term “recombinant” when used with reference to, for example, a cell, nucleic acid, protein, or vector, indicates that the cell, nucleic acid, protein, or vector, has been modified by the introduction of a heterologous nucleic acid or protein or the alteration of a native nucleic acid or protein, or that the cell is derived from a cell so modified. Thus, for example, recombinant cells express genes that are not found within the native (naturally occurring) form of the cell or express a second copy of a native gene that is otherwise normally or abnormally expressed, under expressed, or not expressed at all.

[00035] “Substantially identical” can mean that a first and second amino acid or polynucleotide sequence are at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical over a region of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100 amino acids or nucleotides, respectively.

[00036] “Variant” used herein with respect to a polynucleotide means (i) a portion or fragment of a referenced nucleotide sequence; (ii) the complement of a referenced nucleotide sequence or portion thereof; (iii) a nucleic acid that is substantially identical to a referenced nucleic acid or the complement thereof; or (iv) a nucleic acid that hybridizes under stringent conditions to the referenced nucleic acid, complement thereof, or a sequences substantially identical thereto.

[00037] “Variant” with respect to a peptide or polypeptide that differs in amino acid sequence by the insertion, deletion, or conservative substitution of amino acids, but retain at least one biological activity. Variant may also mean a protein with an amino acid sequence that is substantially identical to a referenced protein with an amino acid sequence that retains at least one biological activity. Representative examples of “biological activity” include the ability to be bound by a specific antibody or polypeptide or to promote an immune response. Variant can mean a functional fragment thereof. Variant can also mean multiple copies of a polypeptide. The multiple copies can be in tandem or separated by a linker. A conservative substitution of an amino acid, for example, replacing an amino acid with a different amino acid of similar properties (for example, hydrophilicity, degree and distribution of charged regions) is recognized in the art as typically involving a minor change. These minor changes may be identified, in part, by considering the hydropathic index of amino acids, as understood in the art (Kyte et al., J.

Mol. Biol. 1982, 157, 105-132). The hydropathic index of an amino acid is based on a consideration of its hydrophobicity and charge. It is known in the art that amino acids of similar hydropathic indexes may be substituted and still retain protein function. For example, amino acids having hydropathic indexes of ±2 are substituted. The hydrophilicity of amino acids may also be used to reveal substitutions that would result in proteins retaining biological function. A consideration of the hydrophilicity of amino acids in the context of a peptide permits calculation of the greatest local average hydrophilicity of that peptide. Substitutions may be performed with amino acids having hydrophilicity values within ±2 of each other. Both the hydrophobicity index and the hydrophilicity value of amino acids are influenced by the particular side chain of that amino acid. Consistent with that observation, amino acid substitutions that are compatible with biological function are understood to depend on the relative similarity of the amino acids, and particularly the side chains of those amino acids, as revealed by the hydrophobicity, hydrophilicity, charge, size, and other properties.

[00038] “Vector” as used herein means a nucleic acid sequence containing an origin of replication. A vector may be a viral vector, bacteriophage, bacterial artificial chromosome, or yeast artificial chromosome. A vector may be a DNA or RNA vector. A vector may be a selfreplicating extrachromosomal vector, and preferably, is a DNA plasmid. For example, the vector may encode a fusion protein described herein.

2. Fusion Protein

[00039] Provided herein are fusion proteins. The fusion proteins may include a translocated protein domain, a membrane association domain, a post-translational modification (PTM) site, and an indicator domain. A fusion protein may be encoded by the nucleotide sequence of SEQ ID NO: 107, SEQ ID NO: 110, or SEQ ID NO: 113, or variants thereof. A fusion protein may comprise the amino acid sequence of SEQ ID NO: 108, SEQ ID NO: 111 , or SEQ ID NO: 114, or variants thereof.

[00040] The fusion proteins may also comprise a transmembrane domain such as a singlepass transmembrane domain. The transmembrane domain may be encoded by the nucleotide sequence of SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, or SEQ ID NO: 21 , or variants thereof. The transmembrane domain may comprise the amino acid sequence of SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, or SEQ ID NO: 22, or variants thereof. Other transmembrane domains may include transmembrane domains from the Saccharomyces cerevisiae genes HKR1 , MSB2, MID2, MTL1 , WSC2, WSC3, WSC4, SLG1 , OPY2, FAR10, SSO1 , SSO2, PEP12, FUS1 , SKG6, TOS2, AXL2, RAX2, SKG1 , PMP1 , PMP2, FET3, FET5, GMC1, SKN1 , TRE2, TRE1 , VPS70, YN60, YEH2, NPP1, NPP2, YO08A, FYV12, NAG1 , PIN2, or YKY5 or variants thereof. The fusion proteins may further comprise a cleavage site. Cleavage of the fusion protein at the cleavage site may separate the translocated protein domain from the remainder of the fusion protein.

[00041] At least a portion of the translocated protein domain may be configured to be exported from a cell. The translocated protein domain may comprise a heterologous protein or an endogenous protein. The translocated protein domain may comprise a carbohydrase, an alpha-amylase, a protease, or a subtilisin. Other translocated protein domains may include insulin, beta-lactoglobulin, ovalbumin, albumin, antibodies, single-chain antibody fragments (scFv), membrane proteins, DNA polymerases, RNA polymerases, helicases, restriction enzymes, nucleases, single stranded DNA binding proteins, or fluorescent proteins. The carbohydrase, the alpha-amylase, the protease, and the subtilisin may be heterologous or endogenous. The translocated protein domain may be encoded by the nucleotide sequence of SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, or SEQ ID NO: 11 , or variants thereof. The translocated protein domain may comprise the amino acid sequence of SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, or SEQ ID NO: 12, or variants thereof.

[00042] A fusion protein may comprise a signal peptide. In particular, the translocated protein domain of the fusion protein may comprise a signal peptide. The signal peptide may mediate translocation of at least a portion of the translocated protein domain across a membrane. The signal peptide may be encoded by the nucleotide sequence of SEQ ID NO: 1, SEQ ID NO: 3, or variants thereof. The signal peptide may have the amino acid sequence of SEQ ID NO: 2, SEQ ID NO: 4, or variants thereof. The fusion protein may comprise the signal peptide at the N-terminus of the fusion protein.

[00043] The membrane association domain may be capable of operably linking to a membrane. The membrane may be a membrane of the endoplasmic reticulum, a membrane of the golgi apparatus, a vacuole membrane, or a plasma membrane. A fusion protein may comprise a linker. In particular, the membrane association domain may comprise a linker for linking the fusion protein to a membrane. The linker may be encoded by the nucleotide sequence of SEQ ID NO: 31 , SEQ ID NO: 33, SEQ ID NO: 35, SEQ ID NO: 37, or SEQ ID NO: 39, or variants thereof. The linker may have the amino acid sequence of SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID NO: 38, or SEQ ID NO: 40, or variants thereof. The fusion protein may comprise the linker between the translocated protein domain and the indicator domain.

[00044] The PTM site may be capable of being modified when the translocated protein domain is localized to a distinct subcellular compartment or plasma membrane. The PTM site may be modified when the translocated protein domain is localized to a distinct subcellular compartment or plasma membrane. The subcellular compartment may be the endoplasmic reticulum, the golgi apparatus, or a vacuole of a cell. The PTM site may include a tobacco etch virus protease cleavage site, an intramembrane protease cleavage site, a cleavage site for gamma secretase, a phosphorylation site, or a glycosylation site.

[00045] The indicator domain may confer a detectable characteristic to a cell that comprises the fusion protein upon modification of the PTM site. The indicator domain may remain a constituent of the cell after export of the translocated protein domain. The indicator domain may include a cytoplasmic protein, a transcription factor, an antibiotic metabolizing protein, a fluorescent protein, a split fluorescent protein, or a Forster resonance energy transfer fluorescent protein. The indicator domain may be encoded by the nucleotide sequence of SEQ ID NO: 41 or SEQ ID NO: 43, or variants thereof. The indicator domain may comprise the amino acid sequence of SEQ ID NO: 42 or SEQ ID NO: 44, or variants thereof.

3. Fusion Protein-Based System

[00046] Provided herein are fusion protein-based systems. “Fusion protein-based system and “system” may be used interchangeably herein. The system may be used to detect protein secretion. The system may include a fusion protein and a membrane localized protein comprising an enzymatic domain. The membrane localized protein comprising an enzymatic domain may be encoded by the nucleotide sequence of SEQ ID NO: 45 or have the amino acid sequence of SEQ ID NO: 46. The enzymatic domain of the membrane localized protein may modulate the PTM site of the fusion protein. Modulation may include cleavage, phosphorylation, SUMOylation, or glycosylation of the fusion protein. The membrane localized protein may be localized to the plasma membrane. The membrane localized protein may be localized to the plasma membrane through S-palmitoylation or myristoylation. The membrane localized protein may comprise a tobacco etch virus protease, a kinase, a SUMO-protein, or glycosylation proteins.

4. Genetic Constructs

[00047] A fusion protein described herein may be encoded by a polynucleotide. A fusion protein described herein may be encoded by or comprised within a genetic construct. The genetic construct, such as a plasmid or expression vector, may comprise a nucleic acid that encodes the fusion protein. A fusion protein-based system may be encoded by or comprised within a genetic construct. The genetic construct, such as a plasmid or expression vector, may comprise a nucleic acid that encodes the fusion protein-based system. The fusion protein and the membrane localized protein may be encoded by the same genetic construct or by different genetic constructs.

[00048] Genetic constructs may include polynucleotides such as vectors and plasmids. The genetic construct may be a plasmid. The plasmid may have the nucleotide sequence of SEQ ID NO: 106, SEQ ID NO: 109, SEQ ID NO: 112, or SEQ ID NO: 115. The vector may be an expression vectors or system to produce protein by routine techniques and readily available starting materials including Sambrook et al., Molecular Cloning and Laboratory Manual, Second Ed., Cold Spring Harbor (1989), which is incorporated fully by reference. The construct may be recombinant. The genetic construct may comprise regulatory elements for gene expression of the coding sequences of the nucleic acid. The regulatory elements may be a promoter, an enhancer, an initiation codon, a stop codon, or a polyadenylation signal.

5. Compositions

[00049] Further provided herein are compositions comprising the above-described genetic constructs or systems. In some embodiments, the composition may comprise about 1 ng to about 100 pg of DNA encoding the fusion protein-based system. The systems or genetic constructs as detailed herein, or at least one component thereof, may be formulated into a composition in accordance with standard techniques well known to those skilled in the art. The compositions can be formulated according to the mode of administration to be used. The compositions may be sterile, pyrogen free, and particulate free. An isotonic formulation may be used. Generally, additives for isotonicity may include sodium chloride, dextrose, mannitol, sorbitol, and lactose. In some cases, isotonic solutions such as phosphate buffered saline are preferred.

[00050] The composition may further comprise a pharmaceutically acceptable excipient. The pharmaceutically acceptable excipient may be functional molecules as vehicles, adjuvants, carriers, or diluents. The term “pharmaceutically acceptable carrier,” may be a non-toxic, inert solid, semi-solid or liquid filler, diluent, encapsulating material or formulation auxiliary of any type. Pharmaceutically acceptable carriers include, for example, diluents, lubricants, binders, disintegrants, colorants, flavors, sweeteners, antioxidants, preservatives, glidants, solvents, suspending agents, wetting agents, surfactants, emollients, propellants, humectants, powders, pH adjusting agents, and combinations thereof. The pharmaceutically acceptable excipient may be a transfection facilitating agent, which may include surface active agents, such as immune- stimulating complexes (ISCOMS), Freunds incomplete adjuvant, LPS analog including monophosphoryl lipid A, muramyl peptides, quinone analogs, vesicles such as squalene and squalene, hyaluronic acid, lipids, liposomes, calcium ions, viral proteins, polyanions, polycations, or nanoparticles, or other known transfection facilitating agents. The transfection facilitating agent may be a polyanion, polycation, including poly-L-glutamate (LGS), or lipid. The transfection facilitating agent may be poly-L-glutamate.

6. Administration

[00051] The systems or genetic constructs as detailed herein, or at least one component thereof, may be administered or delivered to a cell. Methods of introducing a nucleic acid into a host cell are known in the art, and any known method can be used to introduce a nucleic acid (e.g., an expression construct) into a cell. Suitable methods include, for example, viral or bacteriophage infection, transfection, conjugation, protoplast fusion, polycation or lipidmucleic acid conjugates, lipofection, electroporation, nucleofection, immunoliposomes, calcium phosphate precipitation, polyethyleneimine (PEI)-mediated transfection, DEAE-dextran mediated transfection, liposome-mediated transfection, particle gun technology, calcium phosphate precipitation, direct micro injection, nanoparticle-mediated nucleic acid delivery, lithium acetate (LiAc) and heat shock transformation, and the like. Several different buffers may be used, including BioRad electroporation solution, Sigma phosphate-buffered saline product #D8537 (PBS), Invitrogen OptiMEM I (OM), or Amaxa Nucleofector solution V (N.V.). Transfections may include a transfection reagent, such as Lipofectamine 2000.

[00052] Upon delivery of the presently disclosed systems or genetic constructs as detailed herein, or at least one component thereof, or the compositions comprising the same, into cells, the transfected cells may express the fusion protein or fusion protein-based system. a. Cell Types

[00053] Any of the delivery methods detailed herein can be utilized with a myriad of cell types, including, but not limited to, a Saccharomyces cerevisiae cell, a Schizosaccharomyces pombe cell, a Pichia pastoris cell, a Yarrowia lipolytica cell, a Chinese hamster ovary cell, a murine myeloma cell, and a human embryonic kidney cell. Cells can be modified to isolate and expand clonal populations of cells that include a fusion protein or system described herein.

7. Methods a. Methods for Identifying At Least One Genomic Region in a Cell that Modulates the Secretion of a Protein

[00054] Provided herein are methods for identifying at least one genomic region in a cell that modulates the secretion of a protein. The methods may include generating at least one cell that expresses a fusion protein described herein; introducing an agent into the cell, wherein the agent modulates at least one genomic region of the cell; and measuring the detectable characteristic of the cell. The agent may be a chemical mutagen, UV irradiation, an oligonucleotide, a CRISPR complex, or a CRISPR complex comprising a guide RNA and a Cas9 molecule. In some embodiments, the agent may be a CRISPR guide RNA library.

[00055] The method may further include isolating the cell. A cell may be isolated by fluorescence-activated cell sorting, manual isolation under a microscope, computer assisted optical tweezer isolation, or through dilution and plating on a solid media. The method may also further include sequencing at least one genomic region of the cell.

[00056] A population of cells may be generated by the method. b. Methods for Detecting Secretion of a Protein

[00057] Provided herein are methods for detecting the secretion level of a protein in a cell. The methods may include generating at least one cell comprising a fusion protein as described herein and measuring a detectable characteristic described herein conferred to the cell.

[00058] Generation of the cell may include introducing a polynucleotide encoding the fusion protein into the cell. Generation of the cell may also include introducing a polynucleotide encoding the indicator domain into the cell.

[00059] Measuring the detectable characteristic may include use of an instrument of measurement such as a flow cytometer or a microscope. Antibiotic resistance may be measured by plating clones on plates with increasing amounts of antibiotic and counting colonies.

[00060] A population of cells may be generated by the method.

8. Kits [00061] Provided herein is a kit, which may be used to detect protein secretion. The kit comprises genetic constructs or a composition comprising the same, for detection of protein secretion, as described above, and instructions for using said composition. In some embodiments, the kit comprises at least one fusion protein comprising the amino acid sequence of SEQ ID NO: 108, SEQ ID NO: 111 , or SEQ ID NO: 114 or encoded by a polynucleotide sequence of SEQ ID NO: 107, SEQ ID NO: 110, or SEQ ID NO: 113, a complement thereof, a variant thereof, or fragment thereof, or fusion protein-based system comprising at least one fusion protein comprising the amino acid sequence of SEQ ID NO: 108, SEQ ID NO: 111 , or SEQ ID NO: 114 or encoded by a polynucleotide sequence of SEQ ID NO: 107, SEQ ID NO: 110, or SEQ ID NO: 113, a complement thereof, a variant thereof, or fragment thereof and at least one membrane localized protein comprising the amino acid sequence of SEQ ID NO: 46 or encoded by a polynucleotide sequence of SEQ ID NO: 45, a complement thereof, a variant thereof, or fragment thereof, and instructions for using the fusion protein.

[00062] Instructions included in kits may be affixed to packaging material or may be included as a package insert. While the instructions are typically written on printed materials they are not limited to such. Any medium capable of storing such instructions and communicating them to an end user is contemplated by this disclosure. Such media include, but are not limited to, electronic storage media (e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the like. As used herein, the term “instructions” may include the address of an internet site that provides the instructions.

9. Examples

[00063] The foregoing may be better understood by reference to the following examples, which are presented for purposes of illustration and are not intended to limit the scope of the invention. The present disclosure has multiple aspects and embodiments, illustrated by the appended non-limiting examples.

Example 1

Materials and Methods

[00064] PCR Amplification and Purification ofDNA Fragments. Genomic DNA from S. cerevisiae will be isolated using the DNeasy PowerLyzer Microbial Kit from Qiagen (Hilden, Germany) or plasmid DNA will be isolated using the Monarch® Plasmid Miniprep Kit from New England Biolabs (NEB, Ipswich, USA) to obtain template DNA for PCR amplification. DNA oligos and fragments, gBIocks™ and gBIocks HiFi Gene Fragments, will be synthesized by Integrated DNA Technologies, Inc. (IDT, Coralville, USA). A 2 step PCR using Q5® High- Fidelity DNA Polymerase (NEB) will be accomplished by designing primers with an annealing temperature of 64-72°C using the T _m Calculator from NEB and setting up the reaction according to the manufacturer’s recommendations. PCR fragments will be purified using the DNA Clean & Concentrator-5 Kit (Zymo Research, Irvine, USA).

[00065] Golden Gate Assembly and Cloning. T4 DNA Ligase and Bsal-HF®v2 (NEB) will be used in a single thermocycling reaction to assemble over 24 DNA fragments flanked by Bsal sites or DNA fragments cloned into plasmids containing Bsal sites. A standard reaction of 25 pl uses 0.5 pl of T4 DNA Ligase (2000 U/pl) and 1.5 pl of Bsal-HFv2 (20 U/pl) and about 75 ng of each fragment precloned into a plasmid. For DNA fragments that have not been cloned, molar ratios will be calculated based on the size of each fragment. A thermocycler will be programmed as follows: (5 min 37°C —>■ 5 min 16°C) x 30 cycles followed by 5 min 60°C.

[00066] The MoClo Yeast Toolkit (MoClo-YTK) is a collection of characterized DNA parts including promoters, coding sequences, terminators, origins of replication, and selectable markers, that uses golden gate assembly to quickly construct plasmids and can be ordered from Addgene (Watertown, USA).

[00067] Bacterial Transformation. Electrocompetent cells (10-beta Electrocompetent E. coli) will be purchased from NEB® and transformed following the manufacturer’s instructions. Briefly, cells and electroporation cuvettes are placed on ice until cells are thawed. Approximately, 1 pL of DNA is added to 25 pL of cells and pipetted into a cuvette. After electroporation, 975 pl of outgrowth media is added to the cells and cells are incubated at 37°C for 1 hour before being plated on a selective media.

[00068] Yeast Transformation. Yeast cells will be transformed using the Lithium Acetate and PEG-3350 method suggested by the providers of the MoClo Yeast Toolkit. Briefly, cells are grown to GD600-0.8 in YPD. Cells are pelleted and washed once with water and twice with 100 mM Lithium Acetate and are vortexed with 2.4 mL of 50% PEG-3350, 360 pL of 1 M Lithium Acetate, 250 pL of salmon sperm DNA, and 500 pL of water. DNA is added to 100-350 pL of transformation mixture and incubated at 42°C for 25 min. Cells are incubated or pelleted and plated directly on agar plates containing the appropriate selection conditions. Plasmids with 5' and 3' genome homology arms for chromosomal integration are digested with Notl for 10 min prior to transformation to stimulate homologous recombination.

[00069] Yeast Strains. BY4741 (MATa his3A1 leu2A0 met15A0 ura3A) and BY4742 (MATa his3A1 leu2A0 lys2A0 ura3A0) are stains derived from the widely used S288C laboratory strain. These strains carry deletions that allow for selection using auxotrophic markers. BY4743 is a diploid cross of BY4741 and 4742.

[00070] Confocal Microscopy and Cell Sorting. A confocal laser scanning microscope (Olympus FluoView FV1000) with a 60* (numerical aperture 1.2) oil-immersion objective will be used to obtain images of cells using an appropriate laser. Cells will be immobilized with hydrogel (Biomade, Groningen, Netherlands) between a glass slide and coverslip prior to imaging.

[00071] Fluorescence-activated cell sorting (FACS) of cells expressing fluorescent proteins will be carried out using the S3e Cell Sorter from Biorad (Hercules, USA).

Example 2

Expression of Exemplary Fusion Proteins in Cells

[00072] The MoClo Yeast Tool Kit allows users to create custom parts through gene synthesis or PCR. The following PCR fragments will be generated using Q5 polymerase. Each reaction shows the corresponding forward and reverse primers as well as the template DNA (TABLE 1). The annealing temperature can be calculated for each of the primers using the NEB T _m Calculator. Extension times will be based on the length of the PCR fragments.

TABLE 1. PCR Fragments and Primers

[00073] The plasmid having the nucleotide sequence of SEQ ID NO: 106 will be constructed by Golden Gate assembly with the following MoClo parts and PCR products shown in TABLE 2:

TABLE 2. MoClo Parts and PCR Products for SEQ ID NO: 106

[00074] This plasmid expresses a fusion protein comprising an ER signal peptide, sfGFP, a WSC4 single pass transmembrane linker, a tobacco etch virus (TEV) cleavage site, and mTurquoise2. After transformation of the plasmid and selection of clones using the His3 marker, protein expression will be induced using 0.01% galactose in the media. Cells expressing the fusion protein will be imaged on a confocal microscope using a 488 nm laser for sfGFP and a 434 nm laser for mTurquoise2. Emission wavelengths are 510 and 474 respectively. Images will demonstrate localization of the fusion protein to the plasma membrane.

[00075] The plasmid having the nucleotide sequence of SEQ ID NO: 115 will be constructed by Golden Gate assembly with the following MoClo parts and PCR products shown in TABLE 3: TABLE 3. MoClo Parts and PCR Products for SEQ ID NO: 115

[00076] This plasmid expresses a fusion protein comprising TEV protease with the last 51 amino acids of Gap1 p on the C terminus. The C terminus of Gap1 will be used to associate a protein with the plasma membrane through palmitoylation. This plasmid comprises URA3 3’ and 5 prime homology arms and is designed to integrate into the chromosome. Homologous recombination efficiency is increased if the plasmid is digested with the restriction enzyme Notl before transformation into cells. Expression from this plasmid is constitutive and clones are selected using the LEU2 marker.

[00077] A strain with pJY_TEV_Gap1C integrated into the chromosome will be transformed with the plasmid having the nucleotide sequence of SEQ ID NO: 106 and expression will be induced using 0.01 % galactose in the media. Confocal imaging of the resulting cells will demonstrate localization of sfGFP to the plasma membrane and mTurquoise2 in the cytoplasm of the cells. This will demonstrate that the TEV_Gap1C protein is able to cleave the cytoplasmic side of the fusion protein having the amino acid sequence of SEQ ID NO: 108.

[00078] While confocal imaging of the membrane and the cytoplasm is useful for detecting if the translocated domain of the fusion protein has been exported from the cell, a different signal could be generated by replacing the mTurquoise2 domain with a transcription factor such as GAL4. The plasmid having the nucleotide sequence of SEQ ID NO: 109 can be constructed by Golden Gate assembly with the following MoClo parts and PCR products are shown in TABLE 4:

TABLE 4. MoClo Parts and PCR Products for SEQ ID NO: 109

[00079] In order to generate a fluorescent signal when GAL4 is cleaved from the membrane, a reporter construct must be introduced into the strain with the plasmid having the nucleotide sequence of SEQ ID NO: 115 integrated into the chromosome. This will be accomplished by integrating the plasmid having the nucleotide sequence of SEQ ID NO: 104 into the HO locus. The plasmid is linearized with Notl before transformation and colonies are selected with G418. When GAL4 is expressed in the cell, the transcription factor binds to the UAS sequence and mScarlet-l will be expressed in the cytoplasm. The excitation and emission wavelengths for this protein are 569 nm and 593 nm respectively.

[00080] Confocal images of cells harboring both the plasmid having the nucleotide sequence of SEQ ID NO: 115 and the plasmid having the nucleotide sequence of SEQ ID NO: 104 chromosomal integrations will be transformed with the plasmid having the nucleotide sequence of SEQ ID NO: 109. And expression of the fusion protein will be induced with 0.01 % galactose in the media. Confocal imaging of the cells will demonstrate localization of sfGFP to the plasma membrane and mScarlet-l in the cytoplasm of the cells. This will demonstrate that the protein having the amino acid sequence of SEQ ID NO: 117 is able to cleave the cytoplasmic side of the fusion protein having the amino acid sequence of SEQ ID NO: 111 and the GAL4 transcription factor is able to be imported into the nuclease of the cell and upregulate the transcription of mScarlet-l.

[00081] Beta-lactoglobulin is a commercially relevant protein that can be produced by microbial fermentation. The protein can be secreted into the media for easier downstream purification. By replacing the sfGFP domain in the previous construct with beta-lactoglobulin it will be possible to detect if beta-lactoglobulin is secreted from the cell. The plasmid having the nucleotide sequence of SEQ ID NO: 112 can be constructed by Golden Gate assembly with the following MoClo parts and PCR products shown in TABLE 5:

TABLE 5. MoClo Parts and PCR Products for SEQ ID NO: 112

[00082] Expressing this protein in a cell with both the plasmid having the nucleotide sequence of SEQ ID NO: 115 and the plasmid having the nucleotide sequence of SEQ ID NO: 104 chromosomal integrations will result in mScarlet-l in the cytoplasm. If the protein were to misfold in the endoplasmic reticulum (ER), it could be retro-translocated into the cytoplasm and be degraded through the endoplasmic-reticulum-associated protein degradation (ERAD) pathway. If only a small amount of beta-lactoglobulin is being exported from the cell, the amount of mScarlet-l in the cytoplasm would also be reduced and the fluorescent signal would be detected.

[00083] CRISPR Screening is a powerful technique in functional genomics that has been applied to yeast biology (Momen-Roknabadi et al., Communications biology, 2020; 3(1): 723). The ability to generate a fluorescent signal that is coupled to protein production of single cells will allow scientists to use genome-wide modification tools such as CRISPR to quickly identify genes that modulate expression of a protein such as beta-lactoglobulin. To carry out a CRISPR screen for genes that modulate expression of beta-lactoglobulin, a population of cells carrying the plasmids described herein would be generated. Next a CRISPR library will be introduced into the cells. Cells will then be sorted on a cell sorter with a 569 nm laser and 593 nm detector. Cells with a fluorescent signal above 2 standard deviations will be collected. The sgRNA sequences will be PCR amplified and sequenced using high-throughput sequencing and will be compared to sgRNA sequences in the population as a whole to check for enrichment of certain genes. These genes will then be targets for modification to engineer a cell that produces higher levels of beta-lactoglobulin.

***

[00084] The foregoing description of the specific aspects will so fully reveal the general nature of the invention that others can, by applying knowledge within the skill of the art, readily modify and/or adapt for various applications such specific aspects, without undue experimentation, without departing from the general concept of the present disclosure. Therefore, such adaptations and modifications are intended to be within the meaning and range of equivalents of the disclosed aspects, based on the teaching and guidance presented herein. It is to be understood that the phraseology or terminology herein is for the purpose of description and not of limitation, such that the terminology or phraseology of the present specification is to be interpreted by the skilled artisan in light of the teachings and guidance.

[00085] The breadth and scope of the present disclosure should not be limited by any of the above-described exemplary aspects, but should be defined only in accordance with the following claims and their equivalents.

[00086] All publications, patents, patent applications, and/or other documents cited in this application are incorporated by reference in their entirety for all purposes to the same extent as if each individual publication, patent, patent application, and/or other document were individually indicated to be incorporated by reference for all purposes.

[00087] For reasons of completeness, various aspects of the invention are set out in the following numbered clauses:

[00088] Clause 1. A fusion protein comprising: (a) a translocated protein domain comprising a signal peptide, wherein the signal peptide mediates translocation of at least a portion of the translocated protein domain across a membrane; (b) a membrane association domain comprising a linker for linking the fusion protein to the membrane; (c) a post-translational modification site, wherein the post-translational modification site is capable of being modified when the translocated protein domain is localized to a distinct subcellular compartment or plasma membrane; and (d) an indicator domain, wherein the indicator domain confers a detectable characteristic to a cell comprising the fusion protein upon modification of the post- translational modification site, and wherein the indicator domain remains a constituent of the cell after export of the translocated protein domain.

[00089] Clause 2. The fusion protein of clause 1, wherein the translocated protein domain comprises a heterologous protein or an endogenous protein.

[00090] Clause 3. The fusion protein of clause 1 or clause 2, wherein the translocated protein domain comprises a carbohydrase, an alpha-amylase, a protease, or a subtilisin.

[00091] Clause 4. The fusion protein of any one of clauses 1-3, wherein the indicator domain is encoded by the nucleotide sequence of SEQ ID NO: 41 or SEQ ID NO: 43, or comprises the amino acid sequence of SEQ ID NO: 42 or SEQ ID NO: 44.

[00092] Clause 5. The fusion protein of any one of clauses 1-4, wherein the translocated protein domain is encoded by the nucleotide sequence of SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, or SEQ ID NO: 11 , or comprises the amino acid sequence of SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, or SEQ ID NO: 12.

[00093] Clause 6. The fusion protein of any one of clauses 1-5, further comprising a transmembrane domain.

[00094] Clause 7. The fusion protein of any one of clauses 1-6, further comprising a singlepass transmembrane domain. [00095] Clause 8. The fusion protein of clause 6, wherein the transmembrane domain is encoded by the nucleotide sequence of SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, or SEQ ID NO: 21, or comprises the amino acid sequence of SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO: 20, or SEQ ID NO: 22.

[00096] Clause 9. The fusion protein of any one of clauses 1-8, wherein the indicator domain comprises a cytoplasmic protein, a transcription factor, an antibiotic metabolizing protein, a fluorescent protein, a split fluorescent protein, or a Forster resonance energy transfer fluorescent protein.

[00097] Clause 10. The fusion protein of any one of clauses 1-9, wherein the post- translational modification site comprises a tobacco etch virus protease cleavage site, an intramembrane protease cleavage site, a cleavage site for gamma secretase, a phosphorylation site, or a glycosylation site.

[00098] Clause 11. The fusion protein of any one of clauses 1-10, further comprising a cleavage site, wherein cleavage of the fusion protein at the cleavage site separates the translocated protein domain from the remainder of the fusion protein.

[00099] Clause 12. The fusion protein of any one of clauses 1-11 , wherein the cell is a Saccharomyces cerevisiae cell, a Schizosaccharomyces pombe cell, a Pichia pastoris cell, a Yarrowia lipolytica cell, a Chinese hamster ovary cell, a murine myeloma cell, or a human embryonic kidney cell.

[000100] Clause 13. The fusion protein of any one of clauses 1-12, wherein the fusion protein is encoded by the nucleotide sequence of SEQ ID NO: 107, SEQ ID NO: 110, or SEQ ID NO: 113 or comprises the amino acid sequence of SEQ ID NO: 108, SEQ ID NO: 111 , or SEQ ID NO: 114.

[000101] Clause 14. A polynucleotide that encodes the fusion protein of any one of clauses 1- 13.

[000102] Clause 15. A method for identifying at least one genomic region in a cell that modulates the secretion of a protein comprising: (a) generating at least one cell that expresses the fusion protein of any one of clauses 1-13; (b) introducing an agent into the cell, wherein the agent modulates at least one genomic region of the cell; and (c) measuring the detectable characteristic of the cell. [000103] Clause 16. The method of clause 15, further comprising (d) isolating the cell.

[000104] Clause 17. The method of clause 16, further comprising (e) sequencing at least one genomic region of the cell.

[000105] Clause 18. The method of any one of clauses 15-17, wherein the agent is a chemical mutagen, an oligonucleotide, a CRISPR complex, or a CRISPR complex comprising a guide RNA and a Cas9 molecule.

[000106] Clause 19. The method of any one of clauses 15-18, wherein a population of cells is generated.

[000107] Clause 20. The method of clause 19, wherein at least one cell is isolated using fluorescence-activated cell sorting or through dilution and plating on a solid media.

[000108] Clause 21. The method of clause 19, wherein the agent comprises a CRISPR guide RNA library.

[000109] Clause 22. A system comprising a cell comprising: (a) the fusion protein of any one of clauses 1-13; and (b) a membrane localized protein comprising an enzymatic domain, wherein the enzymatic domain modulates the post-translational modification site of the fusion protein.

[000110] Clause 23. The system of clause 22, wherein the membrane localized protein is localized to the plasma membrane.

[000111] Clause 24. The system of clause 23, wherein the membrane localized protein is localized to the plasma membrane through S-palmitoylation.

[000112] Clause 25. The system of any one of clauses 22-24, wherein the modulation of the post-translational modification site comprises cleavage of the fusion protein.

[000113] Clause 26. The system of any one of clauses 22-25, wherein the membrane localized protein comprises a tobacco etch virus protease.

[000114] Clause 27. A method for detecting the secretion level of a protein in a cell comprising: (a) generating at least one cell comprising a fusion protein comprising: (i) a translocated protein domain comprising a signal peptide, wherein the signal peptide mediates translocation of at least a portion of the translocated protein domain across a membrane; (ii) a membrane association domain comprising a linker for linking the fusion protein to the membrane; (iii) a post-translational modification site, wherein the post-translational modification site is capable of being modified when the translocated protein domain is localized to a distinct subcellular compartment or plasma membrane; and (iv) an indicator domain, wherein the indicator domain confers a detectable characteristic to a cell comprising the fusion protein upon modification of the post-translational modification site, and wherein the indicator domain remains a constituent of the cell after export of the translocated protein domain; and (b) measuring the detectable characteristic conferred to the cell.

[000115] Clause 28. The method of clause 27, wherein generation of the cell comprises introducing a polynucleotide encoding the fusion protein into the cell.

[000116] Clause 29. The method of clause 27 or clause 28, wherein generation of the cell comprises introducing a polynucleotide encoding the indicator domain into the cell.

[000117] Clause 30. The method of any one of clauses 27-29, wherein measuring the detectable characteristic comprises use of a flow cytometer or a microscope.

[000118] Clause 31. The method of any one of clauses 27-30, wherein a population of cells is generated.

SEQUENCES

SEQ ID NO: 1

ER Signal Peptide Pre_Ost1_Alpha_signal_DNA ATGAGGCAGGTTTGGTTCTCTTGGATTGTGGGATTGTTCCTATGTTTTTTCAACGTGTCT TC TGCTGCTCCAGTCAACACTACAACAGAAGATGAAACGGCACAAATTCCGGCTGAAGCTGT C ATCGGTTACTCAGATTTAGAAGGGGATTTCGATGTTGCTGTTTTGCCATTTTCCAACAGC AC AAATAACGGGTTATTGTTTATAAATACTACTATTGCCAGCATTGCTGCTAAAGAAGAAGG GG TATCTCTCGAGAAAAGAGAGGCTGAAGCT

SEQ ID NO: 2

ER Signal Peptide Pre_Ost1_Alpha_signal_Protein MRQVWFSWIVGLFLCFFNVSSAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSN STNN GLLFINTTIASIAAKEEGVSLEKREAEA

SEQ ID NO: 3

ER Signal Peptide Pre_Pro_Alpha_Mating_Factor_Signal_DNA

ATGAGATTTCCTTCAATTTTTACTGCAGTTTTATTCGCAGCATCCTCCGCATTAGCT GCTCC AGTCAACACTACAACAGAAGATGAAACGGCACAAATTCCGGCTGAAGCTGTCATCGGTTA C TTAGATTTAGAAGGGGATTTCGATGTTGCTGTTTTGCCATTTTCCAACAGCACAAATAAC GG GTTATTGTTTATAAATACTACTATTGCCAGCATTGCTGCTAAAGAAGAAGGGGTATCTTT GG

ATAAAAGAGAGGCTGAAGCT

SEQ ID NO: 4

ER Signal Peptide Pre_Pro_Alpha_Mating_Factor_Signal_Protein

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYLDLEGDFDVAVLPFSN STNNGLLFI

NTTIASIAAKEEGVSLDKREAEA

SEQ ID NO: 5 sfGFP_DNA

ATGAGCAAAGGTGAAGAACTGTTTACCGGCGTTGTGCCGATTCTGGTGGAACTGGAT GGT

GATGTGAATGGCCATAAATTTAGCGTTCGTGGCGAAGGCGAAGGTGATGCGACCAAC GGT

AAACTGACCCTGAAATTTATTTGCACCACCGGTAAACTGCCGGTTCCGTGGCCGACC CTGG

TGACCACCCTGACCTATGGCGTTCAGTGCTTTAGCCGCTATCCGGATCATATGAAAC GCCA

TGATTTCTTTAAAAGCGCGATGCCGGAAGGCTATGTGCAGGAACGTACCATTAGCTT CAAA

GATGATGGCACCTATAAAACCCGTGCGGAAGTTAAATTTGAAGGCGATACCCTGGTG AACC

GCATTGAACTGAAAGGTATTGATTTTAAAGAAGATGGCAACATTCTGGGTCATAAAC TGGAA

TATAATTTCAACAGCCATAATGTGTATATTACCGCCGATAAACAGAAAAATGGCATC AAAGC

GAACTTTAAAATCCGTCACAACGTGGAAGATGGTAGCGTGCAGCTGGCGGATCATTA TCAG

CAGAATACCCCGATTGGTGATGGCCCGGTGCTGCTGCCGGATAATCATTATCTGAGC ACC

CAGAGCGTTCTGAGCAAAGATCCGAATGAAAAACGTGATCATATGGTGCTGCTGGAA TTTG

TTACCGCCGCGGGCATTACCCACGGTATGGATGAACTGTATAAA

SEQ ID NO: 6 sfGFP_Protein

MSKGEELFTGVVPILVELDGDVNGHKFSVRGEGEGDATNGKLTLKFICTTGKLPVPW PTLVTTL

TYGVQCFSRYPDHMKRHDFFKSAMPEGYVQERTISFKDDGTYKTRAEVKFEGDTLVN RIELKG

IDFKEDGNILGHKLEYNFNSHNVYITADKQKNGIKANFKIRHNVEDGSVQLADHYQQ NTPIGDG

PVLLPDNHYLSTQSVLSKDPNEKRDHMVLLEFVTAAGITHGMDELYK

SEQ ID NO: 7

A spergillus_ T ubingensis_Alpha_A mylase_DNA

ATGAGAGTGTCGACTTCAAGTATTGCCCTTGCTGTGTCCCTTTTTGGGAAGCTGGCC CTTG

GGCTGTCAGCTGCAGAATGGCGCACTCAATCCATCTACTTCCTTTTGACGGATCGGT TCGG

TAGGACGGACAATTCGACTACAGCTACGTGCAATACGGGTGACCAAATCTACTGTGG TGG

AAGTTGGCAAGGAATTATCAACCATCTGGACTATATCCAGGGCATGGGATTCACAGC TATC

TGGATCTCGCCTATCACTGAGCAGCTACCCCAGGATACTTCGGATGGTGAAGCCTAC CAT

GGATACTGGCAGCAGAAGATATACTATGTGAACTCCAACTTCGGCACGGCAGATGAT CTGA

AGTCCCTCTCCGATGCTCTTCACGCCCGCGGAATGTACCTCATGGTCGACGTCGTCC CTA

ACCACATGGGCTACGCAGGTAACGGCAACGATGTGGATTACAGCGTCTTCGACCCCT TCG

ACTCCTCCTCCTACTTCCATCCATACTGCCTCATCACAGATTGGGACAACTTGACCA TGGT

CCAAGACTGTTGGGAGGGTGACACCATCGTGTCTCTGCCAGATCTGAACACCACGGA AAC

CGCCGTGAGAACCATTTGGTACGATTGGGTAGCCGACCTGGTATCCAACTACTCAGT CGA

CGGCCTCCGTATCGACAGTGTCGAAGAAGTCGAACCCGACTTCTTCCCGGGCTACCA AGA

AGCAGCAGGAGTCTACTGCGTCGGTGAAGTCGACAACGGCAACCCTGCTCTCGACTG CCC

ATACCAAAAATATCTAGATGGTGTTCTCAACTATCCCATCTACTGGCAACTCCTCTA CGCCT

TTGAATCCTCCAGCGGCAGCATCAGCAACCTCTACAACATGATCAAATCCGTCGCCA GCGA

CTGCTCCGATCCGACCCTCCTGGGCAACTTTATCGAAAACCACGACAACCCCCGCTT CGC

CTCCTACACATCCGACTACTCCCAAGCCAAAAACGTCCTCAGCTACATCTTCCTCTC CGAC

GGCATCCCCATCGTCTACGCCGGCGAAGAACAGCACTACTCCGGCGGCGACGTGCCC TA

CAACCGCGAAGCTACCTGGCTATCAGGCTACGACACCTCCGCGGAGCTCTACACCTG GAT AGCCACCACAAACGCGATCCGGAAACTAGCTATCTCAGCAGACTCGGACTACATTACTTA C

AAGAACGACCCAATCTACACAGACAGCAACACCATCGCGATGCGCAAAGGCACCTCC GGC

TCCCAAATCATCACCGTCCTCTCCAACAAAGGCTCCTCCGGAAGCAGCTACACCCTC ACCC

TCAGCGGAAGCGGCTACACGTCCGGCACGAAGCTCATCGAAGCGTACACCTGCACGT CC

GTGACGGTGGACTCGAACGGGGATATCCCTGTGCCGATGGCTTCGGGATTACCTAGA GTT

CTCCTCCCTGCTTCGGTGGTTGATAGTTCTTCGCTTTGTGGGGGGAGTGGTAACACA ACCA

CGACCACAACTGCTGCTACCTCCACATCCAAAGCCACCACCTCCTCTTCTTCTTCTT CTGCT

GCTGCTACTACTTCTTCATCATGCACCGCAACAAGCACCACCCTCCCCATCACCTTC GAAG

AACTCGTCACCACTACCTACGGGGAAGAAGTCTACCTCAGCGGATCTATCTCCCAGC TCG

GAGAGTGGCATACGAGTGACGCGGTGAAGTTGTCCGCGGATGATTATACCTCGAGTA ACC

CCGAGTGGTCTGTTACTGTGTCGTTGCCGGTGGGGACGACCTTCGAGTATAAGTTTA TTAA

GGTCGATGAGGGTGGAAGTGTGACTTGGGAAAGTGATCCGAATAGGGAGTATACTGT GCC

TGAATGTGGGAGTGGGAGTGGGGAGACGGTGGTTGATACGTGGAGGTAG

SEQ ID NO: 8

A spergillus_ T ubingensis_Alpha_A mylase_Protein

MRVSTSSIALAVSLFGKLALGLSAAEWRTQSIYFLLTDRFGRTDNSTTATCNTGDQI YCGGSWQ

GIINHLDYIQGMGFTAIWISPITEQLPQDTSDGEAYHGYWQQKIYYVNSNFGTADDL KSLSDALH

ARGMYLMVDVVPNHMGYAGNGNDVDYSVFDPFDSSSYFHPYCLITDWDNLTMVQDCW EGDT

IVSLPDLNTTETAVRTIWYDWVADLVSNYSVDGLRIDSVEEVEPDFFPGYQEAAGVY CVGEVD

NGNPALDCPYQKYLDGVLNYPIYWQLLYAFESSSGSISNLYNMIKSVASDCSDPTLL GNFIENH

DNPRFASYTSDYSQAKNVLSYIFLSDGIPIVYAGEEQHYSGGDVPYNREATWLSGYD TSAELYT

WIATTNAIRKLAISADSDYITYKNDPIYTDSNTIAMRKGTSGSQIITVLSNKGSSGS SYTLTLSGSG

YTSGTKLIEAYTCTSVTVDSNGDIPVPMASGLPRVLLPASVVDSSSLCGGSGNTTTT TTAATST

SKATTSSSSSSAAATTSSSCTATSTTLPITFEELVTTTYGEEVYLSGSISQLGEWHT SDAVKLSA

DDYTSSNPEWSVTVSLPVGTTFEYKFIKVDEGGSVTWESDPNREYTVPECGSGSGET VVDTW

SEQ ID NO: 9

Insulin Precursor_DNA

TTTGTAAACCAGCATTTATGCGGGAGTCATCTGGTCGAGGCGTTGTATTTGGTATGT GGAG

AACGTGGCTTTTTTTATACACCGAAAACATCCGACGATGCTAAGGGAATCGTCGAAC AATG

TTGTACGTCTATCTGTTCCCTTTATCAACTAGAGAACTACTGTAAC

SEQ ID NO: 10

Insulin Precursor_Protein

FVNQHLCGSHLVEALYLVCGERGFFYTPKTSDDAKGIVEQCCTSICSLYQLENYCN

SEQ ID NO: 11

Beta lactoglobulin_DNA

CTTATAGTGACTCAAACCATGAAGGGACTGGACATCCAGAAGGTCGCAGGTACATGG TACT

CTCTTGCGATGGCGGCTTCTGACATTAGCCTGCTAGATGCGCAATCAGCTCCTTTAC GTGT

TTACGTCGAAGAGTTGAAGCCAACACCGGAAGGGGACCTGGAAATTCTTCTTCAAAA ATGG

GAGAATGGAGAGTGTGCGCAAAAAAAAATTATCGCTGAAAAGACAAAAATCCCTGCC GTCT

TCAAAATTGACGCGCTGAATGAGAATAAGGTATTGGTTTTAGATACCGACTATAAGA AATAC

CTTTTATTCTGCATGGAAAATAGTGCCGAGCCAGAGCAGTCCCTTGCTTGCCAATGT CTTG

TTCGTACACCCGAGGTGGACGATGAGGCCCTAGAGAAATTCGATAAGGCATTGAAAG CAC

TTCCAATGCACATAAGGCTAAGTTTCAACCCAACTCAGCTTGAGGAGCAGTGTCACA TC

SEQ ID NO: 12

Beta lactoglobulin_Protein LIVTQTMKGLDIQKVAGTWYSLAMAASDISLLDAQSAPLRVYVEELKPTPEGDLEILLQK WENG

ECAQKKIIAEKTKIPAVFKIDALNENKVLVLDTDYKKYLLFCMENSAEPEQSLACQC LVRTPEVD

DEALEKFDKALKALPMHIRLSFNPTQLEEQCHI

SEQ ID NO: 13

WSC4_ Transmembrane_Domain_DNA

AGCCCAGGGAAGATTGCTGCCACGTTCGTGGTAGTTGGCGTGGTCTGCTTAGTCATT ATCT

GCATCCTGATCTACCTGATCCATCACTACAGA

SEQ ID NO: 14

WSC4_Transmembrane_Domain_Protien

SPGKI AATFVVVGVVCLVI ICI LI YLI H HYR

SEQ ID NO: 15 l/l/SC3_ Transmembrane_Domain_DNA

GCCATTGCCGGTATTGTTATTGGTGTTGTGTTTGGCGTAATTTTTATTATTTTGATT CTATTG

TTCCTGATATGGAGGAGACGGAAATCG

SEQ ID NO: 16

WSC3_Transmembrane_Domain_Protein

AIAGMGVVFGVIFIILILLFLIWRRRKS

SEQ ID NO: 17

WSC2_ Transmembrane_Domain_DNA

ATCGCAGGTGTCGTAGTAGGTGTGGTTTGTGGTACAGTTGCCTTGTTGGCTCTGGCG TTAT

TCTTTTTCGTATGGAAAAAACGTCGCCAA

SEQ ID NO: 18

WSC2_Transmembrane_Domain_Protein

IAGVVVGVVCGTVALLALALFFFVWKKRRQ

SEQ ID NO: 19

SLG 1_ Transmembrane_Domain_DNA

GTTGTAGGTGGTGTAGTGGGAGCCGTAGCCATTGCTCTTTGTATCTTGTTGATTGTC AGAC

ACATTAATATGAAGCGG

SEQ ID NO: 20

SLG 1_ Transmembrane_Domain_Protein

VVGGVVGAVAIALCILLIVRHINMKR

SEQ ID NO: 21

JAM 1_ Transmembrane_Domain_DNA

GTTGGCGTGATAGTGGCAGCAGTACTGGTAACGCTAATACTTTTAGGCATTCTTGTA TTTG

GAATCTGGTTCGCCTACTCAAGAGGTCAT

SEQ ID NO: 22

JAM 1_ Transmembrane_Domain_Protein

VGVIVAAVLVTLILLGILVFGIWFAYSRGH

SEQ ID NO: 23

Glycine_Serine_Linker_ 1_DNA GGTGGCTCCACTTCCGGCGGGAGCGGGAGT

SEQ ID NO: 24

Glycine_Serine_Unker_1_Protein

GGSTSGGSGS

SEQ ID NO: 25

Glycine_Serine_Linker_2_DNA

ACCGGCTCCACTAGCGGTGGTTCCAGTACGGGCTCTGGAGGCTCTGGGAGTGGCACC

SEQ ID NO: 26

Glycine_Serine_Linker_2_Protein

TGSTSGGSSTGSGGSGSGT

SEQ ID NO: 27

Glycine_Serine_Linker_3_DNA

GGAACAGGCAGTGGCGGAAGTAGTACTGGCTCTACCGGC

SEQ ID NO: 28

Glycine_Serine_Unker_3_Protein

GTGSGGSSTGSTG

SEQ ID NO: 29

TE V_Protease_ Clea vage_ Site_DNA

GAAAACCTTTATTTTCAGTCA

SEQ ID NO: 30

TE V_Protease_ Clea vage_ Site_Protein

GAAAACCTTTATTTTCAGTCA

SEQ ID NO: 31

Transmembrane_ WSC4_linker_DNA

GGTGGCTCCACTTCCGGCGGGAGCGGGAGTAGCCCAGGGAAGATTGCTGCCACGTTC GT

GGTAGTTGGCGTGGTCTGCTTAGTCATTATCTGCATCCTGATCTACCTGATCCATCA CTACA

GAACCGGCTCCACTAGCGGTGGTTCCAGTACGGGCTCTGGAGGCTCTGGGAGTGGCA CC

GAAAACCTTTATTTTCAGTCAGGAACAGGCAGTGGCGGAAGTAGTACTGGCTCTACC GGC

SEQ ID NO: 32

Transmembrane_WSC4_linker_Protein

GGSTSGGSGSSPGKIAATFVVVGVVCLVIICILIYLIHHYRTGSTSGGSSTGSGGSG SGTENLYF

QSGTGSGGSSTGSTG

SEQ ID NO: 33

Transmembrane_ WSC3_linker_DNA

GGTGGCTCCACTTCCGGCGGGAGCGGGAGTGCCATTGCCGGTATTGTTATTGGTGTT GTG

TTTGGCGTAATTTTTATTATTTTGATTCTATTGTTCCTGATATGGAGGAGACGGAAA TCGAC

CGGCTCCACTAGCGGTGGTTCCAGTACGGGCTCTGGAGGCTCTGGGAGTGGCACCGA AA

ACCTTTATTTTCAGTCAGGAACAGGCAGTGGCGGAAGTAGTACTGGCTCTACCGGC

SEQ ID NO: 34

Transmembrane_WSC3_linker_Protein GGSTSGGSGSAIAGIVIGWFGVIFIILILLFLIWRRRKSTGSTSGGSSTGSGGSGSGTEN LYFQS

GTGSGGSSTGSTG

SEQ ID NO: 35

Transmembrane_ WSC2_linker_DNA

GGTGGCTCCACTTCCGGCGGGAGCGGGAGTATCGCAGGTGTCGTAGTAGGTGTGGTT TG

TGGTACAGTTGCCTTGTTGGCTCTGGCGTTATTCTTTTTCGTATGGAAAAAACGTCG CCAAA

CCGGCTCCACTAGCGGTGGTTCCAGTACGGGCTCTGGAGGCTCTGGGAGTGGCACCG AA

AACCTTTATTTTCAGTCAGGAACAGGCAGTGGCGGAAGTAGTACTGGCTCTACCGGC

SEQ ID NO: 36

Transmembrane_WSC2_linker_Protein

GGSTSGGSGSIAGVVVGVVCGTVALLALALFFFVWKKRRQTGSTSGGSSTGSGGSGS GTENL

YFQSGTGSGGSSTGSTG

SEQ ID NO: 37

Transmembrane_ SGL 1_linker_DNA

GGTGGCTCCACTTCCGGCGGGAGCGGGAGTGTTGTAGGTGGTGTAGTGGGAGCCGTA GC

CATTGCTCTTTGTATCTTGTTGATTGTCAGACACATTAATATGAAGCGGACCGGCTC CACTA

GCGGTGGTTCCAGTACGGGCTCTGGAGGCTCTGGGAGTGGCACCGAAAACCTTTATT TTC

AGTCAGGAACAGGCAGTGGCGGAAGTAGTACTGGCTCTACCGGC

SEQ ID NO: 38

Transmembrane_ SGL 1_linker_Protein

GGSTSGGSGSVVGGVVGAVAIALCILLIVRHINMKRTGSTSGGSSTGSGGSGSGTEN LYFQSG

TGSGGSSTGSTG

SEQ ID NO: 39

Transmembrane_JAM1_linker_DNA

GGTGGCTCCACTTCCGGCGGGAGCGGGAGTGTTGGCGTGATAGTGGCAGCAGTACTG GT

AACGCTAATACTTTTAGGCATTCTTGTATTTGGAATCTGGTTCGCCTACTCAAGAGG TCATA

CCGGCTCCACTAGCGGTGGTTCCAGTACGGGCTCTGGAGGCTCTGGGAGTGGCACCG AA

AACCTTTATTTTCAGTCAGGAACAGGCAGTGGCGGAAGTAGTACTGGCTCTACCGGC

SEQ ID NO: 40

Transmembrane_JAM1_linker_Protein

GGSTSGGSGSVGVIVAAVLVTLILLGILVFGIWFAYSRGHTGSTSGGSSTGSGGSGS GTENLYF

QSGTGSGGSSTGSTG

SEQ ID NO: 41

Indicator Domain Gal4_DNA

ATGAAGCTACTGTCTTCTATCGAACAAGCATGCGATATTTGCCGACTTAAAAAGCTC AAGTG

CTCCAAAGAAAAACCGAAGTGCGCCAAGTGTCTGAAGAACAACTGGGAGTGTCGCTA CTC

TCCCAAAACCAAAAGGTCTCCGCTGACTAGGGCACATCTGACAGAAGTGGAATCAAG GCT

AGAAAGACTGGAACAGCTATTTCTACTGATTTTTCCTCGAGAAGACCTTGACATGAT TTTGA

AAATGGATTCTTTACAGGATATAAAAGCATTGTTAACAGGATTATTTGTACAAGATA ATGTGA

ATAAAGATGCCGTCACAGATAGATTGGCTTCAGTGGAGACTGATATGCCTCTAACAT TGAG

ACAGCATAGAATAAGTGCGACATCATCATCGGAAGAGAGTAGTAACAAAGGTCAAAG ACAG

TTGACTGTATCGATTGACTCGGCAGCTCATCATGATAACTCCACAATTCCGTTGGAT TTTAT

GCCCAGGGATGCTCTTCATGGATTTGATTGGTCTGAAGAGGATGACATGTCGGATGG CTT

GCCCTTCCTGAAAACGGACCCCAACAATAATGGGTTCTTTGGCGACGGTTCTCTCTT ATGT ATTCTTCGATCTATTGGCTTTAAACCGGAAAATTACACGAACTCTAACGTTAACAGGCTC CC GACCATGATTACGGATAGATACACGTTGGCTTCTAGATCCACAACATCCCGTTTACTTCA AA GTTATCTCAATAATTTTCACCCCTACTGCCCTATCGTGCACTCACCGACGCTAATGATGT TG TATAATAACCAGATTGAAATCGCGTCGAAGGATCAATGGCAAATCCTTTTTAACTGCATA TT AGCCATTGGAGCCTGGTGTATAGAGGGGGAATCTACTGATATAGATGTTTTTTACTATCA AA ATGCTAAATCTCATTTGACGAGCAAGGTCTTCGAGTCAGGTTCCATAATTTTGGTGACAG C CCTACATCTTCTGTCGCGATATACACAGTGGAGGCAGAAAACAAATACTAGCTATAATTT TC ACAGCTTTTCCATAAGAATGGCCATATCATTGGGCTTGAATAGGGACCTCCCCTCGTCCT T CAGTGATAGCAGCATTCTGGAACAAAGACGCCGAATTTGGTGGTCTGTCTACTCTTGGGA G ATCCAATTGTCCCTGCTTTATGGTCGATCCATCCAGCTTTCTCAGAATACAATCTCCTTC CC TTCTTCTGTCGACGATGTGCAGCGTACCACAACAGGTCCCACCATATATCATGGCATCAT T GAAACAGCAAGGCTCTTACAAGTTTTCACAAAAATCTATGAACTAGACAAAACAGTAACT GC

AGAAAAAAGTCCTATATGTGCAAAAAAATGCTTGATGATTTGTAATGAGATTGAGGA GGTTT CGAGACAGGCACCAAAGTTTTTACAAATGGATATTTCCACCACCGCTCTAACCAATTTGT TG AAGGAACACCCTTGGCTATCCTTTACAAGATTCGAACTGAAGTGGAAACAGTTGTCTCTT AT CATTTATGTATTAAGAGATTTTTTCACTAATTTTACCCAGAAAAAGTCACAACTAGAACA GGA TCAAAATGATCATCAAAGTTATGAAGTTAAACGATGCTCCATCATGTTAAGCGATGCAGC AC AAAGAACTGTTATGTCTGTAAGTAGCTATATGGACAATCATAATGTCACCCCATATTTTG CC TGGAATTGTTCTTATTACTTGTTCAATGCAGTCCTAGTACCCATAAAGACTCTACTCTCA AAC TCAAAATCGAATGCTGAGAATAACGAGACCGCACAATTATTACAACAAATTAACACTGTT CT GATGCTATTAAAAAAACTGGCCACTTTTAAAATCCAGACTTGTGAAAAATACATTCAAGT ACT GGAAGAGGTATGTGCGCCGTTTCTGTTATCACAGTGTGCAATCCCATTACCGCATATCAG T TATAACAATAGTAATGGTAGCGCCATTAAAAATATTGTCGGTTCTGCAACTATCGCCCAA TA CCCTACTCTTCCGGAGGAAAATGTCAACAATATCAGTGTTAAATATGTTTCTCCTGGCTC AG

TAGGGCCTTCACCTGTGCCATTGAAATCAGGAGCAAGTTTCAGTGATCTAGTCAAGC TGTT ATCTAACCGTCCACCCTCTCGTAACTCTCCAGTGACAATACCAAGAAGCACACCTTCGCA T CGCTCAGTCACGCCTTTTCTAGGGCAACAGCAACAGCTGCAATCATTAGTGCCACTGACC C CGTCTGCTTTGTTTGGTGGCGCCAATTTTAATCAAAGTGGGAATATTGCTGATAGCTCAT TG TCCTTCACTTTCACTAACAGTAGCAACGGTCCGAACCTCATAACAACTCAAACAAATTCT CA AGCGCTTTCACAACCAATTGCCTCCTCTAACGTTCATGATAACTTCATGAATAATGAAAT CA CGGCTAGTAAAATTGATGATGGTAATAATTCAAAACCACTGTCACCTGGTTGGACGGACC A AACTGCGTATAACGCGTTTGGAATCACTACAGGGATGTTTAATACCACTACAATGGATGA T GTATATAACTATCTATTCGATGATGAAGATACCCCACCAAACCCAAAAAAAGAG

SEQ ID NO: 42

Indicator Domain GAL4_Protein

MKLLSSIEQACDICRLKKLKCSKEKPKCAKCLKNNWECRYSPKTKRSPLTRAHLTEV ESRLERL EQLFLLIFPREDLDMILKMDSLQDIKALLTGLFVQDNVNKDAVTDRLASVETDMPLTLRQ HRISA TSSSEESSNKGQRQLTVSIDSAAHHDNSTIPLDFMPRDALHGFDWSEEDDMSDGLPFLKT DPN NNGFFGDGSLLCILRSIGFKPENYTNSNVNRLPTMITDRYTLASRSTTSRLLQSYLNNFH PYCPI VHSPTLMM LYN NQI EIASKDQWQI LFNCI LAIGAWCI EGESTDI DVFYYQNAKSH LTSKVFESGSI ILVTALHLLSRYTQWRQKTNTSYNFHSFSIRMAISLGLNRDLPSSFSDSSILEQRRRIWW SVYS WEIQLSLLYGRSIQLSQNTISFPSSVDDVQRTTTGPTIYHGIIETARLLQVFTKIYELDK TVTAEKS PICAKKCLMICNEIEEVSRQAPKFLQMDISTTALTNLLKEHPWLSFTRFELKWKQLSLII YVLRDF FTNFTQKKSQLEQDQNDHQSYEVKRCSIMLSDAAQRTVMSVSSYMDNHNVTPYFAWNCSY YL FNAVLVPIKTLLSNSKSNAENNETAQLLQQINTVLMLLKKLATFKIQTCEKYIQVLEEVC APFLLS QCAIPLPHISYNNSNGSAIKNIVGSATIAQYPTLPEENVNNISVKYVSPGSVGPSPVPLK SGASFS DLVKLLSNRPPSRNSPVTIPRSTPSHRSVTPFLGQQQQLQSLVPLTPSALFGGANFNQSG NIAD

SSLSFTFTNSSNGPNLITTQTNSQALSQPIASSNVHDNFMNNEITASKIDDGNNSKP LSPGWTD QTAYNAFGITTGMFNTTTMDDVYNYLFDDEDTPPNPKKE SEQ ID NO: 43

Indicator Domain mTurquoise2_DNA

ATGGTTTCTAAAGGTGAAGAATTATTCACTGGTGTTGTCCCAATTTTGGTTGAATTA GATGG

TGATGTTAATGGTCACAAATTTTCTGTCTCCGGTGAAGGTGAAGGTGATGCTACTTA CGGT

AAATTGACCTTAAAATTTATTTGTACTACTGGTAAATTGCCAGTTCCATGGCCAACC TTAGTC

ACTACTTTATCTTGGGGTGTTCAATGTTTTGCAAGATACCCAGATCATATGAAACAA CATGA

CTTTTTCAAGTCTGCCATGCCAGAAGGTTATGTTCAAGAAAGAACTATTTTTTTCAA AGATG

ACGGTAACTACAAGACCAGAGCTGAAGTCAAGTTTGAAGGTGATACCTTAGTTAATA GAAT

CGAATTAAAAGGTATTGATTTTAAAGAAGATGGTAACATTTTAGGTCACAAATTGGA ATACA

ATTATTTCTCTGACAATGTTTACATCACTGCTGACAAACAAAAGAATGGTATCAAAG CTAACT

TCAAAATTAGACACAACATTGAAGATGGTGGTGTTCAATTAGCTGACCATTATCAAC AAAAT

ACTCCAATTGGTGATGGTCCAGTCTTGTTACCAGACAACCATTACTTATCCACTCAA TCTAA

GTTATCCAAAGATCCAAACGAAAAGAGGGACCACATGGTCTTGTTAGAATTTGTTAC TGCT

GCTGGTATTACCTTGGGTATGGATGAATTGTACAAAGGATCC

SEQ ID NO: 44

Indicator Domain mTurquoise2_Protein

MVSKGEELFTGWPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPW PTLVTT

LSWGVQCFARYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLV NRIEL

KGIDFKEDGNILGHKLEYNYFSDNVYITADKQKNGIKANFKIRHNIEDGGVQLADHY QQNTPIGD

GPVLLPDNHYLSTQSKLSKDPNEKRDHMVLLEFVTAAGITLGMDELYKGS

SEQ ID NO: 45

Membrane-associated Tev_protease_DNA

GGAGAAAGCTTGTTTAAGGGGCCGCGTGATTACAACCCGATATCGAGCACCATTGTT CATT

TGACGAATGAATCTGATGGGCACACAACATCGTTGTATGGTATTGGATTTGGTCCCT TCAT

CATTACAAACAAGCACTTGTTTAGAAGAAATAATGGAACACTGGTGGTCCAATCACT ACATG

GTGTATTCAAGGTCAAGAACACCACGACTTTGCAACAACACCTCATTGATGGGAGGG ACAT

GATAATTATTCGCATGCCTAAGGATTTCCCACCATTTCCTCAAAAGCTGAAATTTAG AGAGC

CACAAAGGGAAGAGCGCATAGTCCTTGTGACAACCAACTTCCAAACTAAGAGCATGT CTAG

CATGGTGTCAGACACTAGTTCGACATTCCCTTCAGGAGATGGCATATTCTGGAAGCA TTGG

ATTCAAACCAAGGATGGGCAGTGTGGCAGTCCATTAGTATCAACTAGAGATGGGTTC ATTG

TTGGTATACACTCAGCATCGAATTTCACCAACACAAACAATTATTTCACAAGCGTGC CGAAA

AACTTCATGGAATTGTTGACAAATCAGGAGGCGCAGCAGTGGGTTAGTGGTTGGCGA TTAA

ATGCTGACTCAGTATTGTGGGGAGGCCATAAAGTTTTCATGGACAAACCTGAAGAGC CTTT

TCAGCCAGTTAAGGAAGCGACTCAACTCATGAAT

SEQ ID NO: 46

Membrane-associated Tev_protease_Proteln

GESLFKGPRDYNPISSTIVHLTNESDGHTTSLYGIGFGPFIITNKHLFRRNNGTLVV QSLHGVFK

VKNTTTLQQHLIDGRDMIIIRMPKDFPPFPQKLKFREPQREERIVLVTTNFQTKSMS SMVSDTSS

TFPSGDGIFWKHWIQTKDGQCGSPLVSTRDGFIVGIHSASNFTNTNNYFTSVPKNFM ELLTNQE

AQQWVSGWRLNADSVLWGGHKVFMDKPEEPFQPVKEATQLMN

SEQ ID NO: 47 start_sTEV_F Primer

AGGTCTCATATGGGAGAAAGCTTGTTTAAGGGGCC

SEQ ID NO: 48 start_his_sTEV_F2 Primer

AGGTCTCATATGCATCATCATCATCATCATCATGGAGAAAGC SEQ ID NO: 49 sfGFP_gsL1_R Primer

AGGTCTCACTACCGCTGCCTTTATACAGTTCATCCATACCGTGGG

SEQ ID NO: 50 sTEV_gsL1_R Primer

AGGTCTCACTACCGCTAGTGGAGCCGCGACGGCGACGACGATTC

SEQ ID NO: 51 gsL 1_Gap 1 C_F Primer

AGGTCTCGGTAGTAGGAATTGGAAGCTTTTCATCCCAG

SEQ ID NO: 52

Gap1C_end_R Primer

AGGTCTCTGGATTTAACACCAGAAATTCCAGATTCTATACCATC

SEQ ID NO: 53 start_sfGFP_F Primer

AGGTCTCATATGAGCAAAGGTGAAGAACTGTTTACCG

SEQ ID NO: 54 gsL1_sfGFP_F Primer

AGGTCTCGGTAGTATGAGCAAAGGTGAAGAACTGTTTACCG

SEQ ID NO: 55 sfGFP_end_R Primer

AGGTCTCTGGATTTATTTATACAGTTCATCCATACCGTGGG

SEQ ID NO: 56 sfGFP_gsL2_R Primer

AGGTCTCACTTCCTTTATACAGTTCATCCATACCGTGGG

SEQ ID NO: 57 start_ER_F Primer

AGGTCTCATATGAGGCAGGTTTGGTTCTCTTGG

SEQ ID NO: 58

ER_GSL1_R Primer

AGGTCTCACTACCAGCTTCAGCCTCTCTTTTCTCGAG

SEQ ID NO: 59 gsL2_linker_F Primer

AGGTCTCGGAAGTTCCACTTCCGGCGGGAGC

SEQ ID NO: 60 linker_gsL3_R Primer

AGGTCTCACTGCCGCCGGTAGAGCCAGTACTACTTC

SEQ ID NO: 61 gsL3_Turq_F Primer AGGTCTCGGCAGTATGGTTTCTAAAGGTGAAGAATTATTCACTGG

SEQ ID NO: 62

Turq_end_R Primer

AGGTCTCTGGATTTAGGATCCTTTGTACAATTCATCCATACCC

SEQ ID NO: 63

URA3_5_Prime_F

GGGCGGATTACTACCGTT

SEQ ID NO: 64

URA3_5_Prime_R

GTAATGTTATCCATGTGGGC

SEQ ID NO: 65

URA3_3_Prime_F

AGAGCACTTGAATCCACTGC

SEQ ID NO: 66

URA3_3_Prime_R

GATTTGGTTAGATTAGATATGGTTTC

SEQ ID NO: 67 pGAL1_F Primer

CAGTAACCTGGCCCCACAAACC

SEQ ID NO: 68 pPOP6_F Primer

CCTCGGGGTGACGTTTACTATTGG

SEQ ID NO: 69 pRPL 18B_F Primer

CCAAACACGTTACCCGACCTCG

SEQ ID NO: 70 tADH1_R Primer

GCTATACCTGAGAAAGCAACCTGACC

SEQ ID NO: 71 tENO2_R Primer

GTGCATTATGCAATAGACAGCACGAGTC

SEQ ID NO: 72 gsL3_Gal4_F Primer

AGGTCTCGGCAGTATGAAGCTACTGTCTTCTATCGAACAAGC

SEQ ID NO: 73

Gal4_end_R Primer

AGGTCTCTGGATTTACTCTTTTTTTGGGTTTGGTGGGGTATC

SEQ ID NO: 74 gsL1_BLG_F Primer

AGGTCTCGGTAGTCTTATAGTGACTCAAACCATGAAGGGACTG

SEQ ID NO: 75

BLG_gsL2_R Primer

AGGTCTCACTTCCGATGTGACACTGCTCCTCAAGCTG

SEQ ID NO: 76

Gap 1p_DNA

ATGAGTAATACTTCTTCGTACGAGAAGAATAATCCAGATAATCTGAAACACAATGGT ATTAC

CATAGATTCTGAGTTTCTAACTCAGGAGCCAATAACCATTCCCTCAAATGGCTCCGC TGTTT

CTATTGACGAAACAGGTTCAGGGTCCAAATGGCAAGACTTTAAAGATTCTTTCAAAA GGGT

AAAACCTATTGAAGTTGATCCTAATCTTTCAGAAGCTGAAAAAGTGGCTATCATCAC TGCCC

AAACTCCATTGAAGCACCACTTGAAGAATAGACATTTGCAAATGATTGCCATCGGTG GTGC

CATCGGTACTGGTCTGCTGGTTGGGTCAGGTACTGCACTAAGAACAGGTGGTCCCGC TTC

GCTACTGATTGGATGGGGGTCTACAGGTACCATGATTTACGCTATGGTTATGGCTCT GGGT

GAGTTGGCTGTTATCTTCCCTATTTCGGGTGGGTTCACCACGTACGCTACCAGATTT ATTG

ATGAGTCCTTTGGTTACGCTAATAATTTCAATTATATGTTACAATGGTTGGTTGTGC TACCAT

TGGAAATTGTCTCTGCATCTATTACTGTAAATTTCTGGGGTACAGATCCAAAGTATA GAGAT

GGGTTTGTTGCGTTGTTTTGGCTTGCAATTGTTATCATCAATATGTTTGGTGTCAAA GGTTA

TGGTGAAGCAGAATTCGTCTTTTCATTTATCAAGGTCATCACTGTTGTTGGGTTCAT CATCT

TAGGTATCATTCTAAACTGTGGTGGTGGTCCAACAGGTGGTTACATTGGGGGCAAGT ACTG

GCATGATCCTGGTGCCTTTGCTGGTGACACTCCAGGTGCTAAATTCAAAGGTGTTTG TTCT

GTCTTCGTCACCGCTGCCTTTTCTTTTGCCGGTTCAGAATTGGTTGGTCTTGCTGCC AGTG

AATCCGTAGAGCCTAGAAAGTCCGTTCCTAAGGCTGCTAAACAAGTTTTCTGGAGAA TCAC

CCTATTTTATATTCTGTCGCTATTAATGATTGGTCTTTTAGTCCCATACAACGATAA AAGTTT

GATTGGTGCCTCCTCTGTGGATGCTGCTGCTTCACCCTTCGTCATTGCCATTAAGAC TCAC

GGTATCAAGGGTTTGCCAAGTGTTGTCAACGTCGTTATCTTGATTGCCGTGTTATCT GTCG

GTAACTCTGCCATTTATGCATGTTCCAGAACAATGGTTGCCCTAGCTGAACAGAGAT TTCTG

CCAGAAATCTTTTCCTACGTTGACCGTAAGGGTAGACCATTGGTGGGAATTGCTGTC ACAT

CTGCATTCGGTCTTATTGCGTTTGTTGCCGCCTCCAAAAAGGAAGGTGAAGTTTTCA ACTG

GTTACTAGCCTTGTCTGGGTTGTCATCTCTATTCACATGGGGTGGTATCTGTATTTG TCACA

TTCGTTTCAGAAAGGCATTGGCCGCCCAAGGAAGAGGCTTGGATGAATTGTCTTTCA AGTC

TCCTACCGGTGTTTGGGGTTCCTACTGGGGGTTATTTATGGTTATTATTATGTTCAT TGCCC

AATTCTACGTTGCTGTATTCCCCGTGGGAGATTCTCCAAGTGCGGAAGGTTTCTTCG AAGC

TTATCTATCCTTCCCACTTGTTATGGTTATGTACATCGGACACAAGATCTATAAGAG GAATT

GGAAGCTTTTCATCCCAGCAGAAAAGATGGACATTGATACGGGTAGAAGAGAAGTCG ATTT

AGATTTGTTGAAACAAGAAATTGCAGAAGAAAAGGCAATTATGGCCACAAAGCCAAG ATGG

TATAGAATCTGGAATTTCTGGTGTTAA

SEQ ID NO: 77

Gap1p_Protein

MSNTSSYEKNNPDNLKHNGITIDSEFLTQEPITIPSNGSAVSIDETGSGSKWQDFKD SFKRVKPI

EVDPNLSEAEKVAIITAQTPLKHHLKNRHLQMIAIGGAIGTGLLVGSGTALRTGGPA SLLIGWGS

TGTMIYAMVMALGELAVIFPISGGFTTYATRFIDESFGYANNFNYMLQWLVVLPLEI VSASITVNF

WGTDPKYRDGFVALFWLAIVIINMFGVKGYGEAEFVFSFIKVITWGFIILGIILNCG GGPTGGYIG

GKYWHDPGAFAGDTPGAKFKGVCSVFVTAAFSFAGSELVGLAASESVEPRKSVPKAA KQVFW

RITLFYILSLLMIGLLVPYNDKSLIGASSVDAAASPFVIAIKTHGIKGLPSVVNVVI LIAVLSVGNSAI

YACSRTMVALAEQRFLPEIFSYVDRKGRPLVGIAVTSAFGLIAFVAASKKEGEVFNW LLALSGLS SLFTWGGICICHIRFRKALAAQGRGLDELSFKSPTGVWGSYWGLFMVIIMFIAQFYVAVF PVGD SPSAEGFFEAYLSFPLVMVMYIGHKIYKRNWKLFIPAEKMDIDTGRREVDLDLLKQEIAE EKAIM

ATKPRWYRIWNFWC*

SEQ ID NO: 78

Gap1p_Plasma_Membrane_Association_Domain_DNA

AGGAATTGGAAGCTTTTCATCCCAGCAGAAAAGATGGACATTGATACGGGTAGAAGA GAAG

TCGATTTAGATTTGTTGAAACAAGAAATTGCAGAAGAAAAGGCAATTATGGCCACAA AGCCA

AGATGGTATAGAATCTGGAATTTCTGGTGT

SEQ ID NO: 79

Gap1p_Plasma_Membrane_Association_Domain_Protein

RNWKLFIPAEKMDIDTGRREVDLDLLKQEIAEEKAIMATKPRWYRIWNFWC

SEQ ID NO: 80 pYTK002_ConLS

TCGGTCTCACCCTGAATTCGCATCTAGATGGTAGAGCCACAAACAGCCGGTACAAGC AAC

GATCTCCAGGACCATCTGAATCATGCGCGGATGACACGAACTCACGACGGCGATCAC AGA

CATTAACCCACAGTACAGACACTGCGACAACGTGGCAATTCGTCGCAATACCGTCTC ACTG

AACTGGCCGATAATTGCAGACGAACGTGAGACCAGACCAATAAAAAACGCCCGGCGG CAA

CCGAGCGTTCTGAACAAATCCAGATGGAGTTCTGAGGTCATTACTGGATCTATCAAC AGGA

GTCCAAGCGAGCTCGATATCAAATTACGCCCCGCCCTGCCACTCATCGCAGTACTGT TGTA

ATTCATTAAGCATTCTGCCGACATGGAAGCCATCACAAACGGCATGATGAACCTGAA TCGC

CAGCGGCATCAGCACCTTGTCGCCTTGCGTATAATATTTGCCCATGGTGAAAACGGG GGC

GAAGAAGTTGTCCATATTGGCCACGTTTAAATCAAAACTGGTGAAACTCACCCAGGG ATTG

GCTGAAACGAAAAACATATTCTCAATAAACCCTTTAGGGAAATAGGCCAGGTTTTCA CCGTA

ACACGCCACATCTTGCGAATATATGTGTAGAAACTGCCGGAAATCGTCGTGGTATTC ACTC

CAGAGCGATGAAAACGTTTCAGTTTGCTCATGGAAAACGGTGTAACAAGGGTGAACA CTAT

CCCATATCACCAGCTCACCGTCTTTCATTGCCATACGAAATTCCGGATGAGCATTCA TCAG

GCGGGCAAGAATGTGAATAAAGGCCGGATAAAACTTGTGCTTATTTTTCTTTACGGT CTTTA

AAAAGGCCGTAATATCCAGCTGAACGGTCTGGTTATAGGTACATTGAGCAACTGACT GAAA

TGCCTCAAAATGTTCTTTACGATGCCATTGGGATATATCAACGGTGGTATATCCAGT GATTT

TTTTCTCCATTTTAGCTTCCTTAGCTCCTGAAAATCTCGATAACTCAAAAAATACGC CCGGT

AGTGATCTTATTTCATTATGGTGAAAGTTGGAACCTCTTACGTGCCCGATCAATCAT GACCA

AAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCA AAGG

ATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACC ACCGC

TACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAA CTGG

CTTCAGCAGAGCGCAGATACCAAATACTGTTCTTCTAGTGTAGCCGTAGTTAGGCCA CCAC

TTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTG GCTG

CTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGG ATA

AGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAA C

GACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCC CGA

AGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCAC G

AGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCAC CTC

TGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAAC GCC

AGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTC TTTCC

TGCGTTATCCCCTGATTCTGTGGATAACCGTAG

SEQ ID NO: 81 pYTK017_pRPL18B

TCGGTCTCAAACGAAGAGGATGTCCAATATTTTTTTTAAGGAATAAGGATACTTCAA GACTA

GATTCCCCCCTGCATTCCCATCAGAACCGTAAACCTTGGCGCTTTCCTTGGGAAGTA TTCA AGAAGTGCCTTGTCCGGTTTCTGTGGCTCACAAACCAGCGCGCCCGATATGGCTTTCTTT T

CACTTATGAATGTACCAGTACGGGACAATTAGAACGCTCCTGTAACAATCTCTTTGC AAATG

TGGGGTTACATTCTAACCATGTCACACTGCTGACGAAATTCAAAGTAAAAAAAAATG GGAC

CACGTCTTGAGAACGATAGATTTTCTTTATTTTACATTGAACAGTCGTTGTCTCAGC GCGCT

TTATGTTTTCATTCATACTTCATATTATAAAATAACAAAAGAAGAATTTCATATTCA CGCCCAA

GAAATCAGGCTGCTTTCCAAATGCAATTGACACTTCATTAGCCATCACACAAAACTC TTTCT

TGCTGGAGCTTCTTTTAAAAAAGACCTCAGTACACCAAACACGTTACCCGACCTCGT TATTT

TACGACAACTATGATAAAATTCTGAAGAAAAAATAAAAAAATTTTCATACTTCTTGC TTTTATT

TAAACCATTGAATGATTTCTTTTGAACAAAACTACCTGTTTCACCAAAGGAAATAGA AAGAAA

AAATCAATTAGAAGAAAACAAAAAACAAAAGATCTATGTGAGACCAGACCAATAAAA AACGC

CCGGCGGCAACCGAGCGTTCTGAACAAATCCAGATGGAGTTCTGAGGTCATTACTGG ATC

TATCAACAGGAGTCCAAGCGAGCTCGATATCAAATTACGCCCCGCCCTGCCACTCAT CGCA

GTACTGTTGTAATTCATTAAGCATTCTGCCGACATGGAAGCCATCACAAACGGCATG ATGA

ACCTGAATCGCCAGCGGCATCAGCACCTTGTCGCCTTGCGTATAATATTTGCCCATG GTGA

AAACGGGGGCGAAGAAGTTGTCCATATTGGCCACGTTTAAATCAAAACTGGTGAAAC TCAC

CCAGGGATTGGCTGAAACGAAAAACATATTCTCAATAAACCCTTTAGGGAAATAGGC CAGG

TTTTCACCGTAACACGCCACATCTTGCGAATATATGTGTAGAAACTGCCGGAAATCG TCGT

GGTATTCACTCCAGAGCGATGAAAACGTTTCAGTTTGCTCATGGAAAACGGTGTAAC AAGG

GTGAACACTATCCCATATCACCAGCTCACCGTCTTTCATTGCCATACGAAATTCCGG ATGA

GCATTCATCAGGCGGGCAAGAATGTGAATAAAGGCCGGATAAAACTTGTGCTTATTT TTCTT

TACGGTCTTTAAAAAGGCCGTAATATCCAGCTGAACGGTCTGGTTATAGGTACATTG AGCA

ACTGACTGAAATGCCTCAAAATGTTCTTTACGATGCCATTGGGATATATCAACGGTG GTATA

TCCAGTGATTTTTTTCTCCATTTTAGCTTCCTTAGCTCCTGAAAATCTCGATAACTC AAAAAA

TACGCCCGGTAGTGATCTTATTTCATTATGGTGAAAGTTGGAACCTCTTACGTGCCC GATC

AATCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGT AGAA

AAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAA ACAAA

AAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTT TCCG

AAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTTCTTCTAGTGTAGCCG TAGT

TAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCC TGTT

ACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACG ATA

GTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAG CT

TGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGCG CCA

CGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAG GA

GAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGG TTT

CGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTA TGG

AAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCT CACA

TGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTAG

SEQ ID NO: 82 pYTK024_pPOP6

TCGGTCTCAAACGTTCGTGCTTTGTGATAAAGTGTTTCACGTCATCCGACATGACTT CGTAG

TTATGGACTGAACTGTGTGGTGAGGTTCCATGATTTCTTAGGTCCAGCAGATACATG TCTCT

TCCCAATTTCTTGTTAAGGTTACGGCCAATGCTTCGGTTGTTGAGCTTGTTACCGAA TAAGC

CGTGAAGTATGATAATAGGTGGTCTTGGCTTCCCTTCATCCCCAGTTTTTACTGCAT CTCTC

TTGATTATGTCATATGAAAGGTCCAGTGGGACTTGCTTTTGTTGCAGCACCTTTGCT AATGA

ATGAAAGGCACATAGTGACTGCTTAAAAATGCAGGAACTTAAATTATTCCGAATGGT ATTTT

GTCTCACATATATTGTCCCATACTGTGCCAAGATCCCGGCTTTACCCAGTATCATCA TTGTA

CCGTTACCAATTCTCCTCGTATATCACGGTTAGTTTTTAAACCTCGGGGTGACGTTT ACTAT

TGGCGTACTAATATATTCTTATTTTCTTTTCTTTTTTGTTGGCAGTTTCAAGCAACA CATGTA

CTGGATAACCAACCCCCGCACGCTCTTGGAAAAAATTGAGAAGGCATCGGACACTTG CTG

ATGAGTATTTCGAAAAATTCCATGAAGATGAGGCCAAGATTGTTTGGAAGAGATTGA AAAGA AGAAGAAGAAAAAAAGATAAAAGCAAATCAAAAGATCTATGTGAGACCAGACCAATAAAA AA

CGCCCGGCGGCAACCGAGCGTTCTGAACAAATCCAGATGGAGTTCTGAGGTCATTAC TGG

ATCTATCAACAGGAGTCCAAGCGAGCTCGATATCAAATTACGCCCCGCCCTGCCACT CATC

GCAGTACTGTTGTAATTCATTAAGCATTCTGCCGACATGGAAGCCATCACAAACGGC ATGA

TGAACCTGAATCGCCAGCGGCATCAGCACCTTGTCGCCTTGCGTATAATATTTGCCC ATGG

TGAAAACGGGGGCGAAGAAGTTGTCCATATTGGCCACGTTTAAATCAAAACTGGTGA AACT

CACCCAGGGATTGGCTGAAACGAAAAACATATTCTCAATAAACCCTTTAGGGAAATA GGCC

AGGTTTTCACCGTAACACGCCACATCTTGCGAATATATGTGTAGAAACTGCCGGAAA TCGT

CGTGGTATTCACTCCAGAGCGATGAAAACGTTTCAGTTTGCTCATGGAAAACGGTGT AACA

AGGGTGAACACTATCCCATATCACCAGCTCACCGTCTTTCATTGCCATACGAAATTC CGGA

TGAGCATTCATCAGGCGGGCAAGAATGTGAATAAAGGCCGGATAAAACTTGTGCTTA TTTT

TCTTTACGGTCTTTAAAAAGGCCGTAATATCCAGCTGAACGGTCTGGTTATAGGTAC ATTGA

GCAACTGACTGAAATGCCTCAAAATGTTCTTTACGATGCCATTGGGATATATCAACG GTGG

TATATCCAGTGATTTTTTTCTCCATTTTAGCTTCCTTAGCTCCTGAAAATCTCGATA ACTCAA

AAAATACGCCCGGTAGTGATCTTATTTCATTATGGTGAAAGTTGGAACCTCTTACGT GCCC

GATCAATCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACC CCGT

AGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTT GCAAA

CAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTC TTTT

TCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTTCTTCTAGTGTA GCCG

TAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTA ATCC

TGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAA GAC

GATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGC CC

AGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAA AGC

GCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGA AC

AGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGT CGG

GTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAG CCT

ATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTT TGCT

CACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTAG

SEQ ID NO: 83 pYTK030_pGAL1

TCGGTCTCAAACGCCCCATTATCTTAGCCTAAAAAAACCTTCTCTTTGGAACTTTCA GTAAT

ACGCTTAACTGCTCATTGCTATATTGAAGTACGGATTAGAAGCCGCCGAGCGGGCGA CAG

CCCTCCGACGGATGACTCTCCTCCGTGCGTCCTCGTCATCACCGGTCGCGTTCCTGA AAC

GCAGATGTGCCTCGCGCCGCACTGCTCCGAACAATAAAGATTCTACAATACTAGCTT TTAT

GGTTATGAAGAGGAAAAATTGGCAGTAACCTGGCCCCACAAACCTTCAAATTAACGA ATCA

AATTAACAACCATAGGATGATAATGCGATTAGTTTTTTAGCCTTATTTCTGGGGTAA TTAATC

AGCGAAGCGATGATTTTTGATCTATTAACAGATATATAAATGGAAAAGCTGCATAAC CACTT

TAACTAATACTTTCAACATTTTCAGTTTGTATTACTTCTTATTCAAATGTCATAAAA GTATCAA

CAAAAAATTGTTAATATACCTCTATACTTTAACGTCAAGGAGAAAAAACTATAAGAT CTATGT

GAGACCAGACCAATAAAAAACGCCCGGCGGCAACCGAGCGTTCTGAACAAATCCAGA TGG

AGTTCTGAGGTCATTACTGGATCTATCAACAGGAGTCCAAGCGAGCTCGATATCAAA TTAC

GCCCCGCCCTGCCACTCATCGCAGTACTGTTGTAATTCATTAAGCATTCTGCCGACA TGGA

AGCCATCACAAACGGCATGATGAACCTGAATCGCCAGCGGCATCAGCACCTTGTCGC CTT

GCGTATAATATTTGCCCATGGTGAAAACGGGGGCGAAGAAGTTGTCCATATTGGCCA CGTT

TAAATCAAAACTGGTGAAACTCACCCAGGGATTGGCTGAAACGAAAAACATATTCTC AATAA

ACCCTTTAGGGAAATAGGCCAGGTTTTCACCGTAACACGCCACATCTTGCGAATATA TGTG

TAGAAACTGCCGGAAATCGTCGTGGTATTCACTCCAGAGCGATGAAAACGTTTCAGT TTGC

TCATGGAAAACGGTGTAACAAGGGTGAACACTATCCCATATCACCAGCTCACCGTCT TTCA

TTGCCATACGAAATTCCGGATGAGCATTCATCAGGCGGGCAAGAATGTGAATAAAGG CCG

GATAAAACTTGTGCTTATTTTTCTTTACGGTCTTTAAAAAGGCCGTAATATCCAGCT GAACG GTCTGGTTATAGGTACATTGAGCAACTGACTGAAATGCCTCAAAATGTTCTTTACGATGC CA

TTGGGATATATCAACGGTGGTATATCCAGTGATTTTTTTCTCCATTTTAGCTTCCTT AGCTCC

TGAAAATCTCGATAACTCAAAAAATACGCCCGGTAGTGATCTTATTTCATTATGGTG AAAGT

TGGAACCTCTTACGTGCCCGATCAATCATGACCAAAATCCCTTAACGTGAGTTTTCG TTCCA

CTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCT GCGC

GTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCG GATC

AAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAA ATAC

TGTTCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCC TACA

TACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGT CTTA

CCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGG GG

GGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTA CAG

CGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCG GT

AAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTG G

TATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGA TGCTC

GTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCT GG

CCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGG ATAAC

CGTAG

SEQ ID NO: 84 p YTK032_mTurquoise2

TCGGTCTCATATGGTTTCTAAAGGTGAAGAATTATTCACTGGTGTTGTCCCAATTTT GGTTG

AATTAGATGGTGATGTTAATGGTCACAAATTTTCTGTCTCCGGTGAAGGTGAAGGTG ATGC

TACTTACGGTAAATTGACCTTAAAATTTATTTGTACTACTGGTAAATTGCCAGTTCC ATGGCC

AACCTTAGTCACTACTTTATCTTGGGGTGTTCAATGTTTTGCAAGATACCCAGATCA TATGA

AACAACATGACTTTTTCAAGTCTGCCATGCCAGAAGGTTATGTTCAAGAAAGAACTA TTTTT

TTCAAAGATGACGGTAACTACAAGACCAGAGCTGAAGTCAAGTTTGAAGGTGATACC TTAG

TTAATAGAATCGAATTAAAAGGTATTGATTTTAAAGAAGATGGTAACATTTTAGGTC ACAAAT

TGGAATACAATTATTTCTCTGACAATGTTTACATCACTGCTGACAAACAAAAGAATG GTATC

AAAGCTAACTTCAAAATTAGACACAACATTGAAGATGGTGGTGTTCAATTAGCTGAC CATTA

TCAACAAAATACTCCAATTGGTGATGGTCCAGTCTTGTTACCAGACAACCATTACTT ATCCA

CTCAATCTAAGTTATCCAAAGATCCAAACGAAAAGAGGGACCACATGGTCTTGTTAG AATTT

GTTACTGCTGCTGGTATTACCTTGGGTATGGATGAATTGTACAAAGGATCCTGAGAC CAGA

CCAATAAAAAACGCCCGGCGGCAACCGAGCGTTCTGAACAAATCCAGATGGAGTTCT GAG

GTCATTACTGGATCTATCAACAGGAGTCCAAGCGAGCTCGATATCAAATTACGCCCC GCCC

TGCCACTCATCGCAGTACTGTTGTAATTCATTAAGCATTCTGCCGACATGGAAGCCA TCAC

AAACGGCATGATGAACCTGAATCGCCAGCGGCATCAGCACCTTGTCGCCTTGCGTAT AATA

TTTGCCCATGGTGAAAACGGGGGCGAAGAAGTTGTCCATATTGGCCACGTTTAAATC AAAA

CTGGTGAAACTCACCCAGGGATTGGCTGAAACGAAAAACATATTCTCAATAAACCCT TTAG

GGAAATAGGCCAGGTTTTCACCGTAACACGCCACATCTTGCGAATATATGTGTAGAA ACTG

CCGGAAATCGTCGTGGTATTCACTCCAGAGCGATGAAAACGTTTCAGTTTGCTCATG GAAA

ACGGTGTAACAAGGGTGAACACTATCCCATATCACCAGCTCACCGTCTTTCATTGCC ATAC

GAAATTCCGGATGAGCATTCATCAGGCGGGCAAGAATGTGAATAAAGGCCGGATAAA ACTT

GTGCTTATTTTTCTTTACGGTCTTTAAAAAGGCCGTAATATCCAGCTGAACGGTCTG GTTAT

AGGTACATTGAGCAACTGACTGAAATGCCTCAAAATGTTCTTTACGATGCCATTGGG ATATA

TCAACGGTGGTATATCCAGTGATTTTTTTCTCCATTTTAGCTTCCTTAGCTCCTGAA AATCTC

GATAACTCAAAAAATACGCCCGGTAGTGATCTTATTTCATTATGGTGAAAGTTGGAA CCTCT

TACGTGCCCGATCAATCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAG CGTC

AGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAAT CTGCT

GCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGC TACC

AACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTTCT TCTAG

TGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCG CTCT GCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGA

CTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTG CA

CACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGC TAT

GAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCA GG

GTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTAT AGT

CCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGG GGG

CGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGC TGG

CCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTAG

SEQ ID NO: 85 pYTK033_Venus

TCGGTCTCATATGTCTAAAGGTGAAGAATTATTCACTGGTGTTGTCCCAATTTTGGT TGAAT

TAGATGGTGATGTTAATGGTCACAAATTTTCTGTCTCCGGTGAAGGTGAAGGTGATG CTAC

TTACGGTAAATTGACCTTAAAATTGATTTGTACTACTGGTAAATTGCCAGTTCCATG GCCAA

CCTTAGTCACTACTTTAGGTTATGGTTTGCAATGTTTTGCTAGATACCCAGATCATA TGAAA

CAACATGACTTTTTCAAGTCTGCCATGCCAGAAGGTTATGTTCAAGAAAGAACTATT TTTTT

CAAAGATGACGGTAACTACAAGACCAGAGCTGAAGTCAAGTTTGAAGGTGATACCTT AGTT

AATAGAATCGAATTAAAAGGTATTGATTTTAAAGAAGGTGGTAACATTTTAGGTCAC AAATT

GGAATACAACTATAACTCTCACAATGTTTACATCACTGCTGACAAACAAAAGAATGG TATCA

AAGCTAACTTCAAAATTAGACACAACATTGAAGATGGTGGTGTTCAATTAGCTGACC ATTAT

CAACAAAATACTCCAATTGGTGATGGTCCAGTCTTGTTACCAGACAACCATTACTTA TCCTA

TCAATCTGCCTTATCCAAAGATCCAAACGAAAAGAGAGATCACATGGTCTTGTTAGA ATTTG

TTACTGCTGCTGGTATTACCCATGGTATGGATGAATTGTACAAAGGATCCTGAGACC AGAC

CAATAAAAAACGCCCGGCGGCAACCGAGCGTTCTGAACAAATCCAGATGGAGTTCTG AGG

TCATTACTGGATCTATCAACAGGAGTCCAAGCGAGCTCGATATCAAATTACGCCCCG CCCT

GCCACTCATCGCAGTACTGTTGTAATTCATTAAGCATTCTGCCGACATGGAAGCCAT CACA

AACGGCATGATGAACCTGAATCGCCAGCGGCATCAGCACCTTGTCGCCTTGCGTATA ATAT

TTGCCCATGGTGAAAACGGGGGCGAAGAAGTTGTCCATATTGGCCACGTTTAAATCA AAAC

TGGTGAAACTCACCCAGGGATTGGCTGAAACGAAAAACATATTCTCAATAAACCCTT TAGG

GAAATAGGCCAGGTTTTCACCGTAACACGCCACATCTTGCGAATATATGTGTAGAAA CTGC

CGGAAATCGTCGTGGTATTCACTCCAGAGCGATGAAAACGTTTCAGTTTGCTCATGG AAAA

CGGTGTAACAAGGGTGAACACTATCCCATATCACCAGCTCACCGTCTTTCATTGCCA TACG

AAATTCCGGATGAGCATTCATCAGGCGGGCAAGAATGTGAATAAAGGCCGGATAAAA CTTG

TGCTTATTTTTCTTTACGGTCTTTAAAAAGGCCGTAATATCCAGCTGAACGGTCTGG TTATA

GGTACATTGAGCAACTGACTGAAATGCCTCAAAATGTTCTTTACGATGCCATTGGGA TATAT

CAACGGTGGTATATCCAGTGATTTTTTTCTCCATTTTAGCTTCCTTAGCTCCTGAAA ATCTC

GATAACTCAAAAAATACGCCCGGTAGTGATCTTATTTCATTATGGTGAAAGTTGGAA CCTCT

TACGTGCCCGATCAATCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAG CGTC

AGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAAT CTGCT

GCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGC TACC

AACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTTCT TCTAG

TGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCG CTCT

GCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTT GGA

CTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTG CA

CACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGC TAT

GAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCA GG

GTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTAT AGT

CCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGG GGG

CGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGC TGG

CCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTAG SEQ ID NO: 86 pYTK053_tADH1

TCGGTCTCAATCCTAACTCGAGGCGAATTTCTTATGATTTATGATTTTTATTATTAA ATAAGT

TATAAAAAAAATAAGTGTATACAAATTTTAAAGTGACTCTTAGGTTTTAAAACGAAA ATTCTTA

TTCTTGAGTAACTCTTTCCTGTAGGTCAGGTTGCTTTCTCAGGTATAGCATGAGGTC GCTCT

TATTGACCACACCTCTACCGGCATGCCGAGCAAATGCCTGCAAATCGCTCCCCATTT CGCT

GTGAGACCAGACCAATAAAAAACGCCCGGCGGCAACCGAGCGTTCTGAACAAATCCA GAT

GGAGTTCTGAGGTCATTACTGGATCTATCAACAGGAGTCCAAGCGAGCTCGATATCA AATT

ACGCCCCGCCCTGCCACTCATCGCAGTACTGTTGTAATTCATTAAGCATTCTGCCGA CATG

GAAGCCATCACAAACGGCATGATGAACCTGAATCGCCAGCGGCATCAGCACCTTGTC GCC

TTGCGTATAATATTTGCCCATGGTGAAAACGGGGGCGAAGAAGTTGTCCATATTGGC CACG

TTTAAATCAAAACTGGTGAAACTCACCCAGGGATTGGCTGAAACGAAAAACATATTC TCAAT

AAACCCTTTAGGGAAATAGGCCAGGTTTTCACCGTAACACGCCACATCTTGCGAATA TATG

TGTAGAAACTGCCGGAAATCGTCGTGGTATTCACTCCAGAGCGATGAAAACGTTTCA GTTT

GCTCATGGAAAACGGTGTAACAAGGGTGAACACTATCCCATATCACCAGCTCACCGT CTTT

CATTGCCATACGAAATTCCGGATGAGCATTCATCAGGCGGGCAAGAATGTGAATAAA GGCC

GGATAAAACTTGTGCTTATTTTTCTTTACGGTCTTTAAAAAGGCCGTAATATCCAGC TGAAC

GGTCTGGTTATAGGTACATTGAGCAACTGACTGAAATGCCTCAAAATGTTCTTTACG ATGCC

ATTGGGATATATCAACGGTGGTATATCCAGTGATTTTTTTCTCCATTTTAGCTTCCT TAGCTC

CTGAAAATCTCGATAACTCAAAAAATACGCCCGGTAGTGATCTTATTTCATTATGGT GAAAG

TTGGAACCTCTTACGTGCCCGATCAATCATGACCAAAATCCCTTAACGTGAGTTTTC GTTCC

ACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTC TGCG

CGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCC GGAT

CAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCA AATA

CTGTTCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGC CTAC

ATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTG TCTT

ACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACG GG

GGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCT ACA

GCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCC GG

TAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCT G

GTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTG ATGCT

CGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCC TG

GCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTG GATAA

CCGTAG

SEQ ID NO: 87 pYTK055_tENO2

TCGGTCTCAATCCTAACTCGAGAGTGCTTTTAACTAAGAATTATTAGTCTTTTCTGC TTATTT

TTTCATCATAGTTTAGAACACTTTATATTAACGAATAGTTTATGAATCTATTTAGGT TTAAAAA

TTGATACAGTTTTATAAGTTACTTTTTCAAAGACTCGTGCTGTCTATTGCATAATGC ACTGGA

AGGGGAAAAAAAAGGTGCACACGCGTGGCTTTTTCTTGAATTTGCAGTTTGAAAAAT GCTG

TGAGACCAGACCAATAAAAAACGCCCGGCGGCAACCGAGCGTTCTGAACAAATCCAG ATG

GAGTTCTGAGGTCATTACTGGATCTATCAACAGGAGTCCAAGCGAGCTCGATATCAA ATTA

CGCCCCGCCCTGCCACTCATCGCAGTACTGTTGTAATTCATTAAGCATTCTGCCGAC ATGG

AAGCCATCACAAACGGCATGATGAACCTGAATCGCCAGCGGCATCAGCACCTTGTCG CCT

TGCGTATAATATTTGCCCATGGTGAAAACGGGGGCGAAGAAGTTGTCCATATTGGCC ACGT

TTAAATCAAAACTGGTGAAACTCACCCAGGGATTGGCTGAAACGAAAAACATATTCT CAATA

AACCCTTTAGGGAAATAGGCCAGGTTTTCACCGTAACACGCCACATCTTGCGAATAT ATGT

GTAGAAACTGCCGGAAATCGTCGTGGTATTCACTCCAGAGCGATGAAAACGTTTCAG TTTG

CTCATGGAAAACGGTGTAACAAGGGTGAACACTATCCCATATCACCAGCTCACCGTC TTTC

ATTGCCATACGAAATTCCGGATGAGCATTCATCAGGCGGGCAAGAATGTGAATAAAG GCC GGATAAAACTTGTGCTTATTTTTCTTTACGGTCTTTAAAAAGGCCGTAATATCCAGCTGA AC