SAPONIN PRODUCTION IN YEAST

Title:

SAPONIN PRODUCTION IN YEAST

Document Type and Number:

WIPO Patent Application WO/2023/122801

Kind Code:

Abstract:

The present invention relates inter alia to methods of biosynthetic production of QS-21, precursors and variants thereof, and to related aspects.

Inventors:

LIU YUZHONG (US)
CROWE SAMANTHA AIKO (US)
KEASLING JAY D (US)
CHEN XIAOYUE (US)
HUDSON GRAHAM ARTHUR (US)
GAN FEI (US)
SCHELLER HENRIK V (US)
REED JAMES (GB)
MARTIN LAETITIA (GB)

Application Number:

PCT/US2022/082381

Publication Date:

June 29, 2023

Filing Date:

December 23, 2022

Export Citation:

Click for automatic bibliography generation Help

Assignee:

UNIV CALIFORNIA (US)
PLANT BIOSCIENCE LTD (GB)

International Classes:

C12P33/00

Attorney, Agent or Firm:

OSMAN, Richard (US)

Download PDF:

View/Download PDF PDF Help

Claims:

Claims

1. A method of producing quillaic acid (QA) in yeast, wherein the method comprises the step of overexpressing, in a yeast engineered to produce □-amyrin, heterologous genes encoding the following enzymes:

(i) a cytochrome P450 C16 oxidase, wherein the C16 oxidase oxidizes the C16 carbon of p-amyrin to a hydroxyl group,

(ii) a cytochrome P450 C23 oxidase, wherein the C23 oxidase oxidizes the C23 carbon of p-amyrin to an aldehyde group,

(iii) a cytochrome P450 C28 oxidase, wherein the C28 oxidase oxidizes the C28 carbon of p-amyrin to a carboxyl group, and

(iv) a cytochrome P450 reductase (CPR), acting as a redox partner wherein the C16 oxidase, the C23 oxidase, the C28 oxidase and the CPR are from a plant origin.

2. The method of claim 1, wherein the C16 oxidase is selected from QsC16 according to SEQ ID NO: 20, QsC28C16 according to SEQ ID NO: 23, and SvC16 according to SEQ ID NO: 26, the C23 oxidase is selected from MtC23 oxidase according to SEQ ID NO: 38, QsC23 according to SEQ ID NO: 29, SvC23-1 according to SEQ ID NO: 32, and SvC23-2 according to SEQ ID NO: 35, and the C28 oxidase is selected from MtC28 according to SEQ ID NO: 46, QsC28 according to SEQ ID NO: 41 and SvC28 according to SEQ ID NO: 44.

3. The method of claim 1 , wherein the yeast further overexpresses a heterologous gene encoding (v) a cytochrome b5.

4. The method of claim 1 , wherein the yeast further overexpresses a heterologous gene encoding (vi) a scaffold protein, wherein the scaffold protein physically interacts with one or more of the C16 oxidase, the C23 oxidase, the C28 oxidase and the CPR, wherein the scaffold protein is a membrane steroid-binding protein (MSBP) selected from AtMSBPI according to SEQ ID NO: 63, AtMSBP2 according to SEQ ID NO: 65, QsMSBPI according to SEQ ID NO: 73, SvMSBPI according to SEQ ID NO: 67 and SvMSBP2 according to SEQ ID NO: 70.

5. The method of any of claims 1 to 4, wherein the yeast is engineered to produce p- amyrin and overexpresses a p-amyrin synthase (BAS) selected from AaBAS according to SEQ ID NO: 1, AtBAS according to SEQ ID NO: 4, GgBAS according to SEQ ID NO: 7, GvBAS according to SEQ ID NO: 10, QsBAS according to SEQ ID NO: 15, and SvBAS according to SEQ ID NO: 13. 6. The method of claim 4, wherein the C16 oxidase is QsC28C16, the C23 oxidase is QsC23, the C28 oxidase is QsC28, the CPR is AtATRI , the MSBP is SvMSBPI , the cytochrome b5 is Qsb5, and the BAS is GvBAS.

7. A method of producing a C3-glycosylated QA derivative in yeast, wherein the derivative is QA-C3-GlcA, and the method comprises the step of overexpressing, in a yeast engineered to produce QA and UDP-GIcA, a heterologous gene encoding the following enzyme:

(i) a UDP-GIcA transferase (GlcAT) transferring UDP-GIcA and attaching a GlcA residue at the C3 position of QA to form QA-C3-GlcA.

8. The method of claim 7, wherein the GlcAT is selected from QsCsIGI according to SEQ ID NO: 78, QsCslG2 according to SEQ ID NO: 81, and SvCsIG according to SEQ ID NO: 76 .

9. The method of claim 7, wherein the derivative is QA-C3-GlcA-Gal, and the overexpressing further comprises overexpressing a heterologous gene encoding the following enzyme:

(ii) a UDP-Galactose transferase (GalT) transferring UDP-Gal and attaching a Gal residue to QA-C3-GlcA to form QA-C3-GlcA-Gal.

10. The method of claim 9, wherein the GalT is QsGalT according to SEQ ID NO: 116 or GalT is SvGalT according to SEQ ID NO: 98.

11. The method of claim 7, wherein the derivative is QA-C3-GlcA-Gal-Rha, the yeast is further engineered to produce UDP-Rha, and the overexpressing further comprises overexpressing a heterologous gene encoding the following enzyme:

(iii) a UDP-Rhamnose transferase (RhaT) transferring UDP-Rha and attaching a Rha residue to QA-C3-GlcA-Gal to form QA-C3-GlcA-Gal-Rha.

12. The method of any one of claims 7-11, wherein the derivative is QA-C3-GlcA-Gal-Xyl, the yeast is further engineered to produce UDP-Xyl, and the overexpressing further comprises overexpressing heterologous genes encoding the following enzymes:

(iii) a UDP-Xylose transferase (XylT) transferring UDP-Xylose and attaching a Xyl residue to QA-C3-GlcA-Gal to form QA-C3-GlcA-Gal-Xyl. 117

13. The method of claim 12, wherein the XylT is selected from QsC3XylT according to SEQ ID NO: 122 and SvC3XylT according to SEQ ID NO: 100.

14. A method of producing UDP-Fucose (UDP-Fuc) in yeast, wherein the method comprises the step of overexpressing heterologous genes encoding the following enzymes:

(i) a UDP-glucose-4,6-dehydratase (LIG46DH) converting UDP-GIc into UDP-4- keto-6-deoxy-glucose and

(ii) a 4-keto-reductase converting UDP-4-keto-6-deoxy-glucose into UDP-D-Fuc.

15. The method of claim 14, wherein the LIG46DH is SvllG46DH according to SEQ ID NO: 87 and the 4-keto-reductase is selected from svNMD according to SEQ ID NO: 90 and QsFucSyn according to SEQ ID NO: 175.

16. A method of producing a C-28-glycosylated QA derivative in yeast, wherein the derivative is QA-C3-GlcA-Gal-Rha-C28-Fuc, or QA-C3-GlcA-Gal-Xyl-C28-Fuc, the method comprises the step of overexpressing, in a yeast engineered to produce QA-C3-GlcA-Gal-Rha, or QA-C3-GlcA-Gal-Xyl, and UDP-Fucose, a heterologous gene encoding the following enzyme:

(i) a UDP-Fucose transferase (FucT) transferring UDP-Fuc and attaching a Fuc residue at the C28 position of QA to form QA-C3-GlcA-Gal-Rha-C28-Fuc, or QA-C3-GlcA-Gal- Xyl-C28-Fuc.

17. The method of claim 16, wherein the FucT is selected from QsFucT according to SEQ ID NO: 93 and SvFucT according to SEQ ID NO: 96.

18. The method of claim 16, wherein the derivative is QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha, or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha, the overexpressing further comprises overexpressing a heterologous gene encoding the following enzyme:

(ii) a UDP-Rhamnose transferase (RhaT) transferring UDP-Rha and attaching a Rha residue to QA-C3-GlcA-Gal-Rha-C28-Fuc, or QA-C3-GlcA-Gal-Xyl-C28-Fuc, to form QA- C3-GlcA-Gal-Rha-C28-Fuc-Rha or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha.

19. The method of claim 18, wherein the RhaT is QsRhaT according to SEQ ID NO: 119. 118

20. The method of claim 18, wherein the derivative is QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha- Xyl, or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl, the overexpressing further comprises overexpressing heterologous genes encoding the following enzyme:

(iii) a UDP-Xylose transferase (XylT) transferring UDP-Xyl and attaching a Xyl residue to QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha to form GlcA-Gal-Rha-C28-Fuc-Rha-Xyl and QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl, respectively.

21. The method of claim 20, wherein the XylT is QsC28XylT3 according to SEQ ID NO: 125.

22. The method of claim 20, wherein the derivative is QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha- Xyl-Xyl, or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Xyl, the overexpressing further comprises overexpressing heterologous genes encoding the following enzymes:

(iv) a UDP-Xylose transferase (XylT) transferring UDP-Xyl and attaching a Xyl residue to QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl to form QA- C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Xyl and QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Xyl, respectively.

23. The method of claim 22, wherein the XylT is QsC28XylT4 according to SEQ ID NO: 128.

24. The method of claim 22, wherein the derivative is QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha- Xyl-Api or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Api, the overexpressing further comprises overexpressing heterologous genes encoding the following enzymes:

(iv) a UDP-Apiose synthase (AXS) converting UDP-GIcA into UDP-Api and

(v) a UDP-Apiose transferase (ApiT) transferring UDP-Apiose and attaching an Apiose residue to QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl or QA-C3-GlcA-Gal-Xyl-C28-Fuc- Rha-Xyl to form QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Api and QA-C3-GlcA-Gal-Xyl-C28- Fuc-Rha-Xyl-Api, respectively.

25. The method of claim 24, wherein the AXS is QsAXS according to SEQ ID NO: 113 and the ApiT is QsC28ApiT4 according to SEQ ID NO: 151.

26. A method of producing (S)-2-methylbutyryl CoA (2MB-CoA) in yeast, wherein the method comprises the step of overexpressing a heterologous gene encoding a carboxyl 119 coenzyme A (CoA) ligase (CCL) converting 2MB acid into 2MB-CoA, and 2MB acid is supplemented exogenously.

27. The method of claim 26, wherein the CCL is QsCCL from Q. saponaria according to SEQ ID NO: 178.

28. The method of claim 26, wherein the overexpressing further comprises overexpressing heterologous genes encoding the following enzymes:

(i) a phosphopantetheinyl (Ppant) transferase,

(ii) a megasynthase LovF-TE including an ACP domain, condensing two units of malonyl-CoA to 2MB-ACP, cleaving 2MB acid from the ACP domain which is converted into 2MB-CoA by the CCL, and no 2MB acid is supplemented exogenously.

29. The method of claim 28, wherein the Ppant is AnNpgA according to SEQ ID NO: 237 and the megasynthase LovF-TE is AstLovF-TE according to SEQ ID NO: 235.

30. A method of producing UDP-Arabinofuranose (UDP-Ara ) in yeast, wherein the method comprises the step of overexpressing, in a yeast engineered to produce UDP-Xyl, heterologous genes encoding the following enzymes:

(i) a UDP-Xyl epimerase (UXE) converting UDP-Xyl into UDP-Arabinopyranose (UDP-Arap), and

(ii) a UDP-Arabinose mutases (UAM) converting UDP-Arap into UDP- Arabinofuranose (UDP-Ara ).

31. The method of claim 30, wherein the UXE is selected from AtUXE according to SEQ ID NO: 199, AtUXE2 according to SEQ ID NO: 202, HvUXE-1 according to SEQ ID NO: 240, HvUXE-2 according to SEQ ID NO: 242 and AtUGE3 according to SEQ ID NO: 205 and the UAM is selected from AtUAMI according to SEQ ID NO: 208 and HvUAM according to SEQ ID NO: 211.

32. A method of producing UDP-Araf in yeast, wherein the method comprises the step of overexpressing heterologous genes encoding the following enzymes:

(i) an arabinokinase (AraK) and

(ii) a UDP-sugar pyrophosphorylase (USP), and arabinose is supplemented exogenously. 120

33. The method of claim 32, wherein the AraK is selected from AtAraK according to SEQ ID NO: 214 and LeiAraK according to SEQ ID NO: 217 and the USP is selected from AtllSP according to SEQ ID NO: 223 and LeillSP according to SEQ ID NO: 226.

34. The method of claim 33, wherein the overexpressing further comprises overexpressing a heterologous gene encoding an arabinose transporter (AraT).

35. The method of claim 35, wherein AraT is encoded by the nucleotide sequence SEQ ID NO: 221.

36. A method of producing an acylated and glycosylated QA derivative in yeast, wherein the derivative is QA-C3-GGR-C28-FRX-C9, QA-C3-GGX-C28-FRX-C9, QA-C3-GGR-C28- FRXX-C9, QA-C3-GGX-C28-FRXX-C9, QA-C3-GGR-C28-FRXA-C9 or QA-C3-GGX-C28- FRXA-C9, and the method comprises the step of overexpressing, in a yeast engineered to produce QA-C3-GGR-C28-FRX, QA-C3-GGX-C28-FRX, QA-C3-GGR-C28-FRXX, QA-C3- GGX-C28-FRXX, QA-C3-GGR-C28-FRXA, or QA-C3-GGX-C28-FRXA, heterologous genes encoding the following enzymes:

(i) a carboxyl coenzyme A ligase (CCL) converting 2MB acid into 2MB-CoA,

(ii) a chalcone-synthase-like type III PKS (Polyketide synthase) condensing malonyl-CoA with 2MB-CoA to form C9-Keto-CoA,

(iii) a keto-reductase (KR) converting C9-Keto-CoA into C9-CoA, and

(iv) an acyltransferase transferring and attaching a first C9-CoA unit to QA-C3- GGR-C28-FRX, QA-C3-GGX-C28-FRX, QA-C3-GGR-C28-FRXX, QA-C3-GGX-C28-FRXX, QA-C3-GGR-C28-FRXA, or QA-C3-GGX-C28-FRXA to form QA-C3-GGR-C28-FRX-C9, QA- C3-GGX-C28-FRX-C9, QA-C3-GGR-C28-FRXX-C9, QA-C3-GGX-C28-FRXX-C9, QA-C3- GGR-C28-FRXA-C9 or QA-C3-GGX-C28-FRXA-C9. wherein 2MB acid is supplemented exogenously.

37. The method of claim 36, wherein the CCL is QsCCL according to SEQ ID NO: 178, the chalcone-synthase-like type III PKS is QsChSD according to SEQ ID NO: 181, QsChSE according to SEQ ID NO: 184, or both QsChSD according to SEQ ID NO: 181 and QsChSE according to SEQ ID NO: 184, the keto-reductase is QsKR11 according to SEQ ID NO: 187, QsKR23 according to SEQ ID NO: 190, or both QsKR11 according to SEQ ID NO: 187 and QsKR23 according to SEQ ID NO: 190, and the acyltransferase is QsDMOT9 according to SEQ ID NO: 193. 121

38. The method of claim 36 wherein the derivative is QA-C3-GGR-C28-FRX-C18, QA-C3- GGX-C28-FRX-C18, QA-C3-GGR-C28-FRXX-C18, QA-C3-GGX-C28-FRXX-C18, QA-C3- GGR-C28-FRXA-C18 or QA-C3-GGX-C28-FRXA-C18, and the overexpressing further comprises overexpressing a heterologous gene encoding the following enzyme:

(v) an acyltransferase QsDMOT4 according to SEQ ID NO: 196 attaching a second C9-CoA unit to C3-GGR-C28-FRX-C9, QA-C3-GGX-C28-FRX-C9, QA-C3-GGR-C28-FRXX- C9, QA-C3-GGX-C28-FRXX-C9, QA-C3-GGR-C28-FRXA-C9, or QA-C3-GGX-C28-FRXA-C9 to form C3-GGR-C28-FRX-C18, QA-C3-GGX-C28-FRX-C18, QA-C3-GGR-C28-FRXX-C18, QA-C3-GGX-C28-FRXX-C18, QA-C3-GGR-C28-FRXA-C18 or QA-C3-GGX-C28-FRXA-C18.

39. The method of claim 38, wherein QsDMOT4 is encoded by the nucleotide sequence SEQ ID NO: 197.

40. The method of claim 38, wherein the derivative is QA-C3-GGR-C28-FRX-C18-Araf, QA-C3-GGX-C28-FRX-C18-Araf, QA-C3-GGR-C28-FRXX-C18-Araf, QA-C3-GGX-C28-FRXX- C18-Araf, QA-C3-GGR-C28-FRXA-C18-Araf, or QA-C3-GGX-C28-FRXA-C18-Araf, the yeast is further engineered to produce UDP-Araf, and the overexpressing further comprises overexpressing a heterologous gene encoding the following enzyme:

(vi) an arabinotransferase (ArafT) transferring UDP-Araf and attaching an Araf residue to QA-C3-GGR-C28-FRX-C18, QA-C3-GGX-C28-FRX-C18, QA-C3-GGR-C28-FRXX- C18, QA-C3-GGX-C28-FRXX-C18, QA-C3-GGR-C28-FRXA-C18, or QA-C3-GGX-C28-FRXA- C18- to form QA-C3-GGR-C28-FRX-C18-Araf, QA-C3-GGX-C28-FRX-C18-Araf, QA-C3-GGR- C28-FRXX-C18-Araf, QA-C3-GGX-C28-FRXX-C18-Araf, QA-C3-GGR-C28-FRXA-C18-Araf or QA-C3-GGX-C28-FRXA-C18-Araf.

41. The method of claim 40, wherein the ArafT is selected from QsArafT according to SEQ ID NO: 229 and QsArafT2 according to SEQ ID NO: 232.

42. The method of claim 41, wherein the ArafT is QsArafT2 according to SEQ ID NO: 232.

43. A method of producing QA-C3-GGX-C28-FRX-C18-Xyl, QA-C3-GGR-C28-FRX-C18- Xyl, QA-C3-GGX-C28-FRXX-C18-Xyl, QA-C3-GGR-C28-FRXX-C18-Xyl, QA-C3-GGX-C28- FRX-C18-Xyl, QA-C3-GGX-C28-FRXA-C18-Xyl or QA-C3-GGR-C28-FRX-C18-Xyl in a yeast, wherein the method comprises the step of overexpressing, in a yeast engineered to produce QA-C3-GGX-C28-FRX-C18, QA-C3-GGR-C28-FRX-C18, QA-C3-GGX-C28-FRXX-C18, QA- 122

C3-GRX-C28-FRXX-C18, QA-C3-GGX-C28-FRX-C18, QA-C3-GGX-C28-FRXA-C18 or QA- C3-GGR-C28-FRX-C18, a heterologous gene encoding an arabinotransferase (ArafT) transferring UDP-Xyl and attaching a Xyl residue to QA-C3-GGX-C28-FRX-C18, QA-C3-GGR- C28-FRX-C18, QA-C3-GGX-C28-FRXX-C18, QA-C3-GRX-C28-FRXX-C18, QA-C3-GGX-C28- FRX-C18, QA-C3-GGX-C28-FRXA-C18 and QA-C3-GGR-C28-FRX-C18 to form QA-C3-GGX- C28-FRX-C18-Xyl, QA-C3-GGR-C28-FRX-C18-Xyl, QA-C3-GGX-C28-FRXX-C18-Xyl, QA-C3- GRX-C28-FRXX-C18-Xyl, QA-C3-GGX-C28-FRX-C18-Xyl, QA-C3-GGX-C28-FRXA-C18-Xyl or QA-C3-GGR-C28-FRX-C18-Xyl.

44. The method of claim 43, wherein the ArafT is QsArafT is according to SEQ ID NO: 229.

45. The method of any one of claims 37 to 45, wherein the overexpressing further comprises the overexpressing of heterologous genes encoding the following enzymes:

(i) a phosphopantetheinyl (Ppant) transferase,

(ii) a megasynthase LovF-TE including an AGP domain, condensing two units of malonyl-CoA to 2MB-ACP, cleaving 2MB acid from the AGP domain which is converted into 2MB-CoA by the CoA ligase (CCL), and no 2MB acid is supplemented exogenously.

46. The method of claim 45, wherein the Ppant is AnNpgA according to SEQ ID NO: 237 and the megasynthase LovF-TE is AstLovF-TE according to SEQ ID NO: 235.

47. A method of producing QA-C3-GGX-C28-FRXX-C18-Araf (QS-21-Xyl) in yeast, wherein the method comprises the step of overexpressing heterologous genes encoding GvBAS according to SEQ ID NO: 10, QsC28C16 according to SEQ ID NO: 23, QsC23 according to SEQ ID NO: 29, QsC28 according to SEQ ID NO: 41 , AtATRI according to SEQ ID NO: 49, Qsb5 according to SEQ ID NO: 55, SvMSBPI according to SEQ ID NO: 67, AtUGD_A10iL according to SEQ ID NO: 108, QsCslG2 according to SEQ ID NO: 78, QsGalT according to SEQ ID NO: 116, AtllXS according to SEQ ID NO: 105, QsC3XylT according to SEQ ID NO: 122, SvNMD according to SEQ ID NO: 90, SvllG46DH according to SEQ ID NO: 87, QsFuct according to SEQ ID NO: 93, AtRHM2 according to SEQ ID NO: 102, QsRhaT according to SEQ ID NO: 119, QsC28XylT3 according to SEQ ID NO: 125, QsC28XylT4 according to SEQ ID NO: 128, QsChSD according to SEQ ID NO: 181, QsChSE according to SEQ ID NO: 184, QsKR11 according to SEQ ID NO: 187, QsKR23 according to SEQ ID NO: 190, QsDMOT9 according to SEQ ID NO: 193, QsDMOT4 according to SEQ ID NO: 196, AtllXE according to SEQ ID NO: 199, AtllAMI according to SEQ ID NO: 208, QsArafT2 123 according to SEQ ID NO: 232, AnNpgA according to SEQ ID NO: 237, QsCCL according to SEQ ID NO: 178 and AstLovF-TE according to SEQ ID NO: 235.

48. A method of producing QA-C3-GGX-C28-FRXA-C18-Araf (QS-21-Api) in yeast, wherein the method comprises the step of overexpressing heterologous genes encoding GvBAS according to SEQ ID NO: 10, QsC28C16 according to SEQ ID NO: 23, QsC23 according to SEQ ID NO: 29, QsC28 according to SEQ ID NO: 41 , AtATRI according to SEQ ID NO: 49, Qsb5 according to SEQ ID NO: 55, SvMSBPI according to SEQ ID NO: 67, AtUGD_A10iL according to SEQ ID NO: 108, QsCslG2 according to SEQ ID NO: 81, QsGalT according to SEQ ID NO: 116, AtllXS according to SEQ ID NO: 105, QsC3XylT according to SEQ ID NO: 122, SvNMD according to SEQ ID NO: 90, SvllG46DH according to SEQ ID NO: 87, QsFucT according to SEQ ID NO: 93, AtRHM2 according to SEQ ID NO: 102, QsRhaT according to SEQ ID NO : 119, QsC28XylT3 according to SEQ ID NO: 125, QsC28ApiT4 according to SEQ ID NO: 151, QsChSD according to SEQ ID NO: 181, QsChSE according to SEQ ID NO: 184, QsKR11 according to SEQ ID NO: 187, QsKR23 according to SEQ ID NO: 190, QsDMOT9 according to SEQ ID NO: 193, QsDMOT4 according to SEQ ID NO: 196, AtllXE according to SEQ ID NO: 199, AtllAMI according to SEQ ID NO: 208, QsArafT2 according to SEQ ID NO: 232, AnNpgA according to SEQ ID NO: 237, QsCCL according to SEQ ID NO: 178 and AstLovF-TE according to SEQ ID NO: 235.

49. C3-glycosylated QA derivatives obtained according to the method of any one of claims 7 to 13.

50. C28-glycosylated QA derivatives obtained according to the method of any one of claims 16 to 25.

51. Acylated and glycosylated QA derivatives obtained according to the method of any one of claims 36 to 46.

52. The use of C3-glycosylated QA derivatives of claim 49, C28-glycosylated QA derivatives of claim 50, and acylated and glycosylated QA derivatives of claim 51 as an adjuvant.

53. An isolated polypeptide selected from a p-amyrin synthase (SvBAS) according to SEQ ID NO: 13, a p-amyrin synthase (QsBAS) according to SEQ ID NO: 15, a CYP C16 oxidase (QsC28C16) according to SEQ ID NO: 23, a CYP C16 oxidase (SvC16) according to SEQ ID 124

NO: 26, CYP C23 oxidase (SvC23-1) according to SEQ ID NO: 32, a CYP C23 oxidase (SvC23-2) according to SEQ ID NO: 35, a CYP C28 oxidase (SvC28) according to SEQ ID NO: 44, ACytochrome b5 protein (Qsb5) according to SEQ ID NO: 55, a Cytochrome b5 protein (Svb5) according to SEQ ID NO: 61 , a UDP-GIcA transferase (SvCsIG) according to SEQ ID NO: 76, AMSBP protein (SvMSBPI) according to SEQ ID NO: 67, AMSBP protein (SvMSBP2) according to SEQ ID NO: 70, a MSBP protein (QsMSBPI) according to SEQ ID NO: 73, a UDP-glucose-4,6-dehydratase (SvllG46DH) according to SEQ ID NO: 87, a UDP-4-keto-6- deoxy-glucose reductase (SvNMD) according to SEQ ID NO: 90, a UDP-Galactose transferase (SvGalT) according to SEQ ID NO: 98, a UDP-Fucose transferase (SvFucT) according to SEQ ID NO: 96, a UDP-Xylose transferase (SvC3XylT) according to SEQ ID NO: 100, AUDP- Arabinofuranose transferase (QsArafT2) according to SEQ ID NO: 229, a UDP-glucose dehydrogenase (AtUGD_A1oi_L) according to SEQ ID NO: 108, a UDP-Xylose transferase (QsC28XylT4-3aa) according to SEQ ID NO: 131, a AUDP-Xylose transferase (QsC28XylT4- 6aa) according to SEQ ID NO: 134, a UDP-Xylose transferase (QsC28XylT4-9aa) according to SEQ ID NO: 137, a UDP-Xylose transferase (QsC28XylT4-12aa) according to SEQ ID NO: 140, AUDP-Xylose transferase (SUMO-QsC28XylT4) according to SEQ ID NO: 143, a UDP- Xylose transferase (TrXA-QsC28XylT4) according to SEQ ID NO: 145, a UDP-Xylose transferase (MBP-QsC28XylT4) according to SEQ ID NO: 147, a AUDP-Xylose transferase (QsC28XylT3-3xGGGS-QsC28XylT4) according to SEQ ID NO: 149 and a type I polyketide synthase (AstLovF-TE) according to SEQ ID NO: 235.

Description:

SAPONIN PRODUCTION IN YEAST

TECHNICAL FIELD

The present invention relates to the biosynthetic production of QS-21 , precursors and variants thereof, and non-native sugar in yeast, as well as to related aspects.

BACKGROUND ART

QS-21 is a natural saponin extract from the bark of the Chilean 'soapbark’ tree, Quillaja saponaria. QS-21 extract was originally identified as a fraction purified from a crude bark extract of Quillaja Saponaria Molina obtained by RP-HPLC purification (peak 21) (Kensil et al. 1991). Crude bark extracts have been reported to comprise a wide range of saponins. The QS- 21 extract, or fraction, comprises several distinct saponin molecules. Two principal isomeric molecular constituents of the fraction were reported (Ragupathi et al. 2011) and are depicted in Fig. 1. Both incorporate a central triterpene core, or aglycon (quillaic acid), to which a branched trisaccharide is attached at the triterpene C3 oxygen functionality, and a linear tetrasaccharide is attached at the triterpene C28 carboxylate group. A fourth component within the saponin structure is a glycosylated C18 pseudo-dimeric acyl chain attached to the fucose residue of the linear tetrasaccharide terminated with an arabinofuranose residue via a hydrolytically labile ester linkage. The isomeric components differ in the constitution of the terminal sugar residue of the tetrasaccharide, in which the major and minor compounds incorporate either an apiose (65%) (‘QS-21 -Api’) or a xylose (35%) (‘QS-21 -Xyl’) carbohydrate, respectively (see R ₂ in Fig. 1).

Saponins from Q. saponaria, including QS-21, have been known for many years to have potent immunostimulatory properties, capable of enhancing antibody production and specific T-cell responses. These properties have resulted in the development of Q. saponaria saponin-based adjuvants for vaccines. Of particular note, the AS01 adjuvant features a liposomal formulation including QS-21 and 3-0-desacyl-4'-monophosphoryl lipid A (3D-MPL) (Garcon, 2011 ; Didierlaurent, 2017) and is currently licenced in vaccines for diseases including shingles (Shingrix™) and malaria (Mosquirix™).

With more vaccines including QS-21 becoming available, the demand for its supply is expected to increase substantially over the years. Therefore, there remains a need for providing methods of production of QS-21 which do not rely upon natural resources, such as biosynthetic methods of production in yeast. Examples of advantages of such methods are as follows: (i) complex purification schemes designed to separate saponins from complex mixtures including multiple saponins (such as from a crude bark extract) are avoided; (ii) ability to produce individual saponins otherwise hard to separate when present in a crude bark extract (e.g. QS-21-Api and QS-21-Xyl); and (iii) ability to produce any saponin of interest (including precursors otherwise not purifiable from crude bark extracts.

The biosynthetic production of QS-21 precursors has been reported in Nicotiana benthamiana (e.g. WO 19/122259, WO 20/260475 and WO 22/136563). Quillaic acid production at trace levels has been reported in yeast (WO 20/263524). The present invention reports for the first time the successful production, in yeast, of QS-21 and glycosylated precursors and variants thereof.

SUMMARY OF THE INVENTION

In a first aspect of the invention, there is provided a method of producing quillaic acid (QA) in yeast, wherein the method comprises the step of overexpressing, in a yeast engineered to produce p-amyrin, heterologous genes encoding the following enzymes:

(i) a cytochrome P450 C16 oxidase, wherein the C16 oxidase oxidizes the C16 carbon of p-amyrin to a hydroxyl group,

(ii) a cytochrome P450 C23 oxidase, wherein the C23 oxidase oxidizes the C23 carbon of p-amyrin to an aldehyde group,

(iii) a cytochrome P450 C28 oxidase, wherein the C28 oxidase oxidizes the C28 carbon of p-amyrin to a carboxyl group, and

(iv) a cytochrome P450 reductase (CPR), acting as a redox partner, wherein the C16 oxidase, the C23 oxidase, the C28 oxidase and the CPR are from a plant origin; and a yeast which is engineered to produce QA accordingly.

In a second aspect, there is provided a method of producing U DP-Glucuronic acid (UDP-GIcA) in yeast, wherein the method comprises the step of overexpressing a heterologous gene encoding a UDP-glucose dehydrogenase (UGD) converting UDP-Glucose (UDP-GIc) into UDP-GIcA; and a yeast which is engineered to produce UDP-GIcA accordingly.

In a third aspect, there is provided a method of producing UDP-Rhamnose (UDP-Rha) in yeast, wherein the method comprises the step of overexpressing a heterologous gene encoding a UDP-rhamnose synthase (RhaT) converting UDP-GIc into UDP-Rha; and a yeast which is engineered to produce UDP-Rha accordingly.

In a fourth aspect, there is provided a method of producing UDP-Xylose (UDP-Xyl) in yeast, wherein the method comprises the step of overexpressing heterologous genes encoding the following enzymes:

(i) a UDP-glucose dehydrogenase (UGD) converting UDP-GIc into UDP-GIcA, and (ii) a UDP-xylose synthase (UXS) converting UDP-GIcA into UDP-Xylose; and a yeast which is engineered to produce UDP-Xyl accordingly.

In a fifth aspect, there is provided a method of producing a C3-glycosylated QA derivative in yeast, wherein the method comprises the step of overexpressing, in a yeast engineered to produce QA and UDP-GIcA, a heterologous gene encoding the following enzyme:

(i) a UDP-GIcA transferase (GlcAT) transferring UDP-GIcA and attaching a GlcA residue at the C3 position of QA to form the C3-glycosylated QA derivative; and a yeast which is engineered to produce the C3-glycosylated QA derivative accordingly.

In a sixth aspect, there is provided a method of producing UDP-Fuc (UDP-Fuc) in yeast, wherein the method comprises the step of overexpressing heterologous genes encoding the following enzymes:

(i) a UDP-glucose-4,6-dehydratase (UG46DH) converting UDP-GIc into UDP-4-keto-6- deoxy-glucose and

(ii) a 4-keto-reductase converting UDP-4-keto-6-deoxy-glucose into UDP-Fuc; and a yeast which is engineered to produce UDP-Fuc accordingly

In a seventh aspect, there is provided a method of producing a C28-glycosylated QA derivative in yeast, wherein the method comprises the step of overexpressing, in a yeast engineered to produce a C3-glycosylated QA derivative, a heterologous gene encoding the following enzyme:

(i) a UDP-Fucose transferase (FucT) transferring UDP-Fuc and attaching a Fuc residue at the C28 position of the C3-glycosylated QA derivative to form the C28- glycosylated QA derivative; and a yeast which is engineered to produce the C28- glycosylated QA derivative accordingly.

In an eighth aspect, there is provided a method of producing (S)-2-methylbutyryl CoA (2MB-CoA) in yeast, wherein the method comprises the step of overexpressing a heterologous gene encoding a carboxyl coenzyme A (CoA) ligase (CCL) converting 2-methylbutyric acid (2MB) acid into 2MB-CoA, and 2MB acid is supplemented exogenously; and a yeast which is engineered to produce 2MB-CoA accordingly.

In a ninth aspect, there is provided a method of producing UDP-Arabinofuranose (UDP- Ara ) in yeast, wherein the method comprises the step of overexpressing, in a yeast engineered to produce UDP-Xyl, heterologous genes encoding the following enzymes:

(i) a UDP-Xyl epimerase (UXE) converting UDP-Xyl into UDP-Arabinopyranose (UDP- Arap), and

(ii) a UDP-arabinose mutases (UAM) converting UDP-Arap into UDP-Arabinofuranose (UDP-Ara ); and a yeast which is engineered to produce UDP-Araf accordingly. In a tenth aspect, there is provided a method of producing an acylated and glycosylated QA derivative in yeast, wherein the method comprises the step of overexpressing, in a yeast engineered to produce a glycosylated QA derivative, heterologous genes encoding the following enzymes:

(i) a carboxyl coenzyme A (CoA) ligase (CCL) converting 2MB acid into 2MB-CoA,

(ii) a chalcone-synthase-like type III polyketide synthase (PKS) condensing malonyl- CoA with 2MB-CoA to form C9-Keto-CoA,

(iii) a keto-reductase (KR) converting C9-Keto-CoA into C9-CoA, and

(iv) an acyltransferase transferring and attaching a first C9-CoA unit to the glycosylated QA derivative to form an acylated and glycosylated QA derivative, and 2MB acid is optionally, supplemented exogenously; and a yeast which is engineered to produce the acylated and glycosylated QA derivative accordingly.

In an eleventh aspect, there are provided QA derivatives obtained according to the method of the first to tenth aspects of the invention.

In a twelfth aspect, there is provided the use of QA derivatives according to the eleventh aspect of the invention as an adjuvant

In a thirteenth aspect, there are provided isolated enzymes or proteins used in the method of the first to tenth aspects of the invention.

BRIEF DESCRIPTION OF THE FIGURES

Fig. 1 Shows the structure of the two principal isomeric constituents present within the QS-21 fraction traditionally purified from a crude bark extract originating from Q. saponaria Molina tree. The core backbone is formed from the triterpene quillaic acid (QA). The C3 position of QA features a branched trisaccharide consisting of a p-D-glucuronic acid (P- D-GIcA) residue, a p-D-galactose (P-D-Gal) residue and p-D-xylose (P-D-Xyl) residue at R1. The C28 position of QA features a linear tetrasaccharide consisting of a p-D- fucose (P-D-Fuc) residue, an a-L-rhamnose (a-L-Rha) residue, a p-D-xylose residue and either a terminal p-D-apiose (P-D-Api) residue or a p-D-xylose residue at R ₂. The P-D-fucose residue also features an 18-carbon pseudo-dimeric acyl chain which terminates with an a-L-arabinofuranose (a-L-Ara ) residue. Carbon numbering in QA (C3, C16, C23 and C28) is indicated. Substitution of RT with an a-L-rhamnose (a-L-Rha) residue represents the rhamnose-chemotype variant of QS-21, present at trace level within the QS-21 fraction traditionally purified from a crude bark extract originating from Q. saponaria Molina tree. Fig. 2 Shows the biosynthetic pathway for de novo production of QS-21 in yeast. Fig. 2A depicts the biosynthesis of nucleotide sugars required for the C3 branched trisaccharide and the C28 linear tetrasaccharide and the biosynthesis of the unit C9- CoA constitutive of the 18-carbon pseudo-dimeric acyl chain from the mevalonate pathway. ‘UGE’ is for UDP-glucose 4-epimerase, ‘UGD’ is for UDP-glucose dehydrogenase, ‘RHM’ is for rhamnose synthase, ‘UXS’ is for UDP-xylose synthase, ‘AXS’ is for UDP-apiose/UDP-xylose synthase, ‘UXE’ is for UDP-xyl epimerase, ‘UAM’ is for UDP-arabinose mutase. ‘LovF-TE’ is for a polyketide synthase (PKS) (or ‘megasynthase’). ‘AGP’ is for Acyl Carrier Protein. ‘TE’ is for thioesterase. ’CCL’ is for carboxyl coenzyme A ligase, ‘PKS’ is for polyketide synthase, ‘KR’ is for keto-reductase. Fig. 2B depicts the biosynthesis of quillaic acid by successive oxidation of p-amyrin. ‘BAS’ is for p-amyrin synthase. Fig. 2C depicts the biosynthesis of the branched trisaccharide at the C3 position of QA. ‘GlcAT’ is for II DP-glucuronic acid transferase. ‘GalT’ is for UDP-galactose transferase. ‘XylT’ is for UDP-xylose transferase. ‘RhaT’ is for UDP-rhamnose transferase. Fig. 2D depicts the biosynthesis of the linear tetrasaccharide at the C28 position of QA. ‘FucT’ is for UDP-fucose transferase. Fig. 2E depicts the addition of the 18-carbon pseudo-dimeric acyl chain to the fucose residue of the linear tetrasaccaride at the C28 position of QA and the addition of arabinofuranose to the end of the acyl chain.

Fig. 3 Screening of p-amyrin synthases (BAS) from different plants, p-amyrin abundance has been measured by GC-MS in yeasts engineered with genes encoding Artemisia annua (Aa) BAS (‘AaBAS’), Arabidopsis thaliana (At) BAS (‘AtBAS’), Glycyrrhiza glabra (Gg) BAS (‘GgBAS’), and Gypsophila vaccaria (Gv) BAS (‘GvBAS’), 1 day, 2 days and 3 days after induction of gene expression.

Fig. 4 Shows the production of QA precursors (gypsogenin, oleanolic acid and hederagenin) and QA in different yeast strains (as indicated) engineered with different combinations of enzymes and proteins, as described in Table 3.

Fig. 5 Show a comparison of the subcellular localization of the Cytochrome P450 C28 oxidase from Quillaja saponaria (QsC28) (Fig. 5A), the Cytochrome P450 C16 oxidase from Quillaja saponaria (QsC28) (Fig. 5B), and the oxidase resulting from the fusion of QsC28 at the N-terminus of QsC16 (QSC28C16) (Fig. 5C), each tagged with a fluorescent protein at their C-terminus (GFP or mcherry, as indicated).

Fig. 6 Panel A shows the relative expression level of p-amyrin synthase (SvBAS) mRNA treated by MeJa at 0, 50, 100 pM during 72h in leaves. Panel B shows the fold-change of p-amyrin synthase treated by MeJa at 50, 100 pM (compared to 0 pM) at 24h and 72h in flowers. Panel C shows a neighbor-joining tree of cytochromes P450 (CYPs) acting on triterpenoid from other plants and CYP candidates identified from S. vaccaria transcriptome. Gene names labelled with an asterisk represent S. vaccaria genes. Gene names included in a box represent CYPs that are co-expressed with p-amyrin synthase.

Fig. 7 Shows LC-MS extracted ion chromatograms (EIC) for QA precursors (‘oleanolic acid’, and ‘echinocystic acid’) detected in Nicotiana benthamiana plants transiently coexpressing a p-amyrin synthase from S. vaccaria (‘SvBAS’), a CYP C28 oxidase from S. vaccaria (‘SvC28’), a CYP C16 oxidase from S. vaccaria (‘SvC16’) in different combinations (as indicated) (Panel A); LC-MS extracted ion chromatograms (EIC) for the QA precursor (Gypsogenin) detected in N. benthamiana plants transiently coexpressing a p-amyrin synthase from Q. saponaria (‘QsBAS’), a CYP C28 oxidase from Q. saponaria (‘QsC28’), a CYP C23 oxidase from S. vaccaria (‘SvC23-1’), a CYP C23 oxidase from S. vaccaria (‘SvC23-2’) in different combinations (as indicated) (Panel B); and LC-MS extracted ion chromatograms for the QA precursor (Gypsogenic acid) detected in N. benthamiana plants transiently co-expressing the same combinations of enzymes as in Panel B (Panel C).

Fig. 8 Shows LC-MS extracted ion chromatograms (EIC) for QA precursors (oleanolic acid - ‘QA’, hederagenin/echinocystic acid - ‘Hed/EA’, gypsogenin - ‘Gyp’ and echinocystic acid ‘EA’) and QA detected in a yeast co-expressing a p-amyrin synthase (‘GvBAS’), a CYP C16 oxidase (‘SvC16’), a CYP C28 oxidase (‘QsC28’), a CYP reductase (‘AtATRI’) and a CYP oxidase C23 from S. vaccaria (‘Sv-C23-1’) The dashed line indicates that the peak obtained in the EIC for QA (in the ‘Yeast sample’) matches the peak obtained in the EIC for the QA standard (‘Commercial QA standard’). Numbers in brackets indicate m/z (mass-to-charge ratio) values.

Fig. 9 Shows LC-MS extracted ion chromatograms (EIC) for QA precursors (oleanolic acid - ‘QA’, hederagenin/echinocystic acid - ‘Hed/EA’, gypsogenin - ‘Gyp’ and echinocystic acid ‘EA’) and QA detected in a yeast co-expressing a p-amyrin synthase (‘GvBAS’), a CYP C16 oxidase (‘SvC16’), a CYP C28 oxidase (‘QsC28’), a CYP reductase (‘AtATRT) and a CYP oxidase C23 from S. vaccaria (‘Sv-C23-2’). The dotted line indicates that the peak obtained in the EIC for QA (in the ‘Yeast sample’) matches the peak obtained in the EIC for the QA standard (‘Commercial QA standard’). Numbers in brackets indicate m/z (mass-to-charge ratio) values.

Fig. 10 Shows the transcript expression profile of AtMSBP homologs in leaves and flowers of S. vaccaria (as indicated). Average expression levels of different homologs in leaves and flowers are represented by TMM (trimmed mean of M-values). Fig. 11 Shows a comparison of the production of QA precursors (gypsogenin, oleanolic acid, hederagenin and erythrodiol) and QA in the absence (YL-4) or presence of MSBP proteins of different plant origins (as indicated), as measured by LC-MS.

Fig. 12 Shows the biosynthetic pathway for de novo production of nucleotide sugars in yeast via nucleotide sugar interconversion enzymes. Non-native sugars in yeast are circled. Heterologous enzymes required for synthesizing such non-native sugars are underlined. ‘Rham synthase’ is for rhamnose synthase, ‘UGE’ is for UDP-glucose 4- epimerase, ‘UGD’ is for UDP-glucose dehydrogenase, ‘UXS’ is for UDP-xylose synthase, ‘AXS’ is for UDP-apiose/UDP-xylose synthase, ‘UXE’ is for UDP-xyl epimerase, ‘UAM’ is for UDP-arabinose mutase, ‘UG46DH’ is for UDP-glucose-4,6- dehydratase, ‘UG46DGR’ is for UDP-4-keto-6-deoxy-glucose reductase.

Fig. 13 Panel A shows a comparison of UDP-Glucose (UDP-GIc), UDP-Glucuronic acid (UDP-GIcA) and UDP-Xylose (UDP-Xyl) production between 2 yeast strains overexpressing AtUGD (a UDP-glucose dehydrogenase) (SC-1 and SC-4). Panel B shows a comparison of UDP-Xyl production between 2 yeast strains overexpressing AtUGD (SC-4 and SC-16), together with a UDP-xylose synthase A. thaliana (AtUXS) and a UDP-apiose/UDP-xylose synthase from Q. saponaria (QsAXS), respectively, at 24h and 48h after gene expression was induced.

Fig. 14 Shows a comparison of UDP-Rhamnose (UDP-Rha), UDP-Xylose (UDP-Xyl) and UDP-Fucose (UDP-Fuc) production between 5 yeast strains (SC-17, SC-19, SC-20, SC-22 and SC-23) overexpressing different combinations of enzymes of different plant origins (as indicated). Panel A provides results in a graph plotted against the yeast strains, while Panel B provides results in a graph plotted against the UDP sugars.

Fig. 15 Shows the production of glucuronylated QA precursors (‘oleanolic acid-GIcA’, ‘gypsogenin-GIcA’ and ‘hederagenin-GIcA’) and glucuronylated QA (‘QA-C3-GlcA’) in a yeast engineered to produce QA, further overexpressing a U DP-glucuronic acid transferase from Q. saponaria (‘QsCsIGT), together with a UDP-glucose dehydrogenase from A. thaliana (‘AtUGD’) (YL-11), as measured by LC-MS. Left panel shows QA precursors and QA (unglycosylated). Right panel shows glucuronylated QA precursors and glucuronylated (‘QA-C3-GlcA’).

Fig. 16 Shows the production of glucuronylated QA precursors (‘oleanolic acid-GIcA’, ‘gypsogenin-GIcA’ and ‘hederagenin-GIcA’) and glucuronylated QA (‘QA-GIcA’) in a yeast engineered to produce QA, further overexpressing a U DP-glucuronic acid transferase (‘QsCslG2’), together with a UDP-glucose dehydrogenase (‘AtUGD’) (YL- 12), as measured by LC-MS. Left panel shows QA precursors and QA (unglycosylated). Right panel shows the glucuronylated QA precursors and glucuronylated QA (‘QA-C3-GlcA’).

Fig. 17 Shows a comparison of the substrate specificity between QsCsIGI and QsCslG2. The data shown in Fig. 16 have been quantified and are presented as a graph, showing the production of QA precursors (‘QA’, ‘Her’ and ‘Gyp’), QA, glucuronylated QA precursors (‘GlcA-OA’, ‘GlcA-Her’, ‘GlcA-Gyp) and glucuronylated QA (Q’A-C3-GlcA’) (as indicated) obtained from YL-11 (overexpressing QsCsIGI) and YL-12 (overexpressing QsCslG2), respectively.

Fig. 18 Shows an LC-MS extracted ion chromatogram (EIC) for QA and QA-C3-GlcA detected in an in vitro enzymatic assay. QA and UDP-GIcA (both from a commercial source) have been directly added into a reaction buffer together with a microsome preparation of a yeast overexpressing a II DP-glucuronic transferase from S. vaccaria (‘SvCsIG’) via plasmid expression.

Fig. 19 Shows LC-MS extracted ion chromatograms (EIC) for QA and C3-glycosylated QA derivatives (‘QA-C3-GlcA’ and ‘QA-C3-GlcA-Gal’) detected in a yeast engineered to produce QA-C3-GlcA, and further overexpressing a galactose transferase from Q. saponaria (‘QsGalT’) (YL-13). Peaks corresponding to QA and QA-C3-GlcA-Gal are labelled as such.

Fig. 20 Shows LC-MS extracted ion chromatograms (EIC) for C3-glycosylated QA derivatives (‘QA-C3-GlcA-Gal’ and ‘QA-C3-GlcA’) detected in N. benthamiana plants transiently co-expressing a U DP-glucuronic acid transferase from S. vaccaria (‘SvCsIG’), together (or not) with a UDP-galactose transferase from S. vaccaria (‘SvGal’) (as indicated) and infiltrated with QA (from a commercial source). Peaks corresponding to QA-C3-Gal and QA-C3-GlcA-Gal are labelled as such.

Fig. 21 Shows LC-MS extracted ion chromatograms (EIC) for QA and C3-glycosylated QA derivatives (‘QA-C3-GlcA’, ‘QA-C3-GlcA-Gal’ and ‘QA-C3-GlcA-Gal-Rha’) detected in a yeast engineered to produce QA-C3-GlcA-Gal and further overexpressing a UDP- rhamnose synthase from A. thaliana (‘AtRHM2’) and a UDP-rhamnose transferase from Q. saponaria (‘QsC3RhaT’) (YL-14). Peaks corresponding to QA and QA-C3- GlcA-Gal-Rha are labelled as such.

Fig. 22 Shows LC-MS extracted ion chromatograms (EIC) for QA and C3-glycosylated QA derivatives (‘QA-C3-GlcA’, ‘QA-C3-GlcA-Gal’ and ‘QA-C3-GlcA-Gal-Xyl’) detected in a yeast engineered to produce QA-C3-GlcA-Gal and further overexpressing a UDP- xylose synthase from A. thaliana (‘AtUXS’) and a UDP-xylose transferase from Q. saponaria (‘QsC3XylT’) (YL-15). Peaks corresponding to QA and QA-C3-GlcA-Gal- Xyl are labelled as such. Fig. 23 Shows a comparison of QA-C3-GlcA-Gal-Xyl production between 7 yeast strains (YL- 15, YL-16, YL-17, YL-18, YL-19, YL-20 and YL-21) engineered to produce QA-GIcA- Gal, and further overexpressing, each, a different UDP-glucose dehydrogenase (‘UGD variants’ - as indicated). ‘Syn’ is for Synechococcus sp, ‘Hs’ is for Homo sapiens, ‘Patl’ is for Paramoeba atlantica (Patl), ‘Bcyt’ is for Bacillus cytotoxicus, ‘Myxfulv’ is for Corallococcus macrosporus, ‘Pfu’ is for Pyrococcus furiosus.

Fig. 24 Shows a comparison of QA and QA-C3-GlcA-Gal-Xyl (‘QA-C3-GGX’) production between 2 yeast strains (YL-15 and YL-22) (as indicated). As compared with YL-15, YL-22 further overexpresses a glucuronkinase from A. thaliana and a II DP-glucuronic acid pyrophosphorylase from A. thaliana (‘AtllSP’).

Fig. 25 Shows a comparison of QA and QA-C3-GlcA-Gal-Xyl (‘QA-C3-GGX’) production in different yeast strains (YL-15 and YL-23) and different conditions (as indicated). As compared with YL-15, YL-23 further overexpresses a glucuronkinase from A. thaliana, a U DP-glucuronic acid pyrophosphorylase from A. thaliana (‘AtUSP’) and a myoinositol oxygenase from Thermothelomyces thermophilus (Tt) (‘TtMIOX’). YL-23 was either left untreated (‘YL-23’) or supplemented externally with myo-inositol (‘Ml’) and glucuronic acid (‘GlcA’) (as indicated).

Fig. 26 Shows a comparison of QA, QA-C3-GlcA-Gal (‘QA-C3-GG’) and QA-C3-GlcA-Gal-Xyl (‘QA-C3-GGX’) production analyzed by LC-MS. A UDP-xylose synthase from A. thaliana (‘AtUXS’) has been overexpressed in a yeast engineered to produce QA-C3- GGX under an inducible pTetOn promoter. The yeast culture was either left untreated (‘No inducer’) or treated with different concentrations of doxycycline (as indicated).

Fig. 27 Panel A shows a comparison of UDP-Xylose (‘UDP-Xy’l) and UDP-Fucose (‘UDP- Fuc’) production between different yeast strains (as indicated) overexpressing a UDP- glucose dehydrogenase from A. thaliana (‘AtUGD’), a UDP-xylose synthase from A. thaliana (‘AtUXS’), a UDP-glucose-4,6-dehydratase from S. vaccaria (‘SvUG46DH’), a UDP-4-keto-6-deoxy-glucose reductase from S. vaccaria (‘SvNMD’) and a UDP-4- keto-6-deoxy-glucose reductase from Q. saponaria in different combinations (as indicated). Panel B shows a comparison of UDP-Xylose (‘UDP-Xy’l), UDP-Fucose (‘UDP-Fuc’) and UDP-Rhamnose (‘UDP-Rha’) production in different yeast strains (as indicated) overexpressing a UDP-rhamnose synthase from A. thaliana (‘AtRHM2’), a UDP-xylose synthase from A. thaliana (‘AtUXS’) a UDP-glucose dehydrogenase from A. thaliana (‘AtUGD’), a UDP-xylose synthase from A. thaliana (‘AtUXS’), a UDP- glucose-4,6-dehydratase from S. vaccaria (‘SvUG46DH’), a UDP-4-keto-6-deoxy- glucose reductase from S. vaccaria (‘SvNMD’) and a UDP-4-keto-6-deoxy-glucose reductase from Q. saponaria in different combinations (as indicated). ‘CP’ is for Cell Pellet.

Fig. 28 Shows LC-MS extracted ion chromatograms (EIC) for C3-glycosylated QA derivatives (‘QA-C3-GlcA’, ‘QA-C3-GlcA-Gal’ and ‘QA-C3-GlcA-Gal-Rha’) and a C28- glcysosylated QA derivative (‘QA-C3-GlcA-Gal-Rha-C28-Fuc’) detected in a yeast engineered to produce QA-C3-GlcA-Gal-Rha and further overexpressing a UDP- glucose-4,6-dehydratase from S. vaccaria (‘SvllG46DH’), UDP-4-keto-6-deoxy- glucose reductase from S. vaccaria (‘SvNMD’) and a UDP-fucose transferase from saponaria Q. saponaria (‘QsFucT’) (YL-25). Peaks corresponding to QA-C3-GlcA-Gal- Rha and QA-C3-GlcA-Gal-Rha-C28-Fuc are labelled as such.

Fig. 29 Shows LC-MS extracted ion chromatograms (EIC) for C3-glycosylated QA derivatives (‘QA-C3-GlcA’, ‘QA-C3-GlcA-Gal’ and ‘QA-C3-GlcA-Gal-Xyl’) and C28-glycosylated QA derivatives (‘QA-C3-GlcA-Gal-Xyl-C28-Fuc’ and QA-C3-GlcA-Xyl-Rha-C28-Fuc’) detected in a yeast engineered to produce QA-C3-GlcA-Gal-Xyl-C28-Fuc and further overexpressing a UDP-rhamnose synthase from A. thaliana (‘AtRHM2’) and a UDP- rhamnose transferase from saponaria Q. saponaria (‘QsRhaT’) (YL-28). Peaks corresponding to QA-C3-GlcA-Gal, QA-C3-GlcA-Gal-Xyl and QA-C3-GlcA-Gal-Xyl- C28-Fuc-Rha are labelled as such.

Fig. 30 Shows LC-MS extracted ion chromatograms (EIC) for C3-glycosylated QA derivatives (‘QA-C3-GlcA’, ‘QA-C3-GlcA-Gal’ and ‘QA-C3-GlcA-Gal-Rha’) and C28-glycosylated QA derivatives (‘QA-C3-GlcA-Gal-Rha-C28-Fuc’ and ‘QA-C3-GlcA-Gal-Rha-C28-Fuc’) detected in a yeast engineered to produce QA-C3-GlcA-Gal-Rha-C28-Fuc and further overexpressing a UDP-rhamnose synthase from A. thaliana (‘AtRHM2’) and a UDP- rhamnose transferase from saponaria Q. saponaria (‘QsRhaT’) (YL-27). Peaks corresponding to QA-C3-GlcA-Gal-Rha and QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha are labelled as such.

Fig. 31 Shows LC-MS extracted ion chromatograms (EIC) for C3-glycosylated QA derivatives (‘QA-C3-GlcA’, ‘QA-C3-GlcA-Gal’ and ‘QA-C3-GlcA-Gal-Xyl’) and C28-glycosylated QA derivatives (‘QA-C3-GlcA-Gal-Xyl-C28-Fuc’, ‘QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha’ and ‘QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl’) detected in a yeast engineered to produce QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha and further overexpressing a UDP-xylose synthase from A. thaliana (‘AtUXS’) and a UDP-xylose transferase from Q. saponaria (‘QsC28XylT3’) (YL-30). Peaks corresponding to QA-C3-GlcA, QA-C3-GlcA-Gal, QA- C3-GlcA-Gal-Xyl, QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha and QA-C3-GlcA-Gal-Xyl-C28- Fuc-Rha-Xyl are labelled as such. Fig. 32 Shows LC-MS extracted ion chromatograms (EIC) for C3-glycosylated QA derivatives (‘QA-C3-GlcA’, ‘QA-C3-GlcA-Gal’ and ‘QA-C3-GlcA-Gal-Rha’) and C28-glycosylated QA derivatives (‘QA-C3-GlcA-Gal-Rha-C28-Fuc’, ‘QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha’ and ‘QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl’) detected in a yeast engineered to produce QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha and further overexpressing a UDP- rhamnose synthase from A. thaliana (‘AtRHM2’) and a UDP-rhamnose transferase from Q. saponaria (‘QsRhaT’) (YL-29). Peaks corresponding to QA-C3-GlcA-Gal, QA- C3-GlcA-Gal-Rha, QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha and QA-C3-GlcA-Gal-Rha- C28-Fuc-Rha-Xyl are labelled as such.

Fig. 33 Shows LC-MS extracted ion chromatograms (EIC) for C3-glycosylated QA derivatives (‘QA-C3-G’, ‘QA-C3-GG’ and ‘QA-C3-GGX’) and C28-glycosylated QA derivatives (‘QA-C3-GGX-C28-FR’, ‘QA-C3-GGX-C28-FRX’ and ‘QA-C3-GGX-C28-FRXX’) detected in a yeast engineered to produce QA-C3-GGX-C28-FRX and further overexpressing a UDP-xylose synthase from A. thaliana (‘AtllXS’) and a UDP-xylose transferase from Q. saponaria (‘QsC28XylT4’) (YL-33). Peaks corresponding to QA- C3-GG, QA-C3-GGX, QA-C3-GGX-C28-FR, QA-C3-GGX-C28-FRX and QA-C3- GGX-C28-FRXX are labelled as such.

Fig. 34 Shows LC-MS extracted ion chromatograms (EIC) for C3-glycosylated QA derivatives (‘QA-C3-G’, ‘QA-C3-GG’ and ‘QA-C3-GGR’) and C28-glycosylated QA derivatives (‘QA-C3-GGR-C28-FR’, ‘QA-C3-GGR-C28-FRX’ and ‘QA-C3-GGR-C28-FRXX’) detected in a yeast engineered to produce QA-C3-GGR-C28-FRX and further overexpressing a UDP-rhamnose synthase from A. thaliana (‘AtRHM2’) and a UDP- rhamnose transferase from Q. saponaria (‘QsRhaT’) (YL-31). Peaks corresponding to QA-C3-GG, QA-C3-GGR, QA-C3-GGR-C28-FR, QA-C3-GGR-C28-FRX and QA-C3- GGR-C28-FRXX are labelled as such.

Fig. 35 Shows LC-MS extracted ion chromatograms (EIC) for C3-glycosylated QA derivatives (‘QA-C3-G’, ‘QA-C3-GG’ and ‘QA-C3-GGX’) and C28-glycosylated QA derivatives (‘QA-C3-GGX-C28-FR’, ‘QA-C3-GGX-C28-FRX’ and ‘QA-C3-GGX-C28-FRXA’) detected in a yeast engineered to produce QA-C3-GGX-C28-FRX and further overexpressing a UDP-apiose synthase from Q. saponaria (‘QsAXS’) and a UDP- apiose transferase from Q. saponaria (‘QsC28ApiT4’) (YL-34). Peaks corresponding to QA-C3-GG, QA-C3-GGX, QA-C3-GGX-C28-FR, QA-C3-GGX-C28-FRX and QA- C3-GGX-C28-FRXA are labelled as such.

Fig. 36 Shows LC-MS extracted ion chromatograms (EIC) for C3-glycosylated QA derivatives (‘QA-C3-G’, ‘QA-C3-GG’ and ‘QA-C3-GGR’) and C28-glycosylated QA derivatives (‘QA-C3-GGR-C28-FR’, ‘QA-C3-GGR-C28-FRX’ and ‘QA-C3-GGR-C28-FRXA’) detected in a yeast engineered to produce QA-C3-GGR-C28-FRX and further overexpressing UDP-apiose synthase from Q. saponaria (‘QsAXS’) and a UDP-apiose transferase from Q. saponaria (‘QsC28ApiT4’) (YL-32). Peaks corresponding to QA- C3-GG, QA-C3-GGR, QA-C3-GGR-C28-FR, QA-C3-GGR-C28-FRX and QA-C3- GGR-C28-FRXA are labelled as such.

Fig. 37 Shows a comparison of the subcellular localization of two xylose transferases from Quillaja saponaria (‘QsC28XylT3-GFP’ and (QsC28XylT4’) and an apiose transferase from Quillaja saponaria (‘QsC28ApiT-GFP’, each tagged with GFP at their C-terminus.

Fig. 38 Shows the level of protein expression (measured by fluorescence intensity after flow cytometry) of different variants of QsC28XylT4 (as indicated) which have been overexpressed in a yeast engineered to produce QA-C3-GGX-C28-FRX. ‘QsC28XylT4-3aa’, ‘QsC28XylT4-3aa’, ‘QsC28XylT4-6aa’, ‘QsC28XylT4-9aa’ and ‘QsC28XylT4-12aa’ designate variants of QsC28XylT4 having a deletion of 3, 6, 9 and 12 amino acids at the N-terminus, respectively. ‘QsC28XylT4-MBP’, ‘QsC28XylT4- SlIMO’ and ‘QsC28XylT4-TrxA’ designate variants of QsC28XylT4 tagged at the N- terminus with the respective MBP, SUMO and TrxA solubility tag.

Fig. 39 Shows a comparison of QA-C3-GGX-C28-FRXX production between the yeasts overexpressing QsC28XylT4-3aa, QsC28XylT4-6aa, QsC28XylT4-9a’ and QsC28XylT4-12aa, QsC28XylT4-SUMO, QsC28XylT4-TrxA and QsC28XylT4-MBP (as indicated).

Fig. 40 Shows a comparison of the subcellular localization of unmodified QsC28XylT4 and a fusion variant (‘QsC28XylT3-3xGGGS-QsC28XylT4) having fused at the N-terminus QsC28XylT3, the two enzymes being separated by a linker (‘3xGGGS’).

Fig. 41 Shows LC-MS extracted ion chromatograms (EIC) for C28-glycosylated QA derivatives (‘QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha’ and ‘QA-C3-GlcA-Gal-Xyl-C28-Fuc- Rha-Xyl’) detected in a yeast engineered to produce QA-C3-GGR-C28-FRX and further overexpressing either QsC28XylT4 (Panel A) or the fusion QsC28XylT3- 3xGGGS-QsC28XylT4 (YL-41) (Panel B). Peaks corresponding to QA-C3-GlcA-Gal- Xyl-C28-Fuc-Rha and QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl are labelled as such.

Fig. 42 Shows the level of protein expression of QsC28XylT4 overexpressed in yeast (measured by fluorescence intensity after flow cytometry) obtained in different culture conditions over a time period of 60h. The yeast culture was either left untreated (‘Control’), or added with galactose, or glucose, in the same culture medium (‘old media’) or with fresh medium (‘fresh media’) (as indicated).

Fig. 43 Shows LC-MS extracted ion chromatograms (EIC) for S-2-methylbutyryl-CoA (‘2MB- CoA’). Upper chromatogram was obtained from a ‘2M-CoA standard’. Middle chromatogram was obtained from a yeast (YL-QSCCL) overexpressing a CoA ligase from Q. saponaria (‘QsCCL’) and exogenously supplemented with 50 mg/L of 2MB acid. Lower chromatogram was obtained from a yeast overexpressing a phosphopantetheinyl transferase from Aspergillus nidulans (‘AnNpgA’), a type I polyketide synthase (PKS) LovF from Aspergillus terreus (AstLovF-TE’) and QsCCL, in the absence of any 2MB acid supplemented exogenously. Peaks corresponding to 2MB-CO acid are labelled as such.

Fig. 44 Shows LC-MS extracted ion chromatograms (EIC) for acylated and/or glycosylated QA derivatives (‘QA-C3-GGX-C28-FRX-C9’, ‘QA-C3-GGX-C28-FRXX’, ‘QA-C3-GGX- C28-FRXX-C9’ and ‘QA-C3-GGX-C28-FRX) detected in a yeast (YL-42) engineered to produce QA-C3-GGX-C28-FRX, and further overexpressing chalcone-synthase-like type III polyketide synthases from Q. saponaria (‘ChsD’ and ‘ChSE’), keto-reductases from Q. saponaria (‘KR11’ and ‘KR23’), QsCCL and an acyl tranferase from Q. saponaria (‘QsDMOT9’) and exogenously supplemented with 50 mg/L of 2MB acid. Peaks corresponding to QA-C3-GGX-C28-FRX-C9, QA-C3-GGX-C28-FRXX, QA-C3- GGX-C28-FRXX-C9 and QA-C3-GGX-C28-FRX are labelled as such.

Fig. 45 Shows a comparison of QA-C3-GGX-C28-FRX and QA-C3-GGX-C28-FRX-C9 production obtained from YL-42 in the presence of an increased concentration of 2MB acid supplemented exogenously (as indicated).

Fig. 46 Shows LC-MS extracted ion chromatograms (EIC) for acylated and/or glycosylated QA derivatives (‘QA-C3-GGX-C28-FRX’, ‘QA-C3-GGX-C28-FR-C9’ and ‘QA-C3-GGX- C28-FRXX-C18’) detected in a yeast (YL-43) engineered to produce QA-C3-GGX- C28-FRX-C9, and further overexpressing an acyl tranferase from Q. saponaria (‘QsDMOT4’) and exogenously supplemented with 500 mg/L of 2MB acid. Peaks corresponding to QA-C3-GGX-C28-FRX, QA-C3-GGX-C28-FR-C9 and QA-C3-GGX- C28-FRXX-C18 are labelled as such.

Fig. 47 Shows LC-MS extracted ion chromatograms (EIC) for acylated and/or glycosylated QA derivatives (‘QA-C3-GGX-C28-FRX’, ‘QA-C3-GGX-C28-FRX-C9’, ‘QA-C3-GGX- C28-FRXX’, ‘QA-C3-GGX-C28-FRXX-C9’, QA-C3-GGX-C28-FRX-C18’ and ‘QA-C3- GGX-C28-FRXX-C18’) detected in a yeast (YL-44) engineered to produce QA-C3- GGX-C28-FRXX-C9, and further overexpressing an acyl tranferase from Q. saponaria (‘QsDMOT4’) and exogenously supplemented with 500 mg/L of 2MB acid. Peaks corresponding to QA-C3-GGX-C28-FRX, QA-C3-GGX-C28-FRX-C9, QA-C3-GGX- C28-FRXX-C9 and QA-C3-GGX-C28-FRXX-C18 are labelled as such.

Fig. 48 Panel A shows a comparison of UDP-Arabinopuranose (‘UDP-Arap’) and UDP- Arabinofuranose (‘UDP-Araf) production between different yeast strains overexpressing an arabinokinase from A thaliana (‘AtAraK’) and from Leptospira interrogans (Lei) (‘LeiAraK’), a UDP-sugar pyrophosphorylase from A. thaliana (‘AtUSP’) and from Leptospira interrogans (‘LeiUSP’), an arabinose transporter from Penicillium rubens Wisconsin (‘PrAraT’), and a UDP-arabinose mutase (‘AtUAMI’) in different combinations (as indicated). Panel B shows a comparison of UDP- Arabinopuranose (‘UDP-Arap’) and UDP-Xylose (‘UDP-Xyl’) production between different yeast strains overexpressing a UDP-arabinose mutase from A. thaliana (‘AtUAMT) and from H. vulgare (‘HvUAM’), 2 UDP-Xylose epimerase UXE from A. thaliana (‘AtUXE’ or ‘AtUXE2’) and a UDP-glucose 4-epimerase from A. thaliana (‘AtUGE3’) in different combinations (as indicated). ‘CP’ is for Cell Pellet.

Fig. 49 Shows LC-MS extracted ion chromatograms (EIC) for acylated and/or glycosylated QA derivatives (‘QA-C3-GGX-C28-FRX’, ‘QA-C3-GGX-C28-FRXX-C9’, ‘QA-C3-GGX- C28-FRX-C18’, ‘QA-C3-GGX-C28-FRX-C18-Araf, and ‘QS-21-Xyl’ corresponding to QA-C3-GGX-C28-FRX-C18-Araf) detected in a yeast (YL-45) engineered to produce QA-C3-GGX-C28-FRXX-C18, and further overexpressing a UDP-xylose epimerase from A. thaliana (‘AtUXE’) and a UDP-arabinose mutases from A. thaliana (‘AtUAMI’) and exogenously supplemented with 500 mg/L of 2MB acid. Peaks corresponding to QA-C3-GGX-C28-FRX, QA-C3-GGX-C28-FRX-C9, QA-C3-GGX-C28-FRXX-C9 and QA-C3-GGX-C28-FRXX-C18 are labelled as such.

Fig. 50 Shows an LC-MS extracted ion chromatograms (EIC) for ‘QS-21-Xyl’ (corresponding to QA-C3-GGX-C28-FRXX-C18-Araf). Comparison is made with a QS-21 standard (QS-21 fraction purified from the bark of Q. saponaria Molina tree), with the two observed peaks matching. The inset

Fig. 51 Shows LC-MS extracted ion chromatograms (EIC) for acylated and/or glycosylated QA derivatives (‘QA-C3-GGX-C28-FRX’, ‘QA-C3-GGX-C28-FRXA’, ‘QA-C3-GGX- C28-FRX-C9’, ‘QA-C3-GGX-C28-FRXX-C9’, ‘QA-C3-GGX-C28-FRX-C18, ‘QA-C3- GGX-C28-FRXX-C18-Araf’and ‘QS-21-Api’ corresponding to QA-C3-GGX-C28-FRX- C18-Araf) detected in a yeast (YL-46) engineered to produce QA-C3-GGX-C28- FRXA-C18, and further overexpressing a UDP-xylose epimerase from A. thaliana (‘AtUXE’), a UDP-arabinose mutase from A. thaliana (‘AtUAMT) and an arabinofuranose transferase from Q. saponaria (‘QsArafT’) and exogenously supplemented with 500 mg/L of 2MB acid. Peaks corresponding to QA-C3-GGX-C28- FRX, QA-C3-GGX-C28-FRX-C9 to QA-C3-GGX-C28-FRX-C18, QA-C3-GGX-C28- FRX-C18-Araf, and QS-21-Api are labelled as such.

Fig. 52 Shows LC-MS extracted ion chromatograms (EIC) for acylated and/or glycosylated QA derivatives (‘QA-C3-GGX-FRX’, ‘QA-C3-GGX-C28-FRX-C9’, ‘QA-C3-GGX-C28- FRX-C18’, and ‘QA-C3-GGX-C28-FRX-C18-Xyl’) detected in a yeast (YL-47) engineered to produce QA-C3-GGX-C28-FRX-C18, and further overexpressing an arabinofuranose transferase from Q. saponaria (‘QsArafT’) and exogenously supplemented with 500 mg/L of 2MB acid. Peaks corresponding to QA-C3-GGX-C28- FRX-C9, QA-C3-GGX-C28-FRX-C18, and QA-C3-GGX-C28-FRX-C18-Xyl are labelled as such.

Fig. 53 Shows LC-MS extracted ion chromatograms (EIC) for acylated and glycosylated QA derivatives (‘QA-C3-GGX-FRX-C9’, ‘QA-C3-GGX-C28-FRX-C18’ and QA-C3-GGX- C28-FRX-C18-Xyl) detected in a yeast (YL-48) engineered to produce QA-C3-GGX- C28-FRX-C18, and further overexpressing an arabinofuranose transferase from Q. saponaria (‘QsArafT2’) and exogenously supplemented with 500 mg/L of 2MB acid. Peaks corresponding to QA-C3-GGX-FRX-C9 and QA-C3-GGX-C28-FRX-C18 are labelled as such.

Fig. 54 Shows LC-MS extracted ion chromatograms (EIC) for acylated and glycosylated QA derivatives (‘QA-C3-GGX-FRX-C9’, ‘QA-C3-GGX-C28-FRX-C18’, ‘QA-C3-GGX-C28- FRX-C18-Araf and ‘QA-C3-GGX-C28-FRXX-C18-Araf) detected in a yeast (YL-49) engineered to produce QA-C3-GGX-C28-FRXX-C18, and further overexpressing an arabinofuranose transferase from Q. saponaria (‘QsArafT2’) and exogenously supplemented with 500 mg/L of 2MB acid. Peaks corresponding to QA-C3-GGX-FRX- C9, QA-C3-GGX-C28-FRX-C18, QA-C3-GGX-C28-FRX-C18-Araf and QA-C3-GGX- C28-FRXX-C18-Araf are labelled as such.

Fig. 55 Shows LC-MS extracted ion chromatograms (EIC) for acylated and glycosylated QA derivatives (‘QS-21-Xyl’, ‘QA-C3-GGX-FRX-C9’, ‘QA-C3-GGX-C28-FRXX-C9’, ‘QA- C3-GGX-C28-FRX-C18’ and ‘QA-C3-GGX-C28-FRX-C18-Araf) detected in a yeast (YL-50) engineered to produce QA-C3-GGX-C28-FRXX-C18-Araf, and further overexpressing a phosphopantetheinyl transferase from Aspergillus nidulans (‘AnNpgA’) and a type I polyketide synthase (PKS) LovF from Aspergillus terreus (AstLovF-TE’), in the absence of any 2MB acid supplemented exogenously. Peaks corresponding to QS-21-Xyl, QA-C3-GGX-FRX-C9, QA-C3-GGX-C28-FRX-C18 and QA-C3-GGX-C28-FRX-C18-Araf are labelled as such.

Fig. 56 Shows LC-MS extracted ion chromatograms (EIC) for acylated and glycosylated QA derivatives (‘QS-21-Api’ corresponding to QA-C3-GGX-C28-FRXA-C18-Araf, ‘QA-C3- GGX-FRXA’, ‘QA-C3-GGX-C28-FRX-C9’, ‘QA-C3-GGX-C28-FRXA-C9’ and ‘QA-C3- GGX-C28-FRX-C18) detected in a yeast (YL-51) engineered to produce QA-C3-GGX- C28-FRXX-C18-Araf, and further overexpressing a phosphopantetheinyl transferase from Aspergillus nidulans (‘AnNpgA’) and a type I polyketide synthase (PKS) LovF from Aspergillus terreus (AstLovF-TE’), in the absence of any 2MB acid supplemnted exogenously. Peaks corresponding to QS-21-Api, QA-C3-GGX-C28-FRX-C’ and QA- C3-GGX-C28-FRX-C18 are labelled as such.

Fig. 57 Shows LC-MS extracted ion chromatograms (EIC) for ‘QA-C3-GlcA-C28-Fuc’ detected in N. benthamiana plants transiently co-expressing a II DP-glucuronic acid transferase from S. vaccaria (‘SvCsIG’), a UDP-glucose-4,6-dehydratase from S. vaccaria (‘SvllG46DH’), a UDP-4-keto-6-deoxy-glucose reductase from S. vaccaria (‘SvNMD’), a fucose transferase from Q. saponaria (‘QsFucT’), a fucose transferase from S. vaccaria (‘SvFucT’) and ‘GFP’ (used as negative control) in different combinations (as indicated) and infiltrated with QA (from a commercial source). Peaks corresponding to QA-C3-GlcA-C28-Fuc is labelled as such.

Fig. 58 Shows an LC-MS extracted ion chromatogram (EIC) for ‘QA-C3-GlcA-Gal-Xyl’ detected in N. benthamiana plants transiently co-expressing a II DP-glucuronic acid transferase from S. vaccaria (‘SvCsIG’), a galactose transferase from S. vaccaria (‘SvGalT’), and a xylose transferase from S. vaccaria (‘SvC3XylT’), and infiltrated with QA (from a commercial source). Peaks corresponding to QA-C3-GlcA-Gal-Xyl is labelled as such.

DETAILED DESCRIPTION OF THE INVENTION

Using more than 30 heterologous proteins from different plant and microbial origins spanning across six distinctively different protein types, including in particular a terpene synthase, cytochrome P450 monooxygenases (or ‘CYP oxidases’), nucleotide sugar synthases, sugar transferases, acyltransferases, and polyketide synthases (PKSs), the inventors have been able, for the first time, to reconstitute the metabolic pathway leading to the successful biosynthesis of QS-21 in Saccharomyces cerevisiae, starting from a simple sugar, galactose.

Quillaic acid (QA), the triterpene core of QS-21 , derives from the simple triterpene p- amyrin, which is synthesised through cyclisation of the universal linear precursor 2,3- oxidosqualene (OS) (according to the mevalonate pathway which is native to yeast - Wong et al. 2018), by an oxidosqualene cyclase (OSC), also referred to as a p-amyrin synthase (‘BAS’) (see Fig. 2B). This p-amyrin scaffold is further oxidised with a carboxylic acid, alcohol and aldehyde at the C28, C16 and C23 positions, respectively, by a series of three CYP oxidases, resulting in the formation of quillaic acid (QA) (see Fig. 2B).

Next, UDP-Glucuronic acid (UDP-GIcA’), UDP-galactose (‘UDP-Gal’), and UDP-Xylose (‘UDP-Xyl’) or UDP-Rhamnose (‘UDP-Rha’), are incorporated at the C3 position of QA by respective glycosyltransferases resulting in the formation of C3-glycosylated QA derivatives (see Fig. 2C). Such C3-glycosylated QA derivatives are individually referred to herein as follows:

- QA-C3-GlcA or QA-C3-G

- QA-C3-GlcA-Gal or QA-C3-GG

- QA-C3-GlcA-Gal-Rha or QA-C3-GGR

- QA-C3-GlcA-Gal-Xyl or QA-C3-GGX

The formula of which being provided in Table 1.

- Next, UDP-fucose (‘UDP-Fuc’), UDP-Rha, UDP-Xyl, and a second UDP-Xyl or a UDP-Api, are incorporated at the C28 position of QA by respective glycosyltransferases resulting in the formation of C28-glycosylated QA derivatives (see Fig. 2D). Such C28-glycosylated QA derivatives are individually referred to herein as follows:

- QA-C3-GlcA-Gal-Xyl-C28-Fuc or QA-C3-GGX-C28-F

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha or QA-C3-GGX-C28-FR

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl or QA-C3-GGX-C28-FRX

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Xyl or QA-C3-GGX-C28-FRXX

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Api or QA-C3-GGX-C28-FRXA

- QA-C3-GlcA-Gal-Rha-C28-Fuc or QA-C3-GGR-C28-F

- QA-C3-GlcA-Gal- Rha-C28-Fuc-Rha or QA-C3-GGR-C28-FR

- QA-C3-GlcA-Gal- Rha-C28-Fuc-Rha-Xyl or QA-C3-GGR-C28-FRX

- QA-C3-GlcA-Gal- Rha-C28-Fuc-Rha-Xyl-Xyl or QA-C3-GGR-C28-FRXX

- QA-C3-GlcA-Gal- Rha-C28-Fuc-Rha-Xyl-Api or QA-C3-GGR-C28-FRXA

The formula of which being provided in Table 1.

Biosynthesis of the 18-carbon pseudo-dimeric acyl chain is achieved by condensing malonyl-CoA (which is native to yeast) with S-2-methylbutyryl-CoA (‘2MB-CoA’) to make C9- CoA using a type I polyketide synthase (‘PKS’), a carboxyl coenzyme A ligase (‘CCL’), type III PKSs and keto-reductases (KRs) (see Fig. 2A).

Next, two repeating C9-CoA acyl units are successively transferred by 2 acyltransferases leading to the addition of 18-carbon pseudo-dimeric acyl chain to the fucose residue of the linear tetrasaccharide at the C28 position and resulting in the formation of acylated and glycosylated QA derivatives. Such acylated and glycosylated QA derivatives are individually referred to herein as follows:

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-C9 or QA-C3-GGX-C28-FRX-C9

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Xyl-C9 or QA-C3-GGX-C28-FRXX-C9

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Api-C9 or QA-C3-GGX-C28-FRXA-C9

- QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-C9 or QA-C3-GGR-C28-FRX-C9

- QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Xyl-C9 or QA-C3-GGR-C28-FRXX-C9 - QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Api-C9 or QA-C3-GGR-C28-FRXA-C9

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-C18 or QA-C3-GGX-C28-FRX-C18

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Xyl-C18 or QA-C3-GGX-C28-FRXX-C18

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Api-C18 or QA-C3-GGX-C28-FRXA-C18

- QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-C18 or QA-C3-GGR-C28-FRX-C18

- QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Xyl-C18 or QA-C3-GGR-C28-FRXX-C18

- QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Api-C18 or QA-C3-GGR-C28-FRXA-C18

The formula of which being provided in Table 1.

Next, UDP-arabinofuranose (‘UDP-Araf), or UDP-Xyl, is incorporated at the end of the 18- carbon pseudo-dimeric acyl chain (on the 5-hydroxy function group of the second C9-CoA acyl unit), resulting in the formation of further acylated and glycosylated QA derivatives (see Fig. 2E), including the two principal isomers of QS-21 found in the QS-21 fraction traditionally purified from the bark of the Q. saponaria Molina tree, and their rhamnose chemotype variants (see Fig. 1). Such acylated and glycosylated QA derivatives are individually referred to herein as follows:

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-C18-Araf or QA-C3-GGX-C28-FRX-C18-Araf

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Xyl-C18-Araf or QA-C3-GGX-C28-FRXX-C18- Ara or QS-21 -Xyl

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Api-C18-Araf or QA-C3-GGX-C28-FRXA-C18- Araf or QS-21 -Api

- QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-C18-Araf or QA-C3-GGR-C28-FRX-C18-Araf

- QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Xyl-C18-Araf or QA-C3-GGR-C28-FRXX-C18- Araf

- QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Api-C18-Araf or QA-C3-GGR-C28-FRXA-C18- Araf

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-C18-Xyl or QA-C3-GGX-C28-FRX-C18-Xyl

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Xyl-C18-Xyl or QA-C3-GGX-C28-FRXX-C18-Xyl

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Api-C18-Xyl or QA-C3-GGX-C28-FRXA-C18- Xyl

- QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-C18-Xyl or QA-C3-GGR-C28-FRX-C18-Xyl

- QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Xyl-C18-Xyl or QA-C3-GGR-C28-FRXX-C18- Araf

- QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Api-C18-Xyl or QA-C3-GGR-C28-FRXA-C18- Xyl

The formula of which being provided in Table 1. ‘C3-glycosylated QA derivative’ designates, in the sense of the invention, a QA derivative including at least a glucuronic acid residue at position C3 (as listed above). ‘C28- glycosylated QA derivative’ designates, in the sense of the invention, a QA derivative including all three sugars of the branched trisaccharide at position C3 and at least the fucose residue of the linear tetrasaccharide at position C28 (as listed above). ‘Acylated and glycosylated QA derivative’ designates, in the sense of the invention, a QA derivative including all three sugars of the branched trisaccharide at position C3, at least the first three sugars of the linear tetrasaccharide at position C28, at least one C9-CoA acyl unit (‘C9’) attached to the fucose residue and, optionally, an arabinofuranose residue, when two C9-CoA acyl units (‘C18’) attached (as listed above).

In the sense of the present invention, ‘heterologous genes’ is to be understood as genes not naturally expressed in yeast.

In the sense of the present invention, ‘a yeast engineered to produce e.g. a sugar or a QA derivative is to be understood as a yeast overexpressing the heterologous genes encoding the enzymes or proteins necessary to the biosynthesis or production of the respective QA derivative, e.g. as described in the respective methods of the first to tenth aspects of the invention.

QA production and production optimization

WO 19/122259 reports the identification of enzymes in the Q. saponaria genome involved in the biosynthesis of QA and the production of QA in Nicotiana benthamiana engineered with such enzymes. WO 20/263524 reports the production of traces of QA in yeast engineered with enzymes originating from different plant origins. The content of both WO 19/122259 and WO 20/263524 is incorporated herein by reference.

B-amyrin

The first step of the method of the first aspect of the invention is the cyclisation of 2,3- oxidosqualene to form p-amyrin. This step is carried out by an oxidosqualene cyclase or p- amyrin synthase (BAS). Any heterologous p-amyrin synthase capable of producing p-amyrin from any plant origin may suitably be used in the method of the invention. For example, p- amyrin synthases (BAS) from Artemisia annua (A. annua or ‘Aa’), Arabidopsis thaliana (A. thaliana or ‘At’), Glycyrrhiza glabra (G. glabra or ‘gG’), Gypsophila vaccaria (G. vaccaria or ‘Gv’), Medicago truncatula (M. truncatula or ‘Mt’), Quillaja saponaria (Q. saponaria or ‘Qs’), or Saponaria vaccaria (S. vaccaria or ‘Sv’) may be used. In some embodiments, the method of the first aspect of the invention uses a p-amyrin synthase selected from the foregoing plants. In particular, the p-amyrin synthase may be selected from AaBAS according to SEQ ID NO: 1, AtBAS according to SEQ ID NO: 4, GgBAS according to SEQ ID NO: 7, GvBAS according to SEQ ID NO: 10, QsBAS according to SEQ ID NO: 15 and SvBAS according to SEQ ID NO: 13. Advantageously, the p-amyrin synthase is from GvBAS according to SEQ ID NO: 10.

AaBAS, AtBAS, GgBAS, GvBAS, GvBAS, QsBAS or SvBAS may alternatively be according to sequences at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 1, SEQ ID NO: 4, SEQ ID NO: 7, SEQ ID NO: 10, SEQ ID NO: 15 or SEQ ID NO: 13.

Quillaic acid (QA)

As described earlier, p-amyrin is successively further oxidized with a carboxylic acid group, a hydroxyl group and aldehyde group at the C28, C16 and C23 position, respectively, by corresponding cytochrome P450 (CYP) oxidases, resulting in the formation of QA.

Any heterologous CYP oxidase from any plant origin previously identified and reported to be effectively capable of functionalizing the respective C28, C16 and C23 positions of p- amyrin may be used in the methods and engineered yeasts of the invention (e.g. as described and reported in WO 19/122259 or WO 2020/263524, or Gosh, 2017 for a review, the content of which being incorporated by reference). In some embodiments, the method of the first aspect of the invention uses a CYP C16 oxidase, a CYP C23 oxidase and a CYP C28 oxidase independently selected from A. annua, A. thaliana, G. glabra, M. truncatula, Q. saponaria, S. vaccaria, Centella asiatica, Bupleurum falcatum, Maesa lanceolate, Q. saponaria and S. vaccaria.

In further embodiments, the CYP C16 oxidase is selected from CYP87D16 and CYP716Y1 ; the CYP C23 oxidase is selected from CYP72A68 and CYP714E19; the CYP C28 oxidase is selected from CYP716A1 , CYP716A12, CYP716A15, CYP716A17, CYP716A44, CYP716A46, CYP716A52v2, CYP716A75, CYP716A78, CYP716A79, CYP716A80, CYP716A81, CYP716A83, CYP716A86, CYP716A110, CYP716A140, CYP716A179, CYP716A252; CYP16A253 and CYP716AL1.

In further embodiments, the CYP C16 oxidase is selected from BfC16 according to SEQ ID NO: 17, QsC16 oxidase according to SEQ ID NO: 20, QsC28C16 according to SEQ ID NO: 23, and SvC16 according to SEQ ID NO: 26.

In further embodiments, the CYP C16 oxidase is selected from BfC16 according to SEQ ID NO: 17, SvC16 according to SEQ ID NO: 26, QsC16 according to SEQ ID NO: 20 and QsC28C16 according to SEQ ID NO: 23. BfC16, SvC16, QsC16 and QsC28C16 may alternatively be according to sequences at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 17, SEQ ID NO: 26, SEQ ID NO: 20 and SEQ ID NO: 23, respectively.

In further embodiments, the CYP C23 oxidase is selected from MtC23 oxidase according to SEQ ID NO: 38, QsC23 according to SEQ ID NO: 29, SvC23-1 according to SEQ ID NO: 32, and SvC23-2 according to SEQ ID NO: 35. MtC23, QsC23, SvC23-1 and SvC23-2 may alternatively be according to sequences at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 38, SEQ ID NO: 29, SEQ ID NO: 32 and SEQ ID NO: 35, respectively.

In further embodiments, the CYP C28 oxidase is selected from MtC28 according to SEQ ID NO: 46, QsC28 according to SEQ ID NO: 41 , or SvC28 according to SEQ ID NO: 44. MtC28, QsC28 and SvC28 may alternatively be according to sequences at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 46, SEQ ID NO: 41 and SEQ ID NO: 44, respectively.

Heterologous redox partners, such as cytochrome P450 reductase (OPR) and/or cytochrome b5, may be further co-expressed in the method of the first aspect of the invention. For example, the OPR may be selected from A thaliana and Lotus japonicus. In some embodiments, OPR is selected from AtATRI according to SEQ ID NO: 49 and LjCPR according to SEQ ID NO: 52.

Heterologous cytochrome b5 may be selected from A. thaliana, Q. saponaria and S. vaccaria. In some embodiments, cytochrome b5 is selected from Atb5 according to SEQ ID NO: 58, Qsb5 according to SEQ ID NO: 55 and Svb5 according to SEQ ID NO: 61. Atb5, Qsb5 and Svb5 may alternatively be according to sequences at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 58, SEQ ID NO: 55 and SEQ ID NO: 61 , respectively.

Heterologous scaffold proteins (allowing to physically organize the P450 enzymes) may be further co-expressed in the method of the first aspect of the invention. The scaffold protein may be a membrane steroid-binding protein (MSBP). For example, the MSBP may be selected from A thaliana, Q. saponaria, and S. vaccaria. In some embodiments, MSBP is selected from AtMSBPI according to SEQ ID NO: 63, AtMSBP2 according to SEQ ID NO: 65, QsMSBPI according to SEQ ID NO: 73, SvMSBPI according to SEQ ID NO: 67 and SvMSBP2 according to SEQ ID NO: 70. AtMSBP2, QsMSBPI , SvMSBPI and SvMSBP2 may alternatively be according to sequences at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 63, SEQ ID NO: 65, SEQ ID NO: 73, SEQ ID NO: 67 and SEQ ID NO: 70, respectively.

The first aspect of the invention also provides a yeast which is engineered to produce QA.

C3 glycosylated QA derivatives production

As described earlier, a branched trisaccharide consisting of GlcA, Gal and Xyl (or Rha) is attached at the C3 position of QA.

Non-native sugar production

The method according to the second aspect of the invention comprises the step of overexpressing of a heterologous gene encoding a UDP-glucose dehydrogenase (UGD) converting UDP-Glucose (UDP-GIc) into UDP-GIcA. UGD from different plant origins may be used. In some embodiments, the UGD is selected from A. thaliana, Synechococcus sp. (Syn), Homo sapiens (Hs), Paramoeba atlantica (Patl), Bacillus cytotoxicus (Bcyt), Corallococcus macrosporus (Myxfulv), and Pyrococcus furiosus (Pfu). In further embodiments, the UGD is selected from AtUGD according to SEQ ID NO: 84, AtUGD ₁₀iL according to SEQ ID NO: 108, SynUGD according to SEQ ID NO: 154, HSUGDAKML according to SEQ ID NO: 157, PatlUGD according to SEQ ID NO: 110, BcytUGD according to SEQ ID NO: 160, MyxfulvUGD according to SEQ ID NO: 163 and PfuUGD according to SEQ ID NO: 166. AtUGD, AtUGD _{10i L}, SynUGD, HsUGD _A1o _4L, PatlUGD, BcytUGD, MyxfulvUGD, PfuUGD may alternatively be according to sequences at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 84, SEQ ID NO: 108, SEQ ID NO: 154, SEQ ID NO: 157, SEQ ID NO: 110, SEQ ID NO: 160, SEQ ID NO: 163 and SEQ ID NO: 166, respectively.

The second aspect of the invention also provides a yeast which is engineered to produce UDP-GIcA.

The first step of the method of the third aspect of the invention is the overexpression of a heterologous gene encoding a UDP-rhamnose synthase. A UDP-rhamnose synthase from different plant origins may be used. In some embodiments, the UDP-rhamnose synthase is AtRHM2 from A. thaliana according to SEQ ID NO: 102, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 102.

The third aspect of the invention also provides a yeast which is engineered to produce UDP-Rha.

The first step of the method of the fourth aspect of the invention is the overexpression of a heterologous gene encoding a UDP-glucose dehydrogenase (UGD) converting UDP- Glucose (UDP-GIc) into UDP-GIcA. The UGD may be any of the UGD described earlier in the method of the second aspect of the invention. The second step of the method of the fourth aspect of the invention is the overexpression of a heterologous gene encoding a UDP-xylose synthase (UXS). UDP-Xyl may be produced by decarboxylation of UDP-GIcA by a UDP-Xyl synthase (UXS) and/or by a dual UDP-Api/Xyl synthase (AXS). The UDP-Xylose synthase and dual UDP-Api/Xyl synthase may be from different plant origins, e.g. from A. thaliana and Q. saponaria. In some embodiments, the UXSis selected from AtUXS encoded by SEQ ID NO: 105 and QsAXS encoded by SEQ ID NO: 113, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 105 and SEQ ID NO: 113, respectively.

The fourth aspect of the invention also provides a yeast which is engineered to produce UDP-Xyl.

QA-C3-GlcA production As shown in Fig. 2C, UDP-GIcA is transferred and a GlcA residue is attached at the C3 position of QA by a glucuronosyl transferase (GlcAT). The first step of the method of the fifth aspect of the invention is the overexpression of a heterologous gene encoding a glucuronosyl transferase (GlcAT), in a yeast engineered to produce QA and UDP-GIcA. The yeast engineered to produce QA may be a yeast according to the first aspect of the invention. The yeast engineered to produce UDP-GIcA may be a yeast according to the second aspect of the invention. The GlcAT may be from any plant origin, for example, may be selected from Q. saponaria and S. vaccaria. In some embodiments, the GlcAT is selected from QsCsIGI according to SEQ ID NO: 78, QsCslG2 according to SEQ ID NO: 81, and SvCsIG according to SEQ ID NO: 76, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 78, SEQ ID NO: 81 and SEQ ID NO: 76, respectively.

The fifth aspect of the invention also provides a yeast which is engineered to produce QA-C3-GlcA (aspect 5a). QA-C3-GlcA-Gal production

As shown in Fig. 2C, UDP-Gal is transferred and a Gal residue is attached at the C3 position of QA by a galactose transferase (GalT). The second step of the method of the fifth aspect of the invention is the overexpression of a heterologous gene encoding a galactose transferase (GalT), in a yeast engineered to produce QA-C3-GlcA. The yeast engineered to produce QA-C3-GlcA may be a yeast according to the fifth aspect of the invention. The GalT may be from any plant origin, for example, may be selected from Q. saponaria and S. vaccaria. In some embodiments, the GalT is selected from QsGalT according to SEQ ID NO: 116 and SvGalT according to SEQ ID NO: 98, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 116 and SEQ ID NO: 98, respectively.

The fifth aspect of the invention also provides a yeast which is engineered to produce QA-C3-GlcA-Gal (aspect 5b).

QA-C3-GlcA-Gal-Rha production

As shown in Fig. 2C, UDP-Rha is transferred and a Rha residue is attached at the C3 position of QA by a rhamnose transferase (RhaT). The third step of the method of the fifth aspect of the invention is the overexpression of a heterologous gene encoding a rhamnose transferase (RhaT), in a yeast engineered to produce QA-C3-GlcA-Gal and UDP-Rha. The yeast engineered to produce QA-C3-GlcA-Gal may be a yeast according to the fifth aspect of the invention. The yeast engineered to produce UDP-Rha may be a yeast according to the third aspect of the invention.

The RhaT may be from any plant origin, for example, may be from Q. saponaria. In some embodiments, the RhaT is QsRhaT according to SEQ ID NO: 119, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 119. The fifth aspect of the invention also provides a yeast which is engineered to produce QA-C3-GlcA-Gal-Rha (aspect 5c).

QA-C3-GlcA-Gal-Xyl production

As shown in Fig. 2C, UDP-Xyl is transferred and a Xyl residue is attached at the C3 position of QA by a xylose transferase (XylT). An alternative third step of the method of the fifth aspect of the invention is the overexpression of a heterologous gene encoding a xylose transferase (XylT), in a yeast engineered to produce QA-C3-GlcA-Gal and UDP-Xyl. The yeast engineered to produce QA-C3-GlcA-Gal may be a yeast according to the aspect 5b of the invention. The yeast engineered to produce UDP-Xyl may be a yeast according to the fourth aspect of the invention. The XylT may be from any plant origin, for example, may be from Q. saponaria or S. vaccaria. In some embodiments, the XylT is selected from QsC3XylT according to SEQ ID NO: 122 and SvC3XylT according to SEQ ID NO: 100, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 122 or SEQ. ID NO: 100.

The fifth aspect of the invention also provides a yeast which is engineered to produce QA-C3-GlcA-Gal-Xyl (aspect 5d).

C28-qlycosylated QA derivatives production

As described earlier, a linear trisaccharide consisting of FRXX/A is attached at the C28 position of QA.

UDP-Fuc production

The first step of the method of the sixth aspect of the invention is the overexpression of heterologous genes encoding a UDP-glucose-4,6-dehydratase (UG46DH) converting UDP-GIc into UDP-4-keto-6-deoxy-glucose and a 4-keto-reductase converting UDP-4-keto-6-deoxy- glucose into UDP-D-Fuc. The UG46DH and 4-keto-reductase may be from any plant origin, for example, may be selected independently from Q. saponaria and S. vaccaria. In some embodiments, the UG46DH is SvUG46DH according to SEQ ID NO: 87, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 87. In some embodiments, the 4-keto-reductase is selected from svNMD according to SEQ ID NO: 90 and QsFucSyn according to SEQ ID NO: 175, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 90 or SEQ ID NO: 175.

The sixth aspect of the invention also provides a yeast which is engineered to produce UDP-Fuc.

QA-C3-GGX-C28-F and QA-C3-GGR-C28-F production

As shown in Fig. 2D, UDP-Fuc is transferred and a Fuc residue is attached at the C28 position of QA by a fucose transferase (FucT). The first step of the method of the seventh aspect of the invention is the overexpression of a heterologous gene encoding a fucose transferase (FucT), in a yeast engineered to produce QA-C3-GlcA-Gal-Rha, or QA-C3-GlcA-Gal-Xyl and UDP- Fucose. The yeast engineered to produce QA-C3-GlcA-Gal-Rha may be a yeast according to the aspect 5c of the invention. The yeast engineered to produce QA-C3-GlcA-Gal-Xyl may be a yeast according to the fifth aspect of the invention. The yeast engineered to produce UDP- Fuc may be a yeast according to the fifth aspect of the invention. The FucT may be selected from Q. Saponaria and S. vaccaria. In some embodiments, the FucT is selected from QsFucT according to SEQ ID NO: 93 and SvFucT according to SEQ ID NO: 96, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 93 and SEQ ID NO: 96, respectively.

The seventh aspect of the invention also provides a yeast which is engineered to produce QA-C3-GGR-C28-F, or QA-C3-GGX-C28-F (aspect 7a).

QA-C3-GGX-C28-FR and QA-C3-GGR-C28-FR production

As shown in Fig. 2D, UDP-Rha is transferred and a Rha residue is attached at the C28 position of QA by a Rha transferase (RhaT). The second step of the method of the seventh aspect of the invention is the overexpression of a heterologous gene encoding a rhamnose transferase (RhaT), in a yeast engineered to produce QA-C3-GGR-F, or QA-GGX-F. The yeast engineered to produce QA-C3-GGR-F or QA-C3-GGX-F may be a yeast according to the to the aspect 7a of the invention. The RhaT may be the same as described in the method of the fifth aspect of the invention.

The seventh aspect of the invention also provides a yeast which is engineered to produce QA-C3-GGR-C28-FR, or QA-C3-GGX-C28-FR (aspect 7b).

QA-C3-GGX-C28-FRX and QA-C3-GGR-C28-FRX production

As shown in Fig. 2D, UDP-Xyl is transferred and a Xyl residue is attached at the C28 position of QA by a xylose transferase (XylT). The third step of the method of the seventh aspect of the invention is the overexpression of a heterologous gene encoding a xylose transferase (XylT), in a yeast engineered to produce QA-C3-GGR-FR, or QA-GGX-FR. The yeast engineered to produce QA-C3-GGR-FR or QA-C3-GGX-FR may be a yeast according to the aspect 7b of the invention. The XylT may be selected from Q. saponaria and S. vaccaria. In some embodiments, XylT is QsC28XylT3 according to SEQ ID NO: 125, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 125.

The seventh aspect of the invention also provides a yeast which is engineered produce QA-C3-GGR-C28-FRX, or QA-C3-GGX-C28-FRX (aspect 7c).

QA-C3-GGX-C28-FRXX and QA-C3-GGR-C28-FRXX production

As shown in Fig. 2D, UDP-Xyl is transferred and a Xyl residue is attached at the C28 position of QA by a xylose transferase (XylT). The fourth step of the method of the seventh aspect of the invention is the overexpression of a heterologous gene encoding a xylose transferase (XylT), in a yeast engineered to produce QA-C3-GGR-FRX, or QA-GGX-FRX. The yeast engineered to produce QA-C3-GGR-FRX or QA-C3-GGX-FRX may be a yeast according to the aspect 7c of the invention. XylT may be from Q. saponaria. In some embodiments, the XylT is QsC28XylT4 according to SEQ ID NO: 128, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 128. In further embodiments, the XylT is selected from QsC28XylT4-3aa according to SEQ ID NO: 131, QsC28XylT4-6aa according to SEQ ID NO: 134, QsC28XylT4-9aa according to SEQ ID NO: 137, and QsC28XylT4-12aa according to SEQ ID NO: 140, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 131, SEQ ID NO: 134, SEQ ID NO: 137 and SEQ ID NO: 140, respectively. In further embodiments, the XylT is selected from SUMO-QsC28XylT4 according to SEQ ID NO: 143, TrxA-QsC28-XylT4 according to SEQ ID NO: 145, and MBP- QsC28XylT4 according to SEQ ID NO: 147. In further embodiments, the XylT is QsC28XylT3- 3xGGGS-QsC28XylT4 according to SEQ ID NO: 149.

The seventh aspect of the invention also provides a yeast which is engineered produce QA-C3-GGR-C28-FRXX, or QA-C3-GGX-C28-FRXX (aspect 7d).

QA-C3-GGX-C28-FRXA and QA-C3-GGR-C28-FRXA production

As shown in Fig. 2D, UDP-Api is transferred and an Api residue is attached at the C28 position of QA by an apiose transferase (XylT). An alternative fourth step of the method of the seventh aspect of the invention is the overexpression of heterologous genes encoding a UDP- apiose synthase (AXS) converting UDP-GIcA into UDP-Api and an apiose transferase (ApiT), in a yeast engineered to produce QA-C3-GGR-FRX, or QA-GGX-FRX. The yeast engineered to produce QA-C3-GGR-FRX or QA-C3-GGX-FRX may be a yeast according to the aspect 7c of the invention. The ApiT and AXS may be independently selected from Q. saponaria and S. vaccaria. In some embodiments, the AXS is QsAXS according to SEQ ID NO: 113, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 113. In some embodiments, the ApiT is QsC28ApiT4 according to SEQ ID NO: 151, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 151.

The seventh aspect of the invention also provides a yeast which is engineered to produce QA-C3-GGR-C28-FRXA, or QA-C3-GGX-C28-FRXA (aspect 7e).

Production and attachment of the 18-carbon pseudo-dimeric acyl chain terminated with an arabinofuranose (C18-Araf)

As shown in Fig. 2A, the biosynthesis of the 18-carbon pseudo-dimeric acyl chain is achieved by condensing malonyl-CoA with 2MB-CoA to make C9-CoA. 2MB-CoA production The method of the eighth aspect of the invention comprises the step of overexpressing a heterologous gene encoding a carboxyl coenzyme A (CoA) ligase (CCL) converting 2- methylbutyric acid (2MB) acid into 2MB-CoA, wherein 2MB acid is supplemented exogenously. The CCL may be from any plant origin. In some embodiments, the CCL is QsCCL from Q. saponaria according to SEQ ID NO: 178, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 178. In an alternative embodiment (which does not require any exogenous supply of 2MB acid), the method further comprises overexpressing heterologous genes encoding the following enzymes:

(i) a phosphopantetheinyl (Ppant) transferase,

(ii) a megasynthase LovF-TE including an ACP domain, condensing two units of malonyl- CoA to 2MB-ACP, cleaving 2MB acid from the ACP domain which is converted into 2MB-CoA by the CCL.

The Ppant may be from Aspergillus nidulans and the megasynthase LovF-TE may be from Aspergillus terreus. In some embodiments, the Ppant is AnNpgA according to SEQ ID NO: 237 and the megasynthase LovF-TE is AstLovF-TE according to SEQ ID NO: 235, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 237 and SEQ ID NO:235, respectively.

The eighth aspect of the invention also provides a yeast which is engineered to produce 2MB-CoA.

UDP-Arabinofuranose production

The method according to the ninth aspect of the invention comprises the step of overexpressing, in a yeast engineered to produce UDP-Xyl, heterologous genes encoding the following enzymes:

(i) a UDP-Xyl epimerase (UXE) converting UDP-Xyl into UDP-Arabinopyranose (UDP- Arap), and

(ii) a UDP-arabinose mutases (UAM) converting UDP-Arap into UDP-Arabinofuranose (UDP-Ara ).

The yeast engineered to produce UDP-Xyl may be according to the fourth aspect of the invention.

The UXE and the UAM may be independently selected from A. thaliana and H. vulgare. In some embodiments, the UXE is selected from AtUXE according to SEQ ID NO: 199, AtUXE2 according to SEQ ID NO: 202, HvUXE-1 according to SEQ ID NO: 240, HvUXE-2 according to SEQ ID NO: 242 and AtUGE3 according to SEQ ID NO: 205, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 199, SEQ ID NQ:202, SEQ ID NO: 240, SEQ ID NO: 242 and SEQ ID NO: 205, respectively. In some embodiments, the UAM is selected from AtllAMI according to SEQ ID NO: 208 and HvlIAM according to SEQ ID NO: 211 , or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 208 and SEQ ID NO: 211 , respectively.

The ninth aspect of the invention also provides a yeast which is engineered to produce UDP-Arabinofuranose.

Acylated and glycosylated QA derivatives production

As shown in Fig. 2A, two repeating C9-CoA acyl units are successively transferred by 2 acyltransferases leading to the addition of 18-carbon pseudo-dimeric acyl chain to the fucose residue of the linear tetrasaccharide at the C28 position and resulting in the formation of acylated and glycosylated QA derivatives.

The first step of the method of the tenth aspect of the invention is the overexpression of heterologous genes, in a yeast engineered to produce a glycosylated QA derivative, encoding the following enzymes:

(i) a carboxyl coenzyme A ligase (CCL) converting 2MB acid into 2MB-CoA,

(ii) a chalcone-synthase-like type III PKS (polyketide synthase) condensing malonyl-CoA with 2MB-CoA to form C9-Keto-CoA,

(iii) a keto-reductase (KR) converting C9-Keto-CoA into C9-CoA, and

(iv) an acyltransferase transferring and attaching a first C9-CoA unit to the glycosylated QA derivative to form an acylated and glycosylated QA derivative, and

2MB acid is supplemented exogenously.

For example, 2MB acid may be added directly into the yeast culture medium, at any appropriate time.

In the method according to the tenth aspect of the invention, the glycosylated QA derivative may be QA-C3-GGX-C28-FRX, QA-C3-GGR-C28-FRX, QA-C3-GGX-C28-FRXX, QA-C3-GGR-C28-FRXX, QA-C3-GGX-C28-FRXA, or QA-C3-GGR-C28-FRXA. The yeast engineered to produce QA-C3-GGX-C28-FRX and QA-C3-GGR-C28-FRX may be according to the aspect 7c. The yeast engineered to produce QA-C3-GGX-C28-FRXX and QA-C3-GGR- C28-FRXX may be according to the aspect 7d. The yeast engineered to produce QA-C3-GGX- C28-FRXA and QA-C3-GGR-C28-FRXA may be according to the aspect 7e of the invention.

In the first step of the method according to the tenth aspect of the invention, the CCL may be as described in the method of the eighth aspect of the invention.

The chalcone-synthase-like type III PKS may be form any plant origin. In some embodiments, the chalcone-synthase-like are QsChSD according to SEQ ID NO: 181, QsChSE according to SEQ ID NO: 184, or both QsChSD according to SEQ ID NO: 181 and QsChSE according to SEQ ID NO: 184, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 181 and SEQ ID NO: 184, respectively.

The KR may be from any plant origin. In some embodiments, the KR is QsKR11 according to SEQ ID NO: 187, QsKR23 according to SEQ ID NO: 190, or both QsKR11 according to SEQ ID NO: 187 and QsKR23 according to SEQ ID NO: 190, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 187 and SEQ ID NO: 190, respectively.

The acyltransferase may be from any plant origin. In some embodiments, the acyltransferase is QsDMOT9 according to SEQ ID NO: 193, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 187 and SEQ ID NO: 193.

The tenth aspect of the invention also provides a yeast which is engineered to produce C3-GGX-C28-FRX-C9, QA-C3-GGR-C28-FRX-C9, C3-GGX-C28-FRXX-C9, QA-C3-GGR- C28-FRXX-C9, C3-GGX-C28-FRXA-C9, or QA-C3-GGR-C28-FRXA-C9 (aspect 10a).

In the second step of the method according to the tenth aspect of the invention, the acylated and glycosylated QA derivative may be C3-GGX-C28-FRX-C9, QA-C3-GGR-C28- FRX-C9, C3-GGX-C28-FRXX-C9, QA-C3-GGR-C28-FRXX-C9, C3-GGX-C28-FRXA-C9, or QA-C3-GGR-C28-FRXA-C9. The yeast engineered to produce C3-GGX-C28-FRX-C9, QA-C3- GGR-C28-FRX-C9, C3-GGX-C28-FRXX-C9, QA-C3-GGR-C28-FRXX-C9, C3-GGX-C28- FRXA-C9, or QA-C3-GGR-C28-FRXA-C9 may be according to the aspect 10a of the invention.

The third step of the method according to the tenth aspect of the invention further comprises overexpressing a gene encoding (v) a second acyltransferase attaching a second C9-CoA unit to an acylated and glycosylated QA derivative to form a further acylated and glycosylated QA derivative.

In the third step of the method according to the tenth aspect of the invention, the acylated and glycosylated QA derivative may be C3-GGX-C28-FRX-C9, QA-C3-GGR-C28- FRX-C9, C3-GGX-C28-FRXX-C9, QA-C3-GGR-C28-FRXX-C9, C3-GGX-C28-FRXA-C9, or QA-C3-GGR-C28-FRXA-C9. The yeast engineered to produce C3-GGX-C28-FRX-C9, QA-C3- GGR-C28-FRX-C9, C3-GGX-C28-FRXX-C9, QA-C3-GGR-C28-FRXX-C9, C3-GGX-C28- FRXA-C9 and QA-C3-GGR-C28-FRXA-C9 may be according to aspect 10a of the invention.

The acyltransferase may be from any plant origin. In some embodiments, the acyltransferase is QsDMOT4 according to SEQ ID NO: 196, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 196.

The tenth aspect of the invention also provides a yeast which is engineered to produce C3-GGX-C28-FRX-C18, QA-C3-GGR-C28-FRX-C18, C3-GGX-C28-FRXX-C18, QA-C3-GGR- C28-FRXX-C18, C3-GGX-C28-FRXA-C18, or QA-C3-GGR-C28-FRXA-C18 (aspect 10b). The fourth step of the method according to the tenth aspect of the invention further comprises overexpressing, in a yeast engineered to produce UDP-Araf, a heterologous gene encoding (vi) an arabinotransferase (ArafT) transferring UDP-Araf and attaching an Araf residue to an acylated and glycosylated QA derivative to form an acetylated and further glycosylated QA derivative.

In the fourth step of the method according to the tenth aspect of the invention, the acylated and glycosylated QA derivative may be C3-GGX-C28-FRX-C18, QA-C3-GGR-C28- FRX-C18, C3-GGX-C28-FRXX-C18, QA-C3-GGR-C28-FRXX-C18, C3-GGX-C28-FRXA-C18, or QA-C3-GGR-C28-FRXA-C18. The yeast engineered to produce C3-GGX-C28-FRX-C18, QA-C3-GGR-C28-FRX-C18, C3-GGX-C28-FRXX-C18, QA-C3-GGR-C28-FRXX-C18, C3- GGX-C28-FRXA-C18 and QA-C3-GGR-C28-FRXA-C18 may be according to aspect 10b of the invention.

The ArafT may be from any plant origin, for example, is from Q. saponaria. In some embodiments, the ArafT is selected from QsArafT according to SEQ ID NO: 229 and QsArafT2 according to SEQ ID NO: 232, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 229 and SEQ ID NO: 232, respectively.

The tenth aspect of the invention also provides a yeast which is engineered to produce QA-C3-GGR-C28-FRX-C18-Araf, QA-C3-GGX-C28-FRX-C18-Araf, QA-C3-GGR-C28-FRXX- C18-Araf, QA-C3-GGX-C28-FRXX-C18-Araf, QA-C3-GGR-C28-FRXA-C18-Araf or QA-C3- GGX-C28-FRXA-C18-Araf (aspect 10c) .

In embodiments, where QsArafT according to SEQ ID NO: 229 is used in the fourth step of the method according to the tenth aspect of the invention, QA-C3-GGR-C28-FRX-C18- Xyl, QA-C3-GGX-C28-FRX-C18- Xyl, QA-C3-GGR-C28-FRXX-C18- Xyl, QA-C3-GGX-C28- FRXX-C18-Xyl, QA-C3-GGR-C28-FRXA-C18- Xyl or QA-C3-GGX-C28-FRXA-C18-Xyl are also formed. The tenth aspect of the invention further provides a yeast which is engineered to produce QA-C3-GGR-C28-FRX-C18-Xyl, QA-C3-GGX-C28-FRX-C18-Xyl, QA-C3-GGR-C28- FRXX-C18-Xyl, QA-C3-GGX-C28-FRXX-C18-Xyl, QA-C3-GGR-C28-FRXA-C18-Xyl or QA-C3- GGX-C28-FRXA-C18-Xyl (aspect 10d of the invention).

In the fifth step of the method according to the tenth aspect of the invention, the method further comprises overexpressing heterologous genes encoding the following enzymes:

(vii) a phosphopantetheinyl (Ppant) transferase,

(viii) a megasynthase LovF-TE including an AGP domain, condensing two units of malonyl-CoA to 2MB-ACP, cleaving 2MB acid from the AGP domain which is converted into 2MB-CoA by the CoA ligase (CCL), and no 2MB acid is supplemented exogenously. The Ppant may be from Aspergillus nidulans and the megasynthase LovF-TE may be from Aspergillus terreus. In some embodiments, the Ppant is AnNpgA according to SEQ ID NO: 237 and the megasynthase LovF-TE is AstLovF-TE according to SEQ ID NO: 235, or a sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 237 and SEQ ID NO: 235, respectively.

“Percent identity” or “% identity” between a query nucleotide sequence and a subject nucleotide sequence is the “Identities” value, expressed as a percentage, that is calculated using a suitable algorithm (e.g. BLASTN, FASTA, Needleman-Wunsch, Smith-Waterman, LALIGN, or GenePAST/KERR) or software (e.g. DNASTAR Lasergene, GenomeQuest, EMBOSS needle or EMBOSS infoalign), over the entire length of the query sequence after a pair-wise global sequence alignment has been performed using a suitable algorithm (e.g. Needleman-Wunsch or GenePAST/KERR) or software (e.g. DNASTAR Lasergene or GenePAST/KERR). Importantly, a query nucleotide sequence may be described by a nucelotide sequence disclosed herein, in particular in one or more of the claims.

“Percent identity” or “% identity” between a query amino acid sequence and a subject amino acid sequence is the “Identities” value, expressed as a percentage, that is calculated using a suitable algorithm (e.g. BLASTP, FASTA, Needleman-Wunsch, Smith-Waterman, LALIGN, or GenePAST/KERR) or software (e.g. DNASTAR Lasergene, GenomeQuest, EMBOSS needle or EMBOSS infoalign), over the entire length of the query sequence after a pair-wise global sequence alignment has been performed using a suitable algorithm (e.g. Needleman-Wunsch or GenePAST/KERR) or software (e.g. DNASTAR Lasergene or GenePAST/KERR). Importantly, a query amino acid sequence may be described by an amino acid sequence disclosed herein, in particular in one or more of the claims.

The query sequence may be 100% identical to the subject sequence, or it may include up to a certain integer number of amino acid or nucleotide alterations as compared to the subject sequence such that the % identity is less than 100%. For example, the query sequence is at least 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical to the subject sequence. In the case of nucleotide sequences, such alterations include at least one nucleotide residue deletion, substitution or insertion, wherein said alterations may occur at the 5’- or 3’-terminal positions of the query sequence or anywhere between those terminal positions, interspersed either individually among the nucleotide residues in the query sequence or in one or more contiguous groups within the query sequence. In the case of amino acid sequences, such alterations include at least one amino acid residue deletion, substitution (including conservative and non-conservative substitutions), or insertion, wherein said alterations may occur at the amino- or carboxy-terminal positions of the query sequence or anywhere between those terminal positions, interspersed either individually among the amino acid residues in the query sequence or in one or more contiguous groups within the query sequence.

With respect to the enzymes and/or proteins used in the methods of the invention and defined in terms of sequence identity, such enzymes and/or proteins typically retain their same respective function and activity, which function and activity may be assesses as described in the Example section.

Yeast engineering

Conventional methods used to engineer yeast may be used in the methods of the invention (see e.g. US 8,828,684 B2, the content of which is incorporated by reference). Heterologous genes may be expressed under constitutive promoters or under inducible promoters, for example galactose-inducible promotors. Gene expression may be achieved either via integration into the genome of a given yeast strain (within the same locus or within different loci) or via plasmid expression. When using genome integration, one or more copies of the genes to be overexpressed may be integrated, for example, 1 to 10, 2 to 8, 3 to 7. In some embodiments, one or more of the genes involved in the biosynthesis of QS-21 are integrated into the genome of the yeast. General yeast culture conditions are known to the skilled person. Once engineered, yeast may be cultured for a few days, for example 1 to 7 days, 2 to 6 days, 4 to 5 days, or 3 days. It is within the ambit of the skilled person to determine the optimal time, depending on the metabolite to be produced. When using inducible promoters such as the gal promoters, determining the optimal induction time is also within the ambit of the skilled person. At any appropriate time after culture and/or induction, the desired metabolites, e.g. sugars or the QA derivatives of the invention may be recovered from the yeast culture, by any methods known in the art, such as extraction using a non-aqueous polar solvent, extraction using an acid medium or a basic medium, or recovery by resin absorption, or extraction by mechanically disrupting the plant cells, such as by ball milling or sonication. In some embodiments, the yeast is Saccharomyces cerevisiae.

Adjuvants The QA derivatives of the invention may be used as an adjuvant, individually, or in any combination. They may also be combined with further immuno-stimulants, in particular with a TLR4 agonist. In some embodiments, the QA derivatives are formulated within a liposome, in combination with a TLR4 agonist.

The TLR4 agonist may be 3D-MPL, in particular lipopolysaccharide TLR4 agonists, such as lipid A derivatives, especially a monophosphoryl lipid A, e.g. 3-de-O-acylated monophosphoryl lipid A (3D-MPL). 3D-MPL is sold under the name 'MPL' by GlaxoSmithKline Biologicals N.A. See, for example, US Patent Nos. 4,436,727; 4,877,611; 4,866,034 and 4,912,094. 3D-MPL can be produced according to the methods described in GB 2 220211 A. Chemically, it is a mixture of 3-deacylated monophosphoryl lipid A with 4, 5 or 6 acylated chains.

Adjuvants of the invention may also be formulated into a suitable carrier, such as an emulsion (e.g. an oil-in-water emulsion) or liposomes, as described below.

Liposomes

The term liposome is well known in the art and defines a general category of vesicles which comprise one or more lipid bilayers surrounding an aqueous space. Liposomes thus consist of one or more lipid and/or phospholipid bilayers and can contain other molecules, such as proteins or carbohydrates, in their structure. Because both lipid and aqueous phases are present, liposomes can encapsulate or entrap water-soluble material, lipid-soluble material, and/or amphiphilic compounds. A method for making such liposomes is described in WO 13/041572.

Liposome size may vary from 30 nm to several pm depending on the phospholipid composition and the method used for their preparation.

The liposome size will be in the range of 50 nm to 200 nm, especially 60 nm to 180 nm, such as 70-165 nm. Optimally, the liposomes should be stable and have a diameter of 100 nm to allow convenient sterilization by filtration.

Structural integrity of the liposomes may be assessed by methods such as dynamic light scattering (DLS) measuring the size (Z-average diameter, Zav) and polydispersity of the liposomes, or, by electron microscopy for analysis of the structure of the liposomes. The average particle size may be between 95 and 120 nm, and/or, the polydispersity (Pdl) index may not be more than 0.3 (such as not more than 0.2).

Table 1 - QA and acylated and/or glycosylated QA derivatives

Throughout the specification, including the claims, where the context permits, the term “comprising” and variants thereof such as “comprises” are to be interpreted as including the stated element (e.g., integer) or elements (e.g., integers) without necessarily excluding any other elements (e.g., integers). Thus a composition “comprising” X may consist exclusively of X or may include something additional e.g. X + Y.

The word “substantially” does not exclude “completely” e.g. a composition which is “substantially free” from Y may be completely free from Y. Where necessary, the word “substantially” may be omitted from the definition of the invention. The term “about” in or “approximately” in relation to a numerical value x is optional and means, for example, x+10% of the given figure, such as x+5% of the given figure, in particular the given figure.

As used herein, the singular forms “a,” “an” and “the” include plural references unless the content clearly dictates otherwise. As used herein, ng refers to nanograms, ug or pg refers to micrograms, mg refers to milligrams, mL or ml refers to milliliter, and mM refers to millimolar. Similar terms, such as urn, are to be construed accordingly. Unless specifically stated, a process comprising a step of mixing two or more components does not require any specific order of mixing. Thus components can be mixed in any order. Where there are three components then two components can be combined with each other, and then the combination may be combined with the third component, etc.

The invention is illustrated further by reference to the following clauses.

Clause 1. A method of producing quillaic acid (QA) in yeast, wherein the method comprises the step of overexpressing, in a yeast engineered to produce p-amyrin, heterologous genes encoding the following enzymes:

(i) a cytochrome P450 C16 oxidase, wherein the C16 oxidase oxidizes the C16 carbon of p-amyrin to a hydroxyl group,

(ii) a cytochrome P450 C23 oxidase, wherein the C23 oxidase oxidizes the C23 carbon of p-amyrin to an aldehyde group,

(iii) a cytochrome P450 C28 oxidase, wherein the C28 oxidase oxidizes the C28 carbon of p-amyrin to a carboxyl group, and

(iv) a cytochrome P450 reductase (CPR), acting as a redox partner wherein the C16 oxidase, the C23 oxidase, the C28 oxidase and the CPR are from a plant origin.

Clause 2. The method of clause 1, wherein the C16 oxidase, the C23 oxidase and the C28 oxidase are independently selected from Artemisia annua (Aa), Arabidopsis thaliana (At), Glycyrrhiza glabra (Gg), Medicago truncatula (Mt), Quillaja saponaria (Qs), Saponaria vaccaria (Sv), Centella asiatica (Ca), Bupleurum falcatum (Bf) and Maesa lanceolate (Ml).

Clause 3. The method of clause 1 or clause 2, wherein the C16 oxidase, the C23 oxidase and the C28 oxidase are independently selected from Medicago truncatula (Mt), Bupleurum falcatum (Bf), Quillaja saponaria (Qs), and Saponaria vaccaria (Sv).

Clause 4. The method of clause 3, wherein the C16 oxidase is selected from QsC16 according to SEQ ID NO: 20, QsC28C16 according to SEQ ID NO: 23, and SvC16 according to SEQ ID NO: 26.

Clause 5. The method of clause 4, wherein QsC16 is encoded by the nucleotide sequence SEQ ID NO: 21 , QsC28C16 is encoded by the nucleotide sequence SEQ ID NO: 24 and SvC16 is encoded by the nucleotide sequence SEQ ID NO: 27.

Clause 6. The method of any one of clauses 1 to 5, wherein the C23 oxidase is selected from MtC23 oxidase according to SEQ ID NO: 38, QsC23 according to SEQ ID NO: 29, SvC23-1 according to SEQ ID NO: 32, and SvC23-2 according to SEQ ID NO: 35. Clause 7. The method of clause 6, wherein MtC23 is encoded by the nucleotide sequence SEQ ID NO: 39, QsC23 is encoded by the nucleotide sequence SEQ ID NO: 30, SvC23-1 is encoded by the nucleotide sequence SEQ ID NO: 33, and SvC23-2 is encoded by the nucleotide sequence SEQ ID NO: 36.

Clause 8. The method of any one of clauses 1 to 7, wherein the C28 oxidase is selected from MtC28 according to SEQ ID NO: 46, QsC28 according to SEQ ID NO: 41 and SvC28 according to SEQ ID NO: 44.

Clause 9. The method of clause 8, wherein MtC28 is encoded by the nucleotide sequence SEQ ID NO: 47, QsC28 is encoded by the nucleotide sequence SEQ ID NO: 42, and SvC28 is encoded by the nucleotide sequence SEQ ID NO: 45.

Clause 10. The method of any one of clauses 4 to 9, wherein the C16 oxidase is SvC16 according to SEQ ID NO: 26, the C23 oxidase is SvC23-1 according to SEQ ID NO: 32 or SvC23-2 oxidase according to SEQ ID NO: 35, and the C28 oxidase is SvC28 according to SEQ ID NO: 44.

Clause 11 . The method of clause 10, wherein SvC16 is encoded by the nucleotide sequence SEQ ID NO: 27, SvC23-1 is encoded by the nucleotide sequence SEQ ID NO: 33, SvC23-2 is encoded by the nucleotide sequence SEQ ID NO: 36, and SvC28 is encoded by the nucleotide sequence SEQ ID NO: 45.

Clause 12. The method of any one of clauses 4 to 9, wherein the C16 oxidase is selected from QsC16 according to SEQ ID NO: 20 and QsC28C16 according to SEQ ID NO: 23, the C23 oxidase is QsC23 according to SEQ ID NO: 29 and the C28 is QsC28 according to SEQ ID NO: 41.

Clause 13. The method of clause 12, wherein QsC16 is encoded by the nucleotide sequence SEQ ID NO: 21 , QsC28C16 is encoded by the nucleotide sequence SEQ ID NO: 24, QsC23 is encoded by the nucleotide sequence SEQ ID NO: 30 and QsC28 is encoded by the nucleotide sequence SEQ ID NO: 42.

Clause 14. The method of any one of clauses 4 to 9, wherein the C16 oxidase is QsC28C16 according to SEQ ID NO: 23, the C23 oxidase is QsC23 according to SEQ ID NO: 29, and the C28 oxidase is QsC28 according to SEQ ID NO: 41.

Clause 15. The method of any one of clause 14, wherein the QsC28C16 is encoded by the nucleotide sequence SEQ ID NO: 24, QsC23 is encoded by the nucleotide sequence SEQ ID NO: 30, and QsC28 is encoded by the nucleotide sequence SEQ ID NO: 42.

Clause 16. The method of any one of clauses 1 to 15, wherein the CPR is selected from A thaliana (At) and Lotus Japonicus (Lj).

Clause 17. The method of clause 16, wherein the CPR is selected from AtATRI according to SEQ ID NO: 49 and LjCPR according to SEQ ID NO: 52. Clause 18. The method of clause 17, wherein the CPR is AtATRI according to SEQ ID NO: 49.

Clause 19. The method of clause 17 or clause 18, wherein AtATRI is encoded by the nucleotide sequence SEQ ID NO: 50 and LjCPR is encoded by the nucleotide sequence SEQ ID NO: 53.

Clause 20. The method of any one of clauses 1 to 19, wherein the yeast further overexpresses a heterologous gene encoding (v) a cytochrome b5.

Clause 21. The method of clause 20, wherein the cytochrome b5 is selected from A. thaliana (At), Q. saponaria (Qs) and S. vaccaria (Sv).

Clause 22. The method of clause 21, wherein the cytochrome b5 is selected from Atb5 according to SEQ ID NO: 58, Qsb5 according to SEQ ID NO: 55 and Svb5 according to SEQ ID NO: 61.

Clause 23. The method of clause 21 or clause 22, wherein the cytochrome b5 is Qsb5 according to SEQ ID NO: 55.

Clause 24. The method of clause 21 or clause 22, wherein the cytochrome b5 is Svb5 according to SEQ ID NO: 61.

Clause 25. The method of any one of clauses 22 to 24, wherein Atb5 is encoded by the nucleotide sequence SEQ ID NO: 59, Qsb5 is encoded by the nucleotide sequence SEQ ID NO: 56, and Svb5 is encoded by the nucleotide sequence SEQ ID NO: 62.

Clause 26. The method of any one of clauses 1 to 25, wherein the yeast further overexpresses a heterologous gene encoding (vi) a scaffold protein, wherein the scaffold protein physically interacts with one or more of the C16 oxidase, the C23 oxidase, the C28 oxidase and the CPR.

Clause 27. The method of clause 26, wherein the scaffold protein is a membrane steroid- binding protein (MSBP).

Clause 28. The method of clause 27, wherein the MSBP is selected from A. thaliana (At), Q. Saponaria (Qs) and S. vaccaria (Sv).

Clause 29. The method of clause 27 or clause 28, wherein the MSBP is selected from AtMSBPI according to SEQ ID NO: 63 and AtMSBP2 according to SEQ ID NO: 65.

Clause 30. The method of clause 27 or clause 28, wherein the MSBP is selected from QsMSBPI according to SEQ ID NO: 73, SvMSBPI according to SEQ ID NO: 67 and SvMSBP2 according to SEQ ID NO: 70.

Clause 31. The method of clause 27, clause 28 or clause 30, wherein the MSBP is SvMSBPI according to SEQ ID NO: 67.

Clause 32. The method of any one of clauses 29 to 31 , wherein AtMSBPI is encoded by the nucleotide sequence SEQ ID NO: 64, AtMSBP2 is encoded by the nucleotide sequence SEQ ID NO: 66, QsMSBPI is encoded by the nucleotide sequence SEQ ID NO: 74, SvMSBPI is encoded by the nucleotide sequence SEQ ID NO: 68 and SvMSBP2 is encoded by the nucleotide sequence SEQ ID NO: 71.

Clause 33. The method of any one of clauses 1 to 32, wherein the yeast engineered to produce p-amyrin overexpresses a p-amyrin synthase (BAS) selected from A. annua (Aa), A. thaliana (At), G. glabra (Gg), G. vaccaria (Gv), S. vaccaria (Sv), and Q. saponaria (Qs).

Clause 34. The method of clause 33, wherein the BAS is selected from AaBAS according to SEQ ID NO: 1 , AtBAS according to SEQ ID NO: 4, GgBAS according to SEQ ID NO: 7, GvBAS according to SEQ ID NO: 10, QsBAS according to SEQ ID NO: 15, and SvBAS according to SEQ ID NO: 13.

Clause 35. The method of clause 33 or clause 34, wherein the BAS is GvBAS according to SEQ ID NO: 10.

Clause 36. The method of any one of clauses 34 to 35, wherein AaBAS is encoded by the nucleotide sequence SEQ ID NO: 2, AtBAS is encoded by the nucleotide sequence SEQ ID NO: 5, GgBAS is encoded by the nucleotide sequence SEQ ID NO: 8, GvBAS is encoded by the nucleotide sequence SEQ ID NO: 11 , QsBAS is encoded by the nucleotide sequence SEQ ID NO: 16, and SvBAS is encoded by the nucleotide sequence SEQ ID NO: 14.

Clause 37. The method of any one of clauses 1 to 9, 12 to 23, 25 to 28, and 30 to 36, wherein the C16 oxidase is QsC28C16, the C23 oxidase is QsC23, the C28 oxidase is QsC28, the CPR is AtATRI , the MSBP is SvMSBPI , the cytochrome b5 is Qsb5, and the BAS is GvBAS.

Clause 38. The method of clause 37, wherein QsC28C16 is encoded by the nucleotide sequence SEQ ID NO: 24, QsC23 is encoded by the nucleotide sequence SEQ ID NO: 30, QsC28 is encoded by the nucleotide sequence SEQ ID NO: 42, AtATRI is encoded by the nucleotide sequence SEQ ID NO: 24, SvMSBPI is encoded by the nucleotide sequence SEQ ID NO: 50, Qsb5 is encoded by the nucleotide sequence SEQ ID NO: 56, and GvBAS is encoded by the nucleotide sequence SEQ ID NO: 11.

Clause 39. A yeast which is engineered to produce QA according to the method of any one of clauses 1 to 38.

Clause 40. The yeast of clause 39 producing at least 60 mg/L of QA.

Clause 41 . A method of producing UDP-Glucuronic acid (UDP-GIcA) in yeast, wherein the method comprises the step of overexpressing a heterologous gene encoding a UDP-glucose dehydrogenase (UGD) converting UDP-Glucose (UDP-GIc) into UDP-GIcA.

Clause 42. The method of clause 41 , wherein the UGD is from A. thaliana (At).

Clause 43. The method of clause 39, wherein the UGD is selected from AtUGD according to SEQ ID NO: 84 and AtUGD _{A10i L} according to SEQ ID NO: 108. Clause 44. The method of any one of clauses 41 to 43, wherein the UGD is AtUGD _AioiL according to SEQ ID NO: 108.

Clause 45. The method of clause 43 or clause 44, wherein AtllGD is encoded by the nucleotide sequence SEQ ID NO: 85, and AtUGD _AioiL is encoded by the nucleotide sequence SEQ ID NO: 109.

Clause 46. A yeast which is engineered to produce UDP-GIcA according to the method of any of clauses 41 to clause 45.

Clause 47. A method of producing UDP-Rhamnose (UDP-Rha) in yeast, wherein the method comprises the step of overexpressing a heterologous gene encoding a UDP-rhamnose synthase converting UDP-Glucose (UDP-GIc) into UDP-Rha.

Clause 48. The method of clause 47, wherein the UDP-rhamnose synthase is from A. thaliana (At).

Clause 49. The method of clause 48, wherein the UDP-rhamnose synthase is AtRHM2 according to SEQ ID NO: 102.

Clause 50. The method of clause 49, wherein AtRHM2 is encoded by the nucleotide sequence SEQ ID NO: 103.

Clause 51. A yeast which is engineered to produce UDP-Rha according to the method of any one of clauses 47 to 50.

Clause 52. A method of producing UDP-Xylose (UDP-Xyl) in yeast, wherein the method comprises the step of overexpressing heterologous genes encoding the following enzymes:

(i) a UDP-glucose dehydrogenase (UGD) converting UDP-GIc into UDP-GIcA, and

(ii) a UDP-xylose synthase (UXS) converting UDP-GIcA into UDP-Xylose.

Clause 53. The method of clause 52, wherein the UGD and the UXS are independently selected from A. thaliana (At) and Q. saponaria (Qs).

Clause 54. The method of clause 53, wherein the UGD is selected from AtUGD according to SEQ ID NO: 84 and AtUGD _A10iL according to SEQ ID NO: 108.

Clause 55. The method of clause 52, wherein the UGD is selected from Synechococcus sp. (Syn), Homo sapiens (Hs), Paramoeba atlantica (Patl), Bacillus cytotoxicus (Bcyt), Corallococcus macrosporus (Myxfulv), and Pyrococcus furiosus (Pfu).

Clause 56. The method of clause 55, wherein the UGD is selected from SynUGD according to SEQ ID NO: 154, HSUGD _A104L according to SEQ ID NO: 157, PatlUGD according to SEQ ID NO: 110, BcytUGD according to SEQ ID NO: 160, MyxfulvUGD according to SEQ ID NO: 163, and PfuUGD according to SEQ ID NO: 166.

Clause 57. The method of any one of clauses 52 to 54, wherein the UGD is AtUGD _{A10i L} according to SEQ ID NO: 108. Clause 58. The method of any of clauses 54 to 57, wherein AtllGD is encoded by the nucleotide sequence SEQ ID NO: 85, AtUGD _AioiL is encoded by the nucleotide sequence SEQ ID NO: 109, SynllGD is encoded by the nucleotide sequence SEQ ID NO: 155, HSIIGD ₁₀4L is encoded by the nucleotide sequence SEQ ID NO: 158, PatlUGD is encoded by the nucleotide sequence SEQ ID NO: 111 , BcytUGD is encoded by the nucleotide sequence SEQ ID NO: 161, MyxfuIvlIGD is encoded by the nucleotide sequence SEQ ID NO: 164, and PfullGD is encoded by the nucleotide sequence SEQ ID NO: 167.

Clause 59. The method of any one of clauses 52 to 58, wherein the UXS is selected from AtllXS according to SEQ ID NO: 105 and QsAXS according to SEQ ID NO: 113.

Clause 60. The method of clause 60 wherein the UGD is AtUGD _A10iL according to SEQ ID NO: 108 and the UXS is AtUXS according to SEQ ID NO: 105.

Clause 61. The method of clause 59 or clause 60, wherein AtUXS is encoded by the nucleotide sequence SEQ ID NO: 106, QsAXS is encoded by the nucleotide sequence SEQ ID NO: 114 and AtUGD _A10iL is encoded by the nucleotide sequence SEQ ID NO: 109.

Clause 62. A yeast which is engineered to produce UDP-Xyl according to the method of any one of clauses 52 to 61.

Clause 63. A method of producing a C3-glycosylated QA derivative in yeast, wherein the derivative is QA-C3-GlcA, and the method comprises the step of overexpressing, in a yeast engineered to produce QA and UDP-GIcA, a heterologous gene encoding the following enzyme:

(i) a UDP-GIcA transferase (GlcAT) transferring UDP-GIcA and attaching a GlcA residue at the C3 position of QA to form QA-C3-GlcA.

Clause 64. The method of clause 63, wherein the GlcAT is selected from Q. saponaria (Qs) and S. vaccaria (Sv).

Clause 65. The method of clause 63 or clause 64, wherein the GlcAT is from Q. saponaria.

Clause 66. The method of clause 65, wherein the GlcAT is selected from QsCsIGI according to SEQ ID NO: 78 and QsCslG2 according to SEQ ID NO: 81.

Clause 67. The method of clause 66, wherein the GlcAT is QsCslG2 according to SEQ ID NO: 81.

Clause 68. The method of clause 64, wherein the GlcAT is from S. vaccaria.

Clause 69. The method of clause 68, wherein the GlcAT is SvCsIG according to SEQ ID

NO: 76.

Clause 70. The method of any one of clauses 66, 67, 68 or 69, wherein QsCsIGI is encoded by the nucleotide sequence SEQ ID NO: 79, QsCslG2 is encoded by the nucleotide sequence SEQ ID NO: 82 and SvCsIG is encoded by the nucleotide sequence SEQ ID NO: 77. Clause 71. The method of any one of clauses 63 to 70, wherein the yeast engineered to produce QA is according to clause 39.

Clause 72. The method of clause 71, wherein the yeast engineered to produce UDP-GIcA is according to clause 46.

Clause 73. A yeast which is engineered to produce QA-C3-GlcA according to the method of any one of clauses 63 to 72.

Clause 74. The method of any one of clauses 63 to 72, wherein the derivative is QA-C3- GlcA-Gal, and the overexpressing further comprises overexpressing a heterologous gene encoding the following enzyme:

(ii) a UDP-Galactose transferase (GalT) transferring UDP-Gal and attaching a Gal residue to QA-C3-GlcA to form QA-C3-GlcA-Gal.

Clause 75. The method of clause 74, wherein the GalT is selected from Q. saponaria (Qs) and S. vaccaria (Sv).

Clause 76. The method of clause 75, wherein the GalT is from Q. saponaria (Qs).

Clause 77. The method of any one of clause 70 to 76, wherein the GalT is QsGalT according to SEQ ID NO: 116.

Clause 78. The method of clause 74, wherein the GalT is from S. vaccaria.

Clause 79. The method of clause 78, wherein GalT is SvGalT according to SEQ ID NO: 98.

Clause 80. The method of clause 77 or clause 79, wherein QsGalT is encoded by the nucleotide sequence SEQ ID NO: 117 and SvGalT is encoded by the nucleotide sequence SEQ ID NO: 99.

Clause 81. A yeast which is engineered to produce QA-C3-GlcA-Gal according to the method of any one of clauses 74 to 80.

Clause 82. The method of any one of clauses 74 to 80, wherein the derivative is QA-C3- GlcA-Gal-Rha, the yeast is further engineered to produce UDP-Rha, and the overexpressing further comprises overexpressing a heterologous gene encoding the following enzyme:

(iii) a UDP-Rhamnose transferase (RhaT) transferring UDP-Rha and attaching a Rha residue to QA-C3-GlcA-Gal to form QA-C3-GlcA-Gal-Rha.

Clause 83. The method of clause 82, wherein the RhaT is from Q. saponaria (Qs).

Clause 84. The method of clause 83, wherein the RhaT is QsRhaT according to SEQ ID

NO: 119.

Clause 85. The method of clause 84, wherein QsRhaT is encoded by the nucleotide sequence SEQ ID NO: 120.

Clause 86. The method of any one of clauses 82 to 86, wherein the yeast engineered to produce UDP-Rha is according to clause 51. Clause 87. A yeast which is engineered to produce QA-C3-GlcA-Gal-Rha according to the method of any one of clauses 82 to 86.

Clause 88. The method of any one of clauses 74 to 80, wherein the derivative is QA-C3- GlcA-Gal-Xyl, the yeast is further engineered to produce UDP-Xyl, and the overexpressing further comprises overexpressing heterologous genes encoding the following enzymes:

(iv) a UDP-Xylose transferase (XylT) transferring UDP-Xylose and attaching a Xyl residue to QA-C3-GlcA-Gal to form QA-C3-GlcA-Gal-Xyl.

Clause 89. The method of clause 88, wherein the XylT is selected from Q. Saponaria (Qs) or S. vaccaria (Sv).

Clause 90. The method of clause 89, wherein the XylT is selected from QsC3XylT according to SEQ ID NO: 122 and SvC3XylT according to SEQ ID NO: 100.

Clause 91 . The method of clause 90, wherein QsC3XylT is encoded by the nucleotide sequence SEQ ID NO: 123 and SvC3XylT is encoded by the nucleotide sequence SEQ ID NO: 101 , and wherein the yeast engineered to produce UDP-Xyl is according to clause 62.

Clause 92. A yeast which is engineered to produce QA-C3-GlcA-Gal-Xyl according to the method of any one of clauses 88 to 91 .

Clause 93. The method of any one of clauses 88 to 91 , wherein the overexpressing further comprises overexpressing of heterologous genes encoding the following enzymes:

(v) a glucuronokinase (GlcAK) converting free glucuronic acid into GlcA-1 -phosphate, and

(vi) a UDP-sugar pyrophosphorylase (USP) converting GlcA-1 -phosphate into UDP-GIcA, and glucuronic acid is supplemented exogenously.

Clause 94. The method of clause 93, wherein the GlcAK and the USP are from A. thaliana (At).

Clause 95. The method of clause 94, wherein GlcAK is AtGIcAK according to SEQ ID NO: 169 and the USP is AtUSP according to SEQ ID NO: 223.

Clause 96. The method of clause 95, wherein AtGIcAK is encoded by the nucleotide sequence SEQ ID NO: 170 and AtUSP is encoded by the nucleotide sequence SEQ ID NO: 224.

Clause 97. The method of any one of clauses 93 to 96, wherein the overexpressing further comprises overexpressing of (vi) a heterologous gene encoding a Myo-Inositol Oxygenase (MIOX), and myo-inositol is additionally supplemented exogenously.

Clause 98. The method of clause 97, wherein MIOX is from Thermothelomyces thermophilus (Tt).

Clause 99. The method of clause 98, wherein MIOX is TtMIOX according to SEQ ID NO: 173. Clause 100. The method of clause 99, wherein TtMIOX is encoded by the nucleotide sequence SEQ ID NO: 174.

Clause 101. A yeast which is engineered to produce QA-C3-GlcA-Gal-Xyl according to the method of any one of clauses 93 to 100.

Clause 102. A method of producing UDP-Fucose (UDP-Fuc) in yeast, wherein the method comprises the step of overexpressing heterologous genes encoding the following enzymes:

(i) a UDP-glucose-4,6-dehydratase (LIG46DH) converting UDP-GIc into UDP-4-keto-6- deoxy-glucose and

(ii) a 4-keto-reductase converting UDP-4-keto-6-deoxy-glucose into UDP-D-Fuc. Clause 103. The method of clause 102, wherein the LIG46DH is from S. vaccaria (Sv). Clause 104. The method of clause 103, wherein the LIG46DH is SvllG46DH according to SEQ ID NO: 87.

Clause 105. The method of clause 104, wherein SvllG46DH is encoded by the nucleotide sequence SEQ ID NO: 88.

Clause 106. The method of any one of clauses 102 to 105, wherein the 4-keto-reductase is selected from Q. saponaria (Qs) and S. vaccaria (Sv).

Clause 107. The method of clause 106, wherein the 4-keto-reductase is selected from svNMD according to SEQ ID NO: 90 and QsFucSyn according to SEQ ID NO: 175.

Clause 108. The method of clause 107, wherein svNMD is encoded by the nucleotide sequence SEQ ID NO: 91 and QsFucSyn is encoded by the nucleotide sequence SEQ ID NO: 176.

Clause 109. A yeast which is engineered to produce UDP-Fucose according to the method of any one of clauses 102 to 108.

Clause 110. A method of producing a C28-glycosylated QA derivative in yeast, wherein the derivative is QA-C3-GlcA-Gal-Rha-C28-Fuc, or QA-C3-GlcA-Gal-Xyl-C28-Fuc, the method comprises the step of overexpressing, in a yeast engineered to produce QA-C3-GlcA-Gal-Rha, or QA-C3-GlcA-Gal-Xyl, and UDP-Fucose, a heterologous gene encoding the following enzyme:

(i) a UDP-Fucose transferase (FucT) transferring UDP-Fuc and attaching a Fuc residue at the C28 position of QA to form QA-C3-GlcA-Gal-Rha-C28-Fuc, or QA-C3-GlcA-Gal-Xyl- C28-Fuc.

Clause 111. The method of clause 110, wherein the FucT is selected from Q. Saponaria (Qs) and S. vaccaria (Sv).

Clause 112. The method of clause 111 , wherein the FucT is selected from QsFucT according to SEQ ID NO: 93 and SvFucT according to SEQ ID NO: 96. Clause 113. The method of clause 112, wherein QsFucT is encoded by the nucleotide sequence SEQ ID NO: 94 and SvFucT is encoded by the nucleotide sequence SEQ ID NO: 97. Clause 114. The method of any one of clauses 110 to 113, wherein the yeast engineered to produce QA-C3-GlcA-Gal-Rha is according to clause 87 and the yeast engineered to produce UDP-Fuc is according to clause 109.

Clause 115. The method of any one of clauses 110 to 113 wherein the yeast engineered to produce QA-C3-GlcA-Gal-Xyl is according to clause 101 and the yeast engineered to produce UDP-Fuc is according to clause 109.

Clause 116. A yeast which is engineered to produce QA-C3-GlcA-Gal-Rha-C28-Fuc according to the method of any one of clauses 110 to 114.

Clause 117. A yeast which is engineered to produce QA-C3-GlcA-Gal-Xyl-C28-Fuc according to the method of any one of clauses 110 to 113 and clause 115.

Clause 118. The method of any one of clauses 110 to 115, wherein the derivative is QA-C3- GlcA-Gal-Rha-C28-Fuc-Rha, or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha, the overexpressing further comprises overexpressing a heterologous gene encoding the following enzyme:

(ii) a UDP-Rhamnose transferase (RhaT) transferring UDP-Rha and attaching a Rha residue to QA-C3-GlcA-Gal-Rha-C28-Fuc, or QA-C3-GlcA-Gal-Xyl-C28-Fuc, to form QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha.

Clause 119. The method of clause 118, wherein the RhaT is from Q. saponaria.

Clause 120. The method of clause 119, wherein the RhaT is QsRhaT according to SEQ ID NO: 119.

Clause 121. The method of clause 120, wherein QsRhaT is encoded by the nucleotide sequence SEQ ID NO: 120.

Clause 122. A yeast which is engineered to produce QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha according to the method of any one of clauses 118 to 121. Clause 123. The method of any one of clauses 118 to 121, wherein the derivative is QA-C3- GlcA-Gal-Rha-C28-Fuc-Rha-Xyl, or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl, the overexpressing further comprises overexpressing heterologous genes encoding the following enzyme:

Clause 124. The method of clause 123, wherein the XylT is selected from Q. Saponaria (Qs) and S. vaccaria (Sv). Clause 125. The method of clause 124, wherein the XylT is QsC28XylT3 according to SEQ ID NO: 125.

Clause 126. The method of clause 125, wherein QsC28XylT3 is encoded by the nucleotide sequence SEQ ID NO: 126.

Clause 127. A yeast which is engineered to produce QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl according to the method of any of clauses 123 to 126.

Clause 128. The method of any one of clauses 123 to 126, wherein the derivative is QA-C3- GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Xyl, or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Xyl, the overexpressing further comprises overexpressing heterologous genes encoding the following enzymes:

(iv) a UDP-Xylose transferase (XylT) transferring UDP-Xyl and attaching a Xyl residue to QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl to form QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Xyl and QA-C3-GlcA-Gal-Xyl-C28-Fuc- Rha-Xyl-Xyl, respectively.

Clause 129. The method of clause 128, wherein the XylT is selected from Q. Saponaria (Qs) and S. vaccaria (Sv).

Clause 130. The method of clause 129, wherein the XylT is QsC28XylT4 according to SEQ ID NO: 128.

Clause 131. The method of clause 130, wherein QsC28XylT4 is encoded by the nucleotide sequence SEQ ID NO: 129.

Clause 132. The method of clause 128 or clause 129, wherein QsC28XylT4 comprises an amino acid deletion at the N-terminus, ranging from 3 amino acids to 20 amino acids.

Clause 133. The method of clause 132, wherein the XylT is selected from QsC28XylT4-3aa according to SEQ ID NO: 131, QsC28XylT4-6aa according to SEQ ID NO: 134, QsC28XylT4- 9aa according to SEQ ID NO: 137, and QsC28XylT4-12aa according to SEQ ID NO: 140. Clause 134. The method of clause 133, wherein QsC28XylT4-3aa is encoded by the nucleotide sequence SEQ ID NO: 132, QsC28XylT4-6aa is encoded by the nucleotide sequence SEQ ID NO: 135, QsC28XylT4-9aa is encoded by the nucleotide sequence SEQ ID NO: 138, and QsC28XylT4-12aa is encoded by the nucleotide sequence SEQ ID NO: 141.

Clause 135. The method of clause 28 or clause 129, wherein a solubility tag is added at the N-terminus of XylT.

Clause 136. The method of clause 135, wherein the XylT is selected from SUMO- QsC28XylT4 according to SEQ ID NO: 143, TrxA-QsC28-XylT4 according to SEQ ID NO: 145, and MBP-QsC28XylT4 according to SEQ ID NO: 147. Clause 137. The method of clause 136, wherein SUMO-QsC28XylT4 is encoded by the nucleotide sequence SEQ ID NO: 144, TrxA-QsC28-XylT4 is encoded by the nucleotide sequence SEQ ID NO: 146 and MBP-QsC28XylT4 is encoded by the nucleotide sequence SEQ ID NO: 148.

Clause 138. The method of clause 128 or clause 129, wherein the XylT is QsC28XylT3- 3xGGGS-QsC28XylT4 according to SEQ ID NO: 149.

Clause 139. The method of clause 138, wherein QsC28XylT3-3xGGGS-QsC28XylT4 is encoded by the nucleotide sequence SEQ ID NO: 150.

Clause 140. A yeast which is engineered to produce QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-

Xyl-Xyl or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Xyl according to the method of any of clauses 128 to 139.

Clause 141. The method of any one of clauses 123 to 126, wherein the derivative is QA-C3- GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Api or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Api, the overexpressing further comprises overexpressing heterologous genes encoding the following enzymes:

(iv) a UDP-Apiose synthase (AXS) converting UDP-GIcA into UDP-Api and

Clause 142. The method of clause 141 , wherein the AXS is QsAXS according to SEQ ID NO: 113.

Clause 143. The method of clause 142, wherein QsAXS is encoded by the nucleotide sequence SEQ ID NO: 114.

Clause 144. The method of any one of clauses 141 to 143, wherein the ApiT is selected from Q. saponaria (Qs) or S. vaccaria (Sv).

Clause 145. The method of clause 144, wherein the ApiT is QsC28ApiT4 according to SEQ ID NO: 151.

Clause 146. The method of clause 145, wherein QsC28ApiT4 is encoded by the nucleotide sequence SEQ ID NO: 152.

Clause 147. A yeast which is engineered to produce QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha- Xyl-Api or QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Api according to the method of any of clauses 141 to 146.

Clause 148. A method of producing (S)-2-methylbutyryl CoA (2MB-CoA) in yeast, wherein the method comprises the step of overexpressing a heterologous gene encoding (i) a carboxyl coenzyme A (CoA) ligase (CCL) converting 2MB acid into 2MB-CoA, and 2MB acid is supplemented exogenously.

Clause 149. The method of clause 148, wherein the CCL is QsCCL from Q. saponaria according to SEQ ID NO: 178.

Clause 150. The method of clause 149, wherein QsCCL is encoded by the nucleotide sequence SEQ ID NO: 179.

Clause 151. The method of any one of clauses 148 to 150 in yeast, wherein the overexpressing further comprises overexpressing heterologous genes encoding the following enzymes:

(ii) a phosphopantetheinyl (Ppant) transferase,

(iii) a megasynthase LovF-TE including an ACP domain, condensing two units of malonyl- CoA to 2MB-ACP, cleaving 2MB acid from the ACP domain which is converted into 2MB-CoA by the CCL, and no 2MB acid is supplemented exogenously.

Clause 152. The method of clause 151 , wherein the Ppant is from Aspergillus nidulans (An) and the megasynthase LovF-TE is from Aspergillus terreus (Ast).

Clause 153. The method of clause 152, wherein the Ppant is AnNpgA according to SEQ ID NO: 237 and the megasynthase LovF-TE is AstLovF-TE according to SEQ ID NO: 235.

Clause 154. The method of clause 153, wherein AnNgA is encoded by the nucleotide sequence SEQ ID NO: 238 and AstLovF-TE is encoded by the nucleotide sequence SEQ ID NO: 236.

Clause 155. A yeast engineered to produce 2MB-CoA according to the method of any one of clauses 148 to 154.

Clause 156. A method of producing UDP-Arabinofuranose (UDP-Ara ) in yeast, wherein the method comprises the step of overexpressing, in a yeast engineered to produce UDP-Xyl, heterologous genes encoding the following enzymes:

(i) a UDP-Xyl epimerase (UXE) converting UDP-Xyl into UDP-Arabinopyranose (UDP- Arap), and

(ii) a UDP-Arabinose mutases (UAM) converting UDP-Arap into UDP-Arabinofuranose (UDP-Ara ).

Clause 157. The method of clause 156, wherein the UXE and the UAM are independently selected from A. thaliana (At) and H. vulgare (Hv).

Clause 158. The method of clause 157, wherein the UXE is selected from AtUXE according to SEQ ID NO: 199, AtUXE2 according to SEQ ID NO: 202, HvUXE-1 according to SEQ ID NO: 240, HvUXE-2 according to SEQ ID NO: 242 and AtUGE3 according to SEQ ID NO: 205 and the UAM is selected from AtllAMI according to SEQ ID NO: 208 and HvlIAM according to SEQ ID NO: 211.

Clause 159. The method of clause 158, wherein AtllXE is encoded by the nucleotide sequence SEQ ID NO: 200, AtUXE2 is encoded by the nucleotide sequence SEQ ID NO: 203, HvllXE-1 is encoded by the nucleotide sequence SEQ ID NO: 241, HvllXE-2 is encoded by the nucleotide sequence SEQ ID NO: 243, AtllAMI is encoded by the nucleotide sequence SEQ ID NO: 209, HvlIAM is encoded by the nucleotide sequence SEQ ID NO: 212, and AtUGE3 is encoded by the nucleotide sequence SEQ ID NO: 206.

Clause 160. The method of any one of clauses 156 to 159, wherein the yeast engineered to produce UDP-Xyl is according to clause 62.

Clause 161. A yeast which is engineered to produce UDP-Araf according to the method of any of clauses 156 to 160.

Clause 162. A method of producing UDP-Araf in yeast, wherein the method comprises the step of overexpressing heterologous genes encoding the following enzymes:

(i) an arabinokinase (AraK) and

(ii) a UDP-sugar pyrophosphorylase (USP), and arabinose is supplemented exogenously.

Clause 163. The method of clause 162, wherein the AraK and the USP are independently selected from A. thaliana (At) and Leptospira interrogans (Lei).

Clause 164. The method of clause 163, wherein the AraK is selected from AtAraK according to SEQ ID NO: 214 and LeiAraK according to SEQ ID NO: 217 and the USP is selected from AtUSP according to SEQ ID NO: 223 and LeiUSP according to SEQ ID NO: 226.

Clause 165. The method of clause 164, wherein the AtAraK is encoded by the nucleotide sequence SEQ ID NO: 215, LeiAraK is encoded by the nucleotide sequence SEQ ID NO: 218, AtUSP is encoded by the nucleotide sequence SEQ ID NO: 224 and LeiUSP is encoded by the nucleotide sequence SEQ ID NO: 227.

Clause 166. The method of any one of clauses 162 to 165, wherein the overexpressing further comprises overexpressing a heterologous gene encoding (iii) an arabinose transporter (AraT).

Clause 167. The method of clause 166, wherein the AraT is PrAraT from Penicillium rubens Wisconsin according to SEQ ID NO: 220.

Clause 168. The method of clause 167, wherein PrAraT is encoded by the nucleotide sequence SEQ ID NO: 221.

Clause 169. A yeast which is engineered to produce UDP-Araf according to the method of any one of clauses 162 to 168. Clause 170. A method of producing an acylated and glycosylated QA derivative in yeast, wherein the derivative is QA-C3-GGR-C28-FRX-C9, QA-C3-GGX-C28-FRX-C9, QA-C3-GGR- C28-FRXX-C9, QA-C3-GGX-C28-FRXX-C9, QA-C3-GGR-C28-FRXA-C9 or QA-C3-GGX-C28- FRXA-C9, and the method comprises the step of overexpressing, in a yeast engineered to produce QA-C3-GGR-C28-FRX, QA-C3-GGX-C28-FRX, QA-C3-GGR-C28-FRXX, QA-C3- GGX-C28-FRXX, QA-C3-GGR-C28-FRXA, or QA-C3-GGX-C28-FRXA, heterologous genes encoding the following enzymes:

(i) a carboxyl coenzyme A ligase (CCL) converting 2MB acid into 2MB-CoA,

(ii) a chalcone-synthase-like type III PKS (Polyketide synthase) condensing malonyl-CoA with 2MB-CoA to form C9-Keto-CoA,

(iii) a keto-reductase (KR) converting C9-Keto-CoA into C9-CoA, and

(iv) an acyltransferase transferring and attaching a first C9-CoA unit to QA-C3-GGR-C28- FRX, QA-C3-GGX-C28-FRX, QA-C3-GGR-C28-FRXX, QA-C3-GGX-C28-FRXX, QA- C3-GGR-C28-FRXA, or QA-C3-GGX-C28-FRXA to form QA-C3-GGR-C28-FRX-C9, QA-C3-GGX-C28-FRX-C9, QA-C3-GGR-C28-FRXX-C9, QA-C3-GGX-C28-FRXX-C9, QA-C3-GGR-C28-FRXA-C9 or QA-C3-GGX-C28-FRXA-C9. wherein 2MB acid is supplemented exogenously.

Clause 171. The method of clause 170, wherein the CCL, the chalcone-synthase-like type III PKS, the KR and the acyltransferase are from Q. saponaria.

Clause 172. The method of clause 171 , wherein the CCL is QsCCL according to SEQ ID NO: 178, the chalcone-synthase-like type III PKS is QsChSD according to SEQ ID NO: 181 , QsChSE according to SEQ ID NO: 184, or both QsChSD according to SEQ ID NO: 181 and QsChSE according to SEQ ID NO: 184, the keto-reductase is QsKR11 according to SEQ ID NO: 187, QsKR23 according to SEQ ID NO: 190, or both QsKR11 according to SEQ ID NO: 187 and QsKR23 according to SEQ ID NO: 190, and the acyltransferase is QsDMOT9 according to SEQ ID NO: 193.

Clause 173. The method of clause 171, wherein the chalcone-synthase-like type III PKS are both QsChSD according to SEQ ID NO: 181 and QsChSE according to SEQ ID NO: 184. Clause 174. The method of clause 170 wherein the KR are both QsKR11 according to SEQ ID NO: 187 and QsKR23 according to SEQ ID NO: 190.

Clause 175. The method of any one of clauses 172 to 174, wherein the CCL is QsCCL according to SEQ ID NO: 178, the chalcone-synthase-like type III PKS are QsChSD according to SEQ ID NO: 181 and QsChSE according to SEQ ID NO: 184, the KR are QsKR11 according to SEQ ID NO: 187 and QsKR23 according to SEQ ID NO: 190 and the acyltransferase is QsDMOT9 according to SEQ ID NO: 193. Clause 176. The method of of clause 175, wherein QsCCL is encoded by SEQ ID NO: 179, QsChSD is encoded by the nucleotide sequence SEQ ID NO: 182, QsChSE is encoded by the nucleotide sequence SEQ ID NO: 185, QsKR11 is encoded by the nucleotide sequence SEQ ID NO: 188, QsKR23 is encoded by the nucleotide sequence SEQ ID NO: 191 and QsDMOT9 is encoded by the nucleotide sequence SEQ ID NO: 194.

Clause 177. The method of any one of clauses 170 to 176, wherein the yeast engineered to produce QA-C3-GGR-C28-FRX and QA-C3-GGX-C28-FRX is according to clause 127, the yeast engineered to produce QA-C3-GGR-C28-FRXX and QA-C3-GGX-C28-FRXX is according to clause 140 and the yeast engineered to produce QA-C3-GGR-C28-FRXA and QA-C3-GGX-C28-FRXA is according to clause 147.

Clause 178. A yeast which is engineered to produce QA-C3-GGR-C28-FRX-C9, QA-C3- GGX-C28-FRX-C9, QA-C3-GGR-C28-FRXX-C9, QA-C3-GGX-C28-FRXX-C9, QA-C3-GGR- C28-FRXA-C9, or QA-C3-GGX-C28-FRXA-C9 according to the method of any one of clauses 170 to 177.

Clause 179. The method of any one of clauses 170 to 178, wherein the derivative is QA-C3- GGR-C28-FRX-C18, QA-C3-GGX-C28-FRX-C18, QA-C3-GGR-C28-FRXX-C18, QA-C3-GGX- C28-FRXX-C18, QA-C3-GGR-C28-FRXA-C18 or QA-C3-GGX-C28-FRXA-C18, and the overexpressing further comprises overexpressing a heterologous gene encoding the following enzyme:

(v) an acyltransferase QsDMOT4 according to SEQ ID NO: 196 attaching a second C9- CoA unit to C3-GGR-C28-FRX-C9, QA-C3-GGX-C28-FRX-C9, QA-C3-GGR-C28- FRXX-C9, QA-C3-GGX-C28-FRXX-C9, QA-C3-GGR-C28-FRXA-C9, or QA-C3-GGX- C28-FRXA-C9 to form C3-GGR-C28-FRX-C18, QA-C3-GGX-C28-FRX-C18, QA-C3- GGR-C28-FRXX-C18, QA-C3-GGX-C28-FRXX-C18, QA-C3-GGR-C28-FRXA-C18 or QA-C3-GGX-C28-FRXA-C18.

Clause 180. The method of clause 179, wherein QsDMOT4 is encoded by the nucleotide sequence SEQ ID NO: 197.

Clause 181. A yeast which is engineered to produce QA-C3-GGX-C28-FRX-C18, QA-C3- GGR-C28-FRX-C18, QA-C3-GGX-C28-FRXX-C18, QA-C3-GGR-C28-FRXX-C18, QA-C3- GGX-C28-FRXA-C18, or QA-C3-GGR-C28-FRXA-C18 according to the method of clause 179 or clause 180.

Clause 182. The method of any one of clauses 179 or 180, wherein the derivative is QA-C3- GGR-C28-FRX-C18-Araf, QA-C3-GGX-C28-FRX-C18-Araf, QA-C3-GGR-C28-FRXX-C18-Araf, QA-C3-GGX-C28-FRXX-C18-Araf, QA-C3-GGR-C28-FRXA-C18-Araf, or QA-C3-GGX-C28- FRXA-C18-Araf, the yeast is further engineered to produce UDP-Araf, and the overexpressing further comprises overexpressing a heterologous gene encoding the following enzyme: (vi) an arabinotransferase (ArafT) transferring UDP-Araf and attaching an Araf residue to QA-C3-GGR-C28-FRX-C18, QA-C3-GGX-C28-FRX-C18, QA-C3-GGR-C28-FRXX-C18, QA-C3-GGX-C28-FRXX-C18, QA-C3-GGR-C28-FRXA-C18, or QA-C3-GGX-C28- FRXA-C18- to form QA-C3-GGR-C28-FRX-C18-Araf, QA-C3-GGX-C28-FRX-C18-Araf, QA-C3-GGR-C28-FRXX-C18-Araf, QA-C3-GGX-C28-FRXX-C18-Araf, QA-C3-GGR- C28-FRXA-C18-Araf or QA-C3-GGX-C28-FRXA-C18-Araf.

Clause 183. The method of clause 182, wherein the ArafT is from Q. saponaria (Qs).

Clause 184. The method of clause 182 or clause 183, wherein the ArafT is selected from QsArafT according to SEQ ID NO: 229 and QsArafT2 according to SEQ ID NO: 232.

Clause 185. The method of clause 184, wherein QsArafT is encoded by the nucleotide sequence SEQ ID NO: 230, and QsArafT2 is encoded by the nucleotide sequence SEQ ID NO: 233.

Clause 186. The method of clause 184, wherein the ArafT is QsArafT2 according to SEQ ID NO: 232.

Clause 187. The method of clause 186, wherein QsArafT2 is encoded by the nucleotide sequence SEQ ID NO: 233.

Clause 188. The method of any one of clauses 182 to 187, wherein the yeast engineered to produce UDP-Araf is according to clause 161 or clause 169.

Clause 189. A yeast which is engineered to produce QA-C3-GGR-C28-FRX-C18-Araf, QA- C3-GGX-C28-FRX-C18-Araf, QA-C3-GGR-C28-FRXX-C18-Araf, QA-C3-GGX-C28-FRXX- C18-Araf, QA-C3-GGR-C28-FRXA-C18-Araf, or QA-C3-GGX-C28-FRXA-C18-Araf according to the method of any one of clauses 182 to 188.

Clause 190. A method of producing QA-C3-GGX-C28-FRX-C18-Xyl, QA-C3-GGR-C28-FRX- C18-Xyl, QA-C3-GGX-C28-FRXX-C18-Xyl, QA-C3-GGR-C28-FRXX-C18-Xyl, QA-C3-GGX- C28-FRX-C18-Xyl, QA-C3-GGX-C28-FRXA-C18-Xyl or QA-C3-GGR-C28-FRX-C18-Xyl in a yeast, wherein the method comprises the step of overexpressing, in a yeast engineered to produce QA-C3-GGX-C28-FRX-C18, QA-C3-GGR-C28-FRX-C18, QA-C3-GGX-C28-FRXX- C18, QA-C3-GRX-C28-FRXX-C18, QA-C3-GGX-C28-FRX-C18, QA-C3-GGX-C28-FRXA-C18 or QA-C3-GGR-C28-FRX-C18, a heterologous gene encoding an arabinotransferase (ArafT) transferring UDP-Xyl and attaching a Xyl residue to QA-C3-GGX-C28-FRX-C18, QA-C3-GGR- C28-FRX-C18, QA-C3-GGX-C28-FRXX-C18, QA-C3-GRX-C28-FRXX-C18, QA-C3-GGX-C28- FRX-C18, QA-C3-GGX-C28-FRXA-C18 and QA-C3-GGR-C28-FRX-C18 to form QA-C3-GGX- C28-FRX-C18-Xyl, QA-C3-GGR-C28-FRX-C18-Xyl, QA-C3-GGX-C28-FRXX-C18-Xyl, QA-C3- GRX-C28-FRXX-C18-Xyl, QA-C3-GGX-C28-FRX-C18-Xyl, QA-C3-GGX-C28-FRXA-C18-Xyl or QA-C3-GGR-C28-FRX-C18-Xyl. Clause 191. The method of clause 190, wherein the ArafT is QsArafT is according to SEQ ID NO: 229.

Clause 192. The method of clause 191 , wherein QsArafT is encoded by the nucleotide sequence SEQ ID NO: 230.

Clause 193. The method of any one of clauses 190 to 192, wherein the yeast engineered to produce QA-C3-GGX-C28-FRX-C18, QA-C3-GGR-C28-FRX-C18, QA-C3-GGX-C28-FRXX- C18, QA-C3-GRX-C28-FRXX-C18, QA-C3-GGX-C28-FRX-C18, QA-C3-GGX-C28-FRXA-C18 and QA-C3-GGR-C28-FRX-C18 is according to clause 181.

Clause 194. A yeast engineered to produce QA-C3-GGX-C28-FRX-C18-Xyl, QA-C3-GGR- C28-FRX-C18-Xyl, QA-C3-GGX-C28-FRXX-C18-Xyl, QA-C3-GRX-C28-FRXX-C18-Xyl, QA- C3-GGX-C28-FRX-C18-Xyl, QA-C3-GGX-C28-FRXA-C18-Xyl or QA-C3-GGR-C28-FRX-C18- Xyl according to the method of any one of clauses 190 to 193.

Clause 195. The method of any one of clauses 170 to 177, 179 to 180, 182 to 188 and 190 to 193, wherein the overexpressing further comprises the overexpressing of heterologous genes encoding the following enzymes:

(vii) a phosphopantetheinyl (Ppant) transferase,

(viii) a megasynthase LovF-TE including an ACP domain, condensing two units of malonyl-CoA to 2MB-ACP, cleaving 2MB acid from the ACP domain which is converted into 2MB-CoA by the CoA ligase (CCL), and no 2MB acid is supplemented exogenously.

Clause 196. The method of clause 195, wherein the Ppant is from Aspergillus nidulans (An) and the megasynthase LovF-TE is from Aspergillus terreus (Ast).

Clause 197. The method of clause 196, wherein the Ppant is AnNpgA according to SEQ ID NO: 237 and the megasynthase LovF-TE is AstLovF-TE according to SEQ ID NO: 235.

Clause 198. The method of clause 197, wherein AnNgA is encoded by the nucleotide sequence SEQ ID NO: 238 and AstLovF-TE is encoded by the nucleotide sequence SEQ ID NO: 236.

Clause 199. A yeast which is engineered to produce QA-C3-GGR-C28-FRX-C18-Araf, QA- C3-GGX-C28-FRX-C18-Araf, QA-C3-GGR-C28-FRXX-C18-Araf, QA-C3-GGX-C28-FRXX- C18-Araf, QA-C3-GGR-C28-FRXA-C18-Araf, or QA-C3-GGX-C28-FRXA-C18-Araf according to the method of any one of clauses 195 to 198.

Clause 200. A method of producing QA-C3-GGX-C28-FRXX-C18-Araf (QS-21-Xyl) in yeast, wherein the method comprises the step of overexpressing heterologous genes encoding GvBAS according to SEQ ID NO: 10, QsC28C16 according to SEQ ID NO: 23, QsC23 according to SEQ ID NO: 29, QsC28 according to SEQ ID NO: 41 , AtATRI according to SEQ ID NO: 49, Qsb5 according to SEQ ID NO: 55, SvMSBPI according to SEQ ID NO: 67, AtUGD _AioiL according to SEQ ID NO: 108, QsCslG2 according to SEQ ID NO: 78, QsGalT according to SEQ ID NO: 116, AtllXS according to SEQ ID NO: 105, QsC3XylT according to SEQ ID NO: 122, SvNMD according to SEQ ID NO: 90, SvllG46DH according to SEQ ID NO: 87, QsFuct according to SEQ ID NO: 93, AtRHM2 according to SEQ ID NO: 102, QsRhaT according to SEQ ID NO: 119, QsC28XylT3 according to SEQ ID NO: 125, QsC28XylT4 according to SEQ ID NO: 128, QsChSD according to SEQ ID NO: 181, QsChSE according to SEQ ID NO: 184, QsKR11 according to SEQ ID NO: 187, QsKR23 according to SEQ ID NO: 190, QsDMOT9 according to SEQ ID NO: 193, QsDMOT4 according to SEQ ID NO: 196, AtllXE according to SEQ ID NO: 199, AtllAMI according to SEQ ID NO: 208, QsArafT2 according to SEQ ID NO: 232, AnNpgA according to SEQ ID NO: 237, QsCCL according to SEQ ID NO: 178 and AstLovF-TE according to SEQ ID NO: 235.

Clause 201. The method of clause 200, wherein GvBAS is encoded by the nucleotide sequence SEQ ID NO: 11 , QsC28C16 is encoded by the nucleotide sequence SEQ ID NO: 24, QsC23 is encoded by the nucleotide sequence SEQ ID NO: 30, QsC28 is encoded by the nucleotide sequence SEQ ID NO: 42, AtATRI is encoded by the nucleotide sequence SEQ ID NO: 50, Qsb5 is encoded by the nucleotide sequence SEQ ID NO: 56, SvMSBPI is encoded by the nucleotide sequence SEQ ID NO: 68, AtUGD _AioiL is encoded by the nucleotide sequence SEQ ID NO: 109, QsCslG2 is encoded by the nucleotide sequence SEQ ID NO: 82, QsGalT is encoded by the nucleotide sequence SEQ ID NO: 117, AtllXS is encoded by the nucleotide sequence SEQ ID NO: 106, QsC3XylT is encoded by the nucleotide sequence SEQ ID NO: 123, SvNMD is encoded by the nucleotide sequence SEQ ID NO: 91, SvllG46DH is encoded by the nucleotide sequence SEQ ID NO: 88, QsFucT is encoded by the nucleotide sequence SEQ ID NO: 94, AtRHM2 is encoded by the nucleotide sequence SEQ ID NO: 103, QsRhaT is encoded by the nucleotide sequence SEQ ID NO: 220, QsC28XylT3 is encoded by the nucleotide sequence SEQ ID NO: 126, QsC28XylT4 is encoded by the nucleotide sequence SEQ ID NO: 129, QsChSD is encoded by the nucleotide sequence SEQ ID NO: 182, QsChSE is encoded by the nucleotide sequence SEQ ID NO: 185, QsKR11 is encoded by the nucleotide sequence SEQ ID NO: 188, QsKR23 is encoded by the nucleotide sequence SEQ ID NO: 191, QsDMOT9 is encoded by the nucleotide sequence SEQ ID NO: 194, QsDMOT4 is encoded by the nucleotide sequence SEQ ID NO: 197, AtllXE is encoded by the nucleotide sequence SEQ ID NO: 200, AtllAMI is encoded by the nucleotide sequence SEQ ID NO: 209, QsArafT2 is encoded by the nucleotide sequence SEQ ID NO: 233, AnNpgA is encoded by the nucleotide sequence SEQ ID NO: 238, QsCCL is encoded by the nucleotide sequence SEQ ID NO: 179 and AstLovF-TE is encoded by the nucleotide sequence SEQ ID NO: 236.

Clause 202. A yeast which is engineered to produce QS-21-Xyl according to the method of clause 201 or clause 202. Clause 203. A method of producing QA-C3-GGX-C28-FRXA-C18-Araf (QS-21-Api) in yeast, wherein the method comprises the step of overexpressing heterologous genes encoding GvBAS according to SEQ ID NO: 10, QsC28C16 according to SEQ ID NO: 23, QsC23 according to SEQ ID NO: 29, QsC28 according to SEQ ID NO: 41 , AtATRI according to SEQ ID NO: 49, Qsb5 according to SEQ ID NO: 55, SvMSBPI according to SEQ ID NO: 67, AtUGD _AioiL according to SEQ ID NO: 108, QsCslG2 according to SEQ ID NO: 81, QsGalT according to SEQ ID NO: 116, AtllXS according to SEQ ID NO: 105, QsC3XylT according to SEQ ID NO: 122, SvNMD according to SEQ ID NO: 90, SvllG46DH according to SEQ ID NO: 87, QsFucT according to SEQ ID NO: 93, AtRHM2 according to SEQ ID NO: 102, QsRhaT according to SEQ ID NO : 119, QsC28XylT3 according to SEQ ID NO: 125, QsC28ApiT4 according to SEQ ID NO: 151, QsChSD according to SEQ ID NO: 181, QsChSE according to SEQ ID NO: 184, QsKR11 according to SEQ ID NO: 187, QsKR23 according to SEQ ID NO: 190, QsDMOT9 according to SEQ ID NO: 193, QsDMOT4 according to SEQ ID NO: 196, AtllXE according to SEQ ID NO: 199, AtllAMI according to SEQ ID NO: 208, QsArafT2 according to SEQ ID NO: 232, AnNpgA according to SEQ ID NO: 237, QsCCL according to SEQ ID NO: 178 and AstLovF-TE according to SEQ ID NO: 235.

Clause 204. The method of clause 203, wherein GvBAS is encoded by the nucleotide sequence SEQ ID NO: 11 , QsC28C16 is encoded by the nucleotide sequence SEQ ID NO: 24, QsC23 is encoded by the nucleotide sequence SEQ ID NO: 30, QsC28 is encoded by the nucleotide sequence SEQ ID NO: 42, AtATRI is encoded by the nucleotide sequence SEQ ID NO: 50, Qsb5 is encoded by the nucleotide sequence SEQ ID NO: 56, SvMSBPI is encoded by the nucleotide sequence SEQ ID NO: 68, AtUGD _AioiL is encoded by the nucleotide sequence SEQ ID NO: 109, QsCslG2 is encoded by the nucleotide sequence SEQ ID NO: 82, QsGalT is encoded by the nucleotide sequence SEQ ID NO: 117, AtllXS is encoded by the nucleotide sequence SEQ ID NO: 106, QsC3XylT is encoded by the nucleotide sequence SEQ ID NO: 123, SvNMD is encoded by the nucleotide sequence SEQ ID NO: 91, SvllG46DH is encoded by the nucleotide sequence SEQ ID NO: 88, QsFucT is encoded by the nucleotide sequence SEQ ID NO: 94, AtRHM2 is encoded by the nucleotide sequence SEQ ID NO: 103, QsRhaT is encoded by the nucleotide sequence SEQ ID NO: 120, QsC28XylT3 is encoded by the nucleotide sequence SEQ ID NO: 126, QsC28ApiT4 is encoded by the nucleotide sequence SEQ ID NO: 152, QsChSD is encoded by the nucleotide sequence SEQ ID NO: 182, QsChSE is encoded by the nucleotide sequence SEQ ID NO: 185, QsKR11 is encoded by the nucleotide sequence SEQ ID NO: 188, QsKR23 is encoded by the nucleotide sequence SEQ ID NO: 191, QsDMOT9 is encoded by the nucleotide sequence SEQ ID NO: 194, QsDMOT4 is encoded by the nucleotide sequence SEQ ID NO: 197, AtllXE is encoded by the nucleotide sequence SEQ ID NO: 200, AtllAMI is encoded by the nucleotide sequence SEQ ID NO: 209, QsArafT2 is encoded by the nucleotide sequence SEQ ID NO: 233, AnNpgA is encoded by the nucleotide sequence SEQ ID NO: 238, QsCCL is encoded by the nucleotide sequence SEQ ID NO: 179 and AstLovF-TE is encoded by the nucleotide sequence SEQ ID NO: 236.

Clause 205. A yeast which is engineered to produce QS-21-Api according to the method of clause 204 or clause 205.

Clause 206. The method of any one of clauses 1 to 38, 42 to 45, 47 to 50, 52 to 61 , 63 to 72, 74 to 80, 82 to 86, 88 to 91 , 93 to 100, 102 to 108, 110 to 115, 118 to 121 , 123 to 126, 128 to 139, 141 to 146, 148 to 154, 156 to 160, 162 to 168, 170 to 177, 179, 180, 182 to 188, 190 to 193, 195 to 198, 200, 201 , 203 and 204, or the yeast of any one of clauses 39, 40, 46, 51 , 62, 73, 81 , 87, 92, 101 , 109, 116, 117, 122, 127, 140, 147, 155, 161 , 169, 178, 181 , 189, 194, 199, 202 and 205, wherein GvBAS (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 10, QsC28C16 (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to according to SEQ ID NO: 23, QsC23 (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to according to SEQ ID NO: 29, QsC28 (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 41 , AtATRI (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 49, Qsb5 (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 55, SvMSBPI (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 67, AtUGD _AioiL (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 108, QsCslG2 (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 81 , QsGalT (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 116, AtllXS (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 105, QsC3XylT (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 122, SvNMD (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 90, SvllG46DH (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 87, QsFucT (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 93, AtRHM2 (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 102, QsC28XylT3 (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 125, QsC28XylT4 (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 128, QsC28ApiT4 (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 151 , QsChSD (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 181, QsChSE according to SEQ ID NO: 184, QsKR11 (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 187, QsKR23 (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 190, QsDMOT9 (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 193, QsDMOT4 (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 196, AtllXE (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 199, AtllAMI (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 208, QsArafT2 (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 232, AnNpgA (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 237, QsCCL (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 178 and AstLovF-TE (when present) is according to an amino acid sequence at least 70%, 80%, 90%, 95%, 98%, or 99% identical to SEQ ID NO: 235.

Clause 207. The method of any one of clauses 1 to 38, 42 to 45, 47 to 50, 52 to 61 , 63 to 72, 74 to 80, 82 to 86, 88 to 91 , 93 to 100, 102 to 108, 110 to 115, 118 to 121 , 123 to 126, 128 to 139, 141 to 146, 148 to 154, 156 to 160, 162 to 168, 170 to 177, 179, 180, 182 to 188, 190 to 193, 195 to 198, 200, 201 , 203, 204 and 206, or the yeast of any one of clauses 39, 40, 46, 51 , 62, 73, 81 , 87, 92, 101 , 109, 116, 117, 122, 127, 140, 147, 155, 161 , 169, 178, 181 , 189, 194, 199, 202, 205 and 206, wherein the heterologous genes are integrated into the genome of the yeast.

Clause 208. The method, or yeast, of clause 207, wherein one or more copies of one or more of the heterologous genes are integrated.

Clause 209. The method, or yeast, of clause 208, wherein the one or more copies ranges from 2 to 5.

Clause 210. The method, or yeast, of clause 208 or clause 209, wherein at least 2 copies of the genes encoding the C16 oxidase, the C23 oxidase and the C28 oxidase are integrated. Clause 211. The method, or yeast, of any one of clauses 208 to 210, wherein at least 3 copies of the gene encoding the UXS (when present) are integrated. Clause 212. The method, or yeast, of clause 208 to 211, wherein the nucleotide sequence of the heterologous genes is codon-optimized.

Clause 213. The method, or yeast, of clause 212, wherein GvBAS (when present) is encoded by the nucleotide sequence SEQ ID NO: 12, QsC28C16 (when present) is encoded by the nucleotide sequence SEQ ID NO: 25, QsC23 (when present) is encoded by the nucleotide sequence SEQ ID NO: 31 , QsC28 (when present) is encoded by the nucleotide sequence SEQ ID NO: 43, AtATRI (when present) is encoded by the nucleotide sequence SEQ ID NO: 51, Qsb5 (when present) is encoded by the nucleotide sequence SEQ ID NO: 57, SvMSBPI is encoded by the nucleotide sequence SEQ ID NO: 69, AtUGD _A10iL (when present) is encoded by the nucleotide sequence SEQ ID NO: 109, QsCslG2 (when present) is encoded by the nucleotide sequence SEQ ID NO: 83, QsGalT (when present) is encoded by the nucleotide sequence SEQ ID NO: 118, AtllXS (when present) is encoded by the nucleotide sequence SEQ ID NO: 107, QsC3XylT (when present) is encoded by the nucleotide sequence SEQ ID NO: 124, SvNMD (when present) is encoded by the nucleotide sequence SEQ ID NO: 92, SvllG46DH (when present) is encoded by the nucleotide sequence SEQ ID NO: 89, QsFucT (when present) is encoded by the nucleotide sequence SEQ ID NO: 95, AtRHM2 (when present) is encoded by the nucleotide sequence SEQ ID NO: 104, QsRhaT (when present) is encoded by the nucleotide sequence SEQ ID NO: 121, QsC28XylT3 (when present) is encoded by the nucleotide sequence SEQ ID NO: 127, QsC28XylT4 (when present) is encoded by the nucleotide sequence SEQ ID NO: 130, QsC28ApiT4 (when present) encoded by the nucleotide sequence SEQ ID NO: 153 QsChSD (when present) is encoded by the nucleotide sequence SEQ ID NO: 183, QsChSE (when present) is encoded by the nucleotide sequence SEQ ID NO: 186, QsKR11 (when present) is encoded by the nucleotide sequence SEQ ID NO: 189, QsKR23 (when present) is encoded by the nucleotide sequence SEQ ID NO: 192, QsDMOT9 (when present) is encoded by the nucleotide sequence SEQ ID NO: 195, QsDMOT4 is encoded by the nucleotide sequence SEQ ID NO: 198, AtllXE (when present) is encoded by the nucleotide sequence SEQ ID NO: 201, AtllAMI (when present) is encoded by the nucleotide sequence SEQ ID NO: 210, QsArafT2 (when present) is encoded by the nucleotide sequence SEQ ID NO: 234, AnNpgA (when present) is encoded by the nucleotide sequence SEQ ID NO: 239, QsCCL (when present) is encoded by the nucleotide sequence SEQ ID NO: 180 and AstLovF-TE (when present) is encoded by the nucleotide sequence SEQ ID NO: 236.

Clause 214. The method of any one of clauses 1 to 38, 42 to 45, 47 to 50, 52 to 61, 63 to 72, 74 to 80, 82 to 86, 88 to 91 , 93 to 100, 102 to 108, 110 to 115, 118 to 121 , 123 to 126, 128 to 139, 141 to 146, 148 to 154, 156 to 160, 162 to 168, 170 to 177, 179, 180, 182 to 188, 190 to 193, 195 to 198, 200, 201, 203, 204, and 206 to 213, wherein the method comprises the further step of culturing the yeast to allow production of QA, respective UDP-sugars, and/or respective QA derivatives.

Clause 215. The method of clause 214, wherein the culturing step ranges from 2 to 6 days. Clause 216. The method of clause 215, wherein the culturing step is about 3 days.

Clause 217. The method of any one of clauses 1 to 38, 42 to 45, 47 to 50, 52 to 61, 63 to 72, 74 to 80, 82 to 86, 88 to 91 , 93 to 100, 102 to 108, 110 to 115, 118 to 121, 123 to 126, 128 to 139, 141 to 146, 148 to 154, 156 to 160, 162 to 168, 170 to 177, 179, 180, 182 to 188, 190 to 193, 195 to 198, 200, 201, 203, 204, and 206 to 216, or the yeast of any one of clauses 39, 40, 46, 51, 62, 73, 81, 87, 92, 101, 109, 116, 117, 122, 127, 140, 147, 155, 161, 169, 178, 181, 189, 194, 199, 202, and 205 to 213, wherein the heterologous genes are overexpressed under inducible promoters.

Clause 218. The method, or the yeast, of clause 217, wherein induction is for 2 to 5 days, and yeasts are cultured for 2 to 5 more days.

Clause 219. The method of any one of clauses 1 to 38, 42 to 45, 47 to 50, 52 to 61, 63 to 72, 74 to 80, 82 to 86, 88 to 91 , 93 to 100, 102 to 108, 110 to 115, 118 to 121, 123 to 126, 128 to

139, 141 to 146, 148 to 154, 156 to 160, 162 to 168, 170 to 177, 179, 180, 182 to 188, 190 to 193, 195 to 198, 200, 201, 203, 204, and 206 to 218, wherein the method further comprises the step of isolating UDP-sugars, C3-glycosylated QA derivatives, C28-glycosylated QA derivatives or acylated and glycosylated QA derivatives.

Clause 220. QA obtained according to the method of any one of clauses 1 to 38.

Clause 221. C3-glycosylated QA derivatives, C28-glycosylated QA derivatives or acylated and glycosylated QA derivatives obtained according to the method of clause 219.

Clause 222. The use of the QA derivatives of clause 221 as an adjuvant.

Clause 223. The use of clause 222, wherein the adjuvant is a liposomal formulation.

Clause 224. The use of clause 222 or clause 223, wherein the adjuvant comprises a TLR4 agonist.

Clause 225. The use of clause 224, wherein the TLR4 agonist is 3D-MPL.

Clause 226. An adjuvant composition comprising QS-21-Xyl according to clause 201 and/or QS-21-Api according to clause 203.

Clause 227. An isolated p-amyrin synthase (SvBAS) according to SEQ ID NO: 13.

Clause 228. An isolated p-amyrin synthase (QsBAS) according to SEQ ID NO: 15.

Clause 229. An isolated CYP C16 oxidase (QsC28C16) according to SEQ ID NO: 23.

Clause 230. An isolated CYP C16 oxidase (SvC16) according to SEQ ID NO: 26.

Clause 231. An isolated CYP C23 oxidase (SvC23-1) according to SEQ ID NO: 32.

Clause 232. An isolated CYP C23 oxidase (SvC23-2) according to SEQ ID NO: 35.

Clause 233. An isolated CYP C28 oxidase (SvC28) according to SEQ ID NO: 44. Clause 234. An isolated Cytochrome b5 protein (Qsb5) according to SEQ ID NO: 55.

Clause 235. An isolated Cytochrome b5 protein (Svb5) according to SEQ ID NO: 61.

Clause 236. An isolated UDP-GIcA transferase (SvCsIG) according to SEQ ID NO: 76.

Clause 237. An isolated MSBP protein (SvMSBPI) according to SEQ ID NO: 67.

Clause 238. An isolated MSBP protein (SvMSBP2) according to SEQ ID NO: 70.

Clause 239. An isolated MSBP protein (QsMSBPI) according to SEQ ID NO: 73.

Clause 240. An isolated UDP-glucose-4,6-dehydratase (SvllG46DH) according to SEQ ID NO: 87.

Clause 241. An isolated UDP-4-keto-6-deoxy-glucose reductase (SvNMD) according to SEQ ID NO: 90.

Clause 242. An isolated UDP-Galactose transferase (SvGalT) according to SEQ ID NO: 98.

Clause 243. An isolated UDP-Fucose transferase (SvFucT) according to SEQ ID NO: 96.

Clause 244. An isolated UDP-Xylose transferase (SvC3XylT) according to SEQ ID NO: 100.

Clause 245. An isolated UDP-Arabinofuranose transferase (QsArafT2) according to SEQ ID NO: 229.

Clause 246. An isolated UDP-glucose dehydrogenase (AtUGD _AioiL) according to SEQ ID NO: 108.

Clause 247. An isolated UDP-Xylose transferase (QsC28XylT4-3aa) according to SEQ ID NO: 131.

Clause 248. An isolated UDP-Xylose transferase (QsC28XylT4-6aa) according to SEQ ID NO: 134.

Clause 249. An isolated UDP-Xylose transferase (QsC28XylT4-9aa) according to SEQ ID NO: 137.

Clause 250. An isolated UDP-Xylose transferase (QsC28XylT4-12aa) according to SEQ ID NO: 140.

Clause 251. An isolated UDP-Xylose transferase (SUMO-QsC28XylT4) according to SEQ ID NO: 143.

Clause 252. An isolated UDP-Xylose transferase (TrXA-QsC28XylT4) according to SEQ ID NO: 145.

Clause 253. An isolated UDP-Xylose transferase (MBP-QsC28XylT4) according to SEQ ID NO: 147.

Clause 254. An isolated UDP-Xylose transferase (QsC28XylT3-3xGGGS-QsC28XylT4) according to SEQ ID NO: 149.

Clause 255. An isolated type I polyketide synthase (AstLovF-TE) according to SEQ ID NO: 235. EXAMPLES

The genotypes of YL and SC yeast strains used in the Examples described below are provided in Table 3 and Table 4, respectively. Yeast engineering was carried out as described in Example 5 below (unless stated otherwise). Heterologous gene expression in yeast was carried out using nucleotide sequences that have been codon-optimized in order to increase the production of the corresponding protein. It is to be understood that codon optimization does not affect the amino acid sequence of the protein which is overexpressed. Heterologous genes have been integrated into the genome of the different yeast strains (as indicated), unless stated otherwise, under galactose-inducible promoters. After 2 days of culturing, expression of the heterologous genes has been induced with galactose added to the culture medium. 3 days post-induction, the production of sugars, QA precursors, QA, and acylated and/or glycosylated QA derivatives (as indicated) has been assessed by analysing their presence, by liquid chromatography-mass spectrometry (LC-MS) detection (as described in Example 6 below), after extraction of the yeast culture medium (as described in Example 5 below), unless stated otherwise.

1.1 Production of the B-amyrin precursor

A previously developed mevalonate-overproducing strain, Jwy601 , a CEN.PK2 based Saccharomyces cerevisiae strain was chosen as a parent strain (Wong et al. 2018). Jwy601 has been engineered to overexpress genes encoding p-amyrin synthases (BAS) of different plant origins by genome integration and the respective engineered yeast strains have been tested for their ability to convert 2,3-oxido-squalene into p-amyrin by analysing the presence of P-amyrin by gas chromatography-mass spectrometry (GC-MS) (using a standard commercially available).

Results

BAS from Artemisia annua (Aa) (named ‘AaBAS’ enzyme - SEQ ID NO: 1 encoded by SEQ ID NO: 3), Arabidopsis thaliana (At) (named ‘AtBAS’ enzyme - SEQ ID NO: 4 encoded by SEQ ID NO: 6), Glycyrrhiza glabra (Gg) (named ‘GgBAS’ enzyme - SEQ ID NO: 7 encoded by SEQ ID NO: 9), Gypsophila vaccaria (Gv) (named ‘GvBAS’ enzyme - SEQ ID NO: 10 encoded by SEQ ID NO: 12) have been tested. The BAS homolog from G. vaccaria yielded the highest production of p-amyrin (see Fig. 3). The yeast strain engineered with GvBAS (MLY-01) was therefore selected for further engineering as described below. 1.2 Production of QA and production optimization

1.2.1 Production of QA precursors in yeast

MLY-01 has been further engineered to co-express different cytochrome P450 (CYP) oxidases (C16, C23 and C28 oxidases) with a cytochrome P450 reductase (CPR) of different plant origins via sequential integration into the yeast genome. The production of QA and QA precursors has been analysed (using respective standards commercially available, e.g. from Merck, as a reference) by LC-MS in the yeast strains engineered with the following combination of enzymes:

A CPR from A. thaliana (named ‘AtATRT - SEQ ID NO: 49 encoded by SEQ ID NO: 51)

- A C16 oxidase from Bupleurum falcatum [CYP716Y1] (named ‘BfC16’ - SEQ ID NO: 17 encoded by SEQ ID NO: 19)

- A C23 oxidase from Medicago truncatula [CYP72A68] (named ‘MtC23’ - SEQ ID NO: 38 encoded by SEQ ID NO: 40)

- A C28 oxidase from Medicago truncatula [CYP716A12] (named ‘MtC28’ - SEQ ID NO: 46 encoded by SEQ ID NO: 48)

Results

Hederagenin and gypsogenin (QA precursors) were detectable. In addition, the pic obtained at about 10 min demonstrated the presence of QA at trace amount (< 1 mg/L) (data not shown here, but data disclosed in Fig. 3 of WO 20/26354). These results confirm the functional relevance and activity of the CPR and CYP oxidases expressed in yeast and their ability to produce QA, when co-expressed in yeast.

CYP oxidases of alternative plant origins have been additionally tested. MLY-01 has been further engineered to co-express homologs CYP oxidases from Q. saponaria, together with the above AtAtrl , via sequential integration into the yeast genome, as follows:

- A C16 oxidase [CYP716A297] (named ‘QsC16’ - SEQ ID NO: 20 encoded by SEQ ID NO: 22)

- A C23 oxidase [CYP714E52] (named ‘QsC23’ - SEQ ID NO: 29 encoded by SEQ ID NO: 31),

- A C28 oxidase [CYP716A24] (named ‘QsC28’ - SEQ ID NO: 41 encoded by SEQ ID NO: 43)

In some experiments, the cytochrome b5 protein from Q. saponaria (named ‘Qsb5’ - SEQ ID NO: 55 encoded by SEQ ID NO: 57) and/or the membrane steroid-binding protein from S. vaccaria (named ‘SvMSBPT - SEQ ID NO: 67 encoded by SEQ ID NO: 69) have been further co-expressed (see also below Sections 1.2.4 and 1.2.5).

The production of QA and QA precursors has been analysed (using respective standards commercially available, e.g. from Merck, as a reference) by LC-MS in the yeast strains engineered with the following combinations of enzymes:

- AtATR1-QsC28 (YL-1)

- AtATR1-QsC28-QsC23-Qsb5 (YL-3)

- AtATR1-QsC28-QsC23-Qsb5-QsC28C16 (YL-4)

- AtATR1-QsC28-QsC23-Qsb5-QsC28C16-SvMSBP1 (YL-6)

- AtATRI (2 copies)-QsC28 (2 copies)-QsC23-Qsb5-QsC28C16 (YL-8)

- AtATRI (2 copies)-QsC28 (2 copies)-QsC23 (2 copies)-Qsb5 (2 copies)-QsC28C16 (2 copies)-SvMSBP1 (2 copies) (YL-10)

The data are provided in Table 2 below and the results are presented in the form of a graph in Fig. 4.

Table 2 - Calculated titers of QA and QA precursors (in mg/L) in engineered YL strains

Results

- As shown in Fig. 4 and in Table 2, while AtATRI (the CPR reductase) alone was sufficient to facilitate C28 oxidation to carboxylic acid, leading to the production of 263.4 mg/L oleanolic acid (YL-1), C23 oxidation required Q. saponaria cytochrome b5 (Qsb5) for the hydroxy oxidation to an aldehyde functional group in gypsogenin (YL-3).

- The additional co-expression of QsC16 (together with AtATRI, QsC23 and QsC28) did not result into QA production (data not shown), indicating that no oxidation at the C16 position happened, suggesting that QsC16 was non-functional.

- Subcellular localization studies revealed that, unlike other CYP oxidases, the C-terminally mcherry-tagged QsC16 oxidase is cytosolic, despite the presence of a predicted transmembrane domain at the N-terminus of the C16 oxidase. The confocal microscopy images obtained show that QsC18-GFP is localized in the endoplasmic reticulum (ER) membrane (Fig. 5, left image), while QsC16-mcherry is localized in the cytosol (Fig. 5, middle image).

- In order to test the hypothesis that the lack of activity of QsC16 was due to inappropriate localization in yeast, the 22-amino acid predicted transmembrane domain of QsC28 was fused to the N-terminus of QsC16 ( named ‘QsC28C16’ - SEQ ID NO: 23 encoded by SEQ ID NO: 25), anchoring it to the ER membrane (Fig. 5, right image) where the OPR, the other CYP oxidases, as well as the terpene substrate, p-amyrin, are co-localized (data not shown).

- When co-expressing QsC28C16 (instead of QsC16) in YL-4, QA was detected and produced at 1.1 mg/L (see Table 2 and Fig. 4).

- The further co-expression of SvMSBPI , in YL-6, resulted into an increased global oxidation efficiency leading to an improved QA production (see Table 2 and Fig. 4). While the total titer of QA precursors remained consistent, the production of the final oxidation product (QA) was increased by 4-fold (4 mg/L) upon the co-expression of SvMSBPI , which colocalized with both QsC28 and QsC23 oxidases in the ER membrane (data not shown).

- The simultaneous overexpression of 2 copies of QsC28 and 2 copies of AtATRI , in YL-8, led to an 8-fold increase in QA (18.9 mg/L) (see Table 2 and Fig. 4).

- An additional second copy of all enzymes, in YL-10, led to a further optimized production of QA (65.2 mg/L) (see Table 2 and Fig. 4).

1.2.2 Gene discovery in S. vaccaria - CYP oxidases

Leaves and flowers of S. vaccaria (Sv) have been treated with 0, 50 pM or 100 pM methyl jasmonate (Meja) for 72h. The expression level of p-amyrin synthase mRNA has been analyzed (in leaves) (see Fig. 6A) and the fold-change of p-amyrin synthase mRNA expression induced by MeJa at 50 pM or 100 pM has been compared to 0 pM at 24h and 72h in flowers (see Fig. 6B). A neighbor-joining tree (1 ,000 bootstrap replicates) of cytochrome P450 (CYP) oxidases acting on triterpenoids from other plants and CYP candidates identified from S. vaccaria transcriptome (see also Section 1.2.4 below) is shown in Fig. 6C. Gene names of CYPs from S. vaccaria newly identified are labelled with an asterisk (*). Gene names of CYPs from S. vaccaria newly identified that are co-expressed with p amyrin synthase (BAS) are highlighted in boxes.

The functional relevance and activity of ‘SvC16’, ‘SvC23-1’, ‘Sv23-2’ and ‘SvC28’ (as named in Fig. 6C) has been tested in N. benthamiana, in combination with p-amyrin synthases (BAS) of different plant origins. The following enzymes have been transiently expressed in Nicotiana benthamiana, in different combinations (as indicated in Fig. 7):

- A p-amyrin synthase from S. vaccaria (BAS) (named ‘SvBAS’ - SEQ ID NO: 13 encoded by SEQ ID NO: 14)

- A p-amyrin synthase (BAS) from Q. Saponaria (named ‘QsBAS’ - SEQ ID NO: 15 encoded by SEQ ID NO: 16)

- A C16 oxidase from S. vaccaria (named ‘SvC16’ - SEQ ID NO: 26 encoded by SEQ ID NO: 27)

- A C28 oxidase from S. vaccaria (named ‘SvC28’ - SEQ ID NO: 44 encoded by SEQ ID NO: 45)

- A C23 oxidase from S. vaccaria (named ‘SvC23-1’ - SEQ ID NO: 32 encoded by SEQ ID NO: 33)

- A C23 oxidase from S. vaccaria (named ‘SvC23-2’ - SEQ ID NO: 35 encoded by SEQ ID NO: 36)

The production of QA precursors has been analyzed (using respective standards commercially available, e.g. from MCE, Chemcruz and TCI, as a reference) by LC-MS. Results

Results are shown in Fig. 7.

- Echinocystic acid and oleanolic acid were detected when co-expressing SvBAS, SvC28 and SvC16 (Fig. 7A).

- Gyspogenin was detected when co-expressing QsBAS, QsC28 and each of SvC23-1 or SvC23-2 (Fig. 7B)

- Gypsogenic acid was detected when co-expressing QsBAS, QsC28 and each of SvC23-1 or SvC23-2 (Fig. 7C)

These results confirm the functional relevance and activity of QsBAS, and the newly identified SvC16, SvC23-1, SvC23-2 and SvC28 oxidases, as well as their ability to produce QA precursors, when co-expressed in N. benthamiana.

1.2.3 QA production in yeast using S. vaccaria genes SvC16, SvC23-1 and SvC23-2 MLY-01 has been transformed with the following plasmids: pESC-TRP-SepGAL2- SvC16, pGAL10-AtAtr1 , pGAL1-QsC28, pGAL7-SvC23-1 or pESC-TRP-SepGAL2-SvC16, pGA10-AtAtr1 , pGAL1-QsC28, pGAL7-SvC23-2. The production of QA and QA precursors has been analyzed (using respective standards commercially available, as a reference) by HPLC/LC-MS. Results

- Both chromatograms in Fig. 8 and Fig. 9 show a peak exactly matching the exact m/z value and retention time of the commercial QA standard (dashed line).

- Confocal microscopy images revealed that SvC16 is well-expressed and localizes properly in the endoplasmic reticulum (ER) of the yeast (data not shown), in contrary to QsC16 (see above Section 1.2.1).

These results confirm the functional relevance of SvC16, SvC23-1 and SvC23-2 oxidases, as well as their ability to produce QA, when co-expressed with AtATRI and QsC28 in yeast.

1.2.4 Gene discovery in S. vaccaria - MSBP proteins

Genes encoding MSBP homologs to A. thaliana (At) have been identified in S. vaccaria (Sv) transcriptome by sequence similarity search using algorithm tblastn. Amino acid sequences of MSBPs from At (named ‘AtMSBPT - SEQ ID NO: 63 and ‘AtMSBP2’ - SEQ ID NO: 65) were submitted in a database of Sv transcriptome (prepared in-house) for a comparison with translated DNA sequences of all genes in the transcriptome. Similar sequences were selected based on sequence identity (last column of Table 3) and the significance of sequence match (third column of Table 3). The results are summarized in Table 3 below.

Table 3 - Arabidopsis thaliana (At) MSBP homologs in S. vaccaria (Sv)

*Transcript names

**The longest 2 sequences (also showing the highest expression level in leaves and flowers, as shown in Fig. 10) were selected for functional test in yeast (as described in the below Section 1.2.5).

The average expression levels of the different homologs identified in Table 3 has been analysed in leaves and flowers of S. vaccaria (see Fig. 10).

1.2.5 QA production in yeast using homologs MSBP from S. vaccaria

The functional relevance and activity of the transcripts PB.393.1 and PB.16084.2 has been tested for their ability to increase the oxidation efficiency and improve QA production in yeast. Respective corresponding proteins have been named ‘SvMSBPT (SEQ ID NO: 67 encoded by SEQ ID NO: 69) and ‘SvMSBP2’ (SEQ ID NO: 70 encoded by SEQ ID NO: 72) and have been integrated into the genome of yeasts engineered to produce QA, as follows:

- AtATRI I QsC28 I QsC23 I Qsb5 I QsC28C16 I SvMSBPI (YL-6)

- AtATRI I QsC28 I QsC23 I Qsb5 I QsC28C16 I SvMSBP2

- AtATRI I QsC28 I QsC23 I Qsb5 I QsC28C16 I QsMSBPI

- AtATRI / QsC28 / QsC23 / Qsb5 / QsC28C16 (YL-4) has been used as a control

A homolog MSBP from Q. saponaria (named ‘QsMSBPT - SEQ ID NO: 73 and encoded by SEQ ID NO: 75) has been tested as well. The production of QA and QA precursors has been analyzed (using respective commercial standards as a reference) by LC- MS. Results are presented in the form of a graph in Fig. 11.

Results

As compared with YL-4 (which does not overexpress any MSBP protein), in yeasts overexpressing MSBP proteins (whether from S. vaccaria or from Q. saponaria), a significant increase in QA production was observed, with SvMSBPI and SvMSBP2 performing better (see Fig. 11). SvMSBPI was selected for further yeast engineering to produce C3- glycosylated QA derivatives (see Example 2 below), C28-glycosylated QA derivatives (see Example 3 below) and QS-21-Xyl and QS-21-Api (see Example 4 below).

Conclusion

Using different heterologous enzymes (P-amyrin synthase, CYP oxidases, CYP reductase) and heterologous proteins (cytochrome b5 and MSBP proteins) from different plant origins (e.g. G. vaccaria, A. thaliana, Q. saponaria and S. vaccaria), in different combinations, the inventors have been able to reconstruct in yeast the metabolic pathway leading to the biosynthesis of QA, achieving, for the first time, the successful production of QA in yeast at about 65 mg/L.

2.1 Production of UDP-sugars non- native to yeast (Glucuronic acid. Xylose and Rhamnose)

■ Glucuronic acid (GlcA)

As shown in Fig. 12, UDP-GIcA is produced by a UDP-glucose dehydrogenase (UGD) from UDP-Glucose. A gene encoding a UDP-glucose dehydrogenase from A. thaliana (named ‘AtUGD’ - SEQ ID NO: 84 encoded by SEQ ID NO: 86), has been integrated into the genome of the parent yeast strain CEN.PK2-1c to generate SC-1. The production of UDP-GIcA has been analyzed by LC-MS.

Results

- Fig. 13A shows that UDP-GIcA was detected by SC-1, confirming the functional activity of AtUGD when overexpressed in yeast.

■ Xylose (Xyl)

As shown in Fig. 12, UDP-GIcA can be decarboxylated by a UDP-xylose synthase (UXS) to form UDP-Xyl. A UDP-xylose synthase from A. thaliana (named ‘AtUXS’ - SEQ ID NO: 105 encoded by SEQ ID NO: 107) has been integrated into the genome of SC-1 (overexpressing AtUGD) to generate SC-4. As shown in Fig. 12, UDP-GIcA can also be decarboxylated by a UDP-apiose synthase (AXS) to form UDP-Xyl. A UDP-apiose synthase from Q. saponaria AXS (named ‘QsAXS’ - SEQ ID NO: 113 encoded by SEQ ID NO: 115) has been integrated, together with AtUGD, into the genome of the parent yeast strain CEN.PK2-1c to generate SC-16. The production of UDP-Xyl has been analyzed by LC-MS.

Results - The production of UDP-Xyl was detected in both SC-4 and SC-16 (see Fig. 13A and Fig. 13B), confirming the functional activity of AtllXS and QsAXS when overexpressed in yeast.

■ Rhamnose (Rha)

The expression of the trifunctional AtRHM2 synthase enzyme from A. thaliana (named ‘AtRHM2’ - SEQ ID NO: 102 encoded by SEQ ID NO: 104) has been investigated as a potential rhamnose synthase. AtRHM2 catalyzes the conversion from UDP-GIc directly to UDP-Rha via (i) the dehydration of UDP-GIc followed by (ii) the epimerization of the C3' and C5' positions to form UDP-4-keto-p-L-rhamnose and (iii) the reduction of UDP-4-keto-p-L- rhamnose to produce UDP-p-L-rhamnose. AtRHM2 has been integrated into the genome of the parent yeast strain CEN.PK2-1c to generate SC-17 and into the genome of SC-4 to generate SC-18. The production of UDP-Rha has been analyzed by LC-MS.

Results

- The production of UDP-Rha was detected in both SC-17 and SC-18 (see Fig. 14A and Fig. 14B), confirming the functional activity of ARHM2 when overexpressed in yeast.

2.2 Production of QA-C3-GlcA

The same AtUGD as above has been integrated into the genome of YL-10 (producing QA), together with a glucuronic acid transferase (GlcAT) from Q. saponaria (named ‘QsCsIGT

- SEQ ID NO: 78 encoded by SEQ ID NO: 80) or a second glucuronic acid transferase from Q. saponaria (named ‘QsCslG2’ - SEQ ID NO: 81 encoded by SEQ ID NO: 83) to generate YL- 11 and YL-12, respectively. Production of QA precursors as well as QA-C3-GlcA has been analyzed by LC-MS, using respective standards as a reference (QA-C3-GlcA standard corresponds to QAGIcpA, generated as described in WO 20/260475).

Results

- QA-C3-GlcA was detected in both YL-11 (overexpressing QsCsIGI) and YL-12 (overexpressing QsCslG2) (see Fig. 15 and Fig. 16, respectively), confirming the functional activity of the two enzymes when overexpressed in yeast.

- QsCsIGI is specific towards QA and does not glycosylate other precursors, while QsCslG2 enzyme is promiscuous and 3 times more reactive than CslG1 enzyme to produce GlcA-QA (10.2 mg/L and 3.9 mg/L, respectively) (see Fig. 17).

The inventors also identified in the transcriptome of S. vaccaria a novel gene encoding a CsIG homolog enzyme (named ‘SvCsIG’ - SEQ ID NO: 76 encoded by SEQ ID NO: 77). The function of SvCsIG as a GlcA transferase candidate has been confirmed using an in vitro enzymatic assay. QA (commercially available, e.g. from MedChem Express) and UDP-GIcA, have been directly added into the reaction buffer together with a microsomepreparation of a yeast strain overexpressing SvCsIG via plasmid expression. The production of QA and QA-C3- GlcA was analyzed by LC-MS.

Results

- Fig. 18 shows that, in the presence of UDP-GIcA, a peak corresponding to QA-C3-GlcA was observed, indicating the ability of SvCsIG to transfer UDP-GIcA to the C3 position of QA, confirming its functional relevance and activity when expressed in yeast.

2.3 Production of QA-C3-GlcA-Gal

UDP-galactose is natively produced in yeast and therefore, no addition of a sugar synthase is necessary for this glycosylation step. A galactose transferase from Q. Saponaria (named ‘QsGalT’ - SEQ ID NO: 116 and encoded by SEQ ID NO: 118) has been integrated into the genome of YL-12 to generate YL-13. The production of QA-C3-GlcA and QA-C3-GlcA- Gal has been analyzed (using respective standards) by LC-MS. QA-C3-GlcA standard and QA-C3-GlcA-Gal standard corresponds to ‘QAGIcpA’ and ‘QA-GIcpA-Galp’, respectively, generated as described in WO 20/260475.

Results

- Fig. 19 shows that the overexpression of QsGalT facilitates the glucuronidation step, possibly by complete conversion of QA-C3-GlcA to QA-C3-GlcA-Gal, and thus, pushing the reaction equilibrium towards further glycosylation. The production of QA-C3-GlcA-Gal achieved in YL-13 was 24.3 mg/L.

The inventors also identified in the transcriptome of S. vaccaria a novel gene encoding a galactose transferase candidate (named ‘SvGalT’ - SEQ ID NO: 98 encoded by SEQ ID NO: 99). The function of SvGalT as a galactose transferase has been confirmed by transiently expressing SvCsIG and SvGalT in N. benthamiana plants. Plants have been infiltrated with 40 pM of QA (commercially available, e.g. from MedChem Express) 2 days after Agrobacterium tumefaciens infiltration. The production of QA-C3-GlcA and QA-C3-GlcA-Gal has been analyzed (using respective standards) by LC-MS. QA-C3-GlcA standard and QA-C3-GlcA-Gal standard correspond to ‘QAGIcpA’ and ‘QA-GIcpA-Galp’, respectively, generated as described in WO 20/260475.

Results

- Fig. 20 shows that a peak corresponding to QA-GIcA-Gal was observed when coexpressing SvCsIG and SvGalT, indicating the ability of SvGalT to transfer UDP-Gal to QA- GlcA, confirming its functional relevance and activity.

2.4 Production of QA-C3-GlcA-Gal-Rha The above AtRHM2 and a rhamnose transferase from Q. Saponaria (named ‘QsRhaT’

- SEQ ID NO: 119 and encoded by SEQ ID NO: 121) have been integrated into the genome of YL-13 to generate YL-14. The production of QA-C3-GlcA and QA-C3-GlcA-Gal and QA-C3- GlcA-Gal-Rha has been analyzed (using respective standards) by LC-MS. QA-C3-GlcA standard, QA-C3-GlcA-Gal standard and QA-C3-GlcA-Gal-Rha correspond to ‘QAGIcpA’, ‘QA- GlcpA-Galp’, and ‘QA-GIcpA-Galp-Rhap’, respectively, generated as described in WO 20/260475.

Results

- Fig. 21 shows that the co-expression of AtRHM2 and QsRhaT, together with AtllGD and QsGalT, resulted into the production of QA-C3-GlcA-Gal-Rha. The level achieved was 9.5 mg/L. No residual QA-GIcA-Gal was observed indicating that QsRhaT is highly efficient and catalyzes the complete conversion of QA-GIcA-Gal to QA-C3-GlcA-Gal-Rha.

2.5 Production of QA-C3-GlcA-Gal-Xyl

The above AtllXS has been integrated into the genome of YL-12 (a yeast strain engineered to produce UDP-GIcA). Direct expression of AtllXS in the UDP-GIcA-producing strain led to the absence of any glycosylated molecule (data not shown), possibly due to insufficient UDP-GIcA production. This suggested that the downstream metabolite UDP-Xylose may act as an allosteric feedback inhibitor controlling the activity of UGD. This is confirmed in Fig. 13A showing that there was no detectable UDP-GIcA when AtUXS is being co-expressed with AtUGD.

■ AtUGD mutation

It has been reported that a point mutation A104L engineered in the human UGD homolog has led to a lower UDP-Xyl binding affinity. Therefore, as an attempt to alleviate the observed UGD inhibition induced by UDP-Xyl, mutation(s) were introduced into AtUGD in order to lower UDP-Xyl binding affinity. The protein sequence of AtUGD was aligned against that of the human UGD to identify the corresponding amino acid (data not shown), and a point mutation A101 L was introduced into AtUGD (AtUGD _A10iL - SEQ ID NO: 108 encoded by SEQ ID NO: 109). AtUGD _A10iL, has been integrated into the genome of YL-10 (yeast engineered to produce QA), together with the above QsCslG2, and QsGalT, as well as with a UDP-xylose transferase from Q. saponaria (QsC3XylT - SEQ ID NO: 122 encoded by SEQ ID NO: 124), to generate YL-15. The production of QA-C3-GlcA, QA-C3-GlcA-Gal and QA-C3-GlcA-Gal-Xyl has been analyzed (using respective standards) by LC-MS.

Results - Fig. 22 shows that QA-C3-GlcA-Gal-Xyl was detected in YL-15, with a level achieved at 1 mg/L.

In order to investigate the varying degrees of UDP-Xyl inhibition on different UGDs, six homologs were selected across kingdoms to include those from Synechococcus sp. (Syn) (named ‘SynllGD’ - SEQ ID NO: 154 encoded by SEQ ID NO: 156), Homo sapiens (Hs) (named ‘HSUGD ₁₀4L’ - SEQ ID NO: 157 encoded by SEQ ID NO: 159), Paramoeba atlantica (Patl) (named ‘PatlUGD’ - SEQ ID NO: 110 encoded by SEQ ID NO: 112), Bacillus cytotoxicus (Bcyt) (named ‘BcytUGD’ - SEQ ID NO: 160 encoded by SEQ ID NO: 162), Corallococcus macrosporus (Myxfulv) (named ‘MyxfuIvlIGD’ - SEQ ID NO: 163 encoded by SEQ ID NO: 165), Pyrococcus furiosus (Pfu) (named ‘PfullGD’ - SEQ ID NO: 166 encoded by SEQ ID NO: 168). The sequences of these homologs have been integrated into genome of YL-10 (a yeast strain engineered to produce QA), together with the above QsCslG2, QsGalT, AtllXS, and QsC3XylT, generating YL-16 to YL-21, respectively. The production of QA-C3-GlcA-Gal-Xyl has been analyzed by LC-MS (using respective standards). The results are presented in Fig. 23 in the form of a graph.

Results

- Fig. 23 shows that, while the production of QA-C3-GlcA-Gal-Xyl from other UGD enzymes was comparable with AtUGD _AioiL (YL-15 being used as a control), PatlUGD (YL-18) yielded 3 times higher in production. Upon sequence alignment of PatlUGD with AtUGD, it was noticed that the A101L mutation of AtUGD is natively present in PatlUGD, which may increase its tolerance of UDP-Xyl (data not shown).

■ Alternative UDP-GIcA biosynthesis pathway

UDP-GIcA can also be generated via the de novo salvage pathway or the myo-inositol oxidation pathway. Glucuronokinase (GlcAK) and UDP-sugar pyrophosphorylase (USP) convert free glucuronic acid to GlcA-1 -phosphate and eventually the active UDP form of GlcA (UDP-GIcA). These enzymes are also responsible for the myo-inositol pathway starting with myo-inositol oxygenase (named ‘MIOX’). A GlcAK enzyme from A. thaliana (named ‘AtGIcAK’

- SEQ ID NO: 169 encoded by SEQ ID NO: 171) and a USP from A. thaliana (named ‘AtUSP’

- SEQ ID NO: 223 encoded by SEQ ID NO: 225) have been integrated into the genome of YL- 15 to generate YL-22. The same GlcAK and AtUSP have been separately integrated, together with a MIOX from Thermothelomyces thermophilus (named ‘TtMIOX’ - SEQ ID NO: 172 encoded by SEQ ID NO: 174), into the genome of YL-15 to generate YL-23. The culture medium of YL-23 was either left untreated or exogenously supplemented with 0.5% glucuronic acid and 2% myo-inositol (Ml). The production of QA-C3-GGX has been analyzed by LC-MS (using respective standards). Results

- QA-C3-GGX production was improved by 3-fold in YL-22, as the residual QA decreased significantly (see Fig. 24).

- QA-C3-GGX production, in YL-23 (further overexpressing TtMIOX), was increased by 1.7- fold and 2.3-fold, in the presence of 2% Ml and 0.5% GlcA exogenously supplemented, respectively. Production was further improved by 5.9-fold when both Ml and GlcA were supplemented (see Fig. 25).

■ Inducible TetOn promoter to delay the expression of UXS to accumulate UDP-GIcA

Inducible promoters such as pDDI2 (induced by methyl methane sulfonate), pCup1 (induced by copper ions), as well as pTetOn (induced by tetracycline or doxycycline) have been investigated and used, as a way to delay the expression of AtllXS. AtllXS has been overexpressed in a yeast engineered to produce QA-C3-GGX under a pTetOn promoter. Production of QA, QA-C3-GG and QA-C3-GGX has been analyzed by LC-MS.

Results

- pTetOn was compatible with the galactose promoters used in the parent yeast strain and the protein expression of AtllXS was linearly dependent on the concentration of the inducer (data not shown).

- In the absence of any inducer, a 5.5-fold increase of QA-C3-GGX production was observed, possibly because of the basal level expression of AtUXS due to the leakiness of the promoter. The minimal amount of UDP-Xyl produced may not be sufficient to inhibit AtUGD.

- In order to induce pTetOn, 20 or 100 pg/mL of doxycycline has been added exogenously supplemented in the yeast culture medium 24h after galactose induction. This led to the increased production of QA-C3-GGX by 5.9- and 8.5-fold, as compared to YL-15. When induced with 100 pg/mL of doxycycline after 40h after galactose induction, an 11-fold increase was observed (see Fig. 26).

■ Identification of a C3XylT enzyme in S. vaccaria

The inventors also identified in the transcriptome of S. vaccaria a novel gene encoding a xylosyl transferase candidate (named ‘SvC3XylT’ - SEQ ID NO: 100 encoded by SEQ ID NO: 101). The function of SvC3XylT as a xylose transferase has been tested by transiently coexpressing the same SvCsIG enzyme and SvGalT enzyme as described earlier in N. benthamiana plants. Plants have been infiltrated with 40 pM of QA (commercially available, e.g. from MedChem Express) 2 days after Agrobacterium tumefaciens infiltration. QA-C3-GlcA-Gal- Xyl production has been analyzed by LC-MS. A standard corresponding to ‘QA-GIcpA-Galp- Xylp’ generated as described in WO 20/260475 has been used as a reference. Results

Fig. 58 shows that a peak corresponding to QA-C3-GlcA-Gal-Xyl was observed when co-expressing SvCsIG, SvGalT and SvC3XylT, demonstrating the ability of SvXyIT to transfer UDP-Xyl to QA-GIcA-Gal, confirming thus its functional relevance and activity. Conclusion

Using different heterologous enzymes (glycosyl synthases, glycosyl transferases) from different plant origins (e.g. A. thaliana, Q. saponaria and S. vaccaria), in different combinations, the inventors have been able to reconstruct in yeast the metabolic pathway leading to the biosynthesis of C3-glycosylated QA derivatives, achieving, for the first time, the successful production of such C3-glycosylated QA derivatives in yeast.

Example 3 - C28-qlycosylated QA derivatives biosynthesis

3.1 Production of Fucose non- native to yeast

The transcriptome of S. vaccaria was further explored to identify genes and enzymes involved in saponin biosynthesis, as S. vaccaria contains a number of different saponins that have similarity to saponins in Q. saponaria. S. vaccaria plants were treated with methyl- jasmonate (Meja) which was shown to induce biosynthesis of saponins in plants. An extensive RNASeq analysis was then performed to identify the full-length transcripts in the plants, and to identify the induced genes. Among them, several genes were known to be involved in biosynthesis of the triterpene backbone (e.g. p-amyrin synthase), as well as several Cytochrome P450 enzymes (CYP) and glycosyltransferase genes (see e.g. WO 20/263524). Some of the genes are homologs to genes known to be involved in saponin biosynthesis in Q. Saponaria (see e.g. WO 19/122259, WO 20/260475, WO 22/136563; Decker and Kleczkowski 2017). Based on knowledge from dTDP-D-Fucose biosynthesis in bacteria and UDP-L- Rhamnose biosynthesis in plants, it was predicted the pathway to include a dehydratase step and a reductase step (as shown in Fig. 12). No homologs of the enzymes involved in biosynthesis of dTDP-D-Fucose were found in bacteria. A homolog of a Q. saponaria UDP-4- keto-6-deoxy-glucose reductase gene was discovered in the S. vaccaria transcriptome, which was named ‘svNMD’. A candidate UDP-glucose-4,6-dehydratase that was induced by methyl- jasmonate and belongs to the family of nucleotide sugar epimerases was also discovered. The predicted enzyme, named ‘svUG46DH’, has similarity to a domain of UDP-L-Rhamnose synthase in plants. It was hypothesized that the two enzymes, sv46DH and svNMD, would catalyze the conversion of UDP-D-glucose to UDP-D-fucose (see Fig. 12). The functional relevance and activity of these newly identified genes has been tested in yeast, assessing for their ability to produce UDP-Fucose, in combination with the following enzymes:

- svUG46DH (SEQ ID NO: 87 encoded by SEQ ID NO: 89) and svNMD (SEQ ID NO: 90 encoded by SEQ ID NO: 92) have both been integrated into the genome of the parent yeast strain CEN.PK2-1c to generate SC-19.

- SvllG46DH and SvNMD have both been integrated into the genome of SC-4 (overexpressing AtUGD-AtUXS) to generate SC-20.

- SvllG46DH and SvNMD have both been integrated into the genome of SC-17 (overexpressing AtRHM2) to generate SC-22.

- SvllG46DH and SvNMD have both been integrated into the genome of SC-18 (overexpressing AtUGD-AtUXS-AtRHM2) to generate SC-23.

A homolog reductase from Q. saponaria (WO 22/136563) (named ‘QsFucSyn’ - SEQ ID NO: 175 encoded by SEQ ID NO: 177) has been alternatively tested, in combination with the following enzymes:

- QsFucSyn and SvllG46DH have been integrated into the genome of SC-4 (overexpressing AtUGD-AtUXS) to generate SC-21.

The production of UDP-Fucose has been analyzed by LC-MS.

Results

- UDP-Fucose was produced when svUG46DH and svNMD were overexpressed on their own (SC-19) (see Fig. 14A and Fig. 14B or Fig. 27A and Fig. 27B).

- UDP-Fucose was also produced when svUG46DH and svNMD were overexpressed together with AtUGD-AtUXS (SC-20) (see Fig. 14A and Fig. 14B or Fig. 27A and Fig. 27B).

- UDP-Fucose was also produced when svUG46DH and svNMD were overexpressed together with AtRHM2 (SC-22) (see Fig. 14A and Fig. 14B).

- UDP-Fucose was also produced when svUG46DH and svNMD were overexpressed together with AtUGD-AtUXS-AtRHM2 (SC-23) (see Fig. 14A and Fig. 14B).

- UDP-Fucose was also produced when QsFucSyn and svUG46DH were overexpressed together with AtUGD-AtUXS (SC-21) (see Fig. 14A and Fig. 14B or Fig. 27A).

These results confirm the functional relevance and activity of the newly identified SvUG46DH and SvNMD, and QsFucSyn, when expressed in yeast.

3.2 Production of QA-C3-GlcA-Gal-Rha/Xyl-C28-Fuc

A fucose transferase from Q. saponaria (WO 22/136563) (named ‘QsFucT’ - SEQ ID NO: 93 encoded by SEQ ID NO: 95) has been integrated into the genome of YL-14 to generate YL-25.

Results - QA-C3-GlcA-Gal-Rha and QA-C3-GlcA-Gal-Rha-C28-Fuc have been detected in YL-25 (see Fig. 28), confirmed by co-eluting with the respective standards (QA-C3-GlcA-Gal-Rha standard corresponds to ‘QA-TriR’, generated as described in WO 22/136563 and QA-C3- GlcA-Gal-Rha-C28-Fuc standard corresponds to ‘QA-TriR-F’, also generated as described in WO 22/136563).

The same QsFucT enzyme has also been integrated into the genome of YL-15 to generate YL-26. QA-C3-GlcA-Gal-Rha-C28-Fuc production has been analyzed by LC-MS. Results

- Production of QA-C3-GlcA-Gal-Xyl-C28-Fuc has been similarly observed in YL-26 (data not shown).

Identification of a FucT enzyme in S. vaccaria

The inventors also identified in the transcriptome of S. vaccaria a novel gene encoding a FucT candidate (named ‘SvFucT’ - SEQ ID NO: 96 encoded by SEQ ID NO: 97). The function of SvFucT as a fucose transferase has been tested by transiently co-expressing the same SvCsIG, SvUG46DH and SvNMD as described earlier in N. benthamiana plants. Plants have been infiltrated with 40 pM of QA (commercially available, e.g. from MedChem Express) 2 days after Agrobacterium tumefaciens infiltration. QsFucT (see above) was used as a positive control, and GFP was used as negative control. The production of QA-C3-GlcA-C28-Fuc has been analyzed by LC-MS.

Results

- Fig. 57 shows that, when overexpressing SvFucT, a peak was observed at the same retention time (see the dashed line), as when overexpressing QsFucT (positive control), demonstrating the ability of SvFucT to transfer UDP-Fuc to QA-GIcA, confirming thus the functional relevance and activity of the newly identified SvFucT.

3.3 Production of QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha

The same trifunctional AtRHM2 enzyme as described earlier, together with a rhamnose transferase from Q. Saponaria (named ‘QsRhaT’ - SEQ ID NO: 119 encoded by SEQ ID NO: 121), has been integrated into the genome of YL-15 to generate YL-28. QA-C3-GlcA-Gal-Xyl- C28-Fuc-Rha production has been analyzed by LC-MS (using a a standard which has been chemically synthesized as a reference).

Results

- QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha production was detected in YL-28 at a titer of about 1 mg/L (see Fig. 29). No residual substrate was observed, indicating that QsRhaT is highly efficient and catalyzes the complete conversion of QA-C3-GlcA-Gal-Xyl-C28-Fuc to QA- C3-GlcA-Gal-Xyl-C28-Fuc-Rha.

3.4 Production of QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha

An additional copy of the same trifunctional AtRHM2 enzyme (as described earlier), together with the same QsRhaT (as described earlier) has been integrated into the genome of YL-14 to generate YL-27. QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha production has been analyzed by LC-MS. A standard corresponding to ‘QA-TriR-FR’ as described in WO 22/136563 has been used as a reference.

Results

QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha was detected in YL-27 at a titer of about 3 mg/L (see Fig. 30). No residual substrate was observed, indicating that QsRhaT is highly efficient and catalyzes the complete conversion of QA-C3-GlcA-Gal-Rha-C28-Fuc to QA- C3-GlcA-Gal-Rha-C28-Fuc-Rha.

3.5 Production of QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl

An additional copy of the same AtUXS enzyme (as described earlier), together with a xylose transferase from Q. Saponaria (named ‘QsC28XylT3’ - SEQ ID NO: 125 encoded by SEQ ID NO: 127), has been integrated into the genome of YL-28 to generate YL-30. QA-C3- GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl production has been analyzed by LC-MS. QA-C3-GGR-C28- FRX previously obtained was used a reference.

Results

QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl was detected in YL-30 (see Fig. 31).

3.6 Production of QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl

The same AtUXS enzyme (as described earlier), together with the same QsC28XylT3 as above, was integrated into the genome of YL-27 to generate YL-29. The production of QA- C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl has been analyzed by LC-MS. A standard corresponding to ‘QA-TriR-FRX’ as described in WO 22/136563 has been used as a reference.

Results

QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl was detected in YL-29 (see Fig. 32).

3.7 Production of QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Xyl

An additional copy of the same AtUXS enzyme (as described earlier), together with a xylose transferase from Q. Saponaria (named ‘QsC28XylT4’ - SEQ ID NO: 128 encoded by SEQ ID NO: 130), has been integrated into the genome of YL-30 to generate YL-33. The production of QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Xyl has been analyzed by LC-MS. QA- C3-GGR-C28-FRXX previously obtained was used a reference.

Results

- Conversion of QA-C3-GGX-C28-FRX into QA-C3-GGX-C28-FRX was observed in YL-33 (Fig. 33).

3.8 Production of QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Xyl

An additional copy of the same AtllXS enzyme (as described earlier), together with the same C28QsXylT4 as above, has been integrated into the genome of YL-29 to generate YL- 31. The production of QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Xyl has been analyzed by LC- MS. A standard corresponding to ‘QA-TriR-FRXX’ as described in WO 22/136563 has been used as a reference.

Results

- Conversion of QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl into QA-C3-GlcA-Gal-Rha-C28- Fuc-Rha-Xyl-Xyl was observed in YL-31 (Fig. 34).

3.9 Production of apiose non-native to yeast

UDP-Apiose can be produced using apiose synthase (‘AXS’) enzymes, which produces both UDP-Xyl and UDP-Api (as shown in Fig. 12). The same AtllGD enzyme as above (as described earlier) and an apiose synthase from Q. saponaria (named ‘QsAXS’ - SEQ ID NO: 113 encoded by SEQ ID NO: 115) have been integrated into the genome of the parent yeast strain CEN.PK2-1c to generate SC-16. UDP-Apiose is a very unstable compound with a halflife of 100 min at room temperature. While UDP-Apiose was not detectable (data not shown), it is likely it was produced but degraded during the extraction process, and was thus undetectable via LC-MS.

3. 10 Production of QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Api

An additional copy of the same QsAXS enzyme as above, together with an apiose transferase from Q. Saponaria (named ‘QsC28ApiT4’ - SEQ ID NO: 151 encoded by SEQ ID NO: 153), has been integrated into the genome of YL-30 to generate YL-34. The production of QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Api has been analyzed the by LC-MS. Results

- Conversion of QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl into QA-C3-GlcA-Gal-Xyl-C28-Fuc- Rha-Xyl-Api was observed in YL-34 (Fig. 35).

3. 11 Production of QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Api

The same QsAXS enzyme as above, together with the same QsC28ApiT4 enzyme as above, has been integrated into the genome of YL-29 to generate YL-32. The production of QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl-Api has been analyzed the by LC-MS.

Results

- Conversion of QA-C3-GlcA-Gal-Rha-C28-Fuc-Rha-Xyl to QA-C3-GlcA-Gal-Rha-C28-Fuc- Rha-Xyl-Api was observed in YL-32 (Fig. 36).

Conclusion

Using different heterologous enzymes (glycosyls ynthases and glycosyl transferases) from different plant origins (e.g. A. thaliana, Q. saponaria and S. vaccaria), the inventors have been able to reconstruct in yeast the metabolic pathway leading to the synthesis of C28- glycoslylated QA derivatives, achieving, for the first time, the successful production of such C28-glycoslylated QA derivatives in yeast.

Different approaches have been investigated to assess whether the conversion of QA- C3-GGR/X-C28-FRX into QA-C3-GGR/X-FRXX/A could be improved.

3.12 Subcellular localization of QsC28XylT4 and QsC28ApiT4

The subcellular localization of QsC28XylT4 and QsC28ApiT4 heterologously expressed in yeast was examined. C-terminal green fluorescent protein (GFP) fusion was built to provide QsC28XylT4-GFP and QsC28ApiT4-GFP in order to visualize the subcellular localization in yeast (using QsC28XylT3-GFP as a reference). Each of QsC28XylT4-GFP, QsC28ApiT4-GFP, and QsC28XylT3-GFP has been integrated into the genome of the parent yeast strain CEN.PK2-1C.

Results

- While flow cytometry data (aimed at measuring the absolute protein expression level) showed similar fluorescence intensity, indicating a similar level of protein expression (data not shown), confocal microscopy images revealed that, unlike QsC28XylT3-GFP (Fig. 37A) which shows a cytosolic localization, QsC28XylT4-GFP (Fig. 37B) and QsC28ApiT4-GFP (Figure 37C) formed aggregates, generally known as inclusion bodies in yeast.

3.13 Identification of the localization of QsXylT4 and QsApiT4 aaareciation

Three signature localization protein markers have been selected to identify the subcellular localization where QsC28XylT4 and QsC28ApiT4 aggregates are formed with the aim to functionally express the two enzymes in the cytosol. Rnq1 which is a yeast native prion protein has been shown to co-localize with ‘insoluble protein deposit’ (‘IPOD’), a reservoir and degradation location for amyloid-like proteins. C-terminal mcherry-tagged Rnq1 was expressed in yeast independently to visualize IPOD, shown to be a perivascular compartment (data not shown). The co-expression of Rnq1-mcherry with QsC28XylT4-GFP revealed a different localization pattern, suggesting that QsC28XylT4 is likely not an amyloid-like misfolded protein (data not shown). The second protein marker, heat shock protein-42 (Hsp-42), has been selected due to its suggested physiological role in initiation of stress granules in yeast upon starvation in carbon or nitrogen sources. Hsp42-mcherry fusion protein was localized in the cytosol and nucleus of yeast (data not shown) and was shown to be co-localized with QsXylT4-GFP (data not shown), suggesting the possible sequestration of QsC28XylT4 into stress granules. The last protein marker selected was Rpn1, a functional component of the proteasome actively involved in the protein degradation machinery. When expressed alone, Rpn1 , together with the proteasome machinery, was localized in the nucleus. Upon coexpression with QsC28XylT4-GFP, while the majority of Rpn1-mcherry still remained in the nucleus at 24 h (data not shown), it formed aggregates around QsC28XylT4-GFP aggregates at 48h and degraded the aggregates towards protein recycling (data not shown). These results suggest that QsC28XylT4 may be sequestered into Hsp42-related stress granule and be prone to degradation.

3.14 N-terminal truncation or solubility taqaing of QsC28XylT4

Truncation of the N-terminus of QsC28XylT4, with the increment of three amino acids up to 12, as well as addition of solubility tags, such as SUMO, TrXA, and MBP, have been carried as an attempt to re-direct the protein in the cytosol. ‘QsC28XylT4-3aa’ (QsC28XylT4 deleted from the 3 first amino acids - SEQ ID NO: 131 encoded by SEQ ID NO: 133), ‘QsC28XylT4-6aa’ (QsC28XylT4 deleted from the 6 first amino acids - SEQ ID NO: 134 encoded by SEQ ID NO: 136), ‘QsC28XylT4-9aa’ (QsXylT4 deleted from the 9 first amino acids - SEQ ID NO: 137 encoded by SEQ ID NO: 139), ‘QsC28XylT4-12 aa’ (QsC28XylT4 deleted from the 12 first amino acids - SEQ ID NO: 140 encoded by SEQ ID NO: 142), ‘SUMO-QsC28XylT4’ - SEQ ID NO: 143 encoded by SEQ ID NO: 144, ‘TrXA-QsC28XylT4’ - SEQ ID NO: 145 encoded by SEQ ID NO: 146 and ‘MBP-QsC28XylT4’ - SEQ ID NO: 147 encoded by SEQ ID NO: 148 have each been integrated into the genome of YL-30 to generate YL-35, YL-36, YL-37, YL-38, YL-39 and YL-40, respectively. The level of QsC28XylT4 protein expression in each yeast strain and the ability of each yeast strain to produce QA-C3-GlcA- Gal-Xyl-C28-Fuc-Rha-Xyl-Xyl (as compared with YL-33, harboring a wild-type, full-length, nontagged, QsXylT4) have been looked at. Results

- The fluorescence intensity measured by flow cytometry shows the highest level of protein expression for QsC28XylT4-MBP (see Figure 38).

- In terms of production of QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Xyl, all N-terminal truncations of QsC28XylT4 and all N-terminal-tagged QsC28XylT4 showed a better yield, with the N-terminus MBP tag addition providing a 7-fold increase, as compared with the wild-type and full-length of the enzyme (see Figure 39).

3.15 N-terminal fusion of QsXylT3 to QsXylT4

As an alternative way to render QsC28XylT4 cytosolic, QsC28XylT3 (shown to be cytosolic when expressed in yeast, as described earlier), was fused at the N-terminus of QsC28XylT4. A 3xGGGS linker was genetically inserted between the two amino acid sequences of the enzymes to ensure the flexibility of the linker and independent folding of the two enzymes, without affecting the functional properties of the fusion protein. The fusion QsC28XylT3-3xGGGS-QsC28XylT4 (SEQ ID NO: 149 encoded by SEQ ID NO: 150) has been integrated into the genome of YL-30 to generate YL-41. The localization of the fusion protein and the production of QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Xyl have been looked at. Results

- Confocal microscopy images showed an improved cytosolic expression with less level of aggregation observed for the QsXylT3-3xGGGS-QsXylT4-GFP fusion protein, as compared to QsXylT4-GFP when expressed alone (see Fig. 40).

- The improved reactivity of the fusion protein was also confirmed by the observation of the complete conversion of QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl which leads to a distinctive peak corresponding to the mass of QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Xyl (see Fig. 41 B), compared to the absence of the peak in the yeast strain where QsXylT3 and QsXylT4 were expressed separately (see Fig. 41A).

3.16 Continuous feeding scheme to decrease QsXylT4 sequestration and degradation

A continuous feeding scheme has been devised by adding fresh nitrogen and/or carbon sources every 24h. Protein expression and protein localization have been looked at.

Results

- The fluorescence intensity, reflecting QsC28XylT4 absolute expression, has been measured by flow cytometry (Fig. 42). The protein expression decreased over the course of 60h of the experiment, while spiking additional carbon source only (glucose or galactose) had no impact on the expression level.

- In contrast, the addition of fresh media with additional nitrogen source as well as 4% galactose consistently increased QsC28XylT4 expression level up to 5-fold by 60h.

- In the presence of new carbon and nitrogen sources, QsXylT4 did not colocalize with Hsp42. Rpn1, which represents the localization of the proteasome machinery, remained in the nucleus and did not degrade QsXylT4 aggregates. While some aggregation of QsXylT4 persisted in the presence of new media, consistent cytosolic expression was also observed, in stark contrast to yeast strains cultured with only old media, where QsXylT4 expression in the cytosol was depleted after 24 h (data not shown). -dimeric acvl chain terminated with

As shown in Fig. 2A, acylation requires the biosynthesis and two consecutive additions of C9-CoA; two chalcone-synthase-like type III polyketide synthases (PKSs) stitch two malonyl CoA and one unit of (S)-2-methylbutyryl-CoA (2MB-CoA) to form the C9-keto-CoA which is subsequently reduced by two standalone keto-reductases (KRs) to yield the C9-CoA.

4.1 (S) -2- methyl butyryl CoA (2MB-CoA) conversion from the 2MB acid

Conversion of exogenously supplemented 2MB acid to 2MB-CoA by a CoA ligase identified from Q. Saponaria transcriptome has been investigated. The functional expression of this CoA ligase from Q. Saponaria (named ‘QsCCL’ - SEQ ID NO: 178 encoded by SEQ ID NO: 180) has been confirmed using a high-copy plasmid transfected into the parent yeast strain CEN.pk2-1c via confocal microscopy imaging of the C-terminal GFP fusion of the enzyme, which is visualized to be in the cytoplasm and is stable for at least 24h after galactose induction (data not shown). Additionally, the conversion of 2MB acid to 2MB-CoA by QsCCL has been demonstrated using a whole-cell feed-in experiment. 2MB acid has been added directly to the yeast cell culture and the yeast cells have been lysed to allow the measurement of the intracellular content of 2MB-CoA, by a liquid chromatography method using a porous graphitic carbon column. Production of 2MB-CoA from 50 mg/L 2MB acid in YL-QsCCL has been confirmed by co-eluting with a 2MB-CoA standard (the standard has been chemically synthesized) (see Fig. 43).

4.2 Biosynthesis and reduction of keto-C9-CoA to make C9-CoA

As shown in Fig. 2A and Fig. 2E, the 18-carbon acyl chain consists of two repeating C9 units. They are synthesized from two units of malonyl CoA and one 2MB-CoA using two chalcone-synthase-like type III PKSs (named ‘QsChSD’ - SEQ ID NO: 181 encoded by SEQ ID NO: 183 and ‘QsChSE’ - SEQ ID NO: 184 encoded by SEQ ID NO: 186); the product C9- Keto-CoA is then reduced by two keto-reductases (named ‘QsKR1 T - SEQ ID NO: 187 encoded by SEQ ID NO: 189 and ‘QsKR23’ - SEQ ID NO: 190 encoded by SEQ ID NO: 192) to form C9-CoA. The functional expression of these four enzymes has been confirmed using high-copy plasmids in yeast via confocal microscopy imaging of the C-terminal GFP fusion of the corresponding enzymes. The expression of both QsChSD and QsChSE is shown to be cytosolic, with a low degree of aggregation in the case of QsChSE. While the expression of KR11 is shown to be in the cytoplasm, the expression of QsKR23 is shown to be localized to the endoreticulum (ER) membrane (data not shown).

4.3 Addition of C9-CoAs to C3- and C28-c/lycosylated QA derivatives

Attempts to directly detect the production of C9-CoA as such using LC-MS was unsuccessful, possibly due to its short-lived stability. Therefore, the synthesis of C9-CoA was demonstrated by its addition to glycosylated QA derivatives. It has been demonstrated that the acyl unit (C9-CoA) can be added to both QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl (QA-C3-GGX- C28-FRX) and QA-C3-GlcA-Gal-Xyl-C28-Fuc-Rha-Xyl-Xyl (QA-C3-GGX-C28-FRXX) (Fig. 44). Because of the higher LC-MS response of QA-C3-GGX-C28-FRX, the yeast strain producing such glycosylated QA derivative (YL-30) was used to harbor the genomic integration of a large cassette containing both QsChsD and QsChSE, both QsKR11 and QsKR23, QsCCL, and an acyl transferase (named ‘QsDMOT9’ - SEQ ID NO: 193 encoded by nucleotide sequence SEQ ID NO: 195) to generate YL-42. Production of QA-C3-GGX-C28-FRX-C9 has been analysed by LC-MS.

Results

- YL-42 is shown to produce QA-C3-GGX-C28-FRX-C9 in the presence of 50 mg/L 2MB acid added exogenously, as confirmed by co-eluting with a standard (standard has been generated in N. benthamiana, as described in GB 2204252.7) (see Fig. 44).

- The conversion of QA-C3-GGX-C28-FRX to QA-C3-GGX-C28-FRX-C9 was improved in the presence of a higher concentration of 2MB acid supplemented to the growth media (Fig. 45).

The 18-carbon acyl chain consists of two repeating units of C9-CoA, and the second addition requires its corresponding acyltransferase (named ‘QsDMOT4’ - SEQ ID NO: 196 encoded by nucleotide sequence SEQ ID NO: 198), which has been integrated into the genome of YL-42 to generate YL-43. Results

- With 500 mg/L 2MB acid supplemented to the culture media, the production of QA-C3- GGX-C28-FRX-C18 has been confirmed with the appearance of a new LC-MS peak with the same high-resolution mass and its conversion from QA-C3-GGX-C28-FRX-C9 was shown to be highly efficient with little residual substrate (Fig. 46), suggesting that 2MB acid supplement and endogenous malonyl CoA pool provide sufficient C9-CoA for two acyl additions.

In order to generate QA-C3-GGX-C28-FRXX-C18, QsDMTO4 and QsC28XylT4 have been integrated into the genome of YL-42 to generate YL-44. QA-C3-GGX-C28-FRXX-C18 production has been analyzed by LC-MS.

Results

- A new LC-MS peak corresponding to the mass of QA-C3-GGX-C28-FRXX-C18 was detected (Fig. 47).

- The absence of QA-C3-GGX-C28-FRXX and QA-C3-GGX-C28-FRXX-C9 suggests that they are better substrates for QsDMOT9 and QSDMOT4 acyltransferases than QA-C3- GGX-C28-FRX and QA-C3-GGX-C28-FRX-C9.

4.4 Production of UDP-Arabinofuranose (Araf) non-native in yeast

The biosynthesis of UDP-Araf is not native in yeast and thus, necessary nucleotide sugar synthases as well as an arabinosyl transferase, are required for the heterologous production and addition of this sugar. As shown in Fig. 12, UDP-Xyl can be first converted to UDP-Arabinopyranose via a UDP-Xyl epimerase (UXE), which then undergoes ring-chain tautomerization assisted by UDP-Arabinose mutases (UAMs). UAM from A. thaliana (‘AtUAMT

- according to SEQ ID NO: 208 encoded by the nucleotide sequence SEQ ID NO: 210), H. vulgare (‘HvUAM’ - according to SEQ ID NO: 211 encoded by the nucleotide sequence SEQ ID NO: 213), UXE from A. thaliana (‘AtUXE’ - according to SEQ ID NO: 199 encoded by the nucleotide sequence SEQ ID NO: 201), ‘AtUXE2’ (according to SEQ ID NO: 202 encoded by the nucleotide sequence SEQ ID NO: 204) and/or ‘AtUGE3’ (according to SEQ ID NO: 205 encoded by the nucleotide sequence SEQ ID NO: 207) have been integrated into SC-4, according to the following combinations:

- AtUAM1-AtUXE2 (SC-9)

- HvUAM-AtUXE (SC-10)

- AtUAM1-AtUGE3 (SC-11)

Sugar production has been analyzed by LC-MS. Results

- UDP-Xyl was produced by all combinations of enzymes, with AtUAM1-AtUGE3 (SC-11) producing lower UDP-Xyl.

- While UDP-Arap production was similar, UDP-Araf was not detected, likely due to the coelution with UDP-Xyl and since both UXE and UAM enzymes are dominated by equilibrium, UDP-Araf is likely 100x less in abundance than UDP-Xyl (see Fig. 48B).

As an alternative, the salvage pathway has been tested with arabinokinase (AraK) and UDP-sugar pyrophosphorylase (USP) candidates from A. thaliana (named ‘AtAraK’ - SEQ ID NO: 214 encoded by nucleotide sequence SEQ ID NO: 216 and AtUSP - SEQ ID NO: 223 encoded by nucleotide sequence SEQ ID NO: 225, respectively) and Leptospira interrogans (Lei) (named ‘LeiAraK’ - SEQ ID NO: 217 encoded by nucleotide sequence SEQ ID NO: 219 and ‘LeillSP’ - SEQ ID NO: 226 encoded by nucleotide sequence SEQ ID NO: 228, respectively). An arabinose transporter from Penicillium rubens Wisconsin (named ‘PrAraT’ - SEQ ID NO: 220 encoded by nucleotide sequence SEQ ID NO: 222) has also been tested to determine if it was necessary for arabinose to enter the yeast and AtllAMI to convert UDP- Arap to UDP-Araf. The following combinations have been integrated into the genome of the parent yeast strain CEN.PK2-1c, wherein corresponding yeasts were fed with 1% arabinose added exogenously:

- AtAraK-AtUSP (SC-12)

- LeiAraK-LeiUSP (SC-13)

- AtAraK-AtUSP-PrAraT (SC-14)

- AtAraK-AtUSP-PrAraT-AtUAM1 (SC-15) Results

- Both AraT and the salvage pathway from L. interrogans produced less UDP-Arap (0.910 pmol/g Cell Pellet and 0.665 pmol/g Cell Pellet, respectively), as compared to the salvage pathway from A. thaliana (1.73 pmol/g Cell Pellet).

- UDP-Araf was produced with the salvage pathway, AraT and AtUAMI at 0.185 pmol/g Cell Pellet (see Fig. 48A).

4.5 UDP-Arabinofuranose (Araf) addition

Plant UDP-L-arabinofuranose (UDP-Araf) biosynthesis is closely associated with the golgi apparatus because L-Araf is a key component in the plant cell wall. The biosynthesis of UDP-Arap mainly occurs through the epimerization of UDP-Xyl in the Golgi lumen, which is interconverted into UDP-Araf by a UDP-Ara mutase located outside on the cytosolic surface of the Golgi, then being transported back to the Golgi lumen for its later glycosylation applications. Because of the lack of yeast native sugar transporters on the golgi membrane, cytosolic homologs of these enzymes were selected from A. thaliana, UDP-xylose epimerase (AtUXE) and AtUAMI to produce UDP-Araf in yeast.

Starting from YL-42 (the yeast strain capable of producing QA-C3-GGX-C28-FRX-C9), genes encoding (i) AtUXS and QsC28XylT4 (to produce QA-C3-GGX-C28-FRXX-C9), or AtAXS and QsC28ApiT4 (to produce QA-C3-GGX-C28-FRXA-C9), (ii) QsDMOT4 (to produce QA-C3-GGX-C28-FRXX-C18 or QA-C3-GGX-C28-FRXA-C18), (iii) AtUXE and AtUAMI (to produce arabinofuranose from UDP-Xyl), and (iv) an arabinofuranose transferase (named ‘QsAra/T’ - SEQ ID NO: 229 encoded by nucleotide sequence SEQ ID NO: 231) (to produce QA-C3-GGX-C28-FRXX-C18-A or QA-C3-GGX-C28-FRXA-C18-A) have been further integrated into the genome of YL-42, generating two new yeast strains, as summarized below, and 2MB acid was supplemented in the culture media: - AtUXS-QsC28XylT4-AtUXE-AtUAM1-QsDMOT4-QsAra/T (YL-45)

- AtUXS-QsApiT4-AtUXE-AtUAM1-QsDMOT4-QsAra/T (YL-46) Results

In the extracted single-ion LC-MS chromatogram of YL-45, more than one peak was observed when the exact mass of QS-21-Xyl (/.e. QA-C3-GGX-C28-FRXX-C18-A) was extracted (Fig. 49). Likewise, more than one peak was observed with the exact extracted mass of QA-C3-GGX-C28-FRXX-C18, which also corresponds to the mass of QA-C3-GGX-C28- FRX-C18-A.

The peak with a retention time of 11.1-11.2 min co-elutes with a QS-21 standard (standard corresponds to the QS-21 fraction purified from a crude bark extract of Quillaja saponaria Molina which has been generated as described in WO 19/106192) (Fig. 50). The extracted sample was also spiked with the standard. The isotopic distribution of the peak extracted mass remained unchanged before and after the standard spiking (Fig. 50 inset), therefore confirming the production of QS-21 -Xyl in YL-45 at a titer of 94.6 pg/L.

Results

- In the extracted single-ion LC-MS chromatogram of YL-46, multiple peaks are similarly observed when the exact mass of QS-21-Api (/.e. QA-C3-GGX-C28-FRXA-C18-A) was extracted (Fig. 51).

- Additionally, similar peak patterns with the exact extracted mass of QA-C3-GGX-C28- FRXA-C18 are also observed, which also corresponds to the mass of QA-C3-GGX-C28- FRX-C18-A.

Since xylose, arabinofuranose (Ara/), and arabinopyranose (Arap) are structural isomers, they also have the same exact mass. It is likely that other pentose sugars can be added instead of Araf, leading to the other peaks with the same exact mass as QS-21-Api. Therefore, the substrate scope of the Araf transferase (QsAra/T) has been investigated. The strain YL-47 has been constructed by integrating QsAra/T without the genes required to convert UDP-Xyl to UDP-Araf (/.e. without AtUXE and AtUAMI).

Results

- As a result, a new peak is observed that corresponds to QA-C3-GGX-C28-FRX-C18-Xyl, suggesting that QsAra/T can also use UDP-Xyl as a substrate instead of UDP-Araf for addition at the end of the acyl chain (Fig. 52).

In search of Ara/T homologs that are more specific towards UDP-Araf, BLAST searches were performed on the Q. saponaria transcriptome in 1kp database (https://db.cngb.org/onekp/) using the Ara/T protein sequence (SEQ ID NO: 229). A candidate with 64% protein homolog has been identified, QQHZ_scaffold_2012646, named ‘QsAra/T2’ (SEQ ID NO: 232 encoded by the nucleotide sequence SEQ ID NO: 234). First, this candidate has been tested for its activity towards UDP-Xyl. YL-48 has been similarly constructed by integrating QsArafT2 without the genes required to convert UDP-Xyl to UDP-Araf (/.e. without AtUXE and AtUAMI), and 2MB acid was supplemented in the culture media.

Results

- While the production of QA-C3-GXX-C18-FRX-C18 (in YL-48) has been detected, no LC- MS peak that corresponds to the addition of Xyl was observed (Fig. 53), suggesting that ArafT2 is not active towards using UDP-Xyl as a substrate.

Therefore, a new yeast strain was generated (YL-49), similar to YL-45, except that the gene encoding QsArafT was replaced with a gene encoding QsArafT2.

- The extracted single-ion chromatograms confirmed the production of QS-21-Xyl (/.e. QA- C3-GXX-C18-FRXX-C18-Araf) with a higher ratio of the desired peak with regard to the other LC-MS peaks with the same exact mass (Fig. 54).

4.6 Integration of a type polyketide synthase to produce (S)-2-methylbutyryl Co A in vivo In order to circumvent the need of exogenously adding 2MB acid, the biosynthesis of (S)-2-methylbutyryl CoA (2MB-CoA) in vivo in yeast has been investigated. The branched- chain a-keto acid dehydrogenase (BCKD) complex has first been investigated with a transaminase from Bacillus subtilis (Bs), which, in bacteria, would readily convert isoleucine to 2MB-CoA during amino acid metabolism. However, no 2MB-CoA was detected in yeast engineered to express BsBKCD (data not shown). Without wishing to be bound to a theory, it is believed that this may be due to yeast lacking the necessary post-translational modification mechanism of the subunit E2 of the BKCD complex.

Alternatively, a 7.6 kb type I polyketide synthase (PKS) LovF from Aspergillus terreus (Ast) (also referred to as ‘Megasynthase LovF’) has been engineered to produce 2MB-CoA in vivo. Native LovF condenses two units of malonyl-CoA to 2MB-ACP, i.e. 2MB covalently attached to the AGP (Acyl Carrier Protein) domain. In order to obtain free 2MB, a promiscuous DEBS (6-deoxyerythronolide synthase) thioesterase (TE) domain from Saccharopolyspora erythraea (Se) has been fused at the C-terminus of LovF (also referred to as ‘LovF-TE’), to cleave 2MB acid from the ACP domain. The resulting 2MB acid can then be converted into 2MB-CoA by QsCCL, similar to the case of 2MB exogenous supplementation. An additional phosphopantetheinyl (Ppant) transferase is required for LovF to be functional in a heterologous host. Accordingly, a chromosomal copy of a Ppant candidate from Aspergillus nidulans (named ‘AnNpgA’ according to SEQ ID NO: 237 (encoded by the nucleotide sequence SEQ ID NO: 239) has been integrated into the genome of the parent yeast strain CEN.PK2-1c to generate YL-AnNpgA. A plasmid expressing AstLovF-TE according to SEQ ID NO: 235 (encoded by SEQ ID NO: 236) and QsCCL according to SEQ ID NO: 178 (encoded by SEQ ID NO: 180) has been transfected into YL-AnNpgA to generate YL-PKS. Additionally, AnNpgA and AstLovF-TE have been integrated into the genome of YL-42 (a yeast strain producing QA-C3- GGX-C28-FRX) to generate YL-42-AstLovF-TE, as well as into the genome of YL-45 (a yeast strain producing QA-C3-GGX-C28-FRXX-C18-Araf or QS-21-Xyl, in the presence of 2-MB supplemented exogenously) and YL-46 (a yeast strain producing QA-C3-GGX-C28-FRXA- C18-Araf or QS-21-Api, in the presence of 2-MB supplemented exogenously) to generate YL- 50 and YL-51 , respectively.

Results

- The production of 2MB-CoA by YL-PKS (engineered with LovF-TE) has been confirmed by LC-MS (Fig. 43) demonstrating the successful type I PKS LovF-TE engineering that catalyzes the release of free 2MB acid from AGP and subsequent CoA ligation by QsCCL.

- While the peak integration of 2MB-CoA is lower than that of the 2MB acid feed-in experiment, the production of QA-C3-GGX-C28-FRX-C9 using NgpA and LovF-TE in YL- 42-AstLovF-TE was more comparable with the feed-in experiment in the case of YL-42, approximately 50% (data not shown).

- The complete biosynthesis of QS-21-Xyl and QS-21-Api in YL-50 and YL-51, respectively, was observed (Fig. 55 and Fig. 56, respectively), in the absence of any 2MB acid added exogenously.

Conclusion

Using more than 30 heterologous enzymes and proteins from different plant and microbial origins (e.g. G. vaccaria, Q. saponaria, A. thaliana, S. vaccaria, Thermothelomyces thermophilus, Aspergillus nidulans, and Aspergillus terreus ), the inventors have been able to reconstruct in yeast the metabolic pathway leading to the synthesis of QS- 21-Xyl and QS-21-Api (the two main isomeric constituents present in the QS-21 fraction traditionally purified from the bark of Q. saponaria Molina tree) achieving, for the first time, the successful production of QS-21-Xyl and QS-21-Api in yeast.

Example 5 - Methods

5.1 Expression in N. Benthamiana

N. Benthamiana transient expression experiments were carried out as described in WO 2020/260475.

5.2 Yeast engineering Genes were assembled into pESC plasmids which contain two multiple cloning sites driven by Gallp and Gal10p individually which are galactose-inducible promoters or under the Tet promoter with the tet repressor gene. Nucleotide sequences were codon-optimized for S. cerevisiae using the IDT online tool. Integration was performed by an in-house-developed CRISPR/Cas9 toolkitl 0. Integration was confirmed by colony PCR and confirmed strains were glycerol stocked and stored at -80 °C.

5.3 Production and metabolite extraction

Production of sugars and QA derivatives was done first by streaking the glycerol stock of the desired yeast strain onto a YPD (yeast extract peptone 2% dextrose) plate and grown for about 20h at 30 °C to obtain single colonies. Colonies were picked from the plate and cultured for 48h in 5 mL YPD shaking at 200 rpm at 30 °C. The cultures were then spun down and resuspended in equal volume YPGal (yeast extract peptone 2% galactose) media and cultured further at 200 rpm and 30 °C, inducing expression of Gall and Gal 10 promoters. Samples were collected at between 48h and 36 hours post-induction for metabolite extraction. Yeast cell cultures (or cell pellet for the production of sugars) were extracted with 2:2:1 methanol/chloroform/water (2:2:1 v/v/v). Aqueous and organic layers were separated by centrifugation and the aqueous layer was collected. The collected layer was then evaporated in a speed vac at room temperature and resuspended in 0.3% formic acid at pH 9 (adjusted with ammonium acetate).

5.4 LC-MS detection

LC-MS analysis was carried out using an Agilent HPLC 1260 infinity system attached to an iQ MSD. Detection: MS (ESI ionization, spray voltage Positive 4.5 kV, Negative -3.5 kV, mass range 400 - 1000, negative ion mode) LC Method: Solvent A: [H2O + 0.3 % formic acid at pH 9 (pH adjusted with ammonium hydroxide)] Solvent B: [acetonitrile (CH3CN) + 0.1% formic acid]. Injection volume: 5 pL. Gradient: 2% to 15% [B] from 0 to 20 min, 15% to 50% [B] from 20 to 26 min, 50% to 90% [B] from 26 to 27 min, 90% [B] from 27 to 30 min, 90% to 2% [B] from 30 to 31 min, 2% [B] from 31 to 50 min. Method was performed using a flow rate of 0.1 mL min-1 with a Porous Graphitic Carbon column (Hypercarb, 5 urn, 1 x 150 mm Analytical Column) (or as described in WO 22/136563).

Table 3 - Genotype of the engineered YL yeast strains no

Table 4 - Genotypes of the engineered SC yeast strains

Lei - Leptospira interrogans Hv - Hordeum vulgare

Pr - Penicillium rubens Wisconsin Table 5 - Enzymes

REFERENCES

Kensil et al. (1991) “Separation and characterization of saponins with adjuvant activity from Quillaja saponaria Molina cortex”; Journal of immunology, Vol. 146: p431-437

Ragupathi et al. (2011) “Natural and synthetic saponin adjuvant QS-21 for vaccines against cancer”; Expert Review of Vaccines, Vol. 10: p463-470)

Garcon et al. “Recent clinical experience with vaccines using MPL and QS-21 -containing adjuvant systems”; Expert Review of Vaccines, Vol. 10(4): p71-486

Decker and Kleczkowski (2017) "Substrate Specificity and Inhibitor Sensitivity of Plant UDP- Sugar Producing Pyrophosphorylases”; Frontiers in Plant Science, Vol. 8(1610): p1 -16

Kirby et al. (2008)

Wong et al. (2018)

Gosh, 2017

WO 19/106192

WO 19/122259

WO 20/260475

WO 22/136563

WO 20/263524

Previous Patent: THERAPEUTIC TREATMENT FOR FRAGILE X-ASSOCIATED DISORDER

Next Patent: BIOMARKERS AND METHODS RELATED TO FRAGILE X SYNDROME