Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
ENGINEERING OF RHIZOMUCOR MIEHEI LIPASE TOWARDS AMIDE BOND FORMATION FOR THE SYNTHESIS OF MEDIUM TO LONG CHAIN N-ACYL GLYCINES IN AQUEOUS MEDIA
Document Type and Number:
WIPO Patent Application WO/2024/058717
Kind Code:
A1
Abstract:
Described herein are lipase variants of a Rhizomucor Miehei Lipase. The lipase variants may be used to prepare medium to long chain N-acyl glcyines. The variants may have a parent sequence of SEQ ID NO: 3; and a D156 substitution and/or a L258 substitution, wherein the D156 substitution is selected from the group consisting of serine (S), glycine (G), glutamic acid (E), and threonine (T), wherein if the D156 substitution is glycine, the lipase variant comprises at least one other substitution, and the L258 substitution is selected from the group consisting of lysine (K), arginine (R), glutamine (Q), asparagine (N), histidine (H), and serine (S). The variants may have a parent sequence of SEQ ID NO: 3; and at least one substitution of the following amino acids: D156, L258. L267, V254, L255, N264, T265, S83, S84, W88, R30, R86, S56, L58, and 1I59.

Inventors:
LI ZHI (SG)
KUA KAI BIN (SG)
NGUYEN KIEN TRUC GIANG (SG)
Application Number:
PCT/SG2023/050619
Publication Date:
March 21, 2024
Filing Date:
September 13, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
WILMAR INTERNATIONAL LTD (SG)
NAT UNIV SINGAPORE (SG)
International Classes:
C12N15/55; C12N9/20; C12P7/6418; C12R1/645
Domestic Patent References:
WO1990014429A11990-11-29
WO2022211736A12022-10-06
Foreign References:
CN111363734A2020-07-03
US20060229223A12006-10-12
Other References:
LI GUANLIN, FANG XINGRONG, SU FENG, CHEN YUAN, XU LI, YAN YUNJUN: "Enhancing the Thermostability of Rhizomucor miehei Lipase with a Limited Screening Library by Rational-Design Point Mutations and Disulfide Bonds", APPLIED AND ENVIRONMENTAL MICROBIOLOGY, AMERICAN SOCIETY FOR MICROBIOLOGY, UNITED STATES, 15 January 2018 (2018-01-15), United States, XP093150241, Retrieved from the Internet [retrieved on 20240410], DOI: 10.1128/AEM.02129-17
KUA GLEN KAI BIN, NGUYEN GIANG KIEN TRUC, LI ZHI: "Enzyme Engineering for High‐Yielding Amide Formation: Lipase‐Catalyzed Synthesis of N ‐Acyl Glycines in Aqueous Media", ANGEWANDTE CHEMIE INTERNATIONAL EDITION, VERLAG CHEMIE, HOBOKEN, USA, vol. 62, no. 14, 27 March 2023 (2023-03-27), Hoboken, USA, XP093150244, ISSN: 1433-7851, DOI: 10.1002/anie.202217878
Attorney, Agent or Firm:
AMICA LAW LLC (SG)
Download PDF:
Claims:
Claims

1. A lipase variant comprising a parent sequence of SEQ ID NO: 3; and a D156 substitution of SEQ ID NO: 3 and/or a L258 substitution of SEQ ID NO: 3, wherein the D156 substitution is selected from the group consisting of serine (S), glycine (G), glutamic acid (E), and threonine (T), wherein if the D156 substitution is glycine, the lipase variant comprises at least one other substitution, and the L258 substitution is selected from the group consisting of lysine (K), arginine (R), glutamine (Q), asparagine (N), histidine (H), and serine (S). . A lipase variant comprising a parent sequence of SEQ ID NO: 3; and at least one substitution selected from the group consisting of: a D156 substitution of SEQ ID NO: 3, the D156 substitution is selected from the group consisting of serine (S), glycine (G), glutamic acid (E), and threonine (T), wherein if the D156 substitution is glycine, the lipase variant comprises at least one other substitution; a L258 substitution of SEQ ID NO: 3, the L258 substitution is selected from the group consisting of lysine (K), arginine (R), glutamine (Q), asparagine (N), histidine (H), and serine (S); a L267 substitution of SEQ ID NO: 3, wherein the L267 substitution is selected from the group consisting of asparagine (N), arginine (R), lysine (K), glutamine (Q), histidine (H), and serine (S); a V254 substitution of SEQ ID NO: 3, wherein the V254 substitution is selected from the group consisting of alanine (A), aspartic acid (D), threonine (T), arginine (R), and glycine (G), preferably alanine; a L255 substitution of SEQ ID NO: 3, wherein the L255 substitution is selected from the group consisting of lysine (K), threonine (T), arginine (R), alanine (A), and glycine (G); a N264 substitution of SEQ ID NO: 3, wherein the N264 substitution is selected from the group consisting of threonine (T), arginine (R), serine (S), glutamic acid (E), alanine (A), aspartic acid (D), and glycine (G); a T265 substitution of SEQ ID NO: 3, wherein the T265 substitution is selected from the group consisting of serine (S), glutamic acid (E), alanine (A), and asparagine (N); a S83 substitution of SEQ ID NO: 3, wherein the S83 substitution is selected from the group consisting of aspartic acid (D), asparagine (N), alanine (A), glycine (G), and glutamic acid (E); a S84 substitution of SEQ ID NO: 3, wherein the S84 substitution is selected from the group consisting of glycine (G), and threonine (T); a W88 substitution of SEQ ID NO: 3, wherein the W88 substitution is selected from the group consisting of alanine (A), isoleucine (I), valine (V), aspartic acid (D), glycine (G), threonine (T), and serine (S), preferably alanine, isoleucine, and valine; a R30 substitution of SEQ ID NO: 3, wherein the R30 substitution is selected from the group consisting of glutamic acid (E), and lysine (K); a R86 substitution 86 of SEQ ID NO: 3, wherein the R86 substitution is selected from the group consisting of lysine (K), alanine (A), asparagine (N), and threonine (T); a S56 substitution of SEQ ID NO: 3, wherein the S56 substitution is selected from the group consisting of arginine (R), lysine (K), alanine (A), asparagine (N), and threonine (T); a L58 substitution, wherein the L58 substitution is selected from the group consisting of lysine (K), aspartic acid (D), serine (S), arginine (R), glutamic acid (E), alanine (A), asparagine (N), and threonine (T); and a 159 substitution, wherein the 159 substitution is selected from the group consisting of threonine (T), alanine (A), arginine (R), lysine (K), and leucine (L). The lipase variant according to claims 1 or 2, wherein both the D156 substitution and the L258 substitution are present. The lipase variant according to any one of claims 1 to 3 further comprising a L267 substitution of SEQ ID NO: 3, wherein the L267 substitution is selected from the group consisting of asparagine (N), arginine (R), lysine (K), glutamine (Q), histidine (H), and serine (S). The lipase variant according to any one of claims 1 to 4, wherein if the L258 substitution and the L267 substitution are present in the lipase variant, each of the L258 substitution and the L267 substitution is independently selected from the group consisting of arginine (R), lysine (K), asparagine (N), and serine (S). The lipase variant according to claim 5, wherein the D156 substitution is selected from the group consisting of serine and glycine, the L258 substitution is lysine, and the L267 substitution is asparagine. The lipase variant according to any one of claims 1 to 6 further comprising a V254 substitution of SEQ ID NO: 3, wherein the V254 substitution is selected from the group consisting of alanine (A), aspartic acid (D), threonine (T), arginine (R), and glycine (G), preferably alanine. The lipase variant according to any one of claims 1 to 6 further comprising a L255 substitution of SEQ ID NO: 3, wherein the L255 substitution is selected from the group consisting of lysine (K), threonine (T), arginine (R), alanine (A), and glycine (G). The lipase variant according to any one of claims 1 to 6 further comprising a N264 substitution of SEQ ID NO: 3, wherein the N264 substitution is selected from the group consisting of threonine (T), arginine (R), serine (S), glutamic acid (E), alanine (A), aspartic acid (D), and glycine (G). The lipase variant according to any one of claims 1 to 6 further comprising a T265 substitution of SEQ ID NO: 3, wherein the T265 substitution is selected from the group consisting of serine (S), glutamic acid (E), alanine (A), and asparagine (N). The lipase variant according to any one of claims 1 to 6 further comprising a S83 substitution at position 83 of SEQ ID NO: 3, wherein the S83 substitution is selected from the group consisting of aspartic acid (D), asparagine (N), alanine (A), glycine (G), and glutamic acid (E). The lipase variant according to claim 11, wherein the S83 substitution is aspartic acid. The lipase variant according to any one of claims 1 to 12, further comprising at least one of the following substitutions: a S84 substitution of SEQ ID NO: 3, wherein the S84 substitution is selected from the group consisting of glycine (G), and threonine (T); a W88 substitution of SEQ ID NO: 3, wherein the W88 substitution is selected from the group consisting of alanine (A), isoleucine (I), valine (V), aspartic acid (D), glycine (G), threonine (T), and serine (S), preferably alanine, isoleucine, and valine; a R30 substitution of SEQ ID NO: 30, wherein the R30 substitution is selected from the group consisting of glutamic acid (E), and lysine (K); a R86 substitution of SEQ ID NO: 3, wherein the R86 substitution is selected from the group consisting of lysine (K), alanine (A), asparagine (N), and threonine (T); a S56 substitution of SEQ ID NO: 3, wherein the S56 substitution is selected from the group consisting of arginine (R), lysine (K), alanine (A), asparagine (N), and threonine (T); a L58 substitution of SEQ ID NO: 3, wherein the L58 substitution is selected from the group consisting of lysine (K), aspartic acid (D), serine (S), arginine (R), glutamic acid (E), alanine (A), asparagine (N), and threonine (T); and a 159 substitution of SEQ ID NO: 3, wherein the 159 substitution is selected from the group consisting of threonine (T), alanine (A), arginine (R), lysine (K), and leucine (L). The lipase variant according to claim 13, wherein the L58 substitution is lysine. The lipase variant according to any one of claims 13 or 14, wherein the R86 substitution is lysine. The lipase variant according to any one of claims 13 to 15, wherein the W88 substitution is valine. The lipase variant according to any one of claims 1 to 16 comprising a propeptide, preferably the propeptide is selected from the group consisting of SEQ ID NO: 2, SEQ ID NO: 199, and SEQ ID NO: 200. The lipase variant according to claim 17 comprising a cleavage site to allow the propeptide to be cleaved off. The lipase variant according to any one of claims 1 to 18, wherein a sequence of the lipase variant has at least 90% sequence identity to SEQ ID NO: 3, preferably at least 95% sequence identity to SEQ ID NO: 3, more preferably at least 96% sequence identity to SEQ ID NO: 3, even more preferably at least 97% sequence identity to SEQ ID NO: 3. The lipase variant according to claim 19, wherein the sequence is selected from the group consisting of SEQ ID NO: 110, SEQ ID NO: 108, SEQ ID NO: 74, SEQ ID NO: 51, SEQ ID NO: 24, SEQ ID NO:5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,

SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 25,

SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30,

SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35,

SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39, SEQ ID NO: 40,

SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 45,

SEQ ID NO: 46, SEQ ID NO: 47, SEQ ID NO: 48, SEQ ID NO: 49, SEQ ID NO: 50,

SEQ ID NO: 52, SEQ ID NO: 53, SEQ ID NO: 54, SEQ ID NO: 55, SEQ ID NO: 56,

SEQ ID NO: 57, SEQ ID NO: 58, SEQ ID NO: 59, SEQ ID NO: 60, SEQ ID NO: 61,

SEQ ID NO: 62, SEQ ID NO: 63, SEQ ID NO: 64, SEQ ID NO: 65, SEQ ID NO: 66,

SEQ ID NO: 67, SEQ ID NO: 68, SEQ ID NO: 69, SEQ ID NO: 70, SEQ ID NO: 71,

SEQ ID NO: 72, SEQ ID NO: 73, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77,

SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 82,

SEQ ID NO: 83, SEQ ID NO: 84, SEQ ID NO: 85, SEQ ID NO: 86, SEQ ID NO: 87,

SEQ ID NO: 88, SEQ ID NO: 89, SEQ ID NO: 90, SEQ ID NO: 91, SEQ ID NO: 92,

SEQ ID NO: 93, SEQ ID NO: 94, SEQ ID NO: 95, SEQ ID NO: 96, SEQ ID NO: 97,

SEQ ID NO: 98, SEQ ID NO: 99, SEQ ID NO: 100, SEQ ID NO: 101, SEQ ID NO: 102, SEQ ID NO: 103, SEQ ID NO: 104, SEQ ID NO: 105, SEQ ID NO: 106, SEQ ID NO: 107, and SEQ ID NO: 109. The lipase variant according to claim 19, wherein the sequence is selected from the group consisting of SEQ ID NO: 51, SEQ ID NO: 62, SEQ ID NO: 63, SEQ ID NO: 64, SEQ ID NO: 65, SEQ ID NO: 66, SEQ ID NO: 67, SEQ ID NO: 68, SEQ ID NO: 69, SEQ ID NO: 70, SEQ ID NO: 71, SEQ ID NO: 72„ SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 81, SEQ ID NO: 91, SEQ ID NO: 102, SEQ ID NO: 103, SEQ ID NO: 104, SEQ ID NO: 106, SEQ ID NO: 108, and SEQ ID NO: 110. The lipase variant according to claim 19, wherein the sequence is selected from the group consisting of SEQ ID NO: 51, SEQ ID NO: 71, SEQ ID NO: 72, SEQ ID NO: 74, SEQ ID NO: 77, SEQ ID NO: 103, SEQ ID NO: 106, SEQ ID NO: 108, and SEQ ID NO: 110. The lipase variant according to any one of claims 20 to 22 consisting essentially of the sequence. The lipase variant according to any one of claims 20 to 22 consisting of the sequence. A method to prepare an N-acyl amino acid, the method comprising: providing a mixture comprising an amine, a carboxylic acid and/or an ester of the carboxylic acid, water, and glycerol; contacting the mixture with the lipase variant according to any one of claims 1 to 24 under suitable conditions to form the N-acyl amino acid. The method according to claim 25, wherein the amine is an amino acid, preferably glycine or beta-alanine. The method according to claims 25 or 26 , wherein the carboxylic acid comprises a linear saturated or unsaturated aliphatic chain having 8 to 18 carbon atoms. The method according to claims 25 or 26, wherein the ester is a glyceride ester, and may be a monoglyceride ester, diglyceride ester, or a triglyceride ester. The method according to any one of claims 25 to 28, wherein water is present in at least 10% v/v of the mixture, and preferably at most 80% v/v of the mixture, more preferably 30% to 50% v/v of the mixture. The method according to any one of claims 25 to 29, wherein glycerol is present in up to 90% v/v of the mixture, preferably at least 20% v/v of the mixture, more preferably 50% to 70% v/v of the mixture. The method according to any one of claims 25 to 30, wherein the lipase variant is provided by expression in a microorganism, preferably the microorganism is Pichia pastoris or Escherichia coli. The method according to any one of claims 25 to 31, wherein the lipase variant is free in the mixture and not bound, for example to a solid support or microorganism. The method according to any one of claims 25 to 32, wherein the suitable conditions comprise at least one of the following: a temperature of 20 °C to 60 °C; a pH of 6 to 9; and at least 1 mM of the carboxylic acid and/or the ester per Img of the lipase variant, preferably at least 5 mM of the carboxylic acid and/or the ester, and more preferably at most 50 mM of the carboxylic acid and/or the ester. The method according to claim 33, wherein the temperature is selected such that the carboxylic acid or ester is a liquid at the temperature and below an aggregation temperature of the lipase variant. An N-acyl amino acid obtained by or obtainable by the method according to any one of claims 25 to 34.

Description:
Engineering of Rhizomucor Miehei Lipase Towards Amide Bond Formation for the Synthesis of Medium to Long Chain N-Acyl Glycines in Aqueous Media

[0001] The present application claims priority to Singapore patent application number 10202250997A filed on 14 September 2022 which is incorporated by reference herein in its entirety.

Technical Field

[0002] The present invention is related to synthetic enzymes, and their use in synthesis.

Background

[0003] N-acyl amino acids constitute an important class of lipoamino acids that are widely used in cosmetic and pharmaceutical industries in view of their excellent emulsifying and biological activities 1,2 . These qualities are attributed to their amphipathic properties owing to the presence of amino acid as its hydrophilic moiety and a fatty acid chain as its hydrophobic moiety. One of the major components of lipoamino acids with diverse applications across these industries is N-acyl glycines, with their utility being dependent on the chain length of the fatty acid moiety. For example, medium chain (C12-C14) N-lauroyl and N-myristoylglycines are commonly used as biosurfactants in cosmetic and personal care formulations due to their good surface activities and foaming properties 3 4 . On the other hand, long-chain (C16-C22) N-acyl glycines are endogenous signaling molecules found throughout the mammalian central nervous system and other tissues where they exert analgesic, anti-inflammatory and other interesting pharmacological effects 5-9 . For example, the well-studied N-arachidonoylglycine is an agonist of the GPR18 receptor on microglial cells which produces anti-nociceptive and antiinflammatory effects 6 , while N-palmitoylglycine acts as a modulator for calcium influx and inhibits the heat-evoked firing of nociceptive sensory neurons 7 . Other N-acyl glycines such as N-oleoylglycine, N-stearoylglycine and N-linoleoylglycine are also present in mammalian central nervous system and other tissues as regulators of body temperature, locomotion, and other physiological functions 9,10 . These makes long chain N-acyl glycines a potentially useful class of candidates for the development of novel therapeutics.

[0004] At present, the commercial preparation of N-acyl amino acid surfactants adopt a chemical synthesis pathway that proceeds via the Schotten-Baumann reaction 11,12 . This pathway involves the use of phosgene, an extremely poisonous chemical used during World War I as chemical weapon, to convert free fatty acids into the corresponding acid chlorides 13 , followed by the condensation of the fatty acyl chloride with an amino acid under alkaline conditions and high temperatures. Although this method produce the amides with good yields, stoichiometric amounts of the activating reagent are required and the generation of an equivalent amount of waste makes it a low-atom economy process 14,15 . This was also echoed by the American Chemical Society Green Chemistry Institute, which voted ‘amide bond formation avoiding poor atom economy reagents’ as the top challenge in green chemistry 15 . Additionally, the use of toxic irritants such as fatty acyl chlorides and the generation of byproducts creates quality and safety concern for end-users in personal care and pharmaceuticals and necessitates further purification steps for complete elimination 11 .

[0005] In view of worldwide environmental concerns toward biodegradable and greener surfactants, a greater emphasis has been shifted toward environmentally friendly processes and sustainable production of bio surfactants. Enzymatic reactions, being mild and highly selective, offers the direct synthesis of amides with fewer side -products, hence circumventing the drawbacks associated with chemical synthesis and facilitating product purification 16 . In the biosynthesis of amides, adenosine triphosphate (ATP)-dependent enzymes such as acyl-CoA synthetases (ACS) are well-established enzymes that activate free fatty acids into acyl-CoA via an acyl- adenylate (acyl-AMP) intermediate, followed by the coupling of acyl-CoA with an amine using an N-acyltransferase (NAT) enzyme 17 . However, they have low applicability towards larger scale amide synthesis due to the use of prohibitively expensive cofactors and the difficulty in obtaining pairs of ACS and NAT with matching substrate scope and selectivity 17 . Other reported enzymes for amide bond formation are ATP-independent enzymes, such as that of acylases, and lipases. Many researchers have reported on the utility of mammalian acylase I from porcine or hog kidney towards this application 18-22 , although they are not applicable for commercial scale-up due to their low yields, which stems from their unfavourable equilibrium in reactions towards hydrolysis over synthesis 21 and their vulnerability towards oxidative and thermal inactivation 22 . In addition to mammalian acylases, recent published reports have also highlighted the hydrolysis of N-acyl amino acids by microbial aminoacylases from Streptomyces or Burkholderia species 23-30 ; however, their synthetic activities have not been thoroughly investigated. Moreover, most of the aminoacylases from these species have high substrate specificity towards acyl acceptors such as lysine, arginine, phenylalanine or methionine with much less specificity for glycine 23,25,26 28- 30

[0006] A preferred alternative to the synthesis of N-acyl glycines is the use of lipases. Lipases (E.C.3.1.1.3) originating from the hydrolase family are ATP-independent and cofactor- independent enzymes which are of great interest as catalysts in synthetic reactions. This class of enzymes have been shown to perform N-acylation reactions to generate amide -bond containing biomolecules with limited hydrolysis of the amide 31-39 . More importantly, the highly thermostable lipases can offer the ability for synthetic reactions to be performed at a temperature higher than the melting point of the fatty acid or ester substrates. However, one of the main challenges associated with the use of lipase for synthesizing amides using free fatty acids as the acyl donor involves the formation of unreactive salts between carboxylic acids and amines 40 . Only a few examples have been reported using free carboxylic acid as the acyl donor for amide synthesis. All these examples suffer from long reaction time, with poor to moderate amide yield 33-39 . Moreover, most examples relied heavily on the use of strictly non-aqueous systems for the exclusion of water in forming the amide bond 33-39 , which inadvertently limits the solubility of the amino acid into the reaction system. Thus, naturally occurring lipases are generally limited to the hydrolysis of fats and may not be able to perform the N-acylation reactions to provide high yields and high atom economy.

Summary

[0007] In a first aspect, there is provided a lipase variant comprising a parent sequence of SEQ ID NO: 3; and a D156 substitution of SEQ ID NO: 3 and/or a L258 substitution of SEQ ID NO: 3, wherein the D156 substitution is selected from the group consisting of serine (S), glycine (G), glutamic acid (E), and threonine (T), wherein if the D156 substitution is glycine, the lipase variant comprises at least one other substitution, and the L258 substitution is selected from the group consisting of lysine (K), arginine (R), glutamine (Q), asparagine (N), histidine (H), and serine (S).

[0008] In a second aspect, there is provided a lipase variant comprising a parent sequence of SEQ ID NO: 3; and at least one substitution selected from the group consisting of:

(i) a D156 substitution of SEQ ID NO: 3, the D156 substitution is selected from the group consisting of serine (S), glycine (G), glutamic acid (E), and threonine (T), wherein if the D156 substitution is glycine, the lipase variant comprises at least one other substitution; a L258 substitution of SEQ ID NO: 3, the L258 substitution is selected from the group consisting of lysine (K), arginine (R), glutamine (Q), asparagine (N), histidine (H), and serine (S);

(ii) a L267 substitution of SEQ ID NO: 3, wherein the L267 substitution is selected from the group consisting of asparagine (N), arginine (R), lysine (K), glutamine (Q), histidine (H), and serine (S); (iii) a V254 substitution of SEQ ID NO: 3, wherein the V254 substitution is selected from the group consisting of alanine (A), aspartic acid (D), threonine (T), arginine (R), and glycine (G), preferably alanine;

(iv) a L255 substitution of SEQ ID NO: 3, wherein the L255 substitution is selected from the group consisting of lysine (K), threonine (T), arginine (R), alanine (A), and glycine (G);

(v) a N264 substitution of SEQ ID NO: 3, wherein the N264 substitution is selected from the group consisting of threonine (T), arginine (R), serine (S), glutamic acid (E), alanine (A), aspartic acid (D), and glycine (G);

(vi) a T265 substitution of SEQ ID NO: 3, wherein the T265 substitution is selected from the group consisting of serine (S), glutamic acid (E), alanine (A), and asparagine (N);

(vii) a S83 substitution of SEQ ID NO: 3, wherein the S83 substitution is selected from the group consisting of aspartic acid (D), asparagine (N), alanine (A), glycine (G), and glutamic acid (E);

(viii) a S84 substitution of SEQ ID NO: 3, wherein the S84 substitution is selected from the group consisting of glycine (G), and threonine (T);

(ix) a W88 substitution of SEQ ID NO: 3, wherein the W88 substitution is selected from the group consisting of alanine (A), isoleucine (I), valine (V), aspartic acid (D), glycine (G), threonine (T), and serine (S), preferably alanine, isoleucine, and valine;

(x) a R30 substitution of SEQ ID NO: 3, wherein the R30 substitution is selected from the group consisting of glutamic acid (E), and lysine (K);

(xi) a R86 substitution 86 of SEQ ID NO: 3, wherein the R86 substitution is selected from the group consisting of lysine (K), alanine (A), asparagine (N), and threonine (T);

(xii) a S56 substitution of SEQ ID NO: 3, wherein the S56 substitution is selected from the group consisting of arginine (R), lysine (K), alanine (A), asparagine (N), and threonine (T);

(xiii) a L58 substitution, wherein the L58 substitution is selected from the group consisting of lysine (K), aspartic acid (D), serine (S), arginine (R), glutamic acid (E), alanine (A), asparagine (N), and threonine (T); and

(xiv) a 159 substitution, wherein the 159 substitution is selected from the group consisting of threonine (T), alanine (A), arginine (R), lysine (K), and leucine (L).

[0009] The term “parent sequence” as used herein refers to the original sequence of the protein (specifically the lipase herein) that is modified with one or more amino acids substitutions to produce a mutant or lipase variant. For example, a lipase variant comprising a parent sequence of SEQ ID NO: 3 and a D156 substitution of SEQ ID NO: 3, would mean that such a lipase variant would have the original SEQ ID NO: 3 with at least D156 (aspartic acid at the 156 th amino acid position of SEQ ID NO: 3) substituted and is similarly used to described the lipase variants herein.

[0010] Preferably, both the D156 substitution and the L258 substitution are present.

[0011] Preferably, the lipase variant further comprises a L267 substitution of SEQ ID NO: 3, wherein the L267 substitution is selected from the group consisting of asparagine (N), arginine (R), lysine (K), glutamine (Q), histidine (H), and serine (S).

[0012] Preferably, if the L258 substitution and the L267 substitution are present in the lipase variant, each of the L258 substitution and the L267 substitution is independently selected from the group consisting of arginine (R), lysine (K), asparagine (N), and serine (S).

[0013] Preferably, the D156 substitution is selected from the group consisting of serine and glycine, the L258 substitution is lysine, and the L267 substitution is asparagine.

[0014] Preferably, the lipase variant further comprises a V254 substitution of SEQ ID NO: 3, wherein the V254 substitution is selected from the group consisting of alanine (A), aspartic acid (D), threonine (T), arginine (R), and glycine (G), preferably alanine.

[0015] Preferably, the lipase variant further comprises a L255 substitution of SEQ ID NO: 3, wherein the L255 substitution is selected from the group consisting of lysine (K), threonine (T), arginine (R), alanine (A), and glycine (G).

[0016] Preferably, the lipase variant further comprises a N264 substitution of SEQ ID NO: 3, wherein the N264 substitution is selected from the group consisting of threonine (T), arginine (R), serine (S), glutamic acid (E), alanine (A), aspartic acid (D), and glycine (G).

[0017] Preferably, the lipase variant further comprises a T265 substitution of SEQ ID NO: 3, wherein the T265 substitution is selected from the group consisting of serine (S), glutamic acid (E), alanine (A), and asparagine (N).

[0018] Preferably, the lipase variant further comprises a S83 substitution at position 83 of SEQ ID NO: 3, wherein the S83 substitution is selected from the group consisting of aspartic acid (D), asparagine (N), alanine (A), glycine (G), and glutamic acid (E). In an embodiment, the S83 substitution is aspartic acid.

[0019] Preferably, the lipase variant further comprises at least one of the following substitutions:

(i) a S84 substitution of SEQ ID NO: 3, wherein the S84 substitution is selected from the group consisting of glycine (G), and threonine (T); (ii) a W88 substitution of SEQ ID NO: 3, wherein the W88 substitution is selected from the group consisting of alanine (A), isoleucine (I), valine (V), aspartic acid (D), glycine (G), threonine (T), and serine (S), preferably alanine, isoleucine, and valine;

(iii) a R30 substitution of SEQ ID NO: 30, wherein the R30 substitution is selected from the group consisting of glutamic acid (E), and lysine (K);

(iv) a R86 substitution of SEQ ID NO: 3, wherein the R86 substitution is selected from the group consisting of lysine (K), alanine (A), asparagine (N), and threonine (T);

(v) a S56 substitution of SEQ ID NO: 3, wherein the S56 substitution is selected from the group consisting of arginine (R), lysine (K), alanine (A), asparagine (N), and threonine (T);

(vi) a L58 substitution of SEQ ID NO: 3, wherein the L58 substitution is selected from the group consisting of lysine (K), aspartic acid (D), serine (S), arginine (R), glutamic acid (E), alanine (A), asparagine (N), and threonine (T); and

(vii) a 159 substitution of SEQ ID NO: 3, wherein the 159 substitution is selected from the group consisting of threonine (T), alanine (A), arginine (R), lysine (K), and leucine (L).

[0020] Preferably, the L58 substitution is lysine.

[0021] Preferably, the R86 substitution is lysine.

[0022] Preferably, the W88 substitution is valine.

[0023] Preferably, the lipase variant further comprises a propeptide. In an embodiment, the propeptide is selected from the group consisting of SEQ ID NO: 2, SEQ ID NO: 199, and SEQ ID NO: 200. More preferably, the lipase variant further comprises a cleavage site to allow the propeptide to be cleaved off. The cleavage site may allow a protease to be used to cleave off the propeptide to release the mature peptide and provides flexibility to obtain the lipase variant with and without the propeptide. An example of a cleavage site is SEQ ID NO: 201 which is recognised by the TEV protease for cleavage. Other cleavage sites and corresponding proteases may be used. In an embodiment, one or more linker moieties may be used to attach the mature peptide to propeptide and/or cleavage site. The linker moieties may be amino acid sequences or other functional groups like aliphatic groups, aryl groups, ethers and amines.

[0024] Preferably, a sequence of the lipase variant has at least 90% sequence identity to SEQ ID NO: 3, preferably at least 95% sequence identity to SEQ ID NO: 3, more preferably at least 96% sequence identity to SEQ ID NO: 3, even more preferably at least 97% sequence identity to SEQ ID NO: 3.

[0025] In an embodiment, the sequence is selected from the group consisting of SEQ ID NO: 110, SEQ ID NO: 108, SEQ ID NO: 74, SEQ ID NO: 51, SEQ ID NO: 24, SEQ ID NO:5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ID NO: 47, SEQ ID NO: 48, SEQ ID NO: 49, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 53, SEQ ID NO: 54, SEQ ID NO: 55, SEQ ID NO: 56, SEQ ID NO: 57, SEQ ID NO: 58, SEQ ID NO: 59, SEQ ID NO: 60, SEQ ID NO: 61, SEQ ID NO: 62, SEQ ID NO: 63, SEQ ID NO: 64, SEQ ID NO: 65, SEQ ID NO: 66, SEQ ID NO: 67, SEQ ID NO: 68, SEQ ID NO: 69, SEQ ID NO: 70, SEQ ID NO: 71, SEQ ID NO: 72, SEQ ID NO: 73, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 82, SEQ ID NO: 83, SEQ ID NO: 84, SEQ ID NO: 85, SEQ ID NO: 86, SEQ ID NO: 87, SEQ ID NO: 88, SEQ ID NO: 89, SEQ ID NO: 90, SEQ ID NO: 91, SEQ ID NO: 92, SEQ ID NO: 93, SEQ ID NO: 94, SEQ ID NO: 95, SEQ ID NO: 96, SEQ ID NO: 97, SEQ ID NO: 98, SEQ ID NO: 99, SEQ ID NO: 100, SEQ ID NO: 101, SEQ ID NO: 102, SEQ ID NO: 103, SEQ ID NO: 104, SEQ ID NO: 105, SEQ ID NO: 106, SEQ ID NO: 107, and SEQ ID NO: 109.

[0026] In an embodiment, the sequence is selected from the group consisting of SEQ ID NO: 51, SEQ ID NO: 62, SEQ ID NO: 63, SEQ ID NO: 64, SEQ ID NO: 65, SEQ ID NO: 66, SEQ ID NO: 67, SEQ ID NO: 68, SEQ ID NO: 69, SEQ ID NO: 70, SEQ ID NO: 71, SEQ ID NO: 72„ SEQ ID NO: 74, SEQ ID NO: 75, SEQ ID NO: 77, SEQ ID NO: 78, SEQ ID NO: 81, SEQ ID NO: 91, SEQ ID NO: 102, SEQ ID NO: 103, SEQ ID NO: 104, SEQ ID NO: 106, SEQ ID NO: 108, and SEQ ID NO: 110.

[0027] In an embodiment, the sequence is selected from the group consisting of SEQ ID NO: 51, SEQ ID NO: 71, SEQ ID NO: 72, SEQ ID NO: 74, SEQ ID NO: 77, SEQ ID NO: 103, SEQ ID NO: 106, SEQ ID NO: 108, and SEQ ID NO: 110.

[0028] In an embodiment, the lipase variant consists essentially of the sequence (i.e. of the SEQ ID NO: above).

[0029] In an embodiment, the lipase variant consists of the sequence (i.e. of the SEQ ID NO: above).

[0030] In a third aspect, there is provided a method to prepare an N-acyl amino acid. The method comprises: providing a mixture comprising an amine, a carboxylic acid and/or an ester of the carboxylic acid, water, and glycerol; contacting the mixture with the lipase variant according to the first and second aspects under suitable conditions to form the N-acyl amino acid.

[0031] Preferably, the amine is an amino acid, preferably glycine or beta-alanine.

[0032] Preferably, the carboxylic acid comprises a linear saturated or unsaturated aliphatic chain having 8 to 18 carbon atoms. More preferably, the carboxylic acid comprises a linear saturated or unsaturated aliphatic chain having 12 to 18 carbon atoms, even more preferably the carboxylic acid comprises a linear saturated having 12 to 15 carbon atoms or an unsaturated aliphatic chain having 12 to 18 carbon atoms.

[0033] Preferably, the ester is a glyceride ester, and may be a monoglyceride ester, diglyceride ester, or a triglyceride ester. More preferably, the carboxylic acid comprises a linear saturated or unsaturated aliphatic chain having 12 to 18 carbon atoms, even more preferably the carboxylic acid comprises a linear saturated having 12 to 15 carbon atoms or an unsaturated aliphatic chain having 12 to 18 carbon atoms.

[0034] Preferably, water is present in at least 10% v/v of the mixture, and preferably at most 80% v/v of the mixture, more preferably 30% to 50% v/v of the mixture.

[0035] Preferably, glycerol is present in up to 90% v/v of the mixture, preferably at least 20% v/v of the mixture, more preferably 50% to 70% v/v of the mixture.

[0036] Preferably, the lipase variant is provided by expression in a microorganism, preferably the microorganism is Pichia pastoris or Escherichia coli.

[0037] Preferably, the lipase variant is free in the mixture and not bound, for example to a solid support or microorganism.

[0038] Preferably, the suitable conditions comprise at least one of the following: a temperature of 20 °C to 60 °C; a pH of 6 to 9; and at least 1 mM of the carboxylic acid and/or the ester per Img of the lipase variant, preferably at least 5 mM of the carboxylic acid and/or the ester, and more preferably at most 50 mM of the carboxylic acid and/or the ester.

[0039] Preferably, the temperature is selected such that the carboxylic acid or ester is a liquid at the temperature and below an aggregation temperature of the lipase variant.

[0040] In a fourth aspect, there is provided an N-acyl amino acid obtained by or obtainable by the method according to the third aspect.

Detailed Description

[0041] In the figures: [0042] Figure (FIG.) 1 shows the time course of biotransformation of lauric acid to N- lauroylglycine. The biotransformation was performed with 20 mM lauric acid in a 10 mL scale aqueous reaction containing 7.0 mL glycerol and 3.0 mL of 100 mM sodium phosphate buffer with a final concentration of 0.9 M glycine. The reaction contained 1.6 mg of RMLD156G biocatalyst and was maintained at 50 °C and pH 7.5. Diamond filled: N-lauroylglycine; Square filled: 1-monolaurin; Triangle filled: 1,3-dilaurin; Circle filled: trilaurin. The data shown represent the averages of independent experiments performed in triplicate, with error bars indicating standard deviation.

[0043] FIG. 2 shows a schematic of reaction pathways of RMLoi56G-catalyzed acylation of lauric acid with glycine for the synthesis of N-lauroylglycine. The wild-type (WT) and D156G mutant refer to the mature RML (P. pastoris), while the Gen 5 mutant refers to the proRML Gen 5 (E. coli) as given in FIG. 28. Specific activities were measured in U/g protein.

[0044] FIG. 3 shows the computation modelling of the selected enzymes. Panel (a) shows the docking of N-acetyl glycine (yellow) and FAD (grey) at the active site of glycine oxidase from Bacillus subtilis (PDB: 1NG3). Panel b shows the docking of glycine (yellow) at the active site of conjugated bile acid hydrolase from Clostridium perfringens (PDB: 2RLC), Panel (c) shows the docking of glycine (yellow) at the active site of glycyl tRNA-synthetase from Escherichia coli (PDB: 7EIV). Putative H-bonds and salt bridges between glycine and active site residues are represented by yellow and pink dashed lines, respectively. Panel (d) shows the molecular docking of N-lauroylglycine ligand onto an open conformation of RML (PDB: 4TGL) using Autodock Vina. Yellow sticks represent N-lauroylglycine ligand; residues in dark grey are conserved residues; residues in cyan within the nucleophilic binding site and residues in pink within the nucleophile entrance site are suitable hotspots for mutagenesis.

[0045] FIG. 4 shows a schematic of a workflow for mutagenesis, small-scale expression, and acylation assay for screening and positive identification of desirable clones. Workflow involves generation of BL21 mutant libraries, seeding of individual colonies into 96-deepwell plates for small-scale overnight culture and overexpression of proRML, harvesting of BL21 wet cells for acylation assay, and identification of desired N-oleoylglycine band via thin-layer chromatography (TLC).

[0046] FIG. 5 shows the molecular docking of glycine ligand onto an open conformation of the RML wild-type (PDB: 4TGL) to Gen 5 RML using Autodock Vina in panels (a) to (f) respectively. Yellow sticks represent glycine ligand; residues in dark grey represents conserved residues; residues in pink represents point mutations with respect to the wild-type; and the catalytic serine residue is represented in orange. [0047] FIG. 6 shows the courses of biotransformation of lauric acid to N-lauroylglycine amide with proRMLWT and Gen 1-5 mutants. Biotransformation was performed with 10 mM lauric acid in a 10 mL scale aqueous reaction medium containing 7.0 mL glycerol and 3.0 mL of 100 mM sodium phosphate buffer with a final concentration of 0.9 M glycine at 50 oC and pH 7.5 with 1.6 mg proRMLWT and its mutants. Square filled: proRML wild-type; Circle empty: proRML Gen 1; Cross: proRML Gen 2; Triangle filled: proRML Gen 3; Circle filled: proRML Gen 4; Diamond filled: proRML Gen 5. The data shown represent the averages of independent experiments performed in triplicate, with error bars indicating standard deviation.

[0048] FIG. 7 shows the specific activities for the biotransformation of C8-C18 free fatty acids and C12 glyceryl esters with glycine to N-acyl glycine amides by using proRMLWT and proRML Gen 5 mutant. The data shown represent the averages of independent experiments performed in triplicate, with error bars indicating standard deviation.

[0049] FIG. 8 shows the hydrogen bond network of D 156, which is positioned at the central of the alpha helix connected to the nucleophilic serine elbow, of the crystal structure of the RML wild-type (PDB: 4TGL) as illustrated by PyMOL

[0050] FIG. 9 shows the Static Light Scattering (SLS) measurements of RMLWT and RMLDI56G at a wavelength of 266 nm (SLS266) with thermal ramping from 20-95 °C at a rate of 0.6 °C/min with an incubation of 180 s. The RMLDI56G plotted line is the higher line in FIG. 9 where a significant increase of the measured SLS values was observed at a temperature of 58.6 °C compared to 62.8 °C for the WT.

[0051] FIG. 10 shows the residual activity for para-nitrophenyl dodecanoate (pNPD) hydrolysis after incubation of RMLWT (circle marker) and RMLDI56G (square marker) at temperatures from 30-65 °C for 1 h.

[0052] FIG. 11 shows the expression construct of mature and proRML in Pichia pastoris GS115 and E. coli BL21 hosts, respectively.

[0053] FIG. 12 show the SDS-PAGE of the expressed enzyme. Lane 1 (LI) is the marker; Lane 2 (L2) is the purified proRML following expression with BL21 host, cell lysis and his-tag purification (41.2 kDa); and Lane 3 (L3) is the mature RML following expression with GS115 and concentration of culture supernatant (29.5 kDa).

[0054] FIG. 13 shows the comparison of specific activities towards pNP-dodecanoate (pNPD) hydrolysis and aminolysis (glycine and 1-monolaurin) between mature RMLDI56G and proRMLm56G.

[0055] FIG. 14 shows the plots of initial velocities vs 1-monolaurin concentration for proRML- catalyzed aminolysis of 1-monolaurin and glycine. Panels (a) to (f) are for the following enzymes (a): proRML wild-type (WT); (b) proRML D156G (Gen 1); (c) proRML D156G/L258K (Gen 2); (d) proRML D156G/L258K/L267N (Gen 3); I proRML D156G/L258K/L267N/S83D (Gen 4); (f) proRML D156S/L258K/L267N/S83D/L58K/R86K/ W88V (Gen 5).

[0056] FIG. 15A and FIG. 15B show the molecular docking of 1-monolaurin ligand onto an open conformation of the various RML lipases from the wild-type (PDB : 4TGL) to Gen 5 RML using Autodock Vina. Yellow sticks represent 1-monolaurin ligand; residues in dark grey represents conserved residues; residues in pink represents point mutations with respect to the wild-type; and the catalytic serine residue is represented in orange. Panels (a) to (f) are for the following enzyme (a): proRML wild-type (WT); (b) proRML D156G (Gen 1); (c) proRML D156G/L258K (Gen 2); (d) proRML D156G/L258K/L267N (Gen 3); I proRML D156G/ L258K/L267N/S83D (Gen 4); (f) proRML D156S/L258K/L267N/S83D/L58K/R86K/W88V (Gen 5).

[0057] FIG. 16 shows the SDS-PAGE analysis of the purified proRML wild-type and Gen 1 to Gen 5 mutants as expressed by E. coli, followed by a one-step Ni-NTA his-tag affinity purification. Lane M is for the protein standard markers; Lane 1 (1) is for the proRML wildtype; Lane 2 (2) is for the proRML Gen 1; Lane 3 (3) is for the proRML Gen 2; Lane 4 (4) is for the proRML Gen 3; Lane 5 (5) is for the proRML Gen 4; Land 6 (6) is for the proRML Gen 5.

[0058] FIG. 17 shows a plate of thin-layer chromatography (TLC) performed to detect the N- acylation of glycine with oleic acid to form N-oleoylglycine. The reaction contained E. coli BL21 wet cells overexpressing proRML mutants, 180 pL acylation mix, and 5 pL of oleic acid, and was carried out at 50 °C and 1500 rpm for 6 h. WT refers to the RML wild-type enzyme, G1 to G5 refers to the RML Gen 1 to 5 respectively.

[0059] FIG. 18 to FIG. 27 shows the RP-HPLC analysis of biotransformation of free fatty acids and lauroyl esters as substrates to produce N-acyl glcines. FIG. 18 to FIG. 24 shows the formation of the respective N-glycine esters with the following fatty acids octanoic acid (C8:0), decanoic acid (C10:0), lauric acid (C12:0), myristic acid (C14:0), pentadecanoic acid (C15:0), palmitic acid (C16:0), and oleic acid (C12:0) respectively. FIG. 25 to FIG. 27 shows the use of glycerin esters as the substrate with monolaurin glyceride (FIG. 25), 1,3-dilaurin glyceride (FIG. 26), and trilaurin glyceride (FIG. 27).

[0060] FIG. 28 shows the aminolysis activities of successive generations of proRML mutants with 1-monolaurin and glycine. [0061] FIG. 29 shows the kinetic parameters of successive generations of proRML mutants for aminolysis of 1-monolaurin with glycine.

[0062] FIG. 30 shows the proRML Gen 5-catalyzed biotransformation of fatty acids (Cs-Cis) and lauroyl esters with glycine for the synthesis of Cs-Cis N-acyl glycine amides.

[0063] FIG. 31A to FIG. 3 IE show the comparison of aminolysis activities of successive generations of proRML mutants. The position of the mutations (or substitutions) are with respect to the wild type mature peptide (SEQ ID NO: 3).

[0064] The green and efficient syntheses of medium- to long-chain N-acyl glycines are highly desirable for cosmetics and pharmaceutical industries for their possible use as biosurfactants and therapeutics. The enzymatic amidation in an aqueous system via glycerol activation of fatty acids and their subsequent aminolysis with glycine to synthesize N-acyl glycines is demonstrated herein. Synthetic lipases (proRML) are engineered by reshaping its catalytic pocket to enhance its aminolysis activity and catalytic efficiency by up to 103-fold and 465- fold, respectively. In an example, the evolved proRML (D156S/L258K/L267N/S83D/L58K/ R86K/W88V) catalyzed the amidation of a fatty acid with glycine to give N-lauroylglycine with high yield (80%). It accepts a broad range of fatty acid acyl donors (Cs-Cis), giving high yields of medium- to long-chain N-acyl glycines. The developed amidation concept may be generally applied, and the engineered enzyme is useful for the green synthesis of valuable N- acyl glycines.

[0065] The lipase-catalyzed amide bond formation proceeds in a glycerol-containing aqueous system to both activate the free fatty acids for amide synthesis and enhance the solubility of the amino acid substrates. In this reaction system, the lipase catalyzed the activation of free fatty acid with glycerol to form a glyceryl ester intermediate, followed by the subsequent aminolysis of glyceryl ester with an amine donor to generate the desired amide. On this basis, a lipase from Rhizomucor miehei was engineered to enhance its activity for ester-amide interconversion through a rational-based reshaping of the enzyme catalytic pocket is reported, leading to a high-yielding synthesis of N-acyl glycines. The developed lipase can accept a broad range of substrates, producing a series of medium- to long-chain N-acyl glycines with high yield, thus providing useful and potential applications in relevant industries.

[0066] The lipase variants may have a parent sequence of SEQ ID NO: 3; and a D156 substitution and/or a L258 substitution, wherein the D156 substitution is selected from the group consisting of serine (S), glycine (G), glutamic acid (E), and threonine (T), wherein if the D156 substitution is glycine, the lipase variant comprises at least one other substitution, and the L258 substitution is selected from the group consisting of lysine (K), arginine (R), glutamine (Q), asparagine (N), histidine (H), and serine (S). The variants may have a parent sequence of SEQ ID NO: 3; and at least one substitution of the following amino acids: D156, L258. L267, V254, L255, N264, T265, S83, S84, W88, R30, R86, S56, L58, and 159.

[0067] Result and Discussion

Scheme 1. N-acylation of glycine with Cs-Cis free fatty acids as the acyl donor for the synthesis of N-acyl glycines.

O O ML Mutant O

OH

Free Fatty Acids Glycine N-Acyl Glycine

1a-g 2a-g ; cis-A9

[0068] Screening of Suitable Enzyme Candidates for Amide Synthesis

[0069] A screening was done for a suitable enzyme candidate that can perform the N-acylation of lauric acid with glycine to produce N-lauroylglycine. Since enzymatic reactions involving hydrolases are typically reversible, a panel of in-house and commercial lipases was screened based on their hydrolytic ability towards N-lauroylglycine (Table 3). Most lipases exhibit low hydrolytic activity towards N-lauroylglycine apart from an in-house mutant from Rhizomucor miehei obtained from the culture supernatant of recombinant methylotrophic P. pastoris GS 115 strain. The mutant contained a D156G point mutation (RMLDI56G) and possessed hydrolytic activity that is 11.2 times higher than its wild-type (WT) counterpart. Consequently, RMLDI56G was selected as the enzyme candidate for the synthesis of N-acyl glycines.

[0070] Lipase-Catalyzed Amide Bond Formation in Glycerol-Containing Aqueous System

[0071] To develop a suitable reaction medium for amide synthesis, a series of reaction medium containing saturated glycine solution with a range of organic cosolvents 10-90% (v/v) were examined for the RMLoi56G-catalyzed biotransformation of lauric acid (20 mM) with glycine to produce N-lauroylglycine. While little to no yield of N-lauroylglycine was observed for most organic cosolvents, the use of glycerol at 70% (v/v) as a cosolvent resulted in a 33% yield of N-lauroylglycine after 72 h. Apart from a shifting of the thermodynamic equilibrium in favor of the synthesis reaction, it was hypothesized that the use of glycerol also promotes the activation of free fatty acid via the formation of a more reactive glyceryl ester as the intermediate through esterification. [0072] To investigate whether the synthesis of N-lauroylglycine occurs via direct amidation or through an ester-amide interconversion, the concentrations of N-lauroylglycine, mono-, di- and trilaurin were tracked at regular intervals over a period of 120 h (FIG. 1). FIG. 1 shows the time course of biotransformation of lauric acid to N-lauroylglycine. The biotransformation was performed with 20 mM lauric acid in a 10 mL scale aqueous reaction containing 7.0 mL glycerol and 3.0 mL of 100 mM sodium phosphate buffer with a final concentration of 0.9 M glycine. The reaction contained 1.6 mg of RMLDI56G biocatalyst and was maintained at 50 °C and pH 7.5. The data shown represent the averages of independent experiments performed in triplicate, with error bars indicating standard deviation. It was observed that a significant amount of 1-monolaurin was accumulated at 7.0 mM in the first 3 h, indicating the occurrence of esterification of lauric acid with the glycerol solvent, while the concentration of di- and trilaurin remains low (below 1.0 mM) throughout the reaction. At 20 h, the concentration of 1- monolaurin was then sharply reduced to 2.2 mM, and was maintained at the same concentration until 40 h. This was accompanied by an increase of N-lauroylglycine amide to 5.0 mM at 40 h, suggesting the possible formation of N-lauroylglycine amide from 1-monolaurin via aminolysis. After 40 h, there was still a progressive increase in N-lauroylglycine to 7.9 mM at 90 h, along with a steady increase in 1-monolaurin to 6.0 mM at 90 h. Beyond 90 h, minimal changes in the concentrations of 1-monolaurin ester and N-lauroylglycine amide were observed. [0073] To confirm the glycerol-based activation mechanism for N-lauroylglycine amide synthesis, the individual reactions of RMLoi56G-catalyzed esterification of lauric acid with glycerol was first compared with the aminolysis of 1-monolaurin with glycine in the same reaction medium, with the omission of competing substrates glycine and lauric acid, respectively. A high esterification activity of 693.8 U/g (U = pmol min 1 ) was observed as opposed to 11.3 U/g for aminolysis for RMLDI56G. The RMLm56G-catalyzed direct N-acylation between lauric acid and glycine was also carried out by replacing glycerol with tris buffer, giving an inferior activity of 0.53 U/g. Comparing the activities of esterification, aminolysis, and direct N-acylation, the esterification of lauric acid with glycerol to give 1-monolaurin was predominant, whereas amide formation from the aminolysis of 1-monolaurin with glycine was 21-fold more active than the direct N-acylation of glycine with lauric acid to give N- lauroylglycine. Overall, the amide bond formation is accelerated through the activation of lauric acid with glycerol, and the pathway from lauric acid to 1-monolaurin glyceryl ester to N-lauroylglycine amide is predominant for the synthesis of N-lauroylglycine (FIG. 2). In nature, the biogenesis of amide bonds is almost invariably dependent on the activation of the carboxylic acid, and typically occurs via the formation of acyl phosphate or acyl-adenylate intermediates with the investment of high energy bonds in ATP 32 . The data presented confirms an alternative strategy for amide bond formation in N-acyl glycines in which the carboxylic acid is first activated to a glyceryl ester intermediate, followed by an ester-amide interconversion to afford the desired amide product 41 .

[0074] In this new pathway for amide bond formation, the esterification activity is 61-fold higher than that of aminolysis. This makes aminolysis the rate-limiting step for amide synthesis and represents the bottleneck in the RMLoi56G-catalyzed biotransformation of lauric acid to N- lauroylglycine. Directed evolution of RMLDI56G to improve the aminolysis activity is therefore paramount to further enhance the ester to amide interconversion and consequently increase the yield of N-lauroylglycine.

[0075] The same reactions were examined with RMLWT (RML wild-type). In comparison to the wild-type, RMLDI56G showed similar activities for esterification and direct N-acylation, with slightly higher aminolysis activity. FIG. 2 shows a schematic of reaction pathways of RMLm56G-catalyzed acylation of lauric acid with glycine for the synthesis of N-lauroylglycine. The WT and D156G mutant refer to the mature RML (P. pastoris), while the Gen 5 mutant refers to the proRML Gen 5 (£’. coll) as given in FIG. 28. The specific activities of the different enzymes were measured in U/g protein.

[0076] Based on the crystal structure of RMLWT, D156 is located on the central helix connected to the catalytic Ser residue and is involved in a polar interaction network with four neighboring residues: Y115, QI 19, Q159, and R160 (FIG. 8). A D156G mutation causes a loss of this polar interaction network and possibly resulted in a tradeoff for higher conformational flexibility (FIG. 9), facilitating the binding of glycine, which is bigger than the native water nucleophile. In FIG. 9, it may be seen from the Static Light Scattering (SLS) measurements of RMLWT and RMLD156G at the wavelength of 266 nm (SLS266) with thermal ramping from 20-95 °C at a rate of 0.6 °C/min with an incubation of 180 seconds, the SLS values of RMLDI56G increases significantly at a temperature of 58.6 °C, while RMLWT had a similar increase at a higher temperature of 62.8 °C. It was believed that further engineering of RMLDI56G might create enzymes with even higher aminolysis activities, thus enhancing amide bond formation from free fatty acids to amide through the new pathway involving glycerol activation and aminolysis. This was proven correct by the RML Gen 5 mutant which had a 38 -fold increase in the aminolysis activity compared to the RMLDI56G mutant as shown in FIG. 2 and described further herein.

[0077] Selection of Suitable Amino Add Hotspots for Mutagenesis [0078] The engineering of RMLDI56G to enhance the aminolysis activity started with the selection of amino acid residues for evolution based on rational design to allow better access for glycine to form the amide while excluding water from the active site to minimize hydrolysis. To achieve this, the reported enzymes having high selectivity with glycine substrate were first studied to understand how nature anchors glycine to the catalytic binding pockets. Structures of three enzymes including glycine oxidase from Bacillus sublilis. conjugated bile acid hydrolase from Clostridium perfringens, and glycyl-tRN A- synthetase from Escherichia coli were investigated (FIG. 3 panels (a) to (c)). The a-carboxylic acid terminus of glycine was found typically to form ionic salt bridges or hydrogen bond interactions with neighboring residues Arg, Lys, Asn, Tyr, and His 42,43 . It was thus hypothesized that the introduction of positively charged amino acids Arg, Lys and hydrophilic residues in RMLDI56G may promote the formation of salt bridges or hydrogen bond interactions with the a-carboxylic group of glycine to achieve a favorable orientation within the binding pocket of RML for nucleophilic attack.

[0079] The molecular docking of N-lauroylglycine ligand onto the open conformation of RML was then performed to identify amino acid residues for mutagenesis 44,45 (panel (d) of FIG. 3). A total of 12 key residues within 7 A of the nucleophilic binding site were first positively identified, with the exclusion of structurally or catalytically important amino acid residues. Out of the 12 residues, 3 residues L258, G266 and L267 were found to be immediately adjacent and within 5 A of the a-carboxylic terminus of glycine. Mutations at these sites could orient the glycine molecule in a more favorable position for nucleophilic attack. As such, a site- directed mutagenesis (SDM) of each of these residues to Arg, Lys, Gin, Asn, His, and Ser was first individually designed to promote the binding of glycine molecules through salt bridges or hydrogen bond interactions.

[0080] Next, the nine remaining residues within the nucleophilic binding pocket, A25, S83, S84, N87, W88, V254, L255, N264, and T265, were saturated individually with two degenerate codons: RVK degenerate codon (12 codons: 9 amino acids) coding for small or hydrophilic amino acids and NTT degenerate codon (4 codons:4 amino acids) coding for hydrophobic residues. While small or hydrophilic residues could enlarge the binding pocket or enhance the binding interactions with glycine molecules within the reaction mixture, the inclusion of hydrophobic residues may also reduce water activity and limit the hydrolysis of 1-monolaurin intermediate in the biotransformation. To ensure more than 99% coverage 46 , 60 and 16 colonies were screened for a RVK and an NTT mutant library per residue, respectively. Further evolution was then targeted at eight amino acid residues within the nucleophilic entrance site, R30, S56, L58, 159, Y60, D61, N63, and R86, in addition to Gen 1 residue G156, in a bid to enhance the local concentration of glycine in the vicinity of the active pocket. Similarly, degenerate codons RVK and NTT were used on the targeted residues within the nucleophilic entrance site. Throughout the evolution process, the best mutant for each generation was chosen as the template for subsequent rounds of mutagenesis in an iterative manner.

[0081] Enhancing Throughput for Expression and Screening of Positive Clones

[0082] To facilitate the ease of genetic manipulation, the codon-optimized gene sequence of RMLD156G, along with its N-terminal propetide domain was first cloned into a suitable pET28a vector under the control of a T7 promoter for its overexpression in BL21 E. coli (FIG. 11). FIG. 11 shows the expression construct of mature and proRML in Pichia pastoris GS 115 and E. coli BL21 hosts, respectively. Analogous to previous studies, the observed gel band corresponding to 42 kDa after cell lysis and his-tag protein purification verifies the soluble expression of proRMLDi56G (FIG. 12), while no expression was observed if the genetic construct lacks the N-terminal propeptide region 47 . FIG. 12 shows the SDS-PAGE of the RMLDI56G enzyme. The first lane is the ladder or marker, the second lane is of the proRMLoiseo enzyme and the third lane is the mature RMLDI56G enzyme.

[0083] Comparing the proRMLoiseG mutant against the mature RMLDI56G counterpart, the proRMLoiseo possessed a 4.3-fold reduction in hydrolytic activity towards the native p- nitrophenyl dodecanoate (pNPD) substrate (FIG. 13). Interestingly, the similar aminolysis rates of the mutants towards 1-monolaurin and glycine meant that the additional propeptide region did not affect the synthesis reaction of N-lauroylglycine. This makes E. coli a suitable host for protein engineering of RML towards high aminolysis activities.

[0084] A workflow to clone, express, and screen the proRML mutants for their ability to perform N-acylation reactions was established as shown in FIG. 4. The generated mutants were first subcloned into the expression vector (pET28a) and transformed into BL21 (DE3) E. coli cells for the generation of mutant libraries. The individual colonies were then picked and seeded into a 96-deep well plate for small-scale culture and expression. To detect the N- acylation activity of the proRML mutants, the BL21 E. coli wet cells were harvested via centrifugation with the removal of the culture supernatant to facilitate its use as a direct catalyst for subsequent N-acylation assay. Following which, an acylation mix containing glycerol and saturated glycine was then added directly to the individual wells containing BL21 E. coli wet cells to initiate the N-acylation reaction, followed by the staining of the individual reaction products via thin-layer chromatography (TLC). Oleic acid substrate was used as the primary substrate for the screen since an unsaturated acyl donor is required for the identification of the amide product. Mutant colonies possessing desirable N-acylation activities were then selected for medium-scale shake-flask expression and his-tag purification for a more accurate comparison of their aminolysis activities against the 1-monolaurin substrate.

[0085] Engineering of proRML Mutants with Enhanced Aminolysis Activity

[0086] In round 1 of evolution, the introduction of hydrophilic residues Arg, Lys, Asn and Ser to residues L258 and L267 by SDM led to a substantial improvement in the aminolysis activity between glycine and 1-monolaurin, while a SDM on residue G266 showed unsatisfactory results (FIG. 28 and Table S4). Out of which, the best mutant D156G/L258K (Gen 2) showed a 4.5-fold and 10.2-fold increase in its aminolysis activity as compared to the Gen 1 mutant (RMLD156G) and wild-type respectively. The Gen 2 mutant was subsequently chosen as the template for the second round of SDM at position L267 by introducing hydrophilic residues Arg, Lys, Gin, Asn, His, and Ser. Amongst the generated mutants, the triple mutant D156G/L258K/L267N (Gen 3) showed the highest aminolysis activity of 86.9 U/g, with a 2- fold enhancement as compared to the Gen 2 mutant.

[0087] In round 3, following a site saturation mutagenesis (SSM) on 9 remaining residues within the nucleophilic binding site - A25, S83, S84, N87, W88, V254, L255, N264 and T265, a total of 12 mutants were found to provide further improvement in aminolysis activity, with 11 of them originating from the RVK mutant library (FIG. 28). Most significantly, the D156G/L258K/L267N/S83D (Gen 4) quadruple mutant exhibited an aminolysis activity of 380.1 U/g, which is 90-fold higher than the wild-type enzyme. As the other 11 mutants also showed some activity improvement, the mutations derived from round 3 were added to Gen 4 mutant by SDM in round 4a. Among the 11 generated mutants, only D156G/L258K/L267N/S83D/W88V (Gen 4a) showed a slight improvement in its aminolysis activity, while the other mutants did not produce satisfactory results. Further evolution was then performed on 8 residues within the nucleophilic entrance site - R30, S56, L58, 159, Y60, D61, N63, R86, in addition to residue G156 (or D156 based on the wild-type), by SSM in round 4b using Gen 4 as the template. While most of the mutants had dramatically lower aminolysis activities, the introduction of mutations L58K, R86K, and D156S could retain 74-86% of the aminolysis activity. These three mutations were then introduced to the Gen 4a mutant by SDM in round 4c to give the mutant D156S/L258K/L267N/S83D/W88V/L58K/R86K (Gen 5). This mutant demonstrated an aminolysis activity of 434.0 U/g, which is 103-fold higher than the wild-type counterpart.

[0088] The mutations of the amino acid residues in FIG. 28, Table S4, and the description herein are described with respect to the sequence amino acid residues of the mature RML peptide (SEQ ID NO: 3). However, as explained above, the results (e.g. aminolysis activity, fold improvement to wild-type, etc.) in FIG. 28, Table S4 and description for the mutants are results of the proRML mutants (i.e. with the propeptide in the mutants), unless otherwise indicated.

[0089] Various amino acids are described herein by its full name, and conventional 1 -letter and 3-letter abbreviations as is known in the art. The substitution of an amino acid residue in a peptide is describe by the conventional notation, for example D156G indicates that the 156th amino acid residue (or position) of aspartic acid (D) is substituted by glycine (G).

[0090] Kinetic Studies of the Aminolysis of 1-Monolaurin with Glycine Catalyzed by proRML Mutants and Molecular Docking

[0091] Kinetic studies were performed with the Gen 1 to Gen 5 proRML mutants for the aminolysis of 1-monolaurin with glycine, and compared with the wild-type (FIG. 29). The k ca t and KM of 1-monolaurin were determined while the KM of glycine could not be obtained due to the linearly increasing aminolysis activities with glycine from 10-900 mM. As shown in FIG. 29, the best Gen 5 mutant showed k ca t, KM and k ca t/KM values of 21.0 min 1 , 6.26 mM and 3.35 min 1 mM 1 respectively. This corresponds to a 63.6-fold increment in its k ca t and a 7.3-fold reduction in its KM in comparison to the wild-type, thus amounting to a kc a t/KM value which is 465-fold higher than that of the wild-type counterpart. The introduction of a single L258K mutation in the Gen 2 mutant provided a 2.9-fold improvement in its catalytic efficiency and a 2.8-fold improvement in its k ca t compared to Gen 1 mutant although the KM remains similar. The concomitant introduction of two hydrophilic residues (L258K/L267N) within the nucleophile binding pocket in the Gen 3 mutant resulted in a further 16.1 -fold improvement in its catalytic efficiency, with a 2.2-fold improvement in its k ca t and a 7.5-fold reduction in its KM. An additional S83D mutation in the Gen 4 mutant provided another 3 -fold improvement in its catalytic efficiency, arising from a 3.3 -fold increase in its k ca t and a 1.1 -fold increase in its KM. Finally, the introduction of L58K/R86K/W88V/D156S mutations in the Gen 5 mutant added another 1.5-fold improvement to its catalytic efficiency, with a 1.3-fold improvement in the k ca t and a 1.2-fold reduction in its KM value.

[0092] To obtain a structure -based understanding of the proRML mutants for enhanced aminolysis activity, molecular docking of the glycine substrate onto the X-ray crystal structure of the open conformation of RMLWT (PDB: 4TGL) and the structural models of its mutants was carried out using Autodock Vina and shown in FIG. 5.

[0093] In panel (f) of FIG. 5, the D156S/L258K/L267N/S83D/L58K/R86K/W88V Gen 5 mutant re-oriented the glycine substrate to a more favorable position as compared to the wild- type and Gen 1 mutant (FIG. 5 panels (a) and (b)): the distance between the nucleophilic serine residue (S144) and the a-amino group of glycine in the Gen 5 mutant is only 2.7 A, while the corresponding distance in the wild-type and Gen 1 mutant is 5.1 A. In addition, the predicted binding affinity of glycine was enhanced from -2.8 kcal mol 1 in the wild-type to -3.3 kcal mol’ 1 in the Gen 5 mutant, giving rise to a higher catalytic efficiency. Analyzing the enzyme binding poses of the Gen 1-5 mutants, the distance from the a-amino group of glycine to the nucleophilic serine residue was significantly reduced from 5.1 A in Gen 1 to 4.8 A in Gen 2, 2.8 A in Gen 3, 2.8 A in Gen 4, and 2.7 A in Gen 5, respectively. From panels (c) and (d) of FIG. 5, the concerted introduction of hydrophilic residues L258K and L267N largely assisted in the reorientation of the glycine substrate. Based on the docking pose of the Gen 4 mutant in panel (e), the S83D mutation generated a salt bridge with an adjacent R80 residue situated on the hinge of the lid, which favors the open conformation of the mutant for better access to the glycine substrate. This mutation resulted in an increase in aminolysis activity by 4.4-fold from the Gen 3 to Gen 4 mutant. The addition of mutations D156S, W88V, L58K, and R86K to the Gen 4 mutant likely brought about an enhancement of the access and binding of glycine near the active pocket of the Gen 5 mutant, possibly due to the reduction in size of residues (W88V) and the introduction of positively charged residues (L58K and R86K). This was also evidenced by the increase in the predicted binding affinity from the Gen 4 to Gen 5 mutant towards the glycine ligand from -3.1 kcal mol’ 1 to -3.3 kcal mol’ 1 . The D156S mutation might contribute to the reduction in the distance between the a-amino group of glycine and the nucleophilic serine residue from 2.8 A to 2.7 A. It will be appreciated that the D156S mutation may be a better substitution than the D156G mutation in the RML Gen 1 mutant. Whilst the mutations in the successive rounds were done in combination with the D156G mutation in the RML Gen 1 mutant, it is possible that each of the point mutations (or substitutions) described above and shown in FIG. 31 A to 3 IE can alone improve the activity of the RML wild-type lipase without the other mutations. The RML Gen 2, Gen 3, Gen 4 and Gen 5 mutants are merely the optimized mutant for each round in the aim to determine the optimal RML lipase variant.

[0094] Molecular docking of 1-monolaurin onto the X-ray structure of RMLWT and the structure models of its mutants were also performed using Autodock Vina, respectively (FIG. 15). The distance between the nucleophilic serine residue (S144) and the carbonyl group of 1- monolaurin in the wild-type and Genl-5 mutants are 4.5 A, 4.7 A, 4.6 A, 3.4 A, 3.8 A, and 3.1 A, respectively, which explains the KM values of 1-monolaurin of 45.8 mM, 51.7 mM, 50.6 mM, 6.76 mM, 7.49 mM, and 6.26 mM, respectively, for the wild-type and Gen 1-5 mutants (FIG. 29). [0095] Biotransformation of Lauric Acid (C12) with Glycine to N-lauroylglycine Amide Using proRML Gen 1-5 Mutants

[0096] The course of the biotransformation of lauric acid (10 mM) with glycine to N- lauroylglycine amide was examined with the proRML Gen 1-5 mutants in the glycerol (70% v/v)-containing aqueous system and compared with proRMLwr and the plotted results are shown in FIG. 6. The biotransformation was performed with 10 mM lauric acid in a 10 mL scale aqueous reaction medium containing 7.0 mL glycerol and 3.0 mL of 100 mM sodium phosphate buffer with a final concentration of 0.9 M glycine at 50 °C and pH 7.5 with 1.6 mg proRMLwr and its mutants. At 6 h, the biotransformation yield of N-lauroylglycine amide rose to 5.9%, 14.3%, 30.3%, 53.7%, and 61.7% respectively, with the Gen 1-5 mutants, while it reached only 2.7% with the wild-type enzyme. This clearly demonstrated the enhanced activities of the Gen 1-5 mutants. With the Gen 4-5 mutants, the yield of N-lauroylglycine increased steadily over 6-12 h and maintained a minimal increase from 12-72 h. In the Gen 3 mutant, the yield of N-lauroylglycine increased gradually from 6-16 h and stabilized afterwards. On the other hand, the yield of N-lauroylglycine rose slowly but continually from 6-72 h with the wild-type and Gen 1-2 mutants. At 72 h, the wild-type and Gen 1-4 mutants gave yields of 13.8%, 35.3%, 46.6%, 50.7% and 71.6%, respectively, while a high yield of 80.3% of N- lauroylglycine amide was obtained with the Gen 5 mutant. Evidently, the superior aminolysis activity and kinetics of Gen 5 mutant also translates to a higher biotransformation yield of N- lauroylglycine (80.3% vs 13.8% at 72 h) and a faster reaction (70% yield at 16 h).

[0097] Amidation of Cs-Cis Fatty Adds and C12 Glyceryl Esters with Glycine to N-Acyl Glycine Amides with proRML Gen 5 Mutant

[0098] The biotransformation of medium to long-chain Cs-Cis fatty acids and lauroyl glyceryl esters (C12) with glycine to produce N-acyl glycine were explored with the Gen 5 mutant in glycerol-containing aqueous medium, and their specific activities at 1 hour were compared with the wild-type enzyme. As shown in FIG. 7, the reaction with saturated C10:0, C12:0, C14:0, and C15:0 fatty acids gave high specific activities of 152 U/g, 349 U/g, 431 U/g, and 391 U/g, respectively, while lower specific activities of 57 U/g and 52 U/g were obtained with shorter chain saturated C8:0 fatty acid and longer chain C16:0 saturated fatty acid, respectively. On the other hand, the Gen 5 mutant catalyze the amidation of unsaturated C 18 : 1 fatty acid and glycine with high specificity of 316 U/g. The specific activity difference between C16:0 and C18:l fatty acids might be due to the lower solubility of C16:0 fatty acid with a melting point of 62.9 °C, and a higher solubility of unsaturated C18:l fatty acid with a melting point of 13.4 °C. In the amidation of the C12 glyceryl esters, Gen 5 mutant also displayed high specific activities of 434 U/g, 454 U/g, and 474 U/g with 1-monolaurin, 1,3-dilaurin, and trilaurin, respectively. Accordingly, the Gen 5 mutant accepts a broad range of fatty acids from Cs-Cis, with higher preference towards saturated C12-C15 and unsaturated Cis fatty acids, in addition to C12 glyceryl esters. It is possible that other mono, di, and trisubstituted glyceryl esters will show similar activity to the mutants like the Gen 5 mutant, especially for the glyceride esters with similar melting points. A similar trend in the specific activities towards those fatty acids and C12 glyceryl esters was observed with proRMLwr (FIG. 7), but with low activities (1.3-8.0 U/g). Comparing with the wild-type, the Gen 5 mutant demonstrated 40-, 53-, 71-, 54-, 57-, 39- and 70-fold improvement in specific activities towards C8:0, C10:0, C12:0, C14:0, C15:0, C16:0, and C18:l fatty acids, respectively, while a 94-, 117-, and 79-fold improvement in specific activities were also displayed with C12 glyceryl esters 1-monolaurin, 1,3-dilaurin, and trilaurin, respectively.

[0099] The biotransformation yields of Cs-Cis fatty acids and C12 glyceryl esters with glycine were also evaluated in glycerol-containing aqueous medium catalyzed by the proRML Gen 5 mutant and the results are shown in FIG. 30. High yields of 71.8%, 80.3%, 71.3%, 65.0%, and 41.6%, were obtained with C10:0, C12:0, C14:0, C15:0, and C18:l fatty acids, respectively, while decent yields of 82.0%, 40.5%, and 28.1%, were also obtained with glyceryl esters 1- monolaurin, 1,3-dilaurin, and trilaurin, respectively, at an acyl donor concentration of 10 mM. Similar to the activity profiles observed, a lower yield of 18.2% and 14.7% were resulted with C8:0 and C16:0 fatty acids at the same acyl donor concentration, respectively. At higher acyl donor concentrations of 100 mM, a high N-acyl glycine product concentration of 38.4 mM, 45.4 mM, 40.5 mM, 32.7 mM, 24.9 mM, 49.4 mM, 49.8 mM, and 50.1 mM were also achieved with C10:0, C12:0, C14:0, C15:0, C18:l, fatty acids and 1-monolaurin, 1,3-dilaurin, and trilaurin glyceryl esters, respectively. The engineered Gen 5 mutant produced N-acyl glycines with much higher synthesis yields and product concentrations, outperforming all other known enzyme-catalyzed syntheses of N-acyl glycines.

[0100] Preparative scale biotransformations were also performed at acyl donor concentrations of 100 mM with C10:0, C12:0, C14:0, C15:0 and C18:l fatty acids using a purified proRML Gen 5 biocatalyst (FIG. 30), producing N-acyl glycine amides with isolated yields of 33.6 mM, 36.1 mM, 36.5 mM, 27.9 mM, and 20.3 mM, respectively. The products were purified and identified by 1 H and 13 C NMR analysis.

[0101] Conclusion

[0102] A novel concept of glycerol-activated amide synthesis using glycerol to both activate the free fatty acid and solubilize the substrates for the amidation in aqueous systems was demonstrated. The concept was proven by the proRML mutant-catalyzed amidation of lauric acid and glycine in glycerol (70% v/v)-containing aqueous systems to form N-lauroylglycine via a new biosynthesis pathway through the formation of an acylglyceride intermediate (1- monolaurin) by esterification with glycerol and subsequent aminolysis of 1-monolaurin to N- lauroylglycine amide. The concept could be applicable for the enzymatic amidation to synthesize other N-acyl amino acids.

[0103] The engineering of a hydrolase with enhanced activity for the aminolysis of glyceryl ester 1-monolaurin with glycine as non-natural substrates was demonstrated, which is based on the selection of key amino acid residues for evolution to reshape the nucleophile binding and entrance sites for increasing the binding affinity towards glycine. The evolved proRML Gen 5 mutant showed a 103-fold improvement in aminolysis activity and a 465-fold increase in catalytic efficiency compared to the wild-type counterpart. Molecular docking of glycine provided some insights into the structure -based understanding of the enhanced aminolysis activity, including a successful reorientation of the glycine molecule with a shorter distance between the catalytic serine and the a-amino group of glycine, and a more favorable predicted binding energy of glycine in the mutant as compared to the wild-type enzyme. The same strategy might be useful for the engineering of other enzymes for amide synthesis.

[0104] The proRML Gen 5 mutant catalyzed the amidation of lauric acid with glycine in glycerol-containing aqueous medium to form N-lauroylglycine in 80% yield, providing a useful synthesis of this high-value cosmetic ingredient with much higher yields and product concentrations than other reported enzymatic synthesis. The engineered enzyme can also accept a broad range of medium- and long-chain fatty acids (Cs-Cis) and C12 glyceryl esters as substrates for amidation with glycine to prepare the corresponding N-acyl glycines. Overall, the novel concept of glycerol-activated amide synthesis and the engineered enzyme might be generally useful for amide synthesis and could provide a green and high-yielding synthesis of N-acyl amino acids that are useful in cosmetic and pharmaceutical industries.

[0105] Experimental Section

[0106] Determining Hydrolytic Activity of Lipases Towards N-lauroylglycine

[0107] A total of 10 pg enzyme was incubated with 2 mM N-lauroylglycine in 50 mM MOPS buffer (pH 7.5) in a final volume of 250 pL for 5 h at 37 °C and 2000 rpm in a thermomixer. After 5 h, 125 pL of ninhydrin solution was added to the reaction mixture and then immersed in a 90 °C water bath for 10 min. After 10 min, 625 pL of cold ethanol was added to the mixture to quench the reaction. The amount of glycine liberated was then quantified colorimetrically by measuring the absorbance at OD570 with a UV-VIS spectrophotometer and correlating with a standard calibration curve.

[0108] Growth of Recombinant P. pastoris Strain Expressing Mature RML Mutants

[0109] Recombinant P. pastoris cells (RML) were inoculated into 25 mL buffered glycerolcomplex medium (BMGY). Cultivation was conducted at 220 rpm and 30 °C in an orbital shaker to reach ODeoo of 4-6 within 24 h. The cells were harvested by centrifugation at 2500 g for 10 min at room temperature and the supernatant was decanted. The cells were then inoculated into 200 mL buffered methanol-complex medium (BMMY) without glycerol for protein induction and overexpression. 1.0 mL methanol was added every 24 h to maintain protein induction for a period of 96 h. The culture medium was then harvested and centrifuged (5000 g, 10 min) for removal of the cell pellet while the culture supernatant was buffer exchanged and concentrated in 25 mM sodium phosphate, 500 mM NaCl, pH 7.5 using Amicon Ultra-15 Centrifugal Unit (10 kDa) before analyzing the samples on SDS-PAGE for analysis.

[0110] Biotransformation of Free Fatty Acids or Lauroyl Esters and Glycine to N-Acyl Glycines using Mature RML or proRML Mutants

[0111] Free fatty acids of varying chain lengths (Cs-Cis) or lauroyl esters (C12) as acyl donors and glycine were used as substrates. The 10 mL scale biotransformation reaction contained 10 mM or 100 mM of acyl donor (unless stated otherwise), 7.0 mL glycerol and 3.0 mL of saturated (3 M) glycine solution dissolved in 100 mM sodium phosphate buffer (pH 7.5) for a final concentration of 0.9 M glycine. The reaction mixture was preincubated at 50 °C and 450 rpm for 15 min, followed by the addition of 1.6 mg of mature or proRML mutant enzyme to initiate the reaction. The yield of N-acyl glycines was determined by withdrawing 50 pL of the reaction mixture and diluting with an appropriate amount of stop solution (methanol:4 M HC1; 9:1 v/v) before being analyzed via RP-HPLC.

[0112] Glycine dissolves in water and some amount of water (at least 10% v/v) is required to solubilize glycine into our reaction. However, water and glycine are competing substrates. Increasing water will promote hydrolysis of the 1-monolaurin intermediate and limit the amount of desired aminolysis reaction. The water content may be optimized for the respective mutants to achieve a balance between the amount of glycine and limiting the hydrolysis of the 1-monolaurin intermediate. A 30% v/v water content was used in the examples herein unless indicated otherwise and may be optimized as required.

[0113] Determining Aminolysis Activity of proRML Mutants with 1-Monolaurin

[0114] A standard reaction mixture containing 140 pL glycerol, 60 pL of saturated (3 M) glycine solution dissolved in 100 mM sodium phosphate buffer (pH 7.5), 100 mM of 1- monolaurin and 32 |jg of purified proRML enzyme was used for the determination of aminolysis activity. The reaction was preincubated at 50 °C and 2000 rpm in a thermomixer for 15 min, followed by the addition of the enzyme to start the reaction. After 1 h, the amount of N-lauroylglycine was quantified by diluting the reaction mixture with an appropriate amount of stop solution (methanol:4 M HC1; 9:1 v/v) before being analyzed via RP-HPLC.

[0115] Determining Direct N-Acylation Activity of RML with Lauric Acid

[0116] A standard reaction mixture containing 140 pL saturated tris HC1 (3 M), a structural mimetic of glycerol, 60 pL of saturated (3 M) glycine solution dissolved in 100 mM sodium phosphate buffer (pH 7.5), 100 mM of lauric acid and 32 pg of RML enzyme was used for the determination of direct acylation activity. The reaction was preincubated at 50 °C and 2000 rpm in a thermomixer for 15 min, followed by the addition of the enzyme to start the reaction. After 1 h, the amount of N-lauroylglycine was quantified by diluting the reaction mixture with an appropriate amount of stop solution (methanol:4 M HC1; 9:1 v/v) before being analyzed via RP-HPLC.

[0117] Determining Esterification Activity of RML with Lauric Acid and Glycerol

[0118] A standard reaction mixture containing 140 pL glycerol, 60 pL of 100 mM sodium phosphate buffer (pH 7.5), 100 mM of lauric acid and 32 pg of RML enzyme was used for the determination of esterification activity. The reaction was preincubated at 50 °C and 2000 rpm in a thermomixer for 15 min, followed by the addition of the enzyme to start the reaction. After 1 h, the amount of glyceryl esters was quantified by diluting the reaction mixture with an appropriate amount of stop solution (methanol:4 M HC1; 9:1 v/v) before being analyzed via RP-HPLC.

[0119] Determination of T agg by means of Uncle system

[0120] Enzymes RMLDI56G and RMLWT were prepared in PBS buffer at a concentration of 2 mg/mL. 9 pL of each sample was then loaded in triplicate into a multi-well quartz cuvette chamber and run with a thermal ramp from 20-95 °C with a ramp rate of 0.6 °C/min and a holding time of 180 s. The static light scattering at wavelength of 266 nm (SLS266), which is proportional to the mean solute particle mass, is measured and plotted against temperature. The onset of aggregation (T agg ) was obtained using the Unit Analysis software v. 2.1.

[0121] Comparing Thermostability Between RMLDIS6G and RMLWT

[0122] For thermostability characterization, 10 pL of 0.01 mg/mL RMLDI56G and RMLWT were incubated at temperatures ranging from 30-65 °C for a period of 1 h, before being added to 980 pL of 50 mM TrisHCl buffer maintained at pH 7.5. 10 pL of 50 mM para-nitrophenyl dodecanoate (pNPD) in acetonitrile solution was added to the mixture to initiate the reaction. After incubation of the reaction mixture in a thermomixer at 1000 rpm and 25 °C for 10 min, the amount of para-nitrophenol liberated was then quantified colorimetrically by measuring the absorbance at OD405 with a UV-VIS spectrophotometer and correlating with a standard calibration curve.

[0123] Engineering of proRML Mutants and Recombinant E. coli Strains

[0124] Site-directed and site- saturation mutagenesis of proRML were performed using polymerase chain reaction (PCR) based method using Q5 Master Mix containing Q5 High Fidelity DNA polymerase. The oligonucleotides 1-46 (SEQ ID NO: 111 to 156) were designed for site directed mutagenesis while the oligonucleotides 47-86 (SEQ ID NO: 157 to 196) were designed for site saturation mutagenesis as shown in Table 1. For site directed mutagenesis, a one-step PCR approach was employed to incorporate the desired point mutation(s). The PCR tube contained 0.5 pL of forward and reverse primers of the target site, 1 pL of template pET- 28a (+)-proRML DNA, 23 pL deionized water and 25 pL of Q5 Master Mix. PCR amplification was carried out on Bio-Rad Thermal Cycler T100 using the following program: Denaturation at 98 °C for 30 s, 30 cycles of Denaturation at 98 °C for 10 s, Annealing at 57 °C for 25 s, Extension at 72 °C for 5 min 45 s, and Final Extension at 72 °C for 2 min. The PCR product was then digested with 1 pL of restriction enzyme Dpnl for 6 h at 37 °C, purified using Qiagen PCR purification kit, and then transformed into E. coli XL1 blue competent cells. All mutants were sequenced and confirmed by DNA sequencing facility at Axil Scientific. For site saturation mutagenesis, a PCR amplification of 3 individual fragments of the pET28a (+) vector was first conducted using oligonucleotides carrying non-degenerate codons before being assembled via Golden Gate Cloning. Oligonucleotides 47 and 48 were used to amplify the backbone destination vector, forward primer 49 and reverse primer specific to the targeted site were used to amplify fragment 1, while forward primer carrying the non-degenerate codon and reverse primer 50 were used to amplify fragment 2 (Table 1). The program for PCR amplification is similar to above with the exception of the amplification step, which is set at 3 min 45 s for the backbone destination vector and 50 s for fragments 1 and 2. The purified fragments were then assembled via Golden Gate Cloning in a PCR tube containing 1 pL restriction enzyme Bsa HF v2, 1 pL T4 DNA ligase, 1 pL T4 DNA ligase buffer (1 OX), lOOng of backbone destination vector, 200ng of fragments 1 and 2 (7 times molar ratio of backbone destination vector), and deionized water up to 10 pL. The program for Golden Gate Cloning is as follows: 60 cycles of [digest by Bsal HF v2 at 37 °C for 5 min, ligation by T4 DNA ligase at 16 °C for 5 min] and heat inactivation at 80 °C for 20 min. Following which, 5 pL of Golden Gate assembled pET28a (+) proRML mutants were then transformed into E. coli BL21 (DE3) competent cells and selected on an LB agar plate supplemented with 50 pg/mL kanamycin.

[0125] High-throughput Expression and Screening of proRML mutants in E. coli BL21 (DE3) Strain in 96-Deepwell Plates

[0126] The individual mutant colonies were seeded into 96-deepwell plates containing 1 mL LB medium supplemented with 50 pg/mL kanamycin and grown at 37 °C in a microplate shaker at 800 rpm for 18 h. Following which, 50 pL of the saturated seed culture was inoculated into a fresh 96-deepwell plate containing 950 pL LB supplemented with 50 pg/mL kanamycin for 1 h. IPTG was then added to each well to a final concentration of 0.1 mM to initiate overexpression of the proRML mutants. The 96-deepwell plates were incubated at 16 °C and 800 rpm for a further 24 h. After 24 h of protein induction, the cells were harvested by centrifuging the 96-deepwell plates at 5000 g for 10 min to remove the supernatant. The BL21 (DE3) wet cells containing overexpressed proRML mutants were then used directly for the subsequent N-acylation activity assay with oleic acid.

[0127] To analyze the N-acylation activities of the proRML mutants, 180 pL of acylation mix (glycerol/3 M glycine in 100 mM NaP buffer pH 7.5; 6/3.5 v/v) was first added to BL21 wet cells following the overexpression of proRML mutants. 5 pL of oleic acid was then added to the reaction mixture to initiate the reaction. After incubating the reaction mixture at 50 °C and 1500 rpm for 6 h, 400 pL of stop solution (methanol:4 M HC1; 9:1 v/v) was added to terminate the reaction. The products were analyzed by spotting 2.5 pL of the reaction mixture on a silica gel 60 F254 thin-layer chromatography (TLC) plate (Merck) and developed using two mobile phases of different polarities to facilitate separation. The TLC plate was first run with a polar solvent chloroform/methanol/water/acetic acid (65/25/4/1; v/v/v/v) for 2 cm and dried for 2 min. The TLC was then run with a non-polar solvent hexane/ethyl acetate (1/1; v/v) for 5 cm and dried for 2 min before being stained with iodine. A representation of the developed TLC is also illustrated in Figure S7.

[0128] Growth of Recombinant E. coli Strain Expressing proRML Mutants in Shake Flask

[0129] E. coli (proRML) was inoculated into 2 mL LB medium supplemented with 50 mg/L kanamycin and incubated at 37 °C and 220 rpm in an orbital shaker overnight. 1 mL of this seed culture was inoculated into 50 mL fresh LB medium supplemented with 50 mg/L kanamycin and incubated at 37 °C and 220 rpm for 2-3 h till ODeoo reaches 0.4-0.6, followed by the addition of a final concentration of 0.1 mM IPTG to the culture to initiate protein induction and overexpression. Induction was carried out at 16 °C and 220 rpm for a further 1 24 h. The E. coli cells were then harvested by centrifugation at 6000 g for 10 min and stored at -20 °C before protein purification.

[0130] His- Tag Affinity Purification of proRML mutants

[0131] The E. coli cells (proRML) were lysed by suspending in I mL Bacterial Protein Extraction Reagent containing 0.5 pL lysonase for 30 min at room temperature, followed by centrifugation at 6000 g for 10 min to obtain the cell free extract (CFE). The harvested CFE was applied to a 1 mF gravity flow Ni-NTA column which was previously equilibrated with 5 mL of equilibration buffer containing 25 mM sodium phosphate buffer (pH 7.5) and 150 mM sodium chloride. The column was then washed with 10 mL of binding buffer (20 mM imidazole in equilibration buffer), followed by elution with 5 mL of elution buffer (250 mM imidazole in equilibration buffer). The elute was collected in a single fraction, buffer exchanged and concentrated in 25 mM sodium phosphate, 500 mM NaCl, pH 7.5 using Amicon Ultra- 15 Centrifugal Unit (10 kDa). Protein purity was then analyzed by SDS-PAGE, and protein concentration was determined by Bradford assay.

[0132] Measurement of Kinetic Parameters for proRML wild-type and Gen 1 to Gen 5 Mutants for the Aminolysis of 1-Monolaurin to N-lauroylglycine

[0133] A standard reaction mixture containing 140 pL glycerol, 60 pL of saturated (3 M) glycine solution dissolved in 100 mM sodium phosphate buffer (pH 7.5), and 32 pg of purified proRML enzyme was used for the determination of its kinetic parameters. For proRML WT and Gen 1, the initial concentrations of 1-monolaurin were 16 mM, 44 mM, 68 mM, 100 mM, and 160 mM. For proRML Gen 2, the initial concentrations of 1-monolaurin were 16 mM, 32 mM, 56 mM, 80 mM, and 120 mM. For proRML Gen 3-5, the initial concentrations of 1- monolaurin were 4 mM, 8 mM, 12 mM, 20 mM, and 28 mM. The reaction was preincubated at 50 °C and 2000 rpm in a thermomixer for 15 min, followed by the addition of the enzyme to start the reaction. The amount of N-lauroylglycine was quantified at timepoints of 20 min, 40 min, 60 min, 80 min and 100 min, respectively, by diluting an aliquot of the reaction mixture with an appropriate amount of stop solution (methanol:4 M HC1; 9: 1 v/v) before being analyzed via RP-HPLC. The initial reaction velocities were calculated from the obtained slopes and plotted against the initial substrate concentrations. The Michaelis -Menten equation was fit to the plotted data by linear regression to obtain the kinetic parameters of k ca t and KM.

[0134] Molecular Docking and Computational Analysis

[0135] The structural models of RML mutants were established based on the crystal structure of the open conformation of Rhizomucor Miehei (PDB No.: 4TGL) by homology modeling. The ligands N-lauroylglycine, glycine, and 1-dodecanoyl-sn-glycerol (1-monolaurin) were docked onto the structure model by using AutoDock Vina 4.2.6. The mutated sites were illustrated by the visual software Pymol 2.5.

[0136] Preparative Scale Biotransformation of Fatty Acids and Glycine to Isolate N-Acyl Glycine Amides using proRML Gen 5

[0137] The 10 mL scale biotransformation reaction was performed in a 100 mL round-bottom flask and contained 100 mM of fatty acid as the acyl donor, 7.0 mL glycerol and 3.0 mL of saturated (3 M) glycine solution dissolved in 100 mM sodium phosphate buffer (pH 7.5) for a final concentration of 0.9 M glycine. The reaction mixture was preincubated at 50 °C and 450 rpm for 15 min, followed by the addition of 1.6 mg proRML Gen 5 mutant enzyme to initiate the reaction. After 3 d, 10 mL of 100 mM sodium phosphate buffer (pH7.5) containing 3.0 mg of Eversa Transform 2.0 (Novozymes) was added to the reaction mixture to promote the hydrolysis of the remaining acylglyceride intermediates for a period of 6 h. Since the amide products are surfactants, they cannot be isolated directly from the reaction medium. To isolate the amides, 2 mL of 4 M HC1 was first added to lower the pH to 2.0 after 6 h to convert the amides into the neutral form, and 10 mL (X2) hexane was used to extract the residual fatty acids. Following the removal of the hexane organic phase, 10 mL (X2) of ethyl acetate was then added to the aqueous phase twice to extract the desired N-acyl glycine amides, along with a small amount of the remaining glyceryl esters. The ethyl acetate organic phase was then combined, and the organic solvent was removed by evaporation at reduced pressure to afford the desired product along with a small amount of glyceryl ester as impurities. To promote the complete hydrolysis of the remaining glyceryl esters, 10 mL of 100 mM sodium phosphate buffer (pH 7.5) containing 1.5 mg of Eversa Transform 2.0 (Novozymes) was added to the extracted product and incubated for another 12 h at 50 °C and 450 rpm. Following the hydrolysis of the remaining esters, the extraction process was then repeated as above. 1 mL of 4 M HC1 was added to lower the pH to 2.0, and 10 mL (X2) hexane was used to extract the hydrolyzed fatty acids. After the removal of the hexane organic phase, 10 mL (X2) of ethyl acetate was then added to the aqueous phase twice to extract the desired N-acyl glycine amides. The ethyl acetate organic phase was then combined, and the organic solvent was removed by evaporation at reduced pressure to afford the desired product.

[0138] Chemicals, biochemicals, strains and primers

[0139] Chemicals: octanoic acid (99%), decanoic acid (98%), lauric acid (98%), palmitic acid (99%), oleic acid (99%), 1,3-glyceryl didodecanoate (99%), glyceryl tridodecanoate (99%) p- nitrophenyl dodecanoate (98%), ninhydrin solution (2% solution), glycerol (99%), ethylene glycol (99%), Na 2 HPO 4 (>99%), NaH 2 PO 4 (>99%), K 2 HPO 4 (>99%) and KH 2 PO 4 (>99%) were purchased from Sigma Aldrich (Singapore). Myristic acid (99%), pentadecanoic acid (98%), stearic acid (98%), and glyceryl monolaurate (99%) were purchased from TCI Chemicals. Ethyl acetate, n -hexane, acetonitrile and methanol (all in HPLC grade) were purchased from Fisher Scientific.

[0140] Biochemicals and Strains: The gene of Rhizomucor miehei lipase, also known as RML (Genebank accession no. A34959) was synthesized and cloned into vectors pAO815 and pET- 28a (+) by Twist Bioscience. The P. pastoris GS115 yeast strain was purchased from Thermofisher Scientific while the E. coli XL1 Blue and BL21 (DE3) competent cells, T4-DNA ligase, Q5 high fidelity DNA polymerase (2X master mix), restriction enzymes Ncol, Hindlll, Avril, EcoRI, £>.w/I-HF, Sad and Dp were purchased from NEB (New England Biolabs). Ni- NTA beads was purchased from BioBasic while QIAprep spin plasmid min-prep kit and QIAquick PCR purification kit were purchased from Qiagen. Medium components Luria- Bertani (LB) broth powder, Agar A, tryptone, yeast extract, Yeast Nitrogen Base (YNB) and biotin were purchased from Bio Basic, Singapore. Kanamycin disulfate salt (>99%) and ampicillin (>99%) antibiotics were obtained from Sigma Aldrich (Singapore), while Isopropyl P-D-l -thiogalactopyranoside (IPTG, >99%) was obtained from Calbiochem, USA. The 4-20% Mini-Protean TGX precast gels for SDS-PAGE analysis was obtained from Bio-Rad.

[0141] Primers: The primers used in this work were obtained from Integrated DNA Technologies (IDT) and their relative information are listed in Table 1.

[0142] Analytical Methods

[0143] RP-HPLC analysis: Analysis of all N-acyl glycines was carried out using HPLC (LC- 20AD/T Shimadzu, Singapore) equipped with a photodiode array detector, with detection of all compounds at 220 nm. A reverse phase Agilent InfinityLab Poroshell 120 SB-C18 column (4.6X5 mm; 2.7 um) was employed with the following mobile phases: solvent A (water/TFA; 100/0.05 v/v) and solvent B (acetonitrile/TFA; 100/0.05 v/v). For the analysis of C10-C18 N- acyl glycines, the program for the mobile phases are as follows: 40% solvent B from 0-5 min, 40-100% solvent B from 5-10 min, 100% solvent B from 10-35 min, 100-40% solvent B from 35-40 min, and finally 40% solvent B from 40-45 min. For the analysis of N-octanoyl glycine, the program for the mobile phases are as follows: 10% solvent B from 0-5 min, 10-70% solvent B from 5-10 min, 70% solvent B from 10-35 min, 70-10% solvent B from 35-40 min, and finally 10% solvent B from 40-45 min. For all N-acyl glycines, the flowrate was set at 0.5 mL/min while the column temperature was kept at room temperature throughout the run. The retention times for the various compounds are as shown in Table 2 below.

[0144] SDS-PAGE analysis: Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) was used to analyze the purified mature RML and proRML wild-type and mutants. More details are shown in Figure S6.

[0145] TLC analysis: To analyze the acylation activities of the proRML mutants, 180 pL of acylation mix (Glycerol/3 M Glycine in 100 mM NaP buffer pH 7.5; 6/3.5 v/v) was first added to BL21 wet cells following the overexpression of proRML mutants. 5 pL of oleic acid was then added to the reaction mixture to initiate the reaction. After incubating the reaction mixture at 50 °C and 1500 rpm for 6 h, 400 pL of stop solution (methanol:4 M HC1; 9:1 v/v) was added to terminate the reaction. The products were then analyzed by spotting 2.5 pL of the reaction mixture on a silica gel 60 F254 thin-layer chromatography (TLC) plate (Merck) and developed using two mobile phases of different polarities to facilitate separation. The TLC plate was first run with a polar solvent chloroform/methanol/water/acetic acid (65/25/4/1; v/v/v/v) for 2 cm and dried for 2 min. The TLC was then run with a non-polar solvent hexane/ethyl acetate (1/1; v/v) for 5 cm and dried for 2 min before being stained with iodine. More details are shown in Supplementary Figure S7.

[0146] Gene sequences of proRML and mature RML in pET-28a (+) and pAO815 vectors

[0147] Sequence of proRML wild-type in pET28a (+) vector (SEQ ID NO: 197)

[0148] CCATGGGCAGCAGCCATCATCATCATCATCACAGCAGCGGCCTGGTGCCG CGCGGCAGCCATATGGCTAGCATGACTGGTGGACAGCAAATGGGTCGCGGATCC GTTCCTATCAAACGCCAGTCGAATAGTACGGTGGACTCCCTCCCGCCTCTCATTC CGAGCCGGACGAGTGCCCCTTCGTCGTCACCGTCTACTACAGACCCAGAGGCTCC GGCTATGTCACGGAATGGGCCGTTGCCAAGTGACGTGGAAACTAAATACGGCAT GGCCCTCAATGCCACTAGCTATCCAGACTCAGTGGTACAGGCTATGGAGAACCTT TATTTTCAGGGGAGTATCGATGGGGGTATCCGGGCTGCGACATCACAGGAAATT AACGAGCTCACATACTACACAACCTTGTCCGCTAACTCATATTGTCGTACAGTAA TCCCTGGCGCAACCTGGGACTGCATTCATTGTGATGCTACCGAAGACTTGAAAAT TATTAAAACGTGGTCGACTTTGATTTACGACACAAATGCAATGGTGGCACGTGGG GATAGTGAAAAAACGATCTACATTGTGTTCCGGGGTTCAAGTAGTATTCGTAACT GGATCGCGGACCTCACTTTCGTTCCAGTATCGTATCCTCCAGTATCAGGCACAAA AGTGCACAAGGGCTTCCTCGATTCGTATGGGGAGGTCCAGAACGAACTGGTAGC TACGGTATTAGATCAGTTCAAGCAGTACCCGTCGTACAAAGTCGCAGTTACGGGT CATTCCTTGGGCGGCGCCACAGCCCTCCTGTGCGCATTAGATCTTTATCAACGCG AGGAGGGGCTTAGCTCTTCCAATCTTTTTCTTTATACCCAAGGCCAACCTCGCGT CGGCGACCCGGCCTTCGCCAACTATGTTGTCTCAACTGGCATCCCATACCGTCGG ACTGTCAACGAACGGGATATTGTTCCGCACTTGCCACCAGCAGCTTTTGGGTTCT TACATGCCGGTGAAGAATATTGGATTACAGACAACTCCCCTGAAACCGTGCAAG TATGTACTTCTGATCTGGAAACATCCGATTGCTCGAACTCTATCGTACCATTTACG AGCGTGTTGGACCATCTCAGTTACTTCGGCATTAACACGGGGCTCTGTACCTAAA AGCTT

[0149] (The sequence underlined represents the Ncol and Hindlll restriction sites respectively) [0150] Sequence of pro-pro-linker-RML wild-type in pAO815 vector (SEQ ID NO: 198) [0151] CCTAGGCGAAACGATGAGATTTCCTTCCATCTTCACGGCTGTGCTATTTGC AGCATCCTCCGCACTTGCAGTGCCCATAAAGAGACAATCCAACTCCACAGTCGAT TCCCTTCCACCATTAATTCCTTCCAGGACATCAGCACCTTCTTCTTCTCCTTCTAC CACCGACCCTGAAGCACCTGCTATGTCAAGAAACGGACCTTTGCCATCAGATGTT GAAACGAAGTACGGTATGGCTTTAAACGCTACCTCTTACCCAGACAGTGTCGTTC AGGCTATGAAACGAGAGGCTGAGGCTGAAGCTGTTCCAATCAAACGTCAATCTA ATTCTACTGTTGACTCACTGCCACCCCTGATTCCCTCTCGTACAAGTGCTCCATCT AGTAGTCCTTCTACTACTGATCCAGAGGCCCCTGCCATGTCAAGAAATGGGCCAT TGCCAAGTGATGTTGAAACTAAATATGGCATGGCCTTGAATGCCACTTCATATCC CGATTCAGTAGTACAGGCCATGGGTGGTGGAGGTTCTGGAGGTGGTGGATCTAA ACGTAAGAGGGAGGCTGAAGCCGAAGCTTCCATCGACGGAGGTATTAGAGCCGC TACTTCTCAGGAAATCAACGAACTTACTTACTATACAACTTTGTCAGCTAATTCTT ACTGTAGAACTGTTATTCCTGGTGCTACTTGGGATTGCATACATTGTGACGCCAC TGAAGATTTAAAGATAATTAAAACCTGGTCTACTTTGATTTACGACACTAACGCT ATGGTTGCTAGAGGAGATTCCGAGAAGACTATTTATATCGTGTTTAGAGGTTCTT CTTCTATTCGTAATTGGATCGCTGATTTGACATTCGTTCCAGTCTCTTACCCTCCA GTTTCTGGTACTAAGGTTCACAAAGGATTTCTTGATTCTTATGGTGAAGTTCAAA ACGAGTTGGTTGCTACTGTCTTGGATCAGTTTAAACAATACCCATCTTATAAGGT TGCTGTCACTGGTCACTCTTTGGGAGGTGCTACTGCCTTGCTGTGTGCTTTAGATT TATACCAGAGAGAGGAAGGATTGTCTTCAAGTAACCTATTCTTGTACACTCAAGG TCAGCCTAGAGTTGGAGATCCAGCATTTGCTAATTATGTGGTTTCTACTGGTATTC CATATAGACGTACTGTTAACGAAAGAGACATAGTACCACACTTGCCTCCAGCTGC CTTCGGATTTCTGCATGCCGGTGAAGAGTACTGGATCACAGATAATTCTCCTGAA ACCGTTCAAGTGTGTACATCTGATTTAGAGACTTCCGACTGCTCTAACAGTATTG TTCCATTTACTTCAGTTCTTGATCATTTGTCTTATTTTGGAATTAACACCGGTTTGT GTACTTAAGAATTC

[0152] (The sequence underlined represents the Avril and EcoRI restriction sites respectively)

[0153] RML sequences

[0154] RML Wild type sequence expressed in Pichia Pastoris (SEQ ID NO: 1).

[0155] VPIKRQSNSTVDSLPPLIPSRTSAPSSSPSTTDPEAPAMSRNGPLPSDVETKYGM ALNATSYPDSVVQAMKREAEAEAVPIKRQSNSTVDSLPPLIPSRTSAPSSSPSTTDPEA PAMSRNGPLPSDVETKYGMALNATSYPDSVVQAMGGGGSGGGGSKRKREAEAEAS IDGGIRAATSQEINELTYYTTLSANSYCRTVIPGATWDCIHCDATEDLKIIKTWSTLIY DTNAMVARGDSEKTIYIVFRGSSSIRNWIADLTFVPVSYPPVSGTKVHKGFLDSYGEV QNELVATVLDQFKQYPSYKVAVTGHSLGGATALLCALDLYQREEGLSSSNLFLYTQ GQPRVGDPAFANYVVSTGIPYRRTVNERDIVPHLPPAAFGFLHAGEEYWITDNSPET VQVCTSDLETSDCSNSIVPFTSVLDHLSYFGINTGLCT

[0156] The amino acid residues 169 to 437 is the RML wild type mature peptide (SEQ ID NO: 3). The mutations of the RML variants described herein are with respect to the RML wild type mature peptide (SEQ ID NO: 3). There are two identical propeptide domains in SEQ ID NO:1 - amino acid residues 1 to 70 and amino acid residues 79 to 148. Amino acid residues 149 to 158 is a flexible linker. Cleavage occurs at amino acid residues 71 to 78 and 161 to 168 (KREAEAEA) to release the mature RML peptide.

[0157] RML wild type propeptide domain (SEQ ID NO: 2)

[0158] VPIKRQSNSTVDSLPPLIPSRTSAPSSSPSTTDPEAPAMSRNGPLPSDVETKYGM ALNATSYPDSVVQAM

[0159] RML wild type mature peptide (SEQ ID NO: 3)

[0160] SIDGGIRAATSQEINELTYYTTLSANSYCRTVIPGATWDCIHCDATEDLKIIKT WSTLIYDTNAMVARGDSEKTIYIVFRGSSSIRNWIADLTFVPVSYPPVSGTKVHKGFL DSYGEVQNELVATVLDQFKQYPSYKVAVTGHSLGGATALLCALDLYQREEGLSSSN LFLYTQGQPRVGDPAFANYVVSTGIPYRRTVNERDIVPHLPPAAFGFLHAGEEYWIT DNSPETVQVCTSDLETSDCSNSIVPFTSVLDHLSYFGINTGLCT

[0161] Propeptide sequence used with expression in Pichia pastoris (SEQ ID NO: 199)

[0162] VPIKRQSNSTVDSLPPLIPSRTSAPSSSPSTTDPEAPAMSRNGPLPSDVETKYGM ALNATSYPDSVVQAMKREAEAEAVPIKRQSNSTVDSLPPLIPSRTSAPSSSPSTTDPEA PAMSRNGPLPSDVETKYGMALNATSYPDSVVQAMGGGGSGGGGSKRKREAEAEA [0163] Propeptide sequence used with expression in Escherichia coli (SEQ ID NO: 200) [0164] MGSSHHHHHHSSGLVPRGSHMASMTGGQQMGRGSVPIKRQSNSTVDSLPPLI PSRTSAPSSSPSTTDPEAPAMSRNGPLPSDVETKYGMALNATSYPDSVVQAMENLYF QG

[0165] Sequence of proRML D156G peptide (Gen 1) (SEQ ID NO: 4)

[0166] MGSSHHHHHHSSGLVPRGSHMASMTGGQQMGRGSVPIKRQSNSTVDSLPPLI PSRTSAPSSSPSTTDPEAPAMSRNGPLPSDVETKYGMALNATSYPDSVVQAMENLYF QGSIDGGIRAATSQEINELTYYTTLSANSYCRTVIPGATWDCIHCDATEDLKIIKTWST LIYDTNAMVARGDSEKTIYIVFRGSSSIRNWIADLTFVPVSYPPVSGTKVHKGFLDSY GEVQNELVATVLDQFKQYPSYKVAVTGHSLGGATALLCALGLYQREEGLSSSNLFL YTQGQPRVGDPAFANYVVSTGIPYRRTVNERDIVPHLPPAAFGFLHAGEEYWITDNS PETVQVCTSDLETSDCSNSIVPFTSVLDHLSYFGINTGLCT

[0167] In the RML mutants described herein (SEQ ID NO: 4 to 110), the amino acid residues 112 to 379 is the RML mature peptide with the indicated mutation with respect to the wild type mature peptide (SEQ ID NO: 3). Amino acid residues 1 to 34 are provided by the commercial pET28a plasmid for expression in E. coli and facilitate purification of the mutant. Amino acid residues 35 to 104 is the RML propeptide domain (SEQ ID NO: 2). Amino acid residues 105 to 111 (ENLYFQG, SEQ ID NO: 201) is the cleavage site of the peptide produced in E. coli and is a recognition cleavage site for the TEV protease. TEV protease is not produced naturally in E. coli and has to be added in order to cleave the propeptide. Hence, by choosing whether to add the TEV protease, it is possible to control whether the propetide is cleaved from or bound to the mature peptide.

[0168] Sequence of proRML D156G/L258K peptide (Gen 2) (SEQ ID NO: 6)

[0169] MGSSHHHHHHSSGLVPRGSHMASMTGGQQMGRGSVPIKRQSNSTVDSLPPLI PSRTSAPSSSPSTTDPEAPAMSRNGPLPSDVETKYGMALNATSYPDSVVQAMENLYF QGSIDGGIRAATSQEINELTYYTTLSANSYCRTVIPGATWDCIHCDATEDLKIIKTWST LIYDTNAMVARGDSEKTIYIVFRGSSSIRNWIADLTFVPVSYPPVSGTKVHKGFLDSY GEVQNELVATVLDQFKQYPSYKVAVTGHSLGGATALLCALGLYQREEGLSSSNLFL YTQGQPRVGDPAFANYVVSTGIPYRRTVNERDIVPHLPPAAFGFLHAGEEYWITDNS PETVQVCTSDLETSDCSNSIVPFTSVLDHKSYFGINTGLCT

[0170] Sequence of proRML D156G/L258K/L267N peptide (Gen 3) (SEQ ID NO: 24)

[0171] MGSSHHHHHHSSGLVPRGSHMASMTGGQQMGRGSVPIKRQSNSTVDSLPPLI PSRTSAPSSSPSTTDPEAPAMSRNGPLPSDVETKYGMALNATSYPDSVVQAMENLYF QGSIDGGIRAATSQEINELTYYTTLSANSYCRTVIPGATWDCIHCDATEDLKIIKTWST LIYDTNAMVARGDSEKTIYIVFRGSSSIRNWIADLTFVPVSYPPVSGTKVHKGFLDSY GEVQNELVATVLDQFKQYPSYKVAVTGHSLGGATALLCALGLYQREEGLSSSNLFL YTQGQPRVGDPAFANYVVSTGIPYRRTVNERDIVPHLPPAAFGFLHAGEEYWITDNS PETVQVCTSDLETSDCSNSIVPFTSVLDHKSYFGINTGNCT

[0172] Sequence of proRML D156G/L258K/L267N/S83D peptide (Gen 4) (SEQ ID NO: 51) [0173] MGSSHHHHHHSSGLVPRGSHMASMTGGQQMGRGSVPIKRQSNSTVDSLPPLI PSRTSAPSSSPSTTDPEAPAMSRNGPLPSDVETKYGMALNATSYPDSVVQAMENLYF QGSIDGGIRAATSQEINELTYYTTLSANSYCRTVIPGATWDCIHCDATEDLKIIKTWST LIYDTNAMVARGDSEKTIYIVFRGSDSIRNWIADLTFVPVSYPPVSGTKVHKGFLDSY GEVQNELVATVLDQFKQYPSYKVAVTGHSLGGATALLCALGLYQREEGLSSSNLFL YTQGQPRVGDPAFANYVVSTGIPYRRTVNERDIVPHLPPAAFGFLHAGEEYWITDNS PETVQVCTSDLETSDCSNSIVPFTSVLDHKSYFGINTGNCT

[0174] Sequence of proRML D156S/L258K/L267N/S83D/L58K/R86K/W88V (Gen 5) (SEQ ID NO: 110)

[0175] MGSSHHHHHHSSGLVPRGSHMASMTGGQQMGRGSVPIKRQSNSTVDSLPPLI PSRTSAPSSSPSTTDPEAPAMSRNGPLPSDVETKYGMALNATSYPDSVVQAMENLYF QGSIDGGIRAATSQEINELTYYTTLSANSYCRTVIPGATWDCIHCDATEDLKIIKTWST KIYDTNAMVARGDSEKTIYIVFRGSDSIKNVIADLTFVPVSYPPVSGTKVHKGFLDSY GEVQNELVATVLDQFKQYPSYKVAVTGHSLGGATALLCALSLYQREEGLSSSNLFL YTQGQPRVGDPAFANYVVSTGIPYRRTVNERDIVPHLPPAAFGFLHAGEEYWITDNS PETVQVCTSDLETSDCSNSIVPFTSVLDHKSYFGINTGNCT

[0176] Table 1. Information of primers used in this work.

0177] a The underlined bases stand for the mutated sites. NTT degenerate codon is also used in place of RVK degenerate codon. Symbols follow the IUB code: N = G / T / A / C, K = G / T, R = A / G, V = G / A / C. Each mutation has a pair of forward (F) primer and reverse (R) primer as shown in Table 1.

[0178] Table 2. Retention times of various compounds in Reverse Phase HPLC C18 column

Retention Retention

Compound Compound time [min] time [min] lauric acid 21.9 N-myristoylglycine 21.3

1-monolaurin 20.7 N-pentadecanoylglycine 22.5

N-lauroylglycine 18.9 N-palmitoylglycine 23.8

N-octanoylglycine 22.3 N-oleoylglycine 24.2

N-decanoylglycine 15.6

[0179] Table 3. Screening of suitable lipases for amidation via the hydrolysis of N- lauroylglycine

Amano PS from Burkholderia cepacia Sigma Aldrich N.D*

Amano A from Aspergillus niger Sigma Aldrich 0.034

[0180] N.D*: Not Detected

[0181] Features that are described in the context of an embodiment may correspondingly be applicable to the same or similar features in the other embodiments. Features that are described in the context of an embodiment may correspondingly be applicable to the other embodiments, even if not explicitly described in these other embodiments. Furthermore, additions and/or combinations and/or alternatives as described for a feature in the context of an embodiment may correspondingly be applicable to the same or similar feature in the other embodiments. [0182] Various amino acids are described herein by its full name, and conventional 1 -letter and 3-letter abbreviations as is known in the art. The substitution of an amino acid residue in a peptide is describe by the conventional notation, for example a D156 substitution refers to a substition of the 156 th amino acid resude (aspartic acid in this example) of the relevant sequence. More specifically, D156G indicates that the 156th amino acid residue (or position) of aspartic acid (D) is substituted by glycine (G).

[0183] As used herein, the articles “a”, “an” and “the” as used with regard to a feature or element include a reference to one or more of the features or elements. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items. As used herein, the terms “first,” “second,” and “third,” etc. are used merely as labels, and are not intended to impose numerical requirements on their objects. For example, the lipase variant may comprise the first substituent, and the third substituent without having the second substituent. In another example, the lipase variant may comprise the first substituent, and the sixth substituent without having the second substituent, third substituent, fourth substituent, and fifth substituent. Other examples of different substituents are also possible. Thus, the terms “first,” “second,” and “third,” etc do not impose any numerical requirement.

[0184] Where a range of values is recited, it is to be understood that each intervening integer value, and each fraction thereof, between the recited upper and lower limits of that range is also specifically disclosed, along with each subrange between such values. The upper and lower limits of any range can independently be included in or excluded from the range, and each range where either, neither or both limits are included is also encompassed within the invention. Where a value being discussed has inherent limits, for example where a component can be present at a concentration of from 0 to 100%, or where the pH of an aqueous solution can range from 1 to 14, those inherent limits are specifically disclosed. Where a value is explicitly recited, it is to be understood that values which are about the same quantity or amount as the recited value are also within the scope of the invention, as are ranges based thereon.

[0185] Ranges may be expressed herein as from "about" one particular value, and/or to "about" another particular value. When such a range is expressed, a further aspect includes from the one particular value and/or to the other particular value. Similarly, when values are expressed as approximations, by use of the antecedent "about," it will be understood that the particular value forms a further aspect. It will be further understood that the endpoints of each of the ranges are significant both in relation to the other endpoint, and independently of the other endpoint. It is also understood that there are a number of values disclosed herein, and that each value is also herein disclosed as "about" that particular value in addition to the value itself. For example, if the value "10" is disclosed, then "about 10" is also disclosed. It is also understood that each unit between two particular units are also disclosed. For example, if 10 and 15 are disclosed, then 11, 12, 13, and 14 are also disclosed.

[0186] Unless defined otherwise or the context clearly dictates otherwise, all technical and Scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the invention, the preferred methods and materials are now described.

[0187] Although each of these terms has a distinct meaning, the terms “comprising”, “consisting of’ and “consisting essentially of’ may be interchanged for one another throughout the instant application. The term “having” has the same meaning as “comprising” and may be replaced with either the term “consisting of’ or “consisting essentially of’.

[0188] The terms “polynucleotide", “oligonucleotide”, “nucleic acid” and “nucleic acid molecule” are used interchangeably herein to refer to a polymeric form of nucleotides of any length, and may comprise ribonucleotides, deoxyribonucleotides, analogues thereof, or mixtures thereof. Whether modified or unmodified, in some embodiments the target nucleotide must have a polyanionic backbone, preferably a sugar-phosphate backbone. More particularly, the terms “polynucleotide”, “oligonucleotide”, “nucleic acid” and “nucleic acid molecule” include polydeoxyribonucleotides (containing 2-deoxy-Dribose), polyribonucleotides (containing D-ribose), including tRNA, rRNA, hRNA, and mRNA, whether spliced or unspliced, any other type of polynucleotide which is an N or C-glycoside of a purine or pyrimidine base, and other polymers containing a phosphate or other polyanionic backbone, and other synthetic sequence-specific nucleic acid polymers providing that the polymers contain nucleobases in a configuration which allows for base pairing and base stacking, such as is found in DNA and RNA. These terms include, for example, 3'- deoxy-2',5'-DNA, oligodeoxyribonucleotide N3'->P5' phosphoramidates, 2'-O-alkyl- substituted RNA, double- and single-stranded DNA, as well as double- and single-stranded RNA, and hybrids thereof including for example hybrids between DNA and RNA, and also include known types of modifications, for example, labels, alkylation, caps. Substitution of one or more of the nucleotides with an analog, internucleotide modifications such as, for example, those with negatively charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.), those containing pendant moieties, such as, for example, proteins (including enzymes (e.g. nucleases), toxins, antibodies, signal peptides, poly-L-lysine, etc.), those with intercalators (e.g., acridine, psoralen, etc.), those containing chelates (of, e.g., metals, radioactive metals, boron, oxidative metals, etc.), those containing alkylators, those with modified linkages (e.g., alpha anomeric nucleic acids, etc.), as well as unmodified forms of the polynucleotide or oligonucleotide.

[0189] The terms “polypeptide” and “protein”, used interchangeably herein, refer to a polymer of amino acids without regard to the length of the polymer; thus, peptides, oligopeptides, and proteins are included within the definition of polypeptide. This term also does not specify or exclude chemical or post-expression modifications of the polypeptides of the invention, although chemical or post-expression modifications of these polypeptides may be included or excluded as specific embodiments. Therefore, for example, modifications to polypeptides that include the covalent attachment of glycosyl groups, acetyl groups, phosphate groups, lipid groups and the like are expressly encompassed by the term polypeptide. Further, polypeptides with these modifications may be specified as individual species to be included or excluded from the present invention. The natural or other chemical modifications, such as those listed in examples above can occur anywhere in a polypeptide, including the peptide backbone, the amino acid side-chains and the amino or carboxyl termini. It will be appreciated that the same type of modification may be present in the same or varying degrees at several sites in a given polypeptide. Also, a given polypeptide may contain many types of modifications. Polypeptides may be branched, for example, as a result of ubiquitination, and they may be cyclic, with or without branching. Modifications include acetylation, acylation, ADP-ribosylation, amidation, covalent attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, covalent attachment of phosphotidylinositol, cross -linking, cyclization, disulfide bond formation, demethylation, formation of covalent cross-links, formation of cysteine, formation of pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI anchor formation, hydroxylation, iodination, methylation, myristoylation, oxidation, pegylation, proteolytic processing, phosphorylation, prenylation, racemization, selenoylation, sulfation, transfer-RNA mediated addition of amino acids to proteins such as arginylation, and ubiquitination. Also included within the definition are polypeptides which contain one or more analogs of an amino acid (including, for example, non-naturally occurring amino acids, amino acids which only occur naturally in an unrelated biological system, modified amino acids from mammalian systems, etc.), polypeptides with substituted linkages, as well as other modifications known in the art, both naturally occurring and non-naturally occurring. [0190] The terms “percentage of sequence identity” and “percentage homology” are used interchangeably herein to refer to comparisons among polynucleotides and polypeptides, and are determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity. Identity is evaluated using any of the variety of sequence comparison algorithms and programs known in the art. Such algorithms and programs include, but are by no means limited to, TBLASTN, BLASTP, FASTA, TFASTA, CLUSTAL W, FASTDB.

[0191] References

[0192] 1. Clapes, P. & Infante, M. R. Amino Acid-based Surfactants: Enzymatic Synthesis, Properties and Potential Applications. Biocatal. Biotransformation 20, 215-233 (2009).

[0193] 2. Moran, M. C. et al. ‘Green’ amino acid-based surfactants. Green Chem. 6, 233-240 (2004).

[0194] 3. Ananthapadmanabhan, K. P. Amino-acid surfactants in personal cleansing (review). Tenside, Surfactants, Deterg. 56, 378-386 (2019).

[0195] 4. Zhang, G. et al. Green Synthesis, Composition Analysis and Surface Active Properties of Sodium Cocoyl Glycinate. Am. J. Anal. Chem. 4, 445-450 (2013).

[0196] 5. Tan, B. et al. Identification of endogenous acyl amino acids based on a targeted lipidomics approach. J. Lipid Res. 51, 112-119 (2010).

[0197] 6. Burstein, S. H. N- Acyl Amino Acids (Elmiric Acids): Endogenous Signaling Molecules with Therapeutic Potential. Mol. Pharmacol. 93, 228-238 (2018).

[0198] 7. Lumir Hanus, Esther Shohami, Itai Bab, R. M. N-Acyl amino acids and their impact on biological processes. BioFactors 40, 381-388 (2014).

[0199] 8. Battista, N., Bari, M. & Bisogno, T. N-Acyl Amino Acids: Metabolism, Molecular Targets, and Role in Biological Processes. Biomolecules 9, 822 (2019).

[0200] 9. Bradshaw, H. B., Rimmerman, N., Hu, S. S. J., Burstein, S. & Walker, J. M. Novel endogenous N-acyl glycines identification and characterization. Vitam. Harm. 81, 191-205 (2009).

[0201] 10. Piscitelli, F. et al. Protective Effects of N- Oleoylglycine in a Mouse Model of Mild Traumatic Brain Injury. ACS Chem. Neurosci. 11, 1117-1128 (2020).

[0202] 11. Valeur, E. & Bradley, M. Amide bond formation: Beyond the myth of coupling reagents. Chem. Soc. Rev. 38, 606-631 (2009).

[0203] 12. El-Faham, A. & Albericio, F. Peptide coupling reagents, more than a letter soup. Chem. Rev. Ill, 6557-6602 (2011).

[0204] 13. Delong, R. C. Preparation of acid chlorides with phosgene in the presence of a catalyst. (1975).

[0205] 14. Chhatwal, A. R., Lomax, H. V., Blacker, A. J., Williams, J. M. J. & Marce, P. Direct synthesis of amides from nonactivated carboxylic acids using urea as nitrogen source and Mg(NO3)2or imidazole as catalysts. Chem. Sci. 11, 5808-5818 (2020).

[0206] 15. Constable, D. J. C. et al. Key green chemistry research areas — a perspective from pharmaceutical manufacturers. Green Chem. 9, 411-42 (2007).

[0207] 16. Van Rantwijk, F., Hacking, M. A. P. J. & Sheldon, R. A. Lipase-catalyzed synthesis of carboxylic amides: Nitrogen nucleophiles as acyl acceptor. Monatshefte fur Chemie 131, 549-569 (2000).

[0208] 17. Philpott, H. K., Thomas, P. J., Tew, D., Fuerst, D. E. & Lovelock, S. L. A versatile biosynthetic approach to amide bond formation. Green Chem. 20, 3426-3431 (2018).

[0209] 18. Ferjancic-Biagini, A., Giardina, T., Reynier, M. & Puigserver, A. Hog kidney and intestine aminoacylase-catalyzed acylation of L-methionine in aqueous media. Biocatal. Biotransformation 15, 313-323 (1997).

[0210] 19. Yokoigawa, K., Sato, E., Esaki, N. & Soda, K. Enantioselective synthesis of N-acetyl-l-methionine with aminoacylase in organic solvent. Appl. Microbiol. Biotechnol. 42, 287-289 (1994).

[0211] 20. Wada, E. et al. Enzymatic synthesis of N -acyl-l-amino acids in a glycerolwater system using acylase I from pig kidney. J. Am. Oil Chem. Soc. 79, 41-46 (2002).

[0212] 21. Wardenga, R., Lindner, H. A., Hollmann, F., Thum, O. & Bomscheuer, U.

Increasing the synthesis/hydrolysis ratio of aminoacylase 1 by site-directed mutagenesis. Biochimie 92, 102-109 (2010).

[0213] 22. Wardenga, R., Hollmann, F., Thum, O. & Bornscheuer, U. Functional expression of porcine aminoacylase 1 in E. coli using a codon optimized synthetic gene and molecular chaperones. Appl. Microbiol. Biotechnol. 81, 721-729 (2008).

[0214] 23. Koreishi, M. et al. A novel acylase from Streptomyces mobaraensis that efficiently catalyzes hydrolysis/synthesis of capsaicins as well as N-acyl-L- amino acids and N-acyl-peptides. J. Agric. Food Chem. 54, 72-78 (2006).

[0215] 24. Koreishi, M. et al. Purification, characterization, molecular cloning, and expression of a new aminoacylase from streptomyces mobaraensis that can hydrolyze N- (Middle/Long)-chain-fatty-acyl-L-amino acids as well as N-Short-chain-acyl-L-amino acids. Biosci. Biotechnol. Biochem. 73, 1940-1947 (2009).

[0216] 25. Koreishi, M. et al. Purification and characterization of a novel aminoacylase from Streptomyces mobaraensis. Biosci. Biotechnol. Biochem. 69, 1914-1922 (2005).

[0217] 26. Koreishi, M., Kawasaki, R., Imanaka, H., Imamura, K. & Nakanishi, K. A novel G-lysine acylase from Streptomyces mobaraensis for synthesis of NG-acyl-L-ly sines. J. Am. Oil Chem. Soc. 82, 631-637 (2005).

[0218] 27. Dettori, L. et al. An aminoacylase activity from Streptomyces ambofaciens catalyzes the acylation of lysine on a-position and peptides on N-terminal position. Eng. Life Sci. 18, 589-599 (2018).

[0219] 28. Dettori, L. et al. N-A-acylation of lysine catalyzed by immobilized aminoacylases from Streptomyces ambofaciens in aqueous medium. Microporous Mesoporous Mater. 267, 24-34 (2018).

[0220] 29. Bourkaib, M. C. et al. N-acylation of L-amino acids in aqueous media: Evaluation of the catalytic performances of Streptomyces ambofaciens aminoacylases. Enzyme Microb. Technol. 137, 109536 (2020).

[0221] 30. Takakura, Y. & Asano, Y. Purification, characterization, and gene cloning of a novel aminoacylase from Burkholderia sp. strain LP5_18B that efficiently catalyzes the synthesis of N-lauroyl-L- amino acids. Biosci. Biotechnol. Biochem. 83, 1964-1973 (2019).

[0222] 31. Zeng, S. et al. Amide Synthesis via Aminolysis of Ester or Acid with an Intracellular Lipase. ACS Catal. 8, 8856-8865 (2018).

[0223] 32. Goswami, A. & Van Lanen, S. G. Enzymatic strategies and biocatalysts for amide bond formation: Tricks of the trade outside of the ribosome. Mol. Biosyst. 11, 338-353 (2015).

[0224] 33. Tuccio, B., Ferre, E. & Comeau, L. Lipase-Catalyzed Syntheses of N-Octyl- Alkylamides in Organic Media. Tetrahedron Let. 32, 2763-2764 (1991).

[0225] 34. Maugard, T., Remaud-Simeon, M., Petre, D. & Monsan, P. Lipase-catalysed production of N-oleoyl-taurine sodium salt in non-aqueous medium. Biotechnol. Let. 19, 751-753 (1997).

[0226] 35. Maugard, T., Remaud-Simeon, M., Petre, D. & Monsan, P. Enzymatic amidification for the synthesis of biodegradable surfactants: Synthesis of N-acylated hydroxylated amines. J. Mol. Catal. - B Enzym. 5, 13-17 (1998).

[0227] 36. Maugard, T., Remaud-Simeon, M., Petre, D. & Monsan, P. Enzymatic synthesis of glycamide surfactants by amidification reaction. Tetrahedron 53, 5185-5194 (1997).

[0228] 37. Fernandez-Perez, M. & Otero, C. Enzymatic synthesis of amide surfactants from ethanolamine. Enzyme Microb. Technol. 28, 527-536 (2001).

[0229] 38. Kidwai, M., Poddar, R. & Mothsra, P. N-acylation of ethanolamine using lipase: a chemoselective catalyst. Beilstein J. Org. Chem. 5, 10 (2009).

[0230] 39. Litjens, M. J. J., Straathof, A. J. J., Jongejan, J. A. & Heijnen, J. J. Exploration of lipase-catalyzed direct amidation of free carboxylic acids with ammonia in organic solvents. Tetrahedron 55, 12411-12418 (1999).

[0231] 40. Arce, G., Carrau, G., Bellomo, A. & Gonzalez, D. Greener Synthesis of an Amide by Direct Reaction of an Acid and Amine under Catalytic Conditions. World J. Chem. Educ. 3, 27-29 (2015).

[0232] 41. Funabashi, M. et al. An ATP-independent strategy for amide bond formation in antibiotic biosynthesis. Nat. Chem. Biol. 6, 581-586 (2010).

[0233] 42. Settembre, E. C. et al. Structural and Mechanistic Studies on ThiO, a Glycine Oxidase Essential for Thiamin Biosynthesis in Bacillus subtilis. Biochemistry 42, 2971-2981 (2003).

[0234] 43. Ju, Y. et al. X-shaped structure of bacterial heterotetrameric tRNA synthetase suggests cryptic prokaryote functions and a rationale for synthetase classifications. Nucleic Acids Res. 49, 10106-10119 (2021).

[0235] 44. Eberhardt, J., Santos-Martins, D., Tillack, A. F. & Forli, S. AutoDock Vina 1.2.0: New Docking Methods, Expanded Force Field, and Python Bindings. ChemRxiv (2021).

[0236] 45. Trott, O. & Olson, A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization and multithreading. J. Comput. Chem. 31, 455 (2010).

[0237] 46. Reetz, M. T. & Carballeira, J. D. Iterative saturation mutagenesis (ISM) for rapid directed evolution of functional enzymes. Nat. Protoc. 2, 891-903 (2007).

[0238] 47. Wang, J. et al. Enhanced activity of Rhizomucor miehei lipase by directed evolution with simultaneous evolution of the propeptide. Appl. Microbiol. Biotechnol. 96, 443-450 (2012).

[0239] 45. Ng, A. M. J. et al. A Novel Lipase from Lasiodiplodia theobromaev Efficiently Hydrolyses C8-C10 Methyl Esters for the Preparation of Medium-Chain Triglycerides’ Precursors. Int. J. Mol. Sci. 2021, Vol. 22, Page 10339 22, 10339 (2021).