Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS OF ASSESSING RISK OF DEVELOPING MELANOMA
Document Type and Number:
WIPO Patent Application WO/2024/077357
Kind Code:
A1
Abstract:
The present disclosure relates to methods for assessing the risk of a human subject for developing melanoma. Also provided are polymorphisms associated with the risk of a human subject for developing melanoma.

Inventors:
DITE GILLIAN (AU)
ALLMAN RICHARD (AU)
WONG CHI KUEN (AU)
Application Number:
PCT/AU2023/051013
Publication Date:
April 18, 2024
Filing Date:
October 13, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
GENETIC TECH LIMITED (AU)
International Classes:
C12Q1/6886; G16H50/20; G16H50/30
Domestic Patent References:
WO2013034645A12013-03-14
WO2016061246A12016-04-21
Other References:
FANGYI GU, TING-HUEI CHEN, RUTH M. PFEIFFER, MARIA CONCETTA FARGNOLI, DONATO CALISTA, PAOLA GHIORZO, KETTY PERIS, SUSANA PUIG, CHI: "Combining common genetic variants and non-genetic risk factors to predict risk of cutaneous melanoma", HUMAN MOLECULAR GENETICS, OXFORD UNIVERSITY PRESS, GB, GB , XP093160810, ISSN: 0964-6906, DOI: 10.1093/hmg/ddy282
M.R. ROBERTS; M.M. ASGARI; A.E. TOLAND: "Genome‐wide association studies and polygenic risk scores for skin cancer: clinically useful yet?", BRITISH JOURNAL OF DERMATOLOGY, JOHN WILEY, HOBOKEN, USA, vol. 181, no. 6, 7 July 2019 (2019-07-07), Hoboken, USA, pages 1146 - 1155, XP071119223, ISSN: 0007-0963, DOI: 10.1111/bjd.17917
JULIA STEINBERG, MARK M ILES, JIN YEE LEE, XIAOCHUAN WANG, MATTHEW H LAW, AMELIA K SMIT, TU NGUYEN-DUMONT, GRAHAM G GILES, MELISSA: "Independent evaluation of melanoma polygenic risk scores in UK and Australian prospective cohorts.", BRITISH JOURNAL OF DERMATOLOGY, JOHN WILEY, HOBOKEN, USA, vol. 186, no. 5, 1 May 2022 (2022-05-01), Hoboken, USA, pages 823 - 834, XP093160813, ISSN: 0007-0963, DOI: 10.1111/bjd.20956
ANNE E. CUST, MARTIN DRUMMOND, PETER A. KANETSKY, ALISA M. GOLDSTEIN, JENNIFER H. BARRETT, STUART MACGREGOR, MATTHEW H. LAW, MARK: "Assessing the Incremental Contribution of Common Genomic Variants to Melanoma Risk Prediction in Two Population-Based Studies", JOURNAL OF INVESTIGATIVE DERMATOLOGY, ELSEVIER, NL, vol. 138, no. 12, 1 December 2018 (2018-12-01), NL , pages 2617 - 2624, XP093160816, ISSN: 0022-202X, DOI: 10.1016/j.jid.2018.05.023
STEFANAKI, I. ET AL.: "Replication and predictive value of SNPs associated with melanoma and pigmentation traits in a Southern European case-control study.", PLOS ONE, vol. 8, no. 2, 2013, XP055398236, Retrieved from the Internet DOI: 10.1371/journal.pone.0055712
KYLIE VUONG, BRUCE K ARMSTRONG, ELISABETE WEIDERPASS, EILIV LUND, HANS-OLOV ADAMI, MARIT B VEIEROD, JENNIFER H BARRETT, JOHN R DAV: "Development and External Validation of a Melanoma Risk Prediction Model Based on Self-assessed Risk Factors", JAMA DERMATOLOGY, AMERICAN MEDICAL ASSOCIATION, US, vol. 152, no. 8, 1 August 2016 (2016-08-01), US , pages 889, XP093160820, ISSN: 2168-6068, DOI: 10.1001/jamadermatol.2016.0939
Attorney, Agent or Firm:
FB RICE PTY LTD (AU)
Download PDF:
Claims:
CLAIMS

1. A method for assessing the risk of a human subject for developing melanoma comprising: i) performing a genetic risk assessment of the subject, wherein the genetic risk assessment involves detecting, in a biological sample derived from the subject, the presence of at least two polymorphisms associated with a risk of a human subject for developing melanoma, ii) performing a clinical risk assessment of the subject for developing melanoma, and iii) combining the genetic risk assessment and the clinical risk assessment to obtain the risk of a human subject for developing melanoma.

2. The method of claim 1, wherein the genetic risk assessment comprises detecting the presence of at least two, at least five, at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, or each of the polymorphisms selected from rs77637424, rsl l204750, rs905938, rs3219090, rs61712781, rsl800440, rs35667974, rsl0931936, rs974032, rsl49617956, rsl0936599, rsl3098877, rs3135872, rs7726159, rs401681, rsl 17518215, rs4256371, rs6881722, rs32579, rs57354745, rs7739578, rsl3216160, rs7382061, rs847394, rsl l7132860, rs2286173, rs9298191, rsl384810, rs494668, rsl0809802, rs4636294, rs55797833, rsl011970, rsl0739221, rsl2380282, rs2487999, rsl0832555, rs2046494, rsl 126809, rsl801516, rsl2290699, rsl684387, rs4257028, rs61937385, rsl800407, rsl50962800, rs62034139, rs7196626, rs2967350, rsl7176204, rs9935216, rs34509231, rsl 17586265, rs62054258, rsl 1547464, rsl805007, rsl 17204628, rs74548542, rsl 17417690, rs62052682, rs62074125, rs910873, rs71325459, rs79981941, rs6122147, rs45430, rs4608623 and rs79966207, or a polymorphism in linkage disequilibrium with one or more thereof.

3. The method of claim 1 or claim 2, wherein the genetic risk assessment comprises detecting the presence of each of the following polymorphisms rs77637424, rsl l204750, rs905938, rs3219090, rs61712781, rsl800440, rs35667974, rsl0931936, rs974032, rsl49617956, rsl0936599, rsl3098877, rs3135872, rs7726159, rs401681, rsl 17518215, rs4256371, rs6881722, rs32579, rs57354745, rs7739578, rsl3216160, rs7382061, rs847394, rsl l7132860, rs2286173, rs9298191, rsl384810, rs494668, rsl0809802, rs4636294, rs55797833, rsl011970, rsl0739221, rsl2380282, rs2487999, rsl0832555, rs2046494, rsl 126809, rsl801516, rsl2290699, rsl684387, rs4257028, rs61937385, rsl800407, rsl50962800, rs62034139, rs7196626, rs2967350, rsl7176204, rs9935216, rs34509231, rsl 17586265, rs62054258, rsl 1547464, rsl805007, rsl 17204628, rs74548542, rsl 17417690, rs62052682, rs62074125, rs910873, rs71325459, rs79981941, rs6122147, rs45430, rs4608623 and rs79966207, or a polymorphism in linkage disequilibrium with one or more thereof.

4. The method of any one of claims 1 to 3, wherein performing the clinical risk assessment involves obtaining information from the subject on one or more or all of the following: hair colour, family history of melanoma, personal history of non-melanoma skin cancer, nevus density and history of sunbed use.

5. The method of any one of claims 1 to 3, wherein performing the clinical risk assessment involves obtaining information from the subject on each of hair colour, family history of melanoma, personal history of non-melanoma skin cancer, nevus density and history of sunbed use.

6. The method of any one of claims 2 to 5, wherein the polymorphism in linkage disequilibrium has linkage disequilibrium above 0.9.

7. The method of any one of claims 2 to 5, wherein the polymorphism in linkage disequilibrium has linkage disequilibrium of 1.

8. The method of any one of claims 1 to 7, which further comprises comparing the risk to a pre-determined threshold.

9. The method of any one of claims 1 to 8, wherein the genetic risk assessment produces a polygenic risk score (PRS).

10. The method of claim 9, wherein the PRS is determined using: where: ft is the effect size (log of the odds ratio) of polymorphism j, and

^s:.< is the count (0, 1, 2) of the effect alleles of polymorphism j for individual i.

11. The method of any one of claims 1 to 10, wherein the clinical risk assessment is determined using: clin xb = Z + (PDCE1 x hair_2) + (PDCE2 x hair _3) + (PDCE3 x hair_4) + (-PDCE4 x nevus 1) + (-PDCE5 x nevus 2) + (PDCE6 x nevus _3) + (PDCE7 x nevus _4) + (PDCE8 x fam hist) + (-PDCE9 x (1 - fam hist) + (PDCE10 x non mn + (-PDCE11 x sun _2) + (PDCE12 x sun _3) where:

Z is the baseline risk value, from which the individual’s risk changes according to the P coefficients and the values for the risk factors an individual has (the equation adds and subtracts risk from the baseline value according to the risk factors an individual has), PDCE1 is a predetermined coefficient for light brown hair colour at age 18 years, PDCE2 is a predetermined p coefficient for blonde hair colour at age 18 years, PDCE3 is a predetermined p coefficient for red hair colour at age 18 years, PDCE4 is a predetermined p coefficient for a nevus density of none,

PDCE5 is a predetermined p coefficient for a nevus density of fewer than 20, PDCE6 is a predetermined p coefficient for a nevus density of 20 to 50, PDCE7 is a predetermined P coefficient for a nevus density of more than 50,

PDCE8 is a predetermined P coefficient if the subject has a first-degree family history of melanoma,

PDCE9 is a predetermined p coefficient if the subject does not have a first-degree family history of melanoma,

PDCE10 is a predetermined P coefficient for a subject having a personal history of non-melanoma skin cancer,

PDCE11 is a predetermined p coefficient for a subject having used a sunbed 1 to 10 times in their lifetime,

PDCE12 is a predetermined p coefficient for a subject having used a sunbed more than 10 time in their lifetime, hair_2 is if the subject had light brown hair colour at age 18 years, hair_3 is if the subject had blonde hair colour at age 18 years, hair_4 is if the subject had red hair colour at age 18 years, nevus _1 is if the subject has a nevus density of none, nevus _2 is if the subject has a nevus density of fewer than 20, nevus _3 is if the subject has a nevus density of 20 to 50, nevus_4 is if the subject has a nevus density of more than 50, fam hist is if the subject has a first-degree family history of melanoma, non mn is if the subject has a personal history of non-melanoma skin cancer, sun_2 is if the subject has used a sunbed 1 to 10 times, and sun_3 is if the subject has used a sunbed more than 10 times.

12. The method of claim 11, wherein i) for hair_2 the subject is assigned a value of 1 if they had light brown hair colour at age 18 years, and assigned a value of 0 if not, ii) for hair_3 the subject is assigned a value of 1 if they had blonde hair colour at age 18 years, and assigned a value of 0 if not, iii) for hair_4 the subject is assigned a value of 1 if they had red hair colour at age 18 years, and assigned a value of 0 if not, and iv) if the subject had black or dark brown hair colour at age 18 years, hair colour is not considered further in the clinical risk assessment.

13. The method of claim 11 or claim 12, wherein i) for nevus _1 the subject is assigned a value of 1 if they have a nevus density of none, and assigned a value of 0 if not, ii) for nevus _2 the subject is assigned a value of 1 if they have a nevus density of fewer than 20, and assigned a value of 0 if not, iii) for nevus_3 the subject is assigned a value of 1 if they have a nevus density of 20 to 50, and assigned a value of 0 if not, and iv) for nevus _4 the subject is assigned a value of 1 if they have a nevus density of more than 50, and assigned a value of 0 if not.

14. The method according to any one of claims 11 to 13, wherein for fam hist the subject is assigned a value of 1 if they have a first-degree family history of melanoma, and assigned a value of 0 if not.

15. The method according to any one of claims 11 to 14, wherein for non mn the subject is assigned a value of 1 if they have a personal history of non-melanoma skin cancer, and assigned a value of 0 if not.

16. The method according to any one of claims 11 to 15, wherein i) for sun_2 the subject is assigned a value of 1 if they have used a sunbed 1 to 10 times, and assigned a value of 0 if not, ii) for sun_3 the subject is assigned a value of 1 if they have used a sunbed more than 10 times, and assigned a value of 0 if not, and iii) if the subject has not used a sunbed, sunbed use is not considered further in the clinical risk assessment.

17. The method according to any one of claims 11 to 16, wherein Z is between -0.01 and -0.21.

18. The method of claim 17, wherein Z is about -0.1 such as -0. 11.

19. The method according to any one of claims 11 to 18, wherein one or more or all of the following apply; a) PDCE1 is between 0.12 and 0.32, b) PDCE2 is between 0.71 and 1.11, c) PDCE3 is between 1.26 and 1.66, d) PDCE4 is between -0.69 and -1.09 e) PDCE5 is between -0.47 and -0.67, f) PDCE6 is between 0.24 and 0.44, g) PDCE7 is between 0.57 and 0.97, h) PDCE8 is between 0.51 and 0.71, i) PDCE9 is between 0 and -0.14, j) PDCE10 is between 0.96 and 1.36, k) PDCE11 is between 0 and -0.15, and l) PDCE12 is between 0.36 and 0.56.

20. The method according to any one of claims 11 to 19, wherein one or more or all of the following apply; a) PDCE1 is about 0.2 such as 0.22, b) PDCE2 is about 0.9 such as 0.91, c) PDCE3 is about 1.45 such as 1.46, d) PDCE4 is about -0.9 such as -0.89, e) PDCE5 is about -0.55 such as -0.57, f) PDCE6 is about 0.35 such as 0.34, g) PDCE7 is about 0.75 such as 0.77, h) PDCE8 is about 0.6 such as 0.61, i) PDCE9 is about -0.05 such as--0.04, j) PDCE10 is about 1.15 such as 1.16, k) PDCE11 is about -0.05 such as -0.05, and 1) PDCE12 is about 0.45 such as 0.46.

21. The method according to any one of claims 1 to 20, wherein the genetic risk assessment and the clinical risk assessment are combined using: where:

X is a predetermined p coefficient for the genetic risk assessment, and

Y is a predetermined coefficient for the clinical risk assessment.

22. The method of claim 21, wherein one or both of the following apply; a) X is between 0.541 and 0.941, and b) Y is between 0.119 and 0.519.

23. The method of claim 21 or claim 22, wherein one or both of the following apply; a) X is about 0.74 such as 0.741, and b) Y is about 0.32 such as 0.319.

24. The method of any one of claims 1 to 23, wherein the results of the risk assessment indicate that the subject should be enrolled in a melanoma screening program or subjected to more frequent melanoma screening.

25. The method according to any one of claims 1 to 24 which comprises determining one or more or all of the absolute 5 -year risk, the absolute 10-year risk, or the absolute remaining lifetime risk (to age 90).

26. A method for assessing the risk of a human subject for developing melanoma comprising performing a genetic risk assessment of the subject, wherein the genetic risk assessment involves detecting, in a biological sample derived from the subject, the presence of at least one polymorphism associated with a risk of a human subject for developing melanoma, wherein the polymorphism is selected from rs77637424, rsl 1204750, rs905938, rs3219090, rs61712781, rsl800440, rs35667974, rsl0931936, rs974032, rsl49617956, rsl0936599, rsl3098877, rs3135872, rs7726159, rs401681, rsl 17518215, rs4256371, rs6881722, rs32579, rs57354745, rs7739578, rsl3216160, rs7382061, rs847394, rsl l7132860, rs2286173, rs9298191, rsl384810, rs494668, rsl0809802, rs4636294, rs55797833, rsl011970, rsl0739221, rsl2380282, rs2487999, rsl0832555, rs2046494, rsl 126809, rsl801516, rsl2290699, rsl684387, rs4257028, rs61937385, rsl800407, rsl50962800, rs62034139, rs7196626, rs2967350, rsl7176204, rs9935216, rs34509231, rsl 17586265, rs62054258, rsl 1547464, rsl805007, rsl 17204628, rs74548542, rsl 17417690, rs62052682, rs62074125, rs910873, rs71325459, rs79981941, rs6122147, rs45430, rs4608623 and rs79966207, or a polymorphism in linkage disequilibrium with one or more thereof.

27. The method of claim 26, wherein the genetic risk assessment comprises detecting the presence of at least two, at least five, at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, or each of the polymorphisms selected from rs77637424, rsl 1204750, rs905938, rs3219090, rs61712781, rsl800440, rs35667974, rsl0931936, rs974032, rsl49617956, rsl0936599, rsl3098877, rs3135872, rs7726159, rs401681, rsl 17518215, rs4256371, rs6881722, rs32579, rs57354745, rs7739578, rsl3216160, rs7382061, rs847394, rsl 17132860, rs2286173, rs9298191, rsl384810, rs494668, rsl0809802, rs4636294, rs55797833, rsl011970, rsl0739221, rsl2380282, rs2487999, rsl0832555, rs2046494, rsl 126809, rsl801516, rsl2290699, rsl684387, rs4257028, rs61937385, rsl800407, rsl50962800, rs62034139, rs7196626, rs2967350, rsl7176204, rs9935216, rs34509231, rsl 17586265, rs62054258, rsl 1547464, rsl805007, rsl 17204628, rs74548542, rsl 17417690, rs62052682, rs62074125, rs910873, rs71325459, rs79981941, rs6122147, rs45430, rs4608623 and rs79966207, or a polymorphism in linkage disequilibrium with one or more thereof.

28. The method of claim 26 or claim 27, wherein the genetic risk assessment comprises detecting the presence of each of the following polymorphisms rs77637424, rsl l204750, rs905938, rs3219090, rs61712781, rsl800440, rs35667974, rsl0931936, rs974032, rsl49617956, rsl0936599, rsl3098877, rs3135872, rs7726159, rs401681, rsl 17518215, rs4256371, rs6881722, rs32579, rs57354745, rs7739578, rsl3216160, rs7382061, rs847394, rsl l7132860, rs2286173, rs9298191, rsl384810, rs494668, rsl0809802, rs4636294, rs55797833, rsl011970, rsl0739221, rsl2380282, rs2487999, rsl0832555, rs2046494, rsl 126809, rsl801516, rsl2290699, rsl684387, rs4257028, rs61937385, rsl800407, rsl50962800, rs62034139, rs7196626, rs2967350, rsl7176204, rs9935216, rs34509231, rsl 17586265, rs62054258, rsl 1547464, rsl805007, rsl 17204628, rs74548542, rsl 17417690, rs62052682, rs62074125, rs910873, rs71325459, rs79981941, rs6122147, rs45430, rs4608623 and rs79966207, or a polymorphism in linkage disequilibrium with one or more thereof.

29. A computer-implemented method for assessing the absolute risk of a human subject for developing melanoma, the method operable in a computing system comprising a processor and a memory, the method comprising: receiving clinical risk data and genetic risk data for the subject, wherein the clinical and genetic risk data was obtained by a method according to any one of claims 1 to 28; processing the data to combine the clinical risk data with the genetic risk data to obtain the relative risk of a human subject for developing melanoma; outputting the absolute risk of a human subject for developing melanoma.

30. The computer-implemented method of claim 29, wherein the clinical risk data and genetic risk data for the subject is received from a user interface coupled to the computing system.

31. The computer-implemented method of claim 29 or claim 30, wherein the clinical risk data and genetic risk data for the subject is received from a remote device across a wireless communications network.

32. The computer-implemented method of any one of claims 29 to 31, wherein outputting comprises outputting information to a user interface coupled to the computing system.

33. The computer-implemented method of any one of claims 29 to 32 which comprises determining a genetic risk score based on genetic data derived from a biological sample taken from the subject.

34. A computer-readable storage medium storing executable code, wherein when a processor executes the code, the processor is cause to perform the method of any one of claims 29 to 33.

35. A device for assessing the risk of a human subject developing melanoma, the device comprising: a processor; and a memory device storing executable code, the memory being accessible to the processor; wherein, when caused to execute the executable code stored in the memory device, the processor is caused to perform a method according to any one of claims 1 to 33.

36. The device of claim 35 further comprising a display component, wherein the processor is further caused to display the melanoma risk score of the subject for developing melanoma on the display component.

37. The device of claim 35 or claim 36 further comprising a communications module, wherein the processor is further caused to communicate the melanoma risk score of the subject for developing melanoma to an external device via the communications module.

38. A method for determining the need for routine diagnostic testing of a human subject for melanoma comprising assessing the risk of the subject for developing melanoma using the method according to any one of claims 1 to 33.

39. A method of screening for melanoma in a human subject, the method comprising assessing the risk of the subject for developing melanoma using the method according to any one of claims 1 to 33, and routinely screening for melanoma in the subject if they are assessed as having a risk for developing melanoma.

40. A method for stratifying a group of human subjects for a clinical trial of a candidate therapy, the method comprising assessing the individual risk of the subjects for developing melanoma using the method according to any one of claims 1 to 34, and using the results of the assessment to select subjects more likely to be responsive to the therapy.

Description:
METHODS OF ASSESSING RISK OF DEVELOPING MELANOMA

FIELD OF THE INVENTION

The present disclosure relates to methods for assessing the risk of a human subject for developing melanoma. Also provided are polymorphisms associated with the risk of a human subject for developing melanoma.

BACKGROUND OF THE INVENTION

Melanoma incidence rates have been increasing over the past 30 years in Western countries. In Australia, 6.7% of men and 4.6% of women will be diagnosed with melanoma in their lifetime, giving Australia the highest rate of melanoma in the world (Bray et al., 2018). In the United States, the lifetime risk of melanoma for white, non-Hispanic adults is lower, but still relevant at 2.6% (American Cancer Society, 2022). Despite being curable if caught early, melanoma is responsible for the majority of skin cancer-related deaths (Corrie et al., 2014; Davis et al., 2019).

Population-based screening programs are not recommended in Western countries, with the exception of Germany (Datzmann et al., 2022), due to inconclusive evidence that screening reduces melanoma-associated-mortality (Bibbins-Domingo et al., 2016). One significant barrier to population-level screening is training general practitioners to carry out the screening (Najmi et al., 2022; Brown et al., 2022, Harkemanne et al., 2021; Robinson et al., 2018; Swetter et al., 2017; Weinstock et al., 2016). While clinician-focused educational efforts show improved population-based changes in skin cancer screening, time allocation remains a practical barrier to implementation during patient visits (Tai-Seale et al., 2007; Oliveria et al., 2011). Simplifying screening methods for clinicians could enable early detection of melanoma, but may also reinvigorate prevention efforts. Implementation of risk stratification at the primary care level may be an efficient option to identify at-risk patients while minimizing the impact to limited face-to-face time.

Melanoma risk prediction can be an important tool in public health prevention strategies. Currently, adults are identified as being at high risk of melanoma based on a few clinical risk factors including age, ultraviolet light exposure (Gandini et al., 2005a), melanocytic nevus count (Grob et al., 1990), history of non-melanoma skin cancer (Gandini et al., 2005b), skin and hair colour (Olsen et al., 2019), and family history of melanoma (Ford et al., 1995). Clinicians often assess these risk factors on an individual basis, without any way to consider the multiplicative effects of risk factors that increase risk of melanoma. Adults identified as high-risk can be offered appropriate screening and risk-reduction options. Recent development of risk prediction models (Usher-Smith et al., 2014; Vuong et al., 2016 and 2020) have focused on improving screening access to at-risk adults.

Although ultraviolet light exposure is a major risk factor of melanoma, there is also a substantial heritable component to melanoma (58%; 95% CI 43%, 73%) (Mucci et al., 2016). A family history of the disease is a well-established risk factor for melanoma (Vuong et al., 2016; Wei et al., 2019; Frank et al., 2015), but there is an excess of familial risk that is due to genetics. A portion of this genetic risk is due to high penetrance genes such as CDKN2A and CDK4 (Truderung et al., 2021).

Despite available tools, there is a need for further melanoma risk assessment methods.

SUMMARY OF THE INVENTION

The present inventors have identified improved methods of assessing the risk of a human subject for developing melanoma.

In a first aspect, the present invention provides a method for assessing the risk of a human subject for developing melanoma comprising: i) performing a genetic risk assessment of the subject, wherein the genetic risk assessment involves detecting, in a biological sample derived from the subject, the presence of at least two polymorphisms associated with a risk of a human subject for developing melanoma, ii) performing a clinical risk assessment of the subject for developing melanoma, and iii) combining the genetic risk assessment and the clinical risk assessment to obtain the risk of a human subject for developing melanoma.

In an embodiment, the genetic risk assessment comprises detecting the presence of at least two, at least five, at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, or each of the polymorphisms selected from rs77637424, rsl 1204750, rs905938, rs3219090, rs61712781, rsl800440, rs35667974, rsl0931936, rs974032, rsl49617956, rsl0936599, rsl3098877, rs3135872, rs7726159, rs401681, rsl 17518215, rs4256371, rs6881722, rs32579, rs57354745, rs7739578, rsl3216160, rs7382061, rs847394, rsl l7132860, rs2286173, rs9298191, rsl384810, rs494668, rsl0809802, rs4636294, rs55797833, rsl011970, rsl0739221, rsl2380282, rs2487999, rsl0832555, rs2046494, rsl 126809, rsl801516, rsl2290699, rsl684387, rs4257028, rs61937385, rsl800407, rsl50962800, rs62034139, rs7196626, rs2967350, rsl7176204, rs9935216, rs34509231, rsl 17586265, rs62054258, rsl 1547464, rsl805007, rsl 17204628, rs74548542, rsl 17417690, rs62052682, rs62074125, rs910873, rs71325459, rs79981941, rs6122147, rs45430, rs4608623 and rs79966207, or a polymorphism in linkage disequilibrium with one or more thereof.

In an embodiment, the genetic risk assessment comprises detecting the presence of each of the following polymorphisms rs77637424, rsl 1204750, rs905938, rs3219090, rs61712781, rsl800440, rs35667974, rsl0931936, rs974032, rsl49617956, rsl0936599, rsl3098877, rs3135872, rs7726159, rs401681, rsl 17518215, rs4256371, rs6881722, rs32579, rs57354745, rs7739578, rsl3216160, rs7382061, rs847394, rsl 17132860, rs2286173, rs9298191, rsl384810, rs494668, rsl0809802, rs4636294, rs55797833, rsl011970, rsl0739221, rsl2380282, rs2487999, rsl0832555, rs2046494, rsl 126809, rsl801516, rsl2290699, rsl684387, rs4257028, rs61937385, rsl800407, rsl50962800, rs62034139, rs7196626, rs2967350, rsl7176204, rs9935216, rs34509231, rsl 17586265, rs62054258, rsl 1547464, rsl805007, rsl 17204628, rs 74548542, rsl 17417690, rs62052682, rs62074125, rs910873, rs71325459, rs79981941, rs6122147, rs45430, rs4608623 and rs79966207, or a polymorphism in linkage disequilibrium with one or more thereof.

In an embodiment, the genetic risk assessment comprises detecting the presence of each of the following polymorphisms rs77637424, rsl 1204750, rs905938, rs3219090, rs61712781, rsl800440, rs35667974, rsl0931936, rs974032, rsl49617956, rsl0936599, rsl3098877, rs3135872, rs7726159, rs401681, rsl 17518215, rs4256371, rs6881722, rs32579, rs57354745, rs7739578, rsl3216160, rs7382061, rs847394, rsl 17132860, rs2286173, rs9298191, rsl384810, rs494668, rsl0809802, rs4636294, rs55797833, rsl011970, rsl0739221, rsl2380282, rs2487999, rsl0832555, rs2046494, rsl 126809, rsl801516, rsl2290699, rsl684387, rs4257028, rs61937385, rsl800407, rsl50962800, rs62034139, rs7196626, rs2967350, rsl7176204, rs9935216, rs34509231, rsl 17586265, rs62054258, rsl 1547464, rsl805007, rsl 17204628, rs 74548542, rsl 17417690, rs62052682, rs62074125, rs910873, rs71325459, rs79981941, rs6122147, rs45430, rs4608623 and rs79966207.

In an embodiment, performing the clinical risk assessment involves obtaining information from the subject on one or more or all of the following: hair colour, family history of melanoma, personal history of non-melanoma skin cancer, nevus density and history of sunbed use.

In an embodiment, performing the clinical risk assessment involves obtaining information from the subject on each of hair colour, family history of melanoma, personal history of non-melanoma skin cancer, nevus density and history of sunbed use. In an embodiment, the polymorphism in linkage disequilibrium has linkage disequilibrium above 0.9.

In an embodiment, the polymorphism in linkage disequilibrium has linkage disequilibrium of 1.

In an embodiment, the method further comprises comparing the risk to a predetermined threshold.

In an embodiment, the genetic risk assessment produces a polygenic risk score (PRS).

In an embodiment, the PRS is determined using: where: ft is the effect size (log of the odds ratio) of polymorphism j, and is the count (0, 1, 2) of the effect alleles of polymorphism j for individual i.

In an embodiment, the clinical risk assessment is determined using: clin xb = Z + (PDCE1 x hair_2) + (PDCE2 x hair _3) + (PDCE3 x hair_4) + (-PDCE4 x nevus 1) + (-PDCE5 x nevus 2) + (PDCE6 x nevus _3) + (PDCE7 x nevus _4) + (PDCE8 x fam hist) + (-PDCE9 x (1 - fam hist) + (PDCE10 x non mn) + (-PDCE11 x sun _2) + (PDCE12 x sun _3) where:

Z is the baseline risk value, from which the individual’s risk changes according to the P coefficients and the values for the risk factors an individual has (the equation adds and subtracts risk from the baseline value according to the risk factors an individual has), PDCE1 is a predetermined coefficient for light brown hair colour at age 18 years, PDCE2 is a predetermined p coefficient for blonde hair colour at age 18 years, PDCE3 is a predetermined p coefficient for red hair colour at age 18 years, PDCE4 is a predetermined p coefficient for a nevus density of none, PDCE5 is a predetermined p coefficient for a nevus density of fewer than 20, PDCE6 is a predetermined p coefficient for a nevus density of 20 to 50, PDCE7 is a predetermined P coefficient for a nevus density of more than 50,

PDCE8 is a predetermined P coefficient if the subject has a first-degree family history of melanoma,

PDCE9 is a predetermined p coefficient if the subject does not have a first-degree family history of melanoma, PDCE10 is a predetermined P coefficient for a subject having a personal history of non-melanoma skin cancer,

PDCE11 is a predetermined coefficient for a subject having used a sunbed 1 to 10 times in their lifetime,

PDCE12 is a predetermined p coefficient for a subject having used a sunbed more than 10 time in their lifetime, hair_2 is if the subject had light brown hair colour at age 18 years, hair_3 is if the subject had blonde hair colour at age 18 years, hair_4 is if the subject had red hair colour at age 18 years, nevus _1 is if the subject has a nevus density of none, nevus _2 is if the subject has a nevus density of fewer than 20, nevus _3 is if the subject has a nevus density of 20 to 50, nevus_4 is if the subject has a nevus density of more than 50, fam hist is if the subject has a first-degree family history of melanoma, non mn is if the subject has a personal history of non-melanoma skin cancer, sun_2 is if the subject has used a sunbed 1 to 10 times, and sun_3 is if the subject has used a sunbed more than 10 times.

In an embodiment, i) for hair_2 the subject is assigned a value of 1 if they had light brown hair colour at age 18 years, and assigned a value of 0 if not, ii) for hair_3 the subject is assigned a value of 1 if they had blonde hair colour at age 18 years, and assigned a value of 0 if not, iii) for hair_4 the subject is assigned a value of 1 if they had red hair colour at age 18 years, and assigned a value of 0 if not, and iv) if the subject had black or dark brown hair colour at age 18 years, hair colour is not considered further in the clinical risk assessment.

In an embodiment, i) for nevus _1 the subject is assigned a value of 1 if they have a nevus density of none, and assigned a value of 0 if not, ii) for nevus _2 the subject is assigned a value of 1 if they have a nevus density of fewer than 20, and assigned a value of 0 if not, iii) for nevus_3 the subject is assigned a value of 1 if they have a nevus density of 20 to 50, and assigned a value of 0 if not, and iv) for nevus _4 the subject is assigned a value of 1 if they have a nevus density of more than 50, and assigned a value of 0 if not. In an embodiment, for fam hist the subject is assigned a value of 1 if they have a first-degree family history of melanoma, and assigned a value of 0 if not.

In an embodiment, for non mn the subject is assigned a value of 1 if they have a personal history of non-melanoma skin cancer, and assigned a value of 0 if not.

In an embodiment, i) for sun_2 the subject is assigned a value of 1 if they have used a sunbed 1 to 10 times, and assigned a value of 0 if not, ii) for sun_3 the subject is assigned a value of 1 if they have used a sunbed more than 10 times, and assigned a value of 0 if not, and iii) if the subject has not used a sunbed, sunbed use is not considered further in the clinical risk assessment.

In an embodiment, the genetic risk assessment and the clinical risk assessment are combined using: where:

X is a predetermined p coefficient for the genetic risk assessment, and

Y is a predetermined coefficient for the clinical risk assessment.

In an embodiment, the subject is 18 years of age or older. In an embodiment, the subject is 40 to 69 years of age.

In an embodiment, the subject is Caucasian.

In an embodiment, the results of the risk assessment indicate that the subject should be enrolled in a melanoma screening program or subjected to more frequent melanoma screening.

In an embodiment, the method comprises determining one or more or all of the absolute 5 -year risk, the absolute 10-year risk, or the absolute remaining lifetime risk (to age 90).

In another aspect, the present invention provides a method for assessing the risk of a human subject for developing melanoma comprising performing a genetic risk assessment of the subject, wherein the genetic risk assessment involves detecting, in a biological sample derived from the subject, the presence of at least one polymorphism associated with a risk of a human subject for developing melanoma, wherein the polymorphism is selected from rs77637424, rsl 1204750, rs905938, rs3219090, rs61712781, rsl800440, rs35667974, rsl0931936, rs974032, rsl49617956, rsl0936599, rsl3098877, rs3135872, rs7726159, rs401681, rsl 17518215, rs4256371, rs6881722, rs32579, rs57354745, rs7739578, rsl3216160, rs7382061, rs847394, rsl 17132860, rs2286173, rs9298191, rsl384810, rs494668, rsl0809802, rs4636294, rs55797833, rsl011970, rsl0739221, rsl2380282, rs2487999, rsl0832555, rs2046494, rsl 126809, rsl801516, rsl2290699, rsl684387, rs4257028, rs61937385, rsl800407, rsl50962800, rs62034139, rs7196626, rs2967350, rsl7176204, rs9935216, rs34509231, rsl 17586265, rs62054258, rsl 1547464, rsl805007, rsl 17204628, rs 74548542, rsl 17417690, rs62052682, rs62074125, rs910873, rs71325459, rs79981941, rs6122147, rs45430, rs4608623 and rs79966207, or a polymorphism in linkage disequilibrium with one or more thereof.

In an embodiment, the genetic risk assessment comprises detecting the presence of at least two, at least five, at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, or each of the polymorphisms selected from rs77637424, rsl 1204750, rs905938, rs3219090, rs61712781, rsl800440, rs35667974, rsl0931936, rs974032, rsl49617956, rsl0936599, rsl3098877, rs3135872, rs7726159, rs401681, rsl 17518215, rs4256371, rs6881722, rs32579, rs57354745, rs7739578, rsl3216160, rs7382061, rs847394, rsl l7132860, rs2286173, rs9298191, rsl384810, rs494668, rsl0809802, rs4636294, rs55797833, rsl011970, rsl0739221, rsl2380282, rs2487999, rsl0832555, rs2046494, rsl 126809, rsl801516, rsl2290699, rsl684387, rs4257028, rs61937385, rsl800407, rsl50962800, rs62034139, rs7196626, rs2967350, rsl7176204, rs9935216, rs34509231, rsl 17586265, rs62054258, rsl 1547464, rsl805007, rsl 17204628, rs74548542, rsl 17417690, rs62052682, rs62074125, rs910873, rs71325459, rs79981941, rs6122147, rs45430, rs4608623 and rs79966207, or a polymorphism in linkage disequilibrium with one or more thereof.

In an embodiment, the genetic risk assessment comprises detecting the presence of each of the following polymorphisms rs77637424, rsl 1204750, rs905938, rs3219090, rs61712781, rsl800440, rs35667974, rsl0931936, rs974032, rsl49617956, rsl0936599, rsl3098877, rs3135872, rs7726159, rs401681, rsl 17518215, rs4256371, rs6881722, rs32579, rs57354745, rs7739578, rsl3216160, rs7382061, rs847394, rsl 17132860, rs2286173, rs9298191, rsl384810, rs494668, rsl0809802, rs4636294, rs55797833, rsl011970, rsl0739221, rsl2380282, rs2487999, rsl0832555, rs2046494, rsl 126809, rsl801516, rsl2290699, rsl684387, rs4257028, rs61937385, rsl800407, rsl50962800, rs62034139, rs7196626, rs2967350, rsl7176204, rs9935216, rs34509231, rsl 17586265, rs62054258, rsl 1547464, rsl805007, rsl 17204628, rs 74548542, rsl 17417690, rs62052682, rs62074125, rs910873, rs71325459, rs79981941, rs6122147, rs45430, rs4608623 and rs79966207, or a polymorphism in linkage disequilibrium with one or more thereof.

In a further aspect, the present invention provides a computer-implemented method for assessing the absolute risk of a human subject for developing melanoma, the method operable in a computing system comprising a processor and a memory, the method comprising: receiving clinical risk data and genetic risk data for the subject, wherein the clinical and genetic risk data was obtained by a method of the invention; processing the data to combine the clinical risk data with the genetic risk data to obtain the relative risk of a human subject for developing melanoma; outputting the absolute risk of a human subject for developing melanoma.

In an embodiment, the clinical risk data and genetic risk data for the subject is received from a user interface coupled to the computing system.

In an embodiment, the clinical risk data and genetic risk data for the subject is received from a remote device across a wireless communications network.

In an embodiment, outputting comprises outputting information to a user interface coupled to the computing system.

In an embodiment, the computer-implemented method comprises determining a genetic risk score based on genetic data derived from a biological sample taken from the subject.

Also provided is a computer-readable storage medium storing executable code, wherein when a processor executes the code, the processor is cause to perform a method of the invention.

In another aspect, the present invention provides a device for assessing the risk of a human subject developing melanoma, the device comprising: a processor; and a memory device storing executable code, the memory being accessible to the processor; wherein, when caused to execute the executable code stored in the memory device, the processor is caused to perform a method of the invention.

In an embodiment, the device further comprises a display component, wherein the processor is further caused to display the melanoma risk score of the subject for developing melanoma on the display component.

In an embodiment, the device further comprises a communications module, wherein the processor is further caused to communicate the melanoma risk score of the subject for developing melanoma to an external device via the communications module.

In another aspect, the present invention provides a method for determining the need for routine diagnostic testing of a human subject for melanoma comprising assessing the risk of the subject for developing melanoma using a method of the invention.

In a further aspect, the present invention provides a method of screening for melanoma in a human subject, the method comprising assessing the risk of the subject for developing melanoma using a method of the invention, and routinely screening for melanoma in the subject if they are assessed as having a risk for developing melanoma.

In another aspect, the present invention provides a method for stratifying a group of human subjects for a clinical trial of a candidate therapy, the method comprising assessing the individual risk of the subjects for developing melanoma using a method of the invention, and using the results of the assessment to select subjects more likely to be responsive to the therapy.

Any embodiment herein shall be taken to apply mutatis mutandis to any other embodiment unless specifically stated otherwise.

The present invention is not to be limited in scope by the specific embodiments described herein, which are intended for the purpose of exemplification only. Functionally equivalent products, compositions and methods are clearly within the scope of the invention, as described herein.

Throughout this specification, unless specifically stated otherwise or the context requires otherwise, reference to a single step, composition of matter, group of steps or group of compositions of matter shall be taken to encompass one and a plurality (i.e. one or more) of those steps, compositions of matter, groups of steps or group of compositions of matter.

The invention is hereinafter described by way of the following non-limiting Examples and with reference to the accompanying figures.

BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS

Figure 1. Standardized incidence ratios of the number of observed melanoma cases in the first 10 years of follow-up in the testing data compared with the expected number using population incidence rates by quintile of 10-year risk.

Figure 2. Nevus density pictograms used in Australian Melanoma Family Study (eFigure 1 from Vuong et al., 2016). Figure 3. Standardized incidence ratios of the observed number of melanoma cases in the first 10 years of follow-up compared with the number expected by the clinical risk model.

Figure 4. Distribution of 10-year melanoma risk categorized by different percentage groups.

Figure 5. Adults (n=54,798) in the test set categorized into three bins by 10-year risk scores, low (<0.5%), average (0.5-1.0%), and increased (>1.0%), show the ability of the new model (A) to better stratify the population in by identifying 10-fold more adults in the increased risk category compared to the (B) model using clinical factors alone.

DETAILED DESCRIPTION OF THE INVENTION

General Techniques and Definitions

Unless specifically defined otherwise, all technical and scientific terms used herein shall be taken to have the same meaning as commonly understood by one of ordinary skill in the art (e.g., oncology, melanoma analysis, molecular genetics, biostatistics, risk assessment and clinical studies).

Unless otherwise indicated, the molecular techniques utilized in the present invention are standard procedures, well known to those skilled in the art. Such techniques are described and explained throughout the literature in sources such as, J. Perbal, A Practical Guide to Molecular Cloning, John Wiley and Sons (1984), J. Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbour Laboratory Press (1989), T.A. Brown (editor), Essential Molecular Biology: A Practical Approach, Volumes 1 and 2, IRL Press (1991), D.M. Glover and B.D. Hames (editors), DNA Cloning: A Practical Approach, Volumes 1-4, IRL Press (1995 and 1996), and F.M. Ausubel et al. (editors), Current Protocols in Molecular Biology, Greene Pub. Associates and Wiley-Interscience (1988, including all updates until present), Ed Harlow and David Lane (editors) Antibodies: A Laboratory Manual, Cold Spring Harbour Laboratory, (1988), and J.E. Coligan et al. (editors) Current Protocols in Immunology, John Wiley & Sons (including all updates until present).

The term “and/or”, e.g., “X and/or Y” shall be understood to mean either “X and Y” or “X or Y” and shall be taken to provide explicit support for both meanings or for either meaning. As used herein, the term “about”, unless stated to the contrary, refers to ±10%, more preferably ±5%, more preferably ±1%, of the designated value. In an embodiment, for each value discussed herein, the last or second last decimal point can be removed and the relevant number rounded up (from 5) or down (from 4).

Throughout this specification the word “comprise”, or variations such as “comprises” or “comprising”, will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.

The methods of the present disclosure can be used to assess risk of a human subject developing melanoma. As used herein, the term “melanoma” refers to a type of skin cancer that develops when melanocytes start to grow out of control. Other names for this cancer include malignant melanoma and cutaneous melanoma. Most melanoma cells still make melanin, so melanoma tumors are usually brown or black. But some melanomas do not make melanin and can appear pink, tan, or even white. Melanomas can develop anywhere on the skin, but they are more likely to start on the trunk (chest and back) in men and on the legs in women. The neck and face are other common sites.

As used herein, “biological sample” refers to any sample comprising nucleic acids, especially DNA, from or derived from a human patient, e.g., bodily fluids (blood, saliva, urine etc.), biopsy, tissue, and/or waste from the patient. Thus, tissue biopsies, stool, sputum, saliva, blood, lymph, or the like can easily be screened for polymorphisms, as can essentially any tissue of interest that contains the appropriate nucleic acids. In one embodiment, the biological sample is a cheek cell sample. These samples are typically taken, following informed consent, from a patient by standard medical laboratory methods. The sample may be in a form taken directly from the patient, or may be at least partially processed (purified) to remove at least some non- nucleic acid material.

A “polymorphism” is a locus that is variable; that is, within a population, the nucleotide sequence at a polymorphism has more than one version or allele. One example of a polymorphism is a “single-nucleotide polymorphism”, which is a polymorphism at a single-nucleotide position in a genome (the nucleotide at the specified position varies between individuals or populations). Other examples include a deletion or insertion of one or more base pairs at the polymorphism locus.

As used herein, the term “SNP” or “single-nucleotide polymorphism” refers to a genetic variation between individuals; e.g., a single nitrogenous base position in the DNA of organisms that is variable. As used herein, “SNPs” is the plural of SNP. Of course, when one refers to DNA herein, such reference may include derivatives of the DNA such as amplicons, RNA transcripts thereof, etc.

The term “allele” refers to one of two or more different nucleotide sequences that occur or are encoded at a specific locus, or two or more different polypeptide sequences encoded by such a locus. For example, a first allele can occur on one chromosome, while a second allele occurs on a second homologous chromosome, e.g., as occurs for different chromosomes of a heterozygous individual, or between different homozygous or heterozygous individuals in a population. An allele “positively” correlates with a trait when it is linked to it and when presence of the allele is an indicator that the trait or trait form will occur in an individual comprising the allele. An allele “negatively” correlates with a trait when it is linked to it and when presence of the allele is an indicator that a trait or trait form will not occur in an individual comprising the allele.

A marker polymorphism or allele is “correlated”, or “associated” with a specified phenotype (melanoma susceptibility, etc.) when it can be statistically linked (positively or negatively) to the phenotype (also referred to herein as an “effect allele”). The non-correlated or non-associated allele can also be referred to as the “reference allele”. Methods for determining whether a polymorphism or allele is statistically linked are known to those in the art. That is, the specified polymorphism occurs more commonly in a case population (e.g., melanoma patients) than in a control population (e.g., individuals that do not have melanoma). This correlation is often inferred as being causal in nature, but it need not be, simple genetic linkage to (association with) a locus for a trait that underlies the phenotype is sufficient for correlation/association to occur.

As used herein, the phrase “predetermined coefficient” means a weighting that describes the relationship between a risk factor or a composite risk score with the outcome of interest.

The phrase “linkage disequilibrium” (LD) is used to describe the statistical correlation between two neighbouring polymorphic genotypes. Typically, LD refers to the correlation between the alleles of a random gamete at the two loci, assuming Hardy- Weinberg equilibrium (statistical independence) between gametes. LD is quantified with either Lewontin's parameter of association (D 1 ) or with Pearson correlation coefficient (r) (Devlin and Risch, 1995). Two loci with a LD value of 1 are said to be in complete LD. At the other extreme, two loci with a LD value of 0 are termed to be in linkage equilibrium. Linkage disequilibrium is calculated following the application of the expectation maximization algorithm for the estimation of haplotype frequencies (Slatkin and Excoffier, 1996). LD values according to the present disclosure for neighbouring genotypes/loci are selected above 0.1, preferably, above 0.2, more preferable above 0.5, more preferably, above 0.6, still more preferably, above 0.7, preferably, above 0.8, more preferably above 0.9, ideally about 1.0.

Another way one of skill in the art can readily identify polymorphisms in linkage disequilibrium with the polymorphisms of the present disclosure is determining the LOD score for two loci. LOD stands for “logarithm of the odds”, a statistical estimate of whether two genes, or a gene and a disease gene, are likely to be located near each other on a chromosome and are therefore likely to be inherited together. A LOD score of between about 2-3 or higher is generally understood to mean that two genes are located close to each other on the chromosome. The present inventors have found that many of the polymorphisms in linkage disequilibrium with the polymorphisms of the present disclosure have a LOD score of between about 2-50. Accordingly, in an embodiment, LOD values according to the present disclosure for neighbouring genotypes/loci are selected at least above 2, at least above 3, at least above 4, at least above 5, at least above 6, at least above 7, at least above 8, at least above 9, at least above 10, at least above 20 at least above 30, at least above 40, at least above 50.

In another embodiment, polymorphisms in linkage disequilibrium with the polymorphisms of the present disclosure can have a specified genetic recombination distance of less than or equal to about 20 centimorgan (cM) or less. Lor example, 15 cM or less, 10 cM or less, 9 cM or less, 8 cM or less, 7 cM or less, 6 cM or less, 5 cM or less, 4 cM or less, 3 cM or less, 2 cM or less, 1 cM or less, 0.75 cM or less, 0.5 cM or less, 0.25 cM or less, or 0. 1 cM or less. Lor example, two linked loci within a single chromosome segment can undergo recombination during meiosis with each other at a frequency of less than or equal to about 20%, about 19%, about 18%, about 17%, about 16%, about 15%, about 14%, about 13%, about 12%, about 11%, about 10%, about 9%, about 8%, about 7%, about 6%, about 5%, about 4%, about 3%, about 2%, about 1%, about 0.75%, about 0.5%, about 0.25%, or about 0.1% or less.

In another embodiment, polymorphisms in linkage disequilibrium with the polymorphisms of the present disclosure are within at least 100 kb (which correlates in humans to about 0. 1 cM, depending on local recombination rate), at least 50 kb, at least 20 kb or less of each other.

Lor example, one approach for the identification of surrogate markers for a particular polymorphism involves a simple strategy that presumes that polymorphisms surrounding the target polymorphism are in linkage disequilibrium and can therefore provide information about disease susceptibility. Thus, as described herein, surrogate markers can therefore be identified from publicly available databases, such as HAPMAP, by searching for polymorphisms fulfilling certain criteria that have been found in the scientific community to be suitable for the selection of surrogate marker candidates.

“Allele frequency”, or number of a particular allele, refers to the frequency (proportion or percentage) at which an allele is present at a locus within an individual, within a line or within a population of lines. For example, for an allele “A”, diploid individuals of genotype “AA”, “Aa” or “aa” (alternatively “AA”, “AB” or “BB”) have allele frequencies of 1.0, 0.5, or 0.0, respectively. One can estimate the allele frequency within a line or population (e.g., cases or controls) by averaging the allele frequencies of a sample of individuals from that line or population. Similarly, one can calculate the allele frequency within a population of lines by averaging the allele frequencies of lines that make up the population.

In an embodiment, the term “allele frequency” is used to define the population frequency of the allele of interest, which is known as the effect allele. The effect allele is that linked to melanoma risk, either positively or negatively.

An individual is “homozygous” if the individual has only one type of allele at a given locus (e.g., a diploid individual has a copy of the same allele at a locus for each of two homologous chromosomes). An individual is “heterozygous” if more than one allele type is present at a given locus (e.g., a diploid individual with one copy each of two different alleles). The term “homogeneity” indicates that members of a group have the same genotype at one or more specific loci. In contrast, the term “heterogeneity” is used to indicate that individuals within the group differ in genotype at one or more specific loci.

A “locus” is a chromosomal position or region. For example, a polymorphic locus is a position or region where a polymorphic nucleic acid, trait determinant, gene or marker is located. In a further example, a “gene locus” is a specific chromosome location (region) in the genome of a species where a specific gene can be found.

A “marker”, “molecular marker” or “marker nucleic acid” refers to a nucleotide sequence or encoded product thereof (e.g., a protein) used as a point of reference when identifying a locus or a linked locus. A marker can be derived from genomic nucleotide sequence or from expressed nucleotide sequences (e.g., from an RNA, nRNA, mRNA, a cDNA, etc.), or from an encoded polypeptide. The term also refers to nucleic acid sequences complementary to or flanking the marker sequences, such as nucleic acids used as probes or primer pairs capable of amplifying the marker sequence. A “marker probe” is a nucleic acid sequence or molecule that can be used to identify the presence of a marker locus, e.g., a nucleic acid probe that is complementary to a marker locus sequence. Nucleic acids are “complementary” when they specifically hybridize in solution, e.g., according to Watson-Crick base pairing rules. A “marker locus” is a locus that can be used to track the presence of a second linked locus, e.g., a linked or correlated locus that encodes or contributes to the population variation of a phenotypic trait. For example, a marker locus can be used to monitor segregation of alleles at a locus, such as a QTL, that are genetically or physically linked to the marker locus. Thus, a “marker allele,” alternatively an “allele of a marker locus” is one of a plurality of polymorphic nucleotide sequences found at a marker locus in a population that is polymorphic for the marker locus. Each of the identified markers is expected to be in close physical and genetic proximity (resulting in physical and/or genetic linkage) to a genetic element, e.g., a QTL that contributes to the relevant phenotype. Markers corresponding to genetic polymorphisms between members of a population can be detected by methods well established in the art. These include, e.g., DNA sequencing, PCR-based sequence specific amplification methods, detection of restriction fragment length polymorphisms (RFLP), detection of isozyme markers, detection of allele specific hybridization (ASH), detection of single-nucleotide extension, detection of amplified variable sequences of the genome, detection of self-sustained sequence replication, detection of simple sequence repeats (SSRs), detection of single-nucleotide polymorphisms (SNPs), or detection of amplified fragment length polymorphisms (AFLPs).

The term “amplifying” in the context of nucleic acid amplification is any process whereby additional copies of a selected nucleic acid (or a transcribed form thereof) are produced. Typical amplification methods include various polymerase based replication methods, including the polymerase chain reaction (PCR), ligase mediated methods such as the ligase chain reaction (LCR) and RNA polymerase based amplification (e.g., by transcription) methods.

An “amplicon” is an amplified nucleic acid, e.g., a nucleic acid that is produced by amplifying a template nucleic acid by any available amplification method (e.g., PCR, LCR, transcription, or the like).

A “gene” is one or more sequence(s) of nucleotides in a genome that together encode one or more expressed molecules, e.g., an RNA, or polypeptide. The gene can include coding sequences that are transcribed into RNA, which may then be translated into a polypeptide sequence, and can include associated structural or regulatory sequences that aid in replication or expression of the gene. A “genotype” is the genetic constitution of an individual (or group of individuals) at one or more genetic loci. Genotype is defined by the allele(s) of one or more known loci of the individual, typically, the compilation of alleles inherited from its parents.

A “haplotype” is the genotype of an individual at a plurality of genetic loci on a single DNA strand. Typically, the genetic loci described by a haplotype are physically and genetically linked, i.e., on the same chromosome strand.

A “set” of markers, probes or primers refers to a collection or group of markers probes, primers, or the data derived therefrom, used for a common purpose, e.g., identifying an individual with a specified genotype (e.g., risk of developing melanoma). Frequently, data corresponding to the markers, probes or primers, or derived from their use, is stored in an electronic medium. While each of the members of a set possess utility with respect to the specified purpose, individual markers selected from the set as well as subsets including some, but not all of the markers, are also effective in achieving the specified purpose.

The polymorphisms and genes, and corresponding marker probes, amplicons or primers described above can be embodied in any system herein, either in the form of physical nucleic acids, or in the form of system instructions that include sequence information for the nucleic acids. For example, the system can include primers or amplicons corresponding to (or that amplify a portion of) a gene or polymorphism described herein. As in the methods above, the set of marker probes or primers optionally detects a plurality of polymorphisms in a plurality of said genes or genetic loci. Thus, for example, the set of marker probes or primers detects at least one polymorphism in each of these polymorphisms or genes, or any other polymorphism, gene or locus defined herein. Any such probe or primer can include a nucleotide sequence of any such polymorphism or gene, or a complementary nucleic acid thereof, or a transcribed product thereof (e.g., a nRNA or mRNA form produced from a genomic sequence, e.g., by transcription or splicing).

As used herein, “risk assessment” refers to a process by which a subject’s risk of developing melanoma can be assessed. A risk assessment will typically involve obtaining information relevant to the subject’s risk of developing melanoma, assessing that information, and quantifying the subject’s risk of developing melanoma, for example, by producing a risk score.

As used herein, the terms “routinely screening for melanoma” and “more frequent screening” are relative terms, and are based on a comparison to the level of screening recommended to a subject who has not identified risk of developing melanoma. Skilled clinicians can readily determine suitable frequencies based on their knowledge of the field. Examples of melanoma screening methods include, but are not limited to, visual assessment by eye by a skilled practitioner such as by a skin specialist, epiluminescence microscopy, or dermoscopy.

Clinical Risk Factors

Clinical information can be self-reported by the subject. For example, the subject may complete a questionnaire designed to obtain clinical information regarding the clinical factors. In another example, subject to obtaining informed consent from the subject, clinical information can be obtained from medical records by interrogating a relevant database or hard copy documents comprising the clinical information.

In an embodiment, the clinical risk assessment involves obtaining information from the subject on one or more of the following: hair colour, family history of melanoma, personal history of non-melanoma skin cancer, nevus density age, sex, sunlight exposure, history of sunbum, skin colour, immune system status (for example, immunocompromised) and history of sunbed use.

In an embodiment, the clinical risk assessment involves obtaining information from the subject on one or more of the following: hair colour, family history of melanoma, personal history of non-melanoma skin cancer, nevus density and history of sunbed use.

In an embodiment, the clinical risk assessment involves obtaining information from the subject on each of hair colour, family history of melanoma, personal history of non-melanoma skin cancer, nevus density and history of sunbed use.

In an embodiment, the clinical risk assessment involves determining the hair colour of the subject. This relates to the natural hair colour of the subject. In an embodiment, the hair colour is as at 18 years of age. In an embodiment, the hair colour is selected from black/dark brown, light brown, blonde and red. In an embodiment, the hair colour is self reported.

In an embodiment, the clinical risk assessment involves determining the sunbed use of the subject. This may involve providing different bands of use such as never, 1 to 10 times or more than 10 times.

In an embodiment, the clinical risk assessment involves determining the family history of melanoma. “Family history of melanoma” or variations thereof is used in the context of the present disclosure to refer to the history of melanoma among the subject’s first- and/or second-degree relatives. For example, “family history of melanoma” can be used to refer to the history of melanoma amongst only first-degree relatives. Put another way, the clinical risk assessment procedure can take into consideration the subject’s family history of melanoma amongst first-degree relatives. In the context of the present disclosure, a “first-degree relative” is a family member who shares about 50 percent of their genes with the subject. Examples of first-degree relatives include parents, offspring, and full-siblings. A “second-degree relative” is a family member who shares about 25 percent of their genes with the subject. Examples of second-degree relatives include aunts, nieces, grandparents, grandchildren, and halfsiblings.

In another embodiment, the subject’s family history of melanoma is based on the subject’s first degree-relatives and second degree relatives.

In another embodiment, the subject’s family history of melanoma is based on the subject’s first degree-relatives.

In an embodiment, the clinical risk assessment involves determining the subject’s personal history of non-melanoma skin cancer. The skilled person is well aware of non-melanoma skin cancers and include basal cell carcinoma (BCC), squamous cell carcinoma (SCC), angiosarcoma, cutaneous B-cell lymphoma, cutaneous T-cell lymphoma, dermatofibrosarcoma protuberans, sebaceous carcinoma and Merkel cell carcinoma (MCC).

In an embodiment, the clinical risk assessment involves determining the subject’s nevus density. As this skilled person would appreciate, a nervus is a benign growth on the skin that is formed by a cluster of melanocytes. A nevus, often referred to as a mole, is usually dark and may be raised from the skin. In an embodiment, nevus density refers to the number of nevus the subject has. In an embodiment, nevus density refers to at least one body part such as one or more or all of upper body, lower body, legs, arms, trunk, face and neck. This may involve providing different bands of nevus density such is ‘none’, ‘few’ (such as less than 20); ‘some’ (such as 20 to 50) and ‘many’ (such as more than 50), generally as described by Vuong et al. (2016) (see also Figure 2).

In a preferred embodiment, the clinical risk assessment is determined using the formula: clin xb = Z + (PDCE1 x hair_2) + (PDCE2 x hair_3) + (PDCE3 x hair_4) + (-PDCE4 x nevus 1) + (-PDCE5 x nevus 2) + (PDCE6 x nevus 3) + (PDCE7 x nevus _4) + (PDCE8 x fam hist) + (-PDCE9 x (1 - fam hist) + (PDCE10 x non mn) + (-PDCE11 x sun _2) + (PDCE12 x sun _3) where:

Z is the baseline risk value, from which the individual’s risk changes according to the P coefficients and the values for the risk factors an individual has (the equation adds and subtracts risk from the baseline value according to the risk factors an individual has), PDCE1 is a predetermined coefficient for light brown hair colour at age 18 years, PDCE2 is a predetermined p coefficient for blonde hair colour at age 18 years, PDCE3 is a predetermined p coefficient for red hair colour at age 18 years, PDCE4 is a predetermined p coefficient for a nevus density of none,

PDCE5 is a predetermined p coefficient for a nevus density of fewer than 20, PDCE6 is a predetermined p coefficient for a nevus density of 20 to 50, PDCE7 is a predetermined P coefficient for a nevus density of more than 50,

PDCE8 is a predetermined P coefficient if the subject has a first-degree family history of melanoma,

PDCE9 is a predetermined p coefficient if the subject does not have a first-degree family history of melanoma,

PDCE10 is a predetermined P coefficient for a subject having a personal history of non-melanoma skin cancer,

PDCE11 is a predetermined p coefficient for a subject having used a sunbed 1 to 10 times in their lifetime,

PDCE12 is a predetermined p coefficient for a subject having used a sunbed more than 10 time in their lifetime, hair_2 is if the subject had light brown hair colour at age 18 years, hair_3 is if the subject had blonde hair colour at age 18 years, hair_4 is if the subject had red hair colour at age 18 years, nevus _1 is if the subject has a nevus density of none, nevus _2 is if the subject has a nevus density of fewer than 20, nevus _3 is if the subject has a nevus density of 20 to 50, nevus_4 is if the subject has a nevus density of more than 50, fam hist is if the subject has a first-degree family history of melanoma, non mn is if the subject has a personal history of non-melanoma skin cancer, sun_2 is if the subject has used a sunbed 1 to 10 times, and sun_3 is if the subject has used a sunbed more than 10 times.

In an embodiment, for hair_2 the subject is assigned a value of 1 if they had light brown hair colour at age 18 years, and assigned a value of 0 if not. In an embodiment, for hair_3 the subject is assigned a value of 1 if they had blonde hair colour at age 18 years, and assigned a value of 0 if not.

In an embodiment, for hair_4 the subject is assigned a value of 1 if they had red hair colour at age 18 years, and assigned a value of 0 if not.

In an embodiment, if the subject had black or dark brown hair colour at age 18 years, hair colour is not considered further in the clinical risk assessment.

In an embodiment for nevus _1 the subject is assigned a value of 1 if they have a nevus density of none, and assigned a value of 0 if not.

In an embodiment, for nevus _2 the subject is assigned a value of 1 if they have a nevus density of fewer than 20, and assigned a value of 0 if not.

In an embodiment, for nevus_3 the subject is assigned a value of 1 if they have a nevus density of 20 to 50, and assigned a value of 0 if not.

In an embodiment, for nevus_4 the subject is assigned a value of 1 if they have a nevus density of more than 50, and assigned a value of 0 if not.

In an embodiment, for fam hist the subject is assigned a value of 1 if they have a first-degree family history of melanoma, and assigned a value of 0 if not.

In an embodiment, for non mn the subject is assigned a value of 1 if they have a personal history of non-melanoma skin cancer, and assigned a value of 0 if not.

In an embodiment, for sun_2 the subject is assigned a value of 1 if they have used a sunbed 1 to 10 times, and assigned a value of 0 if not.

In an embodiment, for sun_3 the subject is assigned a value of 1 if they have used a sunbed more than 10 times, and assigned a value of 0 if not.

In an embodiment, if the subject has not used a sunbed, sunbed use is not considered further in the clinical risk assessment.

In an embodiment, Z is between -0.01 and -0.21.

In an embodiment, Z is about -0.1.

In an embodiment, Z is as -0.11.

In an embodiment, one or more or all of the following apply; a) PDCE1 is between 0.12 and 0.32, b) PDCE2 is between 0.71 and 1.11, c) PDCE3 is between 1.26 and 1.66, d) PDCE4 is between -0.69 and -1.09 e) PDCE5 is between -0.47 and -0.67, f) PDCE6 is between 0.24 and 0.44, g) PDCE7 is between 0.57 and 0.97, h) PDCE8 is between 0.51 and 0.71, i) PDCE9 is between 0 and -0.14, j) PDCE10 is between 0.96 and 1.36, k) PDCE11 is between 0 and -0.15, and l) PDCE12 is between 0.36 and 0.56.

In an embodiment, one or more or all of the following apply; a) PDCE1 is about 0.2, b) PDCE2 is about 0.9, c) PDCE3 is about 1.45, d) PDCE4 is about -0.9, e) PDCE5 is about -0.55, f) PDCE6 is about 0.35, g) PDCE7 is about 0.75, h) PDCE8 is about 0.6, i) PDCE9 is about -0.05, j) PDCE10 is about 1.15, k) PDCE11 is about -0.05, and l) PDCE12 is about 0.45.

In an embodiment, one or more or all of the following apply; a) PDCE1 is 0.22, b) PDCE2 is 0.91, c) PDCE3 is 1.46, d) PDCE4 is -0.89, e) PDCE5 is -0.57, f) PDCE6 is 0.34, g) PDCE7 is 0.77, h) PDCE8 is 0.61, i) PDCE9 is -0.04, j) PDCE10 is 1.16, k) PDCE11 is -0.05, and l) PDCE12 is 0.46.

In an embodiment, for each value discussed herein, the last or second last decimal point can be removed and the relevant number rounded up (from 5) or down (from 4).

In an embodiment, the subject does not have, or has not had, melanoma.

In another embodiment, performing the clinical risk assessment uses a model that calculates the absolute risk of developing melanoma. For example, the absolute risk of developing melanoma can be calculated using age, sex and ethnicity cancer incidence rates.

Genetic Risk Factors

Various exemplary polymorphisms associated with melanoma are discussed in the present disclosure. These polymorphisms vary in terms of penetrance and many would be understood by those of skill in the art to be low penetrance polymorphisms.

The term “penetrance” is used in the context of the present disclosure to refer to the frequency at which a particular polymorphism manifests itself within subjects with melanoma. “High penetrance” polymorphisms will often be apparent in a subject with melanoma (such as those with an odds ratio greater than 1.5 or greater than 2) while “low penetrance” polymorphisms will only sometimes be apparent in a subject with melanoma (such as those with an odds ratio less than 1.5). In an embodiment, polymorphisms assessed as part of a genetic risk assessment according to the present disclosure are low penetrance polymorphisms.

In an embodiment, the genetic risk assessment is performed by analysing the genotype of the subject at 2 or more loci for polymorphisms associated with melanoma. In an embodiment, the genetic risk assessment is performed by analysing the genotype of the subject at 5 or more loci for polymorphisms associated with melanoma. In an embodiment, the genetic risk assessment is performed by analysing the genotype of the subject at 10 or more loci for polymorphisms associated with melanoma. In an embodiment, the genetic risk assessment is performed by analysing the genotype of the subject at 20 or more loci for polymorphisms associated with melanoma. In an embodiment, the genetic risk assessment is performed by analysing the genotype of the subject at 30 or more loci for polymorphisms associated with melanoma. In an embodiment, the genetic risk assessment is performed by analysing the genotype of the subject at 50 or more loci for polymorphisms associated with melanoma.

In an embodiment, the genetic risk assessment is performed by analysing the genotype of the subject at 2 or more loci for polymorphisms provided in Table 1, or a polymorphism in linkage disequilibrium with one or more thereof. In an embodiment, the genetic risk assessment is performed by analysing the genotype of the subject at 5 or more loci for polymorphisms provided in Table 1, or a polymorphism in linkage disequilibrium with one or more thereof. In an embodiment, the genetic risk assessment is performed by analysing the genotype of the subject at 10 or more loci for polymorphisms provided in Table 1, or a polymorphism in linkage disequilibrium with one or more thereof. In an embodiment, the genetic risk assessment is performed by analysing the genotype of the subject at 20 or more loci for polymorphisms provided in Table 1, or a polymorphism in linkage disequilibrium with one or more thereof. In an embodiment, the genetic risk assessment is performed by analysing the genotype of the subject at 30 or more loci for polymorphisms provided in Table 1, or a polymorphism in linkage disequilibrium with one or more thereof. In an embodiment, the genetic risk assessment is performed by analysing the genotype of the subject at 35 or more loci for polymorphisms provided in Table 1, or a polymorphism in linkage disequilibrium with one or more thereof. In an embodiment, the genetic risk assessment is performed by analysing the genotype of the subject at 40 or more loci for polymorphisms provided in Table 1, or a polymorphism in linkage disequilibrium with one or more thereof. In an embodiment, the genetic risk assessment is performed by analysing the genotype of the subject at 50 or more loci for polymorphisms provided in Table 1, or a polymorphism in linkage disequilibrium with one or more thereof. In an embodiment, the genetic risk assessment is performed by analysing the genotype of the subject at 60 or more loci for polymorphisms provided in Table 1, or a polymorphism in linkage disequilibrium with one or more thereof.

In an embodiment, the genetic risk assessment comprises detecting the presence of each of the polymorphisms provided in Table 1, or a polymorphism in linkage disequilibrium with one or more thereof.

In an embodiment, the genetic risk assessment comprises detecting the presence of each of the polymorphisms provided in Table 1, namely rs77637424, rsl 1204750, rs905938, rs3219090, rs61712781, rsl800440, rs35667974, rsl0931936, rs974032, rsl49617956, rsl0936599, rsl3098877, rs3135872, rs7726159, rs401681, rsl 17518215, rs4256371, rs6881722, rs32579, rs57354745, rs7739578, rsl3216160, rs7382061, rs847394, rsl l7132860, rs2286173, rs9298191, rsl384810, rs494668, rsl0809802, rs4636294, rs55797833, rsl011970, rsl0739221, rsl2380282, rs2487999, rsl0832555, rs2046494, rsl 126809, rsl801516, rsl2290699, rsl684387, rs4257028, rs61937385, rsl800407, rsl50962800, rs62034139, rs7196626, rs2967350, rsl7176204, rs9935216, rs34509231, rsl 17586265, rs62054258, rsl 1547464, rsl805007, rsl 17204628, rs74548542, rsl 17417690, rs62052682, rs62074125, rs910873, rs71325459, rs79981941, rs6122147, rs45430, rs4608623 and rs79966207.

In an embodiment, the genetic risk assessment comprises detecting the presence of 65 of the polymorphisms provided in Table 1.

In an embodiment, the genetic risk assessment comprises detecting the presence of 64 of the polymorphisms provided in Table 1. Table 1. Polymorphisms associated with risk of developing melanoma.

In an embodiment, the genetic risk assessment comprises detecting the presence of 63 of the polymorphisms provided in Table 1.

In an embodiment, the genetic risk assessment comprises detecting the presence of 62 of the polymorphisms provided in Table 1.

In an embodiment, the genetic risk assessment comprises detecting the presence of 61 of the polymorphisms provided in Table 1. In an embodiment, the genetic risk assessment comprises detecting the presence of 60 of the polymorphisms provided in Table 1.

Polymorphisms in linkage disequilibrium with those specifically mentioned herein are easily identified by those of skill in the art, such as using the HAPMAP database as described in WO 2016/049694.

Ethnic Genotype Variation

In an embodiment, the methods of the present disclosure can be used for assessing the risk for developing melanoma in human subjects from various ethnic backgrounds. For example, the subject can be classified as Caucasoid, Australoid, Mongoloid and Negroid based on physical anthropology. In particular, the inventors have found that the model can be used for Caucasians, African ancestry (including African Americans), East Asian ancestry and Hispanic ancestry.

In an embodiment, the subject is Caucasian.

In an embodiment, the subject is African.

In an embodiment, the subject is East Asian. In an embodiment, the East Asian subject is from China, Japan, South Korea, North Korea, Taiwan, Hong Kong, Mongolia or Macao.

In an embodiment, the subject is Hispanic.

It is well known that over time there has been blending of different ethnic origins. However, in practice this does not influence the ability of a skilled person to practice the invention.

A subject of predominantly European origin, either direct or indirect through ancestry, with white skin is considered Caucasian in the context of the present disclosure. A Caucasian may have, for example, at least 75% Caucasian ancestry (for example, but not limited to, the subject having at least three Caucasian grandparents).

A subject of predominantly central or southern African origin, either direct or indirect through ancestry, is considered Negroid in the context of the present disclosure. A Negroid may have, for example, at least 75% Negroid ancestry. An American subject with predominantly Negroid ancestry and black skin is considered African American in the context of the present disclosure. An African American may have, for example, at least 75% Negroid ancestry. A similar principle applies to, for example, subjects of Negroid ancestry living in other countries (for example Great Britain, Canada and The Netherlands).

A subject predominantly originating from Spain or a Spanish-speaking country, such as a country of northern, central or southern America, either direct or indirect through ancestry, is considered Hispanic in the context of the present disclosure. A Hispanic may have, for example, at least 75% Hispanic ancestry.

The terms “ethnicity” and “race” can be used interchangeably in the context of the present disclosure. In an embodiment, the genetic risk assessment can readily be practiced based on what ethnicity the subject considers them self to be. Thus, in an embodiment, the ethnicity of the human subject is self-reported by the subject. As an example, the subject can be asked to identify their ethnicity in response to this question: “To what ethnic group do you belong?” In another example, the ethnicity of the subject is derived from medical records after obtaining the appropriate consent from the subject or from the opinion or observations of a clinician.

Genetic Risk Score

In an embodiment, the genetic risk assessment involves determining a polygenic risk score for the subject (also referred to herein as PRS or “genetic risk score”). An individual’s PRS can be defined as the weighted sum of the individuals’ genotypes at multiple genetic loci. In other words, they are the linear combinations of the risk alleles across a set of candidate polymorphisms.

In one embodiment, the key steps to construct a polygenic risk score (PRS) are to determine which polymorphisms to include and how to weight their effects. In one embodiment, the maxCT and SCT methods (Prive et al., 2019) are used. These methods are based on clumping and thresholding. The aim of clumping and thresholding is to remove correlated polymorphisms while keeping the most important polymorphisms in the PRS. To do this, the values of a range of hyperparameters including the correlation threshold (r 2 ), the clumping window size (kb) and the p-value significance threshold (p) are decided. Different selection of values for these hyperparameters values would in general give a different section of polymorphisms to include. For the weights, reported GWAS coefficients (i.e., regression coefficients or log odds ratios) from external published GWAS can be used. The core idea of the maxCT and SCT procedures is to select a set of different values for each of the hyperparameters and compute a PRS for each combination of these values.

This will usually create a large number of PRSs, for example, around 100,000 vectors of PRSs for a typical GWAS. After constructing these PRSs, there are two approaches to create the final PRS. In maxCT, the PRS that has the strongest predictive performance (e.g., largest AUC) is selected as the final PRS. In SCT, the PRSs are combined using a penalized logistic regression model, for example, the popular lasso procedure. Since the outcome of SCT will be a linear combination of PRSs, where each PRS is again a linear combination of variants, the final PRS still has the form of the equation below, which means the effect sizes of polymorphisms can be obtained and used for prediction.

In a preferred embodiment, the PRS is determined using the formula: where: is the effect size (log of the odds ratio) of polymorphism j, and is the count (0, 1, 2) of the effect alleles of polymorphism j for individual i.

In an alternate embodiment, a log-additive risk model can be used to define three genotypes AA, AB, and BB for a single polymorphism having relative risk values of 1, OR, and OR 2 , under a rare disease model, where OR is the previously reported disease odds ratio for the effect allele, B, vs the reference allele, A. If the B allele has frequency (p), then these genotypes have population frequencies of (1 - p) 2 , 2p(l - p), and p 2 , assuming Hardy-Weinberg equilibrium. The relative risk values for each polymorphism can then be scaled so that based on these frequencies the average relative risk in the population is 1 (Mealiffe et al., 2010). Specifically, the unsealed population average relative risk for each SNP is Adjusted risk values 1/p, OR/p, and OR 2 /p are used for the AA, AB, and BB genotypes, respectively. Missing genotypes are assigned an adjusted risk of 1. The final PRS is obtained by multiplying the adjusted risk values for each SNP.

Similar calculations can be performed for non-SNP polymorphisms.

It is envisaged that the risk of a human subject for developing melanoma can be provided as a relative risk or an absolute risk as required. In an embodiment, as with the above example, the genetic risk assessment obtains the absolute risk of a human subject for developing melanoma. Absolute risk is the numerical probability of a human subject developing melanoma within a specified period (e.g. 5, 10, 15, 20 or more years or remaining lifetime).

In an alternate embodiment, the genetic risk assessment obtains the relative risk of a human subject for developing melanoma. Relative risk, measured as the incidence of a disease in individuals with a particular characteristic (or exposure) divided by the incidence of the disease in individuals without the characteristic, indicates whether that particular exposure increases or decreases risk. Relative risk is helpful to identify characteristics that are associated with a disease, but by itself is not particularly helpful in guiding screening decisions because the frequency of the risk (incidence) is cancelled out.

An alternate method for calculating the composite PRS is described in Mavaddat et al. (2015). In this example, the following formula is used;

PRS = ?iX i + ?2X2+ .. ../3 K X K +l3nXn where p K is the per-allele log odds ratio (OR) for melanoma associated with the minor allele for polymorphisms K, and x K is the number of alleles for the same polymorphism (0, 1 or 2), n is the total number of polymorphism and PRS is the polygenic risk score.

Melanoma Cancer Risk Assessment

As the skilled person would be aware, in view of the teachings of the present disclosure, a variety of different formulae could be produced to provide a risk score.

In an embodiment, the polygenic risk score is produced using a formula as described above, preferably using

In an embodiment, the clinical risk assessment is determined as described above using the formula: clin xb = Z + (PDCE1 x hair_2) + (PDCE2 x hair_3) + (PDCE3 x hair_4) + (-PDCE4 x nevus 1) + (-PDCE5 x nevus 2) + (PDCE6 x nevus 3) + (PDCE7 x nevus _4) + (PDCE8 x fam hist) + (-PDCE9 x (1 - fam hist) + (PDCE10 x non mn) + (-PDCE11 x sun _2) + (PDCE12 x sun _3).

In an embodiment, the clinical and genetic risk assessments are combined by determining: mel risk xh = (X x prs) + (Y x din xb) mel risk = Qrnel nsk xb where:

X is a predetermined p coefficient for the genetic risk assessment, and

Y is a predetermined p coefficient for the clinical risk assessment.

In an embodiment, X is between 0.541 and 0.941

In an embodiment, X is about 0.74.

In an embodiment, X is 0.741. In an embodiment, Y is between 0.119 and 0.519.

In an embodiment, Y is about 0.32.

In an embodiment, Y is 0.319.

The subject’s results can be one or more or all of their absolute 5-year risk, absolute 10-year risk and absolute remaining lifetime risk up to age 90 years, which can be calculated as defined below. In an embodiment, the subject’s results are their absolute 5-year risk.

In an embodiment, for each individual (aged b years), the most recent sexspecific and country-specific population incidence data is used to determine the population incidence from birth to age b years incid b), to age b + 10 years (incid b 10) and full lifetime from birth to age 90 years (incid Jull life).

Cumulative risks

Absolute remaining lifetime risk (to age 90 years)

Absolute full-lifetime risk (to age 90 years)

In an embodiment, the top quintile of scores obtained using the method have about a 2.3 times greater chance of having melanoma than the population risk.

In an embodiment, the bottom quintile of scores obtained using the method have about a 0.67 times less chance of having melanoma than the population risk.

In an embodiment, the method has an area under the receiver operating characteristic curve (AUC) of at least about 0.6, such as about 0.634. In an embodiment, one or more threshold value(s) are set for determining a particular action such as the need for routine diagnostic testing/screening, preventative therapy or preventative surgery. For example, a score determined using a method of the invention is compared to a pre-determined threshold, and if the score is higher than the threshold a recommendation is made to take the pre-determined action. Methods of setting such thresholds have now become widely used in the art and are described in, for example, US 20140018258.

Marker Detection Strategies

Amplification primers for amplifying markers (e.g., marker loci) and suitable probes to detect such markers or to genotype a sample with respect to multiple marker alleles, can be used in the disclosure. For example, primer selection for long-range PCR is described in US 10/042,406 and US 10/236,480; for short-range PCR, US 10/341,832 provides guidance with respect to primer selection. Also, there are publicly available programs such as Oligo available for primer design. With such available primer selection and design software, the publicly available human genome sequence and the polymorphism locations, one of skill can construct primers to amplify the polymorphisms to practice the disclosure. Further, it will be appreciated that the precise probe to be used for detection of a nucleic acid comprising a polymorphism (e.g., an amplicon comprising the polymorphism) can vary, e.g., any probe that can identify the region of a marker amplicon to be detected can be used in conjunction with the present disclosure. Further, the configuration of the detection probes can, of course, vary. Thus, the disclosure is not limited to the sequences recited herein.

Indeed, it will be appreciated that amplification is not a requirement for marker detection; for example, one can directly detect unamplified genomic DNA simply by performing a Southern blot on a sample of genomic DNA.

Typically, molecular markers are detected by any established method available in the art, including, without limitation, ASH, detection of extension, array hybridization (optionally including ASH), or other methods for detecting polymorphisms, AFUP detection, amplified variable sequence detection, randomly amplified polymorphic DNA (RAPD) detection, RFUP detection, self-sustained sequence replication detection, SSR detection, and single-strand conformation polymorphisms (SSCP) detection.

As the skilled person will appreciate, the sequence of the genomic region to which these oligonucleotides hybridize can be used to design primers which are longer at the 5’ and/or 3’ end, possibly shorter at the 5’ and/or 3’ (as long as the truncated version can still be used for amplification), which have one or a few nucleotide differences (but nonetheless can still be used for amplification), or which share no sequence similarity with those provided but which are designed based on genomic sequences close to where the specifically provided oligonucleotides hybridize and which can still be used for amplification.

In some embodiments, the primers are radiolabelled, or labelled by any suitable means (e.g., using a non-radioactive fluorescent tag), to allow for rapid visualization of differently sized amplicons following an amplification reaction without any additional labelling step or visualization step. In some embodiments, the primers are not labelled, and the amplicons are visualized following their size resolution, e.g., following agarose or acrylamide gel electrophoresis. In some embodiments, ethidium bromide staining of the PCR amplicons following size resolution allows visualization of the different size amplicons.

It is not intended that the primers be limited to generating an amplicon of any particular size. For example, the primers used to amplify the marker loci and alleles herein are not limited to amplifying the entire region of the relevant locus, or any subregion thereof. The primers can generate an amplicon of any suitable length for detection. In some embodiments, marker amplification produces an amplicon at least 20 nucleotides in length, or alternatively, at least 50 nucleotides in length, or alternatively, at least 100 nucleotides in length, or alternatively, at least 200 nucleotides in length. Amplicons of any size can be detected using the various technologies described herein. Differences in base composition or size can be detected by conventional methods such as electrophoresis.

Some techniques for detecting genetic markers utilize hybridization of a probe nucleic acid to nucleic acids corresponding to the genetic marker (e.g., amplified nucleic acids produced using genomic DNA as a template). Hybridization formats, including, but not limited to: solution phase, solid phase, mixed phase, or in situ hybridization assays are useful for allele detection. An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic Acid Probes Elsevier, New York, as well as in Sambrook et al. (supra).

PCR detection using dual-labelled Anorogenic oligonucleotide probes, commonly referred to as "TaqMan™" probes, can also be performed according to the present disclosure. These probes are composed of short (e.g., 20-25 base) oligodeoxynucleotides that are labelled with two different fluorescent dyes. On the 5' terminus of each probe is a reporter dye, and on the 3' terminus of each probe a quenching dye is found. The oligonucleotide probe sequence is complementary to an internal target sequence present in a PCR amplicon. When the probe is intact, energy transfer occurs between the two fluorophores and emission from the reporter is quenched by the quencher by FRET. During the extension phase of PCR, the probe is cleaved by 5' nuclease activity of the polymerase used in the reaction, thereby releasing the reporter from the oligonucleotide-quencher and producing an increase in reporter emission intensity. Accordingly, TaqMan™ probes are oligonucleotides that have a label and a quencher, where the label is released during amplification by the exonuclease action of the polymerase used in amplification. This provides a real time measure of amplification during synthesis. A variety of TaqMan™ reagents are commercially available, e.g., from Applied Biosystems (Division Headquarters in Foster City, Calif.) as well as from a variety of specialty vendors such as Biosearch Technologies (e.g., black hole quencher probes). Further details regarding dual-label probe strategies can be found, e.g., in WO 92/02638.

Other similar methods include e.g. fluorescence resonance energy transfer between two adjacently hybridized probes, e.g., using the “LightCycler®” format described in US 6,174,670.

Array-based detection can be performed using commercially available arrays, e.g., from Affymetrix (Santa Clara, Calif.) or other manufacturers. Array based detection is one preferred method for identification markers of the disclosure in samples, due to the inherently high-throughput nature of array based detection.

The nucleic acid sample to be analysed is isolated, amplified and, typically, labelled with biotin and/or a fluorescent reporter group. The labelled nucleic acid sample is then incubated with the array using a fluidics station and hybridization oven. The array can be washed and or stained or counter-stained, as appropriate to the detection method. After hybridization, washing and staining, the array is inserted into a scanner, where patterns of hybridization are detected. The hybridization data are collected as light emitted from the fluorescent reporter groups already incorporated into the labelled nucleic acid, which is now bound to the probe array. Probes that most clearly match the labelled nucleic acid produce stronger signals than those that have mismatches. Since the sequence and position of each probe on the array are known, by complementarity, the identity of the nucleic acid sample applied to the probe array can be identified.

Markers and polymorphisms can also be detected using DNA sequencing. DNA sequencing methods are well known in the art and can be found for example in Ausubel et al, eds., Short Protocols in Molecular Biology, 3rd ed., Wiley, (1995) and Sambrook et al, Molecular Cloning, 2nd ed., Chap. 13, Cold Spring Harbor Laboratory Press, (1989). Sequencing can be carried out by any suitable method, for example, dideoxy sequencing, chemical sequencing, or variations thereof.

Suitable sequencing methods also include Second Generation, Third Generation, or Fourth Generation sequencing technologies, all referred to herein as “next generation sequencing”, including, but not limited to, pyrosequencing, sequencing-by-ligation, single molecule sequencing, sequence-by-synthesis (SBS), massive parallel clonal, massive parallel single molecule SBS, massive parallel single molecule real-time, massive parallel single molecule real-time nanopore technology, etc. A review of some such technologies can be found in (Morozova and Marra, 2008), herein incorporated by reference. Accordingly, in some embodiments, performing a genetic risk assessment as described herein involves detecting the at least two polymorphisms by DNA sequencing. In an embodiment, the at least two polymorphisms are detected by next generation sequencing.

Next generation sequencing (NGS) methods share the common feature of massively parallel, high-throughput strategies, with the goal of lower costs in comparison to older sequencing methods (see, for example, Voelkerding et al., 2009; MacLean et al., 2009).

Computer-Implemented Method

It is envisaged that the methods of the present disclosure may be implemented by a system such as a computer-implemented method. For example, the system may be a computer system comprising one or a plurality of processors which may operate together (referred to for convenience as “processor”) connected to a memory. The memory may be a non-transitory computer-readable medium, such as a hard drive, a solid-state disk, CD-ROM or the cloud. Software, that is executable instructions or program code, such as program code grouped into code modules, may be stored on the memory, and may, when executed by the processor, cause the computer system to perform functions such as determining that a task is to be performed to assist a user to determine the risk of a human subject for developing melanoma; receiving data relating to one or more clinical factors as discussed herein, receiving data relating to the genetic risk assessment, wherein the genetic risk was derived by detecting at least two polymorphisms known to be associated with melanoma; processing the data to obtain the risk of a human subject for developing melanoma; outputting the risk of a human subject for developing melanoma. For example, the memory may comprise program code, which when executed by the processor causes the system to determine at least two polymorphisms known to be associated with melanoma; process the data to combine clinical and genetic risk assessments to obtain the risk of a human subject for developing melanoma; report the risk of a human subject for developing melanoma.

In another embodiment, the system may be coupled to a user interface to enable the system to receive information from a user and/or to output or display information. For example, the user interface may comprise a graphical user interface, a voice user interface or a touchscreen.

In an embodiment, the system may be configured to communicate with at least one remote device or server across a communications network such as a wireless communications network. For example, the system may be configured to receive information from the device or server across the communications network and to transmit information to the same or a different device or server across the communications network. In other embodiments, the system may be isolated from direct user interaction.

In another embodiment, the diagnostic or prognostic rule is based on the application of a statistical and machine learning algorithm. Such an algorithm uses relationships between a population of polymorphisms and disease status observed in training data (with known disease status) to infer relationships which are then used to determine the risk of a human subject for developing melanoma in subjects with an unknown risk. An algorithm is employed which provides a risk of a human subject developing melanoma. The algorithm performs a multivariate or univariate analysis function.

EXAMPLES

Example 1 - Materials and Methods

Ethics Approval

The UK Biobank has Research Tissue Bank approval (REC #l l/NW/0382) that covers analysis of data by approved researchers. All participants provided written informed consent to the UK Biobank before data collection began. This research has been conducted using the UK Biobank resource under Application Number 47401.

Participants

The inventors used the UK Biobank data (Sudlow et al., 2015; Bycroft et al., 2018) to develop our risk prediction model for melanoma. The UK Biobank is an epidemiological cohort of over 500,000 participants from across the United Kingdom from 2006 to 2010. Status of melanoma was defined by self-reported cancer for skin cancer (UK Biobank data field 20001 with code 1003) or from linked cancer registry data. If there was no melanoma identified in self-reported or cancer registry data, they were considered to be unaffected. The inventors excluded UK Biobank participants with age less than 40 or greater than 69 years. The inventors excluded non-Caucasian participants, prevalent melanoma cases, and participants with less than 6 weeks of follow-up time. The inventors also excluded related individuals with greater than 3rd- degree relatedness and individuals with missing clinical risk factors (hair colour, personal history of non-melanoma skin cancer and number of life sunbed sessions). Table 2 shows the number of individuals after each step of eligibility criteria. There were 365,326 individuals after filtering for eligibility, with 2,134 incident melanoma cases and 363,192 controls. The number of female and male participants were 196,961 (54%) and 168,365 (46%), respectively.

Table 2. Number of individuals after each step of eligibility criteria.

Action Number of individuals remaining

Remove individuals with age < 40 or age > 69 49g ggg

Remove non-Caucasian 407 184

Remove prevalent cases 4Q3 723

Remove people with less than 6 weeks follow up time 493 ggy

Remove related individuals with > 3rd-degree

■ . i o /4, / OU related n ess

Remove if clinical risk factors are not available 335 325

PRS Training Data

After filtering for eligibility, the inventors had 365,326 individuals left for our study. The inventors reserved 70% of the data (size = 255,729) for building the PRS for melanoma. Note that not all data in the 70% was used to build the PRS. The inventors deliberately created a subset for developing the PRS in which age and sex were controlled by design. The procedure to create the PRS training dataset was as follows. First, the quintiles of age was computed using the melanoma cases and divided the individuals into five age groups: {[40, 52], (52, 59], (59, 62], (62, 66], (66, 69]}. Next, the inventors further divided the individuals by their gender so there were 10 groups in total. Then, for each of these 10 groups, the inventors sampled 10 controls per case. By this sampling procedure, the inventors made sure that the case-control ratio was the same across all 10 groups defined by age and sex. The sample size of the PRS training data was 16,434 with 1,494 cases and 14,940 controls. The number of cases and controls for each age and sex groups for the PRS training data are summarized in Table 3.

Table 3. Sample sizes for each age and sex groups in the PRS training data.

Female Male

N Case, N = 740 Control,’ N = . N. C „ase, . N. = - 7,5 c 4. Control ’, N =

7,400 7,540

Age group 8,140 8,294

[40, 52] 194 (26%) 1 ,940 (26%) 107 (14%) 1 ,070 (14%)

(52, 59] 180 (24%) 1 ,800 (24%) 170 (23%) 1 ,700 (23%)

(59, 62] 124 (17%) 1 ,240 (17%) 162 (21 %) 1 ,620 (21%)

(62, 66] 145 (20%) 1 ,450 (20%) 187 (25%) 1 ,870 (25%)

(66, 69] 97 (13%) 970 (13%) 128 (17%) 1 ,280 (17%)

Genetic and Clinical Model Development and Testing Data

The inventors used the remaining 30% of the eligible participants (size = 109,597) to perform a cohort analysis. The inventors limited follow-up to 10 years. To develop the final combined genetic and clinical risk model, the cohort data was divided into halves: development and testing. The inventors used the first half of the cohort data to estimate coefficients for the PRS and the clinical risk score using Cox regression with age as the time axis. In the second half of the cohort data, the performance of our risk score using Cox regression to estimate the hazard ratio (HR) per standard deviation (SD) was assessed. The inventors computed Harrell’s C-index to assess model discrimination. To examine calibration, the inventors computed the standardized incidence ratio (SIR) of the number of melanoma cases observed in the first 10 years of follow-up compared with the number of cases predicted by the 10-year risk score, overall and by quintile of 10-year risk. Lastly, the inventors refitted the model to refine the estimates using the whole cohort data and computed the SIRs of the number of observed melanoma cases compared with the number predicted by population incidence rates, overall and by quintile of 10-year risk.

Polygenic Risk Score

A polygenic risk score (PRS) is defined as the weighted sum of risk-allele counts where [3/ is the weight for SNP j,

Gtj is the count (0, 1, 2) of the effect alleles of SNP j for individual i, and p is the number of SNPs in the PRS.

The core of developing a PRS is to decide which SNPs should be used and what effect sizes are assigned to them. In this study, the inventors only considered SNPs from the UK Biobank Axiom Array data (Bycroft et al., 2018). The inventors obtained the GWAS effect sizes of SNPs from the summary statistics provided by GenoMEL consortium (the Melanoma Genetics Consortium; http://www.genomel.org), with UK Biobank samples removed. For quality control, the inventors removed SNPs with minor allele frequency <10 -3 , genotyping rate less than 95% and Hardy-Weinberg equilibrium p-value less than 10-50. The inventors also removed ambiguous SNPs and duplicate variants with same physical position or refSNP cluster ID number.

To create a PRS for melanoma risk prediction, we used the maximum clumping and thresholding method (Prive et al., 2019). The inventors selected seven correlation thresholds (0.01, 0.05, 0.10, 0.20, 0.50, 0.80, 0.95), four base clumping window sizes (50, 100, 200, 500; in kb) and 50 p-value significance thresholds evenly spaced on a log-log scale. The actual clumping window size used was computed as the base clumping window size divided by the correlation thresholds. The standard clumping and thresholding method was then applied for the combinations of the hyperparameters values to generate 1400 (7 x 4 x 50) risk scores. The best risk score, which maximized the area under the receiver operating characteristic curve (AUC) on the PRS training data, was chosen to be our PRS. The R package bigsnpr (Prive et al., 2018), version 1.8.1 was used to run the maximum clumping and thresholding procedure.

Clinical Risk Score

The clinical risk score for melanoma was obtained from an Australian-based study (Vuong et al., 2016). The clinical risk score originally included hair colour, nevus density, first-degree family history of melanoma, history of non-melanoma skin cancer and number of lifetime sunbed sessions. Because nevus density and first-degree family history of melanoma are not available in UK Biobank, the inventors only used the other three risk factors. Hair colour was classified as black/brown, light brown, blonde and red. Lifetime sunbed use was classified into three groups: none, 1-10 and >10. Because UK Biobank only provides the frequency of solarium or sunlamp use per year instead of lifetime sunbed use, the inventors used a simple conversion to estimate the lifetime sunbed use: for frequency greater than 6 times use per year, the inventors converted it as the >10 group for lifetime; for frequency between 1 to 5 per year, the inventors converted it as the 1-10 group for lifetime. The clinical risk score is a linear combination of these three risk factors.

Table 4 shows the corresponding log-odd ratios (beta coefficients) and the distribution of clinical risk factors in the testing data.

Table 4. Distribution of clinical risk factors in the testing data and the beta coefficients.

Risk factor p coefficient Case, N = 613 Control, N = 108,984

Hair colour

Black/Brown 0 205 (33.4%) 45,954 (42.2%)

Light brown 0.22 251 (40.9%) 45,436 (41.7%)

Blonde 0.91 111 (18.1%) 12,672 (11.6%)

Red 1.46 46 (7.5%) 4,922 (4.5%)

Lifetime sunbed use

None 0 558 (91.0%) 98,681 (90.5%)

1 -10 -0.05 40 (6.5%) 7,397 (6.8%)

>10 0.46 15 (2.4%) 2,906 (2.7%)

Non-melanoma skin cancer

No 0 582 (94.9%) 106,772 (98.0%)

Yes 1.16 31 (5.1%) 2,212 (2.0%)

Example 2 - Results

Genetic Risk Factors

68 SNPs were identified by the maximum clumping and thresholding procedure. The top two SNPs, ranked by odd ratios, were found to be rsl49617956 and rsl805007. Both of these SNPs were found in moderate risk genes for melanoma (Yu et al., 2018). rs 149617956 is found in the MITF gene, which was previously found to be associated with nevus counts and melanoma development (Yokoyama et al., 2011; Potrony et al., 2016). rs 1805007 is found in the MC1R gene that codes red hair colour, which is a strong risk factor for melanoma (Raimondi et al., 2008; Nan et al., 2011).

The PRS for the 68 SNPs by itself had an AUC of 0.634 (95% CI = 0.618, 0.661) on the cohort data. The standardized PRS (with mean 0 and standard deviation 1) had a median of 0.508 for the cases and -0.036 for the controls. The clinical risk score by itself had an AUC of 0.572 (95% CI = 0.549, 0.595). The 10-year risk score which combined the PRS and clinical risk score had an AUC of 0.663 (95% CI = 0.641, 0.684). However, the combined risk score was not well calibrated from the standardized incidence ratio analysis, the inventors divided the testing data into two half and used the first half to do a re -calibration of the model. The second half of the data was used to assess the model performance after recalibration.

Performance of Combined 10-Year Risk Model

The inventors used the second half of the cohort data (306 cases and 54,492 controls) to test the performance of the combined model. The HR per SD was 1.332 (95% CI = 1.263, 1.406; P <0.001). The Harrell’s C-index for the 10-year risk score was 0.685 (95% CI = 0.654, 0.715). As a comparison, the Harrell’s C-index was 0.629 (95% CI = 0.596, 0.661) for the clinical risk score only and 0.676 (95% CI = 0.645, 0.706) for the PRS only. In terms of overall calibration of the combined model, the SIR was 1.193 (95% CI = 1.067, 1.335), the model underestimated 50 cases compared to the observed number (306 observed cases vs 256.47 expected cases). When stratified by quintile of risk, the model was well calibrated except for the highest quintile of risk, which underestimated risk, with 140 observed and 94.71 expected cases. The numbers for the SIR are presented in Table 5.

Final Model

Because the performance in association and discrimination of the model were similar in both half of the testing data, the inventors refined the model estimates by refitting the model using the full testing data and computed the 10-year melanoma risk for all individuals. The inventors estimated the coefficients of PRS and the clinical risk score using a Cox regression. The beta coefficients were found to be 0.319 for clinical risk score and 0.741 for PRS. To demonstrate the performance of the model, the inventors computed the SIRs of the number of observed cases compared with the number predicted by age-, sex- and calendar year-specific population incidence rates, overall and by the quintile of 10-year risk. They are presented in Figure 1 and Table 6.

Table 5. Standardized incidence ratios of the number of melanoma cases observed in the first 10 years of follow-up in the second half of the testing data compared with the expected number using 10-year risk.

. Standardized 95% confidence

Observed Expected . . . .. . . . incidence ratio interval

Overa|| 306 256.47 1.193 [1.067, 1.335]

Quintile of risk

1 22 23.23 0.947 [0.624, 1.439]

2 32 34.69 0.922 [0.652, 1.304]

3 54 44.95 1.201 [0.920, 1.568]

4 58 58.88 0.985 [0.762, 1.274]

5 140 94.71 1.478 [1.253, 1.745]

Table 6. Standardized incidence ratios of the number of melanoma cases observed in the first 10 years of follow-up in the whole testing dataset compared with the expected number using population incidence rates, overall and by quintile of 10-year risk of melanoma.

Observed . r E-xpect .ed . . Standardized 95% confidence inc.id.ence rat ..io . int .erva .l

Overall 613 459.95 1.333 [1.231 , 1.443]

Quintile of risk

1 48 72.04 0.666 [0.502, 0.884]

2 78 85.97 0.907 [0.727, 1.133]

3 95 94.32 1.007 [0.824, 1.232]

4 144 100.60 1.431 [1.216, 1.685]

5 248 107.02 2.317 [2.046, 2.625]

Clinical Model clin xb = -0.11 + (0.22 * hair 2) + (0.91 * hair S) + (1.46 * hair_4) + (-0.89 x nevus 1) + (-0.57 * nevus _2) + (0.34 * nevus 3) + (0.77 * nevus _4) + (0.61 * fam hist) + (-0.04 * (1 - fam hist)) + (1.16 * non mri) + (-0.05 * sun_2) + (0.46 * sun_3) where: hair_2 = 1 if hair colour is light brown; 0 otherwise hair_3 = 1 if hair colour is blonde; 0 otherwise hair_4 = 1 if hair colour is red; 0 otherwise nevus _1 = 1 if nevus density is ‘none’; 0 otherwise nevus _2 = 1 if nevus density is ‘few’ (less than 20); 0 otherwise nevus _3 = 1 if nevus density is ‘some’ (20 to 50); 0 otherwise nevus _4 = 1 if nevus density is ‘many’ (more than 50); 0 otherwise fam hist = 1 if having first-degree family history of melanoma; 0 otherwise non mn = 1 if having a personal history of non-melanoma skin cancer; 0 otherwise sun_2 = 1 if the number of lifetime sunbed use is 1-10; 0 otherwise sun_3 = 1 if the number of lifetime sunbed use is more than 10; 0 otherwise Hair colour refers to the self assessed natural hair colour at age 18 years.

Nevus density is based on 4-level pictograms used in the Australian Melanoma Family Study (Figure 2) (Vuong et al., 2016).

Polygenic Risk Score

PRS is defined as the weighted sum of risk-allele counts of SNPs: ? is the effect sizes (weights) (log of the odds ratio) of the SNPs which are given in Table 1, is the count (0, 1, 2) of the effect alleles of SNP j for individual i.

The PRS was developed according to the maximum clumping and thresholding (maxCT) procedure of Prive et al. (2019).

Combined Model

Overall, the number of observed melanoma cases was higher than expected number using population incidence rates (613 vs 459.95). When stratified by quintile of risk, the inventors can identify individuals at high risk by our risk prediction score. For the individuals in the highest quintile group (top 20%), they were at 2.3 times population risk. The second highest quintile group were at 1.4 times population risk.

On the other hand, the risk prediction score can also identify individuals who are at low risk of melanoma. Individuals in the lowest quintile group were at about 0.67 times population average risk. This is an important distinction to, and improvement over SIR based on clinical risk only, where there is much smaller difference between the five quintiles (Figure 3 and Table 7).

Figure 4 shows the distribution of the 10-year risk categorized by different percentage groups to emphasize the fold-difference between affected and unaffected adults within each risk category.

In these data, the average adult, age 40-69 years, has a 10-year risk score of 0.486. By applying our model, with age and sex-dependent incident rates taken into consideration, the model can identify 17.78% adults that have at-least a two-fold increase in risk, and 1.29% with a four-fold increase risk compared to population average risk. Furthermore, the inventors categorized adults by applying a basic twofold risk threshold, the inventors show the significant clinical value our model has over the standard clinical risk factors alone. The inventors classified adults into 3 risk categories, low, average and high based on their 10-year risk scores, <0.5%, 0.5-1.0%, >1.0%, respectively. The inventors were able to better stratify the population utilizing our model (Figure 5a) compared to the clinical model alone (Figure 5b). Importantly, the model identifies 10 times as many adults in the high category (>1% 10 year risk) compared to the clinical model alone; this high risk category represents adults at twice the population average risk. Conversely, when the model identifies 60% of the general population at lower-than average risk (<0.5% 10-year risk), their standard incidence ratio of melanoma was well below that of the clinical model alone (SIR = 0.86 (95% CI = 0.7158, 1.0333); SIR = 1.1705 (95% CI = 1.0306, 1.3295), respectively).

Table 7. Standardized incidence ratios of the observed number of melanoma in the first 10-year follow-up compared with the expected number estimated by the clinical risk model.

. Standardized 95% confidence

Observed Exp rected . . . .. . . . incidence ratio interval

_ „ 306 190.28 1.6081 [1.438, 1.799]

Overall L J

Quintile of risk

1 28 23.62 1.1856 [0.819, 1.717]

2 49 32.57 1.5044 [1.137, 1.991]

3 54 36.45 1.4814 [1.135, 1.934]

4 67 41.64 1.6089 [1.266, 2.044]

5 108 56.00 1.9286 [1.597, 2.329]

The present application claims priority from AU 2022903017 filed 14 October 2022, the entire contents of which are incorporated herein by reference.

It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the invention as shown in the specific embodiments without departing from the spirit or scope of the invention as broadly described. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.

All publications discussed and/or referenced herein are incorporated herein in their entirety. Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is solely for the purpose of providing a context for the present invention. It is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present invention as it existed before the priority date of each claim of this application.

REFERENCES

American Cancer Society. Accessed October 2022. www.cancer.org/cancer/melanoma- skin-cancer/about/key-statistics.html.

Bibbins-Domingo et al. (2016) Journal of the American Medical Association 316: 429- 435.

Bray et al. (2018) CA Cancer J Clin. 68: 394-424.

Brown et al. (2022) J Gen Intern Med. 37: 2267-2279.

Bycroft et al. (2018) Nature 562: 203-209.

Corrie et al. (2014) Br Med Bull. I l l: 149-162.

Datzmann et al. (2022) Br J Dermatol. 186: 69-77.

Davis et al. (2019) Cancer Biology and Therapy. Taylor and Francis Inc.; pp. 1366— 1379.

Devlin and Risch (1995) Genomics. 29: 311-322.

Ford et al. (1995) Int J Cancer 62: 377-381.

Frank et al. (2015) Sci Rep. 5: 12891.

Gandini et al. (2005a). Eur J Cancer 41: 45-60.

Gandini et al. (2005b) Eur J Cancer 41: 2040-2059.

Grob et al. (1990) Cancer 66: 387-395.

Harkemanne et al. (2011) BMJ Open. 11: e043926.

MacLean et al. (2009) Nature Rev. Microbiol, 7:287-296.

Mavaddat et al. (2015) J Natl Cancer Inst 107: doi: 10. 1093/jnci/djv036.

Mealiffe et al. (2010) Journal of the National Cancer Institute 102: 1618-1627.

Morozova and Marra (2008) Genomics 92:255.

Mucci et al. (2016) Journal of the American Medical Association 315: 68-76.

Najmi et al. (2022) Arch Dermatol Res. 314: 329-340.

Nan et al. (2011) Hum Mol Genet. 20: 3718-3724.

Oliveria et al. (2011) Arch Dermatol. 147: 39-44.

Olsen et al. (2019) J Invest Dermatol. 139: 665-672.

Potrony et al. (2016) JAMA Dermatol. 152: 405-412.

Prive et al. (2018) Bioinformatics. 34: 2781-2787.

Prive et al. (2019) Am J Hum Genet. 105: 1213-1221.

Raimondi et al. (2008) Int J Cancer 122: 2753-2760.

Robinson et al. (2018) J Gen Intern Med. 33: 855-862.

Slatkin and Excoffier (1996) Heredity 76: 377-383.

Sudlow et al. (2015) PLoS Med. 12: 1001779. Swetter et al. (2017) JAMA Dermatol. 153: 797-801.

Tai-Seale (2007) Health Serv Res. 42: 1871-1894.

Truderung et al. (2021) Int J Mol Epidemiol Genet. 12: 71-89.

Usher-Smith et al. (2014) Risk prediction models for melanoma: A systematic review. Cancer Epidemiology Biomarkers and Prevention. American Association for Cancer Research Inc., pp. 1450-1463.

Voelkerding et al. (2009) Clinical Chem. 55: 641-658.

Vuong et al. (2016) JAMA Dermatol. 152: 889-896.

Wei (2019) J Am Acad Dermatol. 81: 489-499. Weinstock et al. (2016) Cancer 122: 3152-3156.

Yokoyama et al. (2011) Nature 480: 99-103.

Yu et al. (2018) Biochim Biophys Acta Mol Basis Dis. 1864: 2247-2254.