Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
A CAUSAL REASONING SYSTEM FOR OPERATIONAL TWIN (CAROT) FOR DEVELOPMENT AND OPERATION OF 5G CNFS
Document Type and Number:
WIPO Patent Application WO/2024/085872
Kind Code:
A1
Abstract:
An apparatus comprising: at least one processor; and at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus at least to: operate a replicate of one or more cloud native network functions; generate observational data of the replicate of the one or more cloud native network functions, the observational data generated based on a plurality of operating conditions of the one or more cloud native network functions; and apply a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

Inventors:
LEE DENNY LUNG SUN (CA)
TYRODE DANIEL (GB)
JARVA MIKKO KAUKO JOHANNES (US)
KARABUDAK UMUT (FI)
SAKKO ARTO (FI)
Application Number:
PCT/US2022/047205
Publication Date:
April 25, 2024
Filing Date:
October 20, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NOKIA SOLUTIONS & NETWORKS OY (FI)
NOKIA AMERICA CORP (US)
International Classes:
G06F11/07; G06F8/30; G06F8/60; G06F11/30; G06F11/34; G06F30/20; H04L41/0631; H04L41/14; H04L41/16; H04L43/00
Foreign References:
US20200371857A12020-11-26
US11468348B12022-10-11
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. An apparatus comprising: at least one processor; and at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus at least to: operate a replicate of one or more cloud native network functions; generate observational data of the replicate of the one or more cloud native network functions, the observational data generated based on a plurality of operating conditions of the one or more cloud native network functions; and apply a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

2. The apparatus of claim 1, wherein the apparatus is caused to: design the one or more cloud native network functions based on the analyzed causality between the at least one observed cause and the at least one observed effect of the one or more cloud native network functions.

3. The apparatus of claim 2, wherein the one or more cloud native network functions is designed during or after a testing phase of a continuous integration and continuous deployment pipeline, and prior to release of the one or more cloud native network functions in a production environment, the release being of the continuous integration and continuous deployment pipeline.

4. The apparatus of any of claims 1 to 3, wherein the apparatus is caused to: operate or configure the one or more cloud native network functions during or after a testing phase of a continuous integration and continuous deployment pipeline, and prior to release of the one or more cloud native network functions in a production environment, the release being of the continuous integration and continuous deployment pipeline.

5. The apparatus of any of claims 1 to 4, wherein the one or more cloud native network functions comprises a software application for a 5G network.

6. The apparatus of any of claims 1 to 5, wherein the replicate comprises a digital twin of the one or more cloud native network functions.

7. The apparatus of any of claims 1 to 6, wherein the at least one observed cause comprises an operating attribute of the one or more cloud native network functions, wherein configuration of the one or more cloud native network functions comprises a type of operating attribute.

8. The apparatus of any of claims 1 to 7, wherein the at least one observed effect comprises at least one performance indicator of the one or more cloud native network functions.

9. The apparatus of any of claims 1 to 8, wherein the at least one observed effect comprises at least one feature, a load, or at least one environmental condition of the one or more cloud native network functions.

10. The apparatus of any of claims 1 to 9, wherein analyzing the causality between the at least one observed cause and the at least one observed effect of the one or more cloud native network functions comprises at least one of: determining an existence of a relationship between the at least one observed cause and the at least one observed effect; determining a magnitude of the relationship between the at least one observed cause and the at least one observed effect; or determining a way in which the at least one observed cause and the at least one observed effect are related.

11. The apparatus of any of claims 1 to 10, wherein the plurality of operating conditions comprises at least one operating condition of the one or more cloud native network functions, the at least one operating condition comprising at least one of: network load; application programming interface load; domain name system impairment; kernel breakdown; total breakdown; network stress; computational resource stress; time impairment; web service breakdown; or input output impairment.

12. The apparatus of any of claims 1 to 11, wherein the apparatus is further caused to: form at least one graph that represents the causality between the at least one observed cause and the at least one observed effect of the one or more cloud native network functions.

13. The apparatus of claim 12, wherein the at least one graph comprises a direct acyclic graph.

14. The apparatus of any of claims 12 to 13, wherein the apparatus is further caused to: form the at least one graph using domain expert knowledge to remove at least one ambiguity related to causality between the at least one observed cause and the at least one observed effect of the one or more cloud native network functions.

15. The apparatus of any of claims 1 to 14, wherein the apparatus is further caused to: generate the observational data as a table, wherein a row of the table comprises an observation, and a column of the table comprises a cause attribute, an effect attribute, or an attribute comprising both a cause and effect.

16. The apparatus of any of claims 1 to 15, wherein the apparatus is further caused to: generate the observational data as a table, wherein a first column of the table comprises at least a portion of operating configurations of the one or more cloud native network functions, and a second column of the table comprises the at least one observed effect of the one or more cloud native network functions.

17. The apparatus of any of claims 1 to 16, wherein the apparatus is further caused to: generate the observational data as a table, wherein the table comprises an experimental group and a control group, wherein the experimental group comprises a collection of experiments in which at least one operating attribute is set to a specific value that is to be studied, and wherein the control group comprises a collection of experiments in which operating attributes are randomized.

18. The apparatus of claim 17, wherein the apparatus is further caused to: determine an average treatment effect of the one or more cloud native network functions when the at least one operating attribute is set to the specific value.

19. The apparatus of any of claims 1 to 18, wherein the apparatus is further caused to: determine a magnitude of a relationship between the at least one observed cause and the at least one observed effect; wherein the magnitude of a relationship comprises at least one of: average treatment effect, a conditional average treatment effect, an individual treatment effect, a natural direct effect, or a natural indirect effect.

20. The apparatus of any of claims 1 to 19, wherein applying the causal reasoning function comprises performing a do-calculus to replace a do operator with at least one conditional probability of the at least one observed effect of the one or more cloud native network functions given the at least one observed cause, the at least one conditional probability used to infer the causality between the at least one observed cause and the at least one observed effect.

21. The apparatus of any of claims 1 to 20, wherein applying the causal reasoning function comprises performing a statistical analysis of the at least one observed cause and the at least one observed effect of the one or more cloud native network functions.

22. The apparatus of any of claims 1 to 21, wherein applying the causal reasoning function comprises applying machine learning to analyze the causality between the at least one observed cause and the at least one observed effect of the one or more cloud native network functions.

23. The apparatus of any of claims 1 to 22, wherein the apparatus is further caused to: validate or dismiss at least one design or configuration assumption related to the at least one observed cause or the at least one observed effect of the one or more cloud native network functions, based on the application of the causal reasoning function.

24. The apparatus of any of claims 1 to 23, wherein the apparatus is further caused to: determine at least one operating condition of the one or more cloud native network functions to be suboptimal, based on the application of the causal reasoning function.

25. The apparatus of any of claims 1 to 24, wherein the apparatus is further caused to: perform a root cause analysis to determine at least one cause of a fault of operation of the one or more cloud native network functions.

26. The apparatus of claim 25, wherein the apparatus is further caused to: operate or configure the one or more cloud native network functions using at least one result of the root cause analysis to complement the application of the causal reasoning function.

27. The apparatus of any of claims 1 to 26, wherein the apparatus is further caused to: determine whether the one or more cloud native network functions is an external application or an internal application.

28. The apparatus of any of claims 1 to 27, wherein the one or more cloud native network functions is in containerized software form.

29. An apparatus comprising: at least one processor; and at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus at least to: operate a replicate of one or more target applications; generate observational data of the replicate of the one or more target applications, the observational data generated based on a plurality of operating conditions of the one or more target applications; and apply a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more target applications and the at least one observed effect of the one or more target applications.

30. The apparatus of claim 29, wherein the apparatus is caused to: design the one or more target applications based on the analyzed causality between the at least one observed cause and the at least one observed effect of the one or more target applications.

31. The apparatus of claim 30, wherein the one or more target applications is designed during or after a testing phase of a continuous integration and continuous deployment pipeline, and prior to release of the one or more target applications in a production environment, the release being of the continuous integration and continuous deployment pipeline.

32. The apparatus of any of claims 29 to 31, wherein the apparatus is caused to: operate or configure the one or more target applications during or after a testing phase of a continuous integration and continuous deployment pipeline, and prior to release of the one or more target applications in a production environment, the release being of the continuous integration and continuous deployment pipeline.

33. The apparatus of any of claims 29 to 32, wherein the one or more target applications comprises a software application for a 5G network.

34. The apparatus of any of claims 29 to 33, wherein the replicate comprises a digital twin of the one or more target applications.

35. The apparatus of any of claims 29 to 34, wherein the at least one observed cause comprises an operating attribute of the one or more target applications, wherein configuration of the one or more target applications comprises a type of operating attribute.

36. The apparatus of any of claims 29 to 35, wherein the at least one observed effect comprises at least one performance indicator of the one or more target applications.

37. The apparatus of any of claim 29 to 36, wherein the at least one observed effect comprises at least one feature, a load, or at least one environmental condition of the one or more target applications.

38. The apparatus of any of claims 29 to 37, wherein analyzing the causality between the at least one observed cause and the at least one observed effect of the one or more target applications comprises at least one of: determining an existence of a relationship between the at least one observed cause and the at least one observed effect; determining a magnitude of the relationship between the at least one observed cause and the at least one observed effect; or determining a way in which the at least one observed cause and the at least one observed effect are related.

39. The apparatus of any of claims 29 to 38, wherein the plurality of operating conditions comprises at least one operating condition of the one or more target applications, the at least one operating condition comprising at least one of: network load; application programming interface load; domain name system impairment; kernel breakdown; total breakdown; network stress; computational resource stress; time impairment; web service breakdown; or input output impairment.

40. The apparatus of any of claims 29 to 39, wherein the apparatus is further caused to: form at least one graph that represents the causality between the at least one observed cause and the at least one observed effect of the one or more target applications.

41. The apparatus of claim 40, wherein the at least one graph comprises a direct acyclic graph.

42. The apparatus of any of claims 40 to 41, wherein the apparatus is further caused to: form the at least one graph using domain expert knowledge to remove at least one ambiguity related to causality between the at least one observed cause and the at least one observed effect of the one or more target applications.

43. The apparatus of any of claims 29 to 42, wherein the apparatus is further caused to: generate the observational data as a table, wherein a row of the table comprises an observation, and a column of the table comprises a cause attribute, an effect attribute, or an attribute comprising both a cause and effect.

44. The apparatus of any of claims 29 to 43, wherein the apparatus is further caused to: generate the observational data as a table, wherein a first column of the table comprises at least a portion of operating configurations of the one or more target applications, and a second column of the table comprises the at least one observed effect of the one or more target applications.

45. The apparatus of any of claims 29 to 44, wherein the apparatus is further caused to: generate the observational data as a table, wherein the table comprises an experimental group and a control group, wherein the experimental group comprises a collection of experiments in which at least one operating attribute is set to a specific value that is to be studied, and wherein the control group comprises a collection of experiments in which operating attributes are randomized.

46. The apparatus of claim 45, wherein the apparatus is further caused to: determine an average treatment effect of the one or more target applications when the at least one operating attribute is set to the specific value.

47. The apparatus of any of claims 29 to 46, wherein the apparatus is further caused to: determine a magnitude of a relationship between the at least one observed cause and the at least one observed effect; and wherein the magnitude of a relationship comprises at least one of: average treatment effect, a conditional average treatment effect, an individual treatment effect, a natural direct effect, or a natural indirect effect.

48. The apparatus of any of claims 29 to 47, wherein applying the causal reasoning function comprises performing a do-calculus to replace a do operator with at least one conditional probability of the at least one observed effect of the one or more target applications given the at least one observed cause, the at least one conditional probability used to infer the causality between the at least one observed cause and the at least one observed effect.

49. The apparatus of any of claims 29 to 48, wherein applying the causal reasoning function comprises performing a statistical analysis of the at least one observed cause and the at least one observed effect of the one or more target applications.

50. The apparatus of any of claims 29 to 49, wherein applying the causal reasoning function comprises applying machine learning to analyze the causality between the at least one observed cause and the at least one observed effect of the one or more target applications.

51. The apparatus of any of claims 29 to 50, wherein the apparatus is further caused to: validate or dismiss at least one design or configuration assumption related to the at least one observed cause or the at least one observed effect of the one or more target applications, based on the application of the causal reasoning function.

52. The apparatus of any of claims 29 to 51, wherein the apparatus is further caused to: determine at least one operating condition of the one or more target applications to be suboptimal, based on the application of the causal reasoning function.

53. The apparatus of any of claims 29 to 52, wherein the apparatus is further caused to: perform a root cause analysis to determine at least one cause of a fault of operation of the one or more target applications.

54. The apparatus of claim 53, wherein the apparatus is further caused to: operate or configure the one or more target applications using at least one result of the root cause analysis to complement the application of the causal reasoning function.

55. The apparatus of any of claims 29 to 54, wherein the apparatus is further caused to: determine whether the one or more target applications is an external application or an internal application.

56. The apparatus of any of claims 29 to 55, wherein the one or more target applications is in containerized software form.

57. The apparatus of any of claims 29 to 56, wherein the one or more target applications comprises a cloud native network function.

58. The apparatus of claim 57, wherein the cloud native network function comprises a 5G cloud native network function.

59. The apparatus of any of claims 57 to 58, wherein the cloud native network function is in containerized software form.

60. An apparatus comprising: at least one processor; and at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus at least to: select one or more cloud native network functions; select at least one feature of the one or more cloud native network functions; select at least one environmental condition of the one or more cloud native network functions; select a load of the one or more cloud native network functions, the load comprising an intensity and duration of processing of the one or more cloud native network functions subject to the at least one environmental condition; perform at least one experiment with the one or more cloud native network functions, the experiment based on the at least one feature, the load, and the at least one environmental condition of the one or more cloud native network functions; collect observational data from the at least one experiment; and apply a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

61. The apparatus of claim 60, wherein the at least one observed cause comprises an operating attribute of the one or more cloud native network functions, wherein configuration of the one or more cloud native network functions comprises a type of operating attribute.

62. The apparatus of any of claims 60 to 61, wherein the at least one observed effect comprises at least one performance indicator of the one or more cloud native network functions.

63. The apparatus of any of claims 60 to 62, wherein the at least one observed effect comprises the at least one feature, the load, or the at least one environmental condition of the one or more cloud native network functions.

64. The apparatus of any of claims 60 to 63, wherein the apparatus is further caused to: design the one or more cloud native network functions based on the analyzed causality between the at least one observed cause and the at least one observed effect of the one or more cloud native network functions.

65. The apparatus of any of claims 60 to 64, wherein the apparatus is further caused to: determine whether the one or more cloud native network functions is an external application or an internal application.

66. The apparatus of claim 65, wherein the apparatus is further caused to: provide an endpoint of the one or more cloud native network functions, in response to determining that the one or more cloud native network functions is an external application.

67. The apparatus of claim 66, wherein the endpoint comprises a uniform resource locator.

68. The apparatus of any of claims 65 to 67, wherein the apparatus is further caused to: deploy the one or more cloud native network functions, in response to determining that the one or more cloud native network functions is an internal application; wherein the least one experiment is performed with the one or more cloud native network functions.

69. The apparatus of claim 68, wherein the apparatus is further caused to: wrap the one or more cloud native network functions within a helm template during the application of the causal reasoning function.

70. The apparatus of any of claims 60 to 69, wherein the environmental condition of the one or more cloud native network functions comprises at least one of: network load; application programming interface load; domain name system impairment; kernel breakdown; total breakdown; network stress; computational resource stress; time impairment; web service breakdown; or input output impairment.

71. The apparatus of any of claims 60 to 70, wherein the apparatus is further caused to: form at least one graph that represents the causality between the at least one observed cause and the at least one observed effect of the one or more cloud native network functions.

72. The apparatus of any of claims 60 to 71, wherein the apparatus is further caused to: determine a magnitude of a relationship between the at least one observed cause and the at least one observed effect; and wherein magnitude of a relationship comprises at least one of: average treatment effect, a conditional average treatment effect, an individual treatment effect, a natural direct effect, or a natural indirect effect.

73. The apparatus of any of claims 60 to 72, wherein the one or more cloud native network functions is in containerized software form.

74. The apparatus of any of claims 60 to 74, wherein the one or more cloud native network functions is a fifth generation (5G) cloud native network function.

75. A method comprising: operating a replicate of one or more cloud native network functions; generating observational data of the replicate of the one or more cloud native network functions, the observational data generated based on a plurality of operating conditions of the one or more cloud native network functions; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

76. A method comprising: operating a replicate of one or more target applications; generating observational data of the replicate of the one or more target applications, the observational data generated based on a plurality of operating conditions of the one or more target applications; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more target applications and the at least one observed effect of the one or more target applications.

77. A method comprising: selecting one or more cloud native network functions; selecting at least one feature of the one or more cloud native network functions; selecting at least one environmental condition of the one or more cloud native network functions; selecting a load of the one or more cloud native network functions, the load comprising an intensity and duration of processing of the one or more cloud native network functions subject to the at least one environmental condition; performing at least one experiment with the one or more cloud native network functions, the experiment based on the at least one feature, the load, and the at least one environmental condition of the one or more cloud native network functions; collecting observational data from the at least one experiment; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

78. An apparatus comprising: means for operating a replicate of one or more cloud native network functions; means for generating observational data of the replicate of the one or more cloud native network functions, the observational data generated based on a plurality of operating conditions of the one or more cloud native network functions; and means for applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

79. An apparatus comprising: means for operating a replicate of one or more target applications; means for generating observational data of the replicate of the one or more target applications, the observational data generated based on a plurality of operating conditions of the one or more target applications; and means for applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more target applications and the at least one observed effect of the one or more target applications.

80. An apparatus comprising: means for selecting one or more cloud native network functions; means for selecting at least one feature of the one or more cloud native network functions; means for selecting at least one environmental condition of the one or more cloud native network functions; means for selecting a load of the one or more cloud native network functions, the load comprising an intensity and duration of processing of the one or more cloud native network functions subject to the at least one environmental condition; means for performing at least one experiment with the one or more cloud native network functions, the experiment based on the at least one feature, the load, and the at least one environmental condition of the one or more cloud native network functions; means for collecting observational data from the at least one experiment; and means for applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

81. A non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable with the machine for performing operations, the operations comprising: operating a replicate of one or more cloud native network functions; generating observational data of the replicate of the one or more cloud native network functions, the observational data generated based on a plurality of operating conditions of the one or more cloud native network functions; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

82. A non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable with the machine for performing operations, the operations comprising: operating a replicate of one or more target applications; generating observational data of the replicate of the one or more target applications, the observational data generated based on a plurality of operating conditions of the one or more target applications; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more target applications and the at least one observed effect of the one or more target applications.

83. A non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable with the machine for performing operations, the operations comprising: selecting one or more cloud native network functions; selecting at least one feature of the one or more cloud native network functions; selecting at least one environmental condition of the one or more cloud native network functions; selecting a load of the one or more cloud native network functions, the load comprising an intensity and duration of processing of the one or more cloud native network functions subject to the at least one environmental condition; performing at least one experiment with the one or more cloud native network functions, the experiment based on the at least one feature, the load, and the at least one environmental condition of the one or more cloud native network functions; collecting observational data from the at least one experiment; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

Description:
A Causal Reasoning System For Operational Twin (CAROT) For Development And Operation Of 5G CNFs

TECHNICAL FIELD

[0001] The examples and non-limiting example embodiments relate generally to chaos engineering and, more particularly, to a causal reasoning system for operational twin (CAROT) for development and operation of a 5G CNFs.

BACKGROUND

[0002] It is known to develop fault tolerant systems within a software development lifecycle.

SUMMARY

[0003] In accordance with an aspect, an apparatus includes at least one processor; and at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus at least to: operate a replicate of one or more cloud native network functions; generate observational data of the replicate of the one or more cloud native network functions, the observational data generated based on a plurality of operating conditions of the one or more cloud native network functions; and apply a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

[0004] In accordance with an aspect, an apparatus includes at least one processor; and at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus at least to: operate a replicate of one or more target applications; generate observational data of the replicate of the one or more target applications, the observational data generated based on a plurality of operating conditions of the one or more target applications; and apply a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more target applications and the at least one observed effect of the one or more target applications.

[0005] In accordance with an aspect, an apparatus includes at least one processor; and at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus at least to: select one or more cloud native network functions; select at least one feature of the one or more cloud native network functions; select at least one environmental condition of the one or more cloud native network functions; select a load of the one or more cloud native network functions, the load comprising an intensity and duration of processing of the one or more cloud native network functions subject to the at least one environmental condition; perform at least one experiment with the one or more cloud native network functions, the experiment based on the at least one feature, the load, and the at least one environmental condition of the one or more cloud native network functions; collect observational data from the at least one experiment; and apply a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

[0006] In accordance with an aspect, a method includes operating a replicate of one or more cloud native network functions; generating observational data of the replicate of the one or more cloud native network functions, the observational data generated based on a plurality of operating conditions of the one or more cloud native network functions; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

[0007] In accordance with an aspect, a method includes operating a replicate of one or more target applications; generating observational data of the replicate of the one or more target applications, the observational data generated based on a plurality of operating conditions of the one or more target applications; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more target applications and the at least one observed effect of the one or more target applications.

[0008] In accordance with an aspect, a method includes selecting one or more cloud native network functions; selecting at least one feature of the one or more cloud native network functions; selecting at least one environmental condition of the one or more cloud native network functions; selecting a load of the one or more cloud native network functions, the load comprising an intensity and duration of processing of the one or more cloud native network functions subject to the at least one environmental condition; performing at least one experiment with the one or more cloud native network functions, the experiment based on the at least one feature, the load, and the at least one environmental condition of the one or more cloud native network functions; collecting observational data from the at least one experiment; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

[0009] In accordance with an aspect, an apparatus includes means for operating a replicate of one or more cloud native network functions; means for generating observational data of the replicate of the one or more cloud native network functions, the observational data generated based on a plurality of operating conditions of the one or more cloud native network functions; and means for applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

[0010] In accordance with an aspect, an apparatus includes means for operating a replicate of one or more target applications; means for generating observational data of the replicate of the one or more target applications, the observational data generated based on a plurality of operating conditions of the one or more target applications; and means for applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more target applications and the at least one observed effect of the one or more target applications.

[0011] In accordance with an aspect, an apparatus includes means for selecting one or more cloud native network functions; means for selecting at least one feature of the one or more cloud native network functions; means for selecting at least one environmental condition of the one or more cloud native network functions; means for selecting a load of the one or more cloud native network functions, the load comprising an intensity and duration of processing of the one or more cloud native network functions subject to the at least one environmental condition; means for performing at least one experiment with the one or more cloud native network functions, the experiment based on the at least one feature, the load, and the at least one environmental condition of the one or more cloud native network functions; means for collecting observational data from the at least one experiment; and means for applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

[0012] In accordance with an aspect, a non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable with the machine for performing operations is provided, the operations including: operating a replicate of one or more cloud native network functions; generating observational data of the replicate of the one or more cloud native network functions, the observational data generated based on a plurality of operating conditions of the one or more cloud native network functions; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

[0013] In accordance with an aspect, a non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable with the machine for performing operations, the operations comprising: operating a replicate of one or more target applications; generating observational data of the replicate of the one or more target applications, the observational data generated based on a plurality of operating conditions of the one or more target applications; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more target applications and the at least one observed effect of the one or more target applications.

[0014] In accordance with an aspect, a non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable with the machine for performing operations is provided, the operations including: selecting one or more cloud native network functions; selecting at least one feature of the one or more cloud native network functions; selecting at least one environmental condition of the one or more cloud native network functions; selecting a load of the one or more cloud native network functions, the load comprising an intensity and duration of processing of the one or more cloud native network functions subject to the at least one environmental condition; performing at least one experiment with the one or more cloud native network functions, the experiment based on the at least one feature, the load, and the at least one environmental condition of the one or more cloud native network functions; collecting observational data from the at least one experiment; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

BRIEF DESCRIPTION OF THE DRAWINGS

[0015] The foregoing aspects and other features are explained in the following description, taken in connection with the accompanying drawings.

[0016] FIG. 1 illustrates a CI/CD pipeline.

[0017] FIG. 2 depicts a left-shift paradigm framework in the CI/CD pipeline.

[0018] FIG. 3 illustrates support of the CI/CD left-shift architecture pipeline by the examples described herein.

[0019] FIG. 4 depicts layers of a causal reasoning system for an operational twin high-level architecture.

[0020] FIG. 5 depicts lower building blocks (a chaos framework) for the causal reasoning system for operational twin described herein.

[0021] FIG. 6 depicts upper building blocks (causal inference) for the causal reasoning system for operational twin described herein.

[0022] FIG. 7 depicts a general problem statement for the causal reasoning system for operational twin described herein.

[0023] FIG. 8 depicts a high-level task description of the causal reasoning system for operational twin described herein.

[0024] FIG. 9 depicts the digital twin component of the causal reasoning system for operational twin described herein. [0025] FIG. 10 depicts a digital-twin component workflow of the causal reasoning system for operation twin described herein.

[0026] FIG. 11 depicts a cause effect inference component.

[0027] FIG. 12 depicts implementation of a randomized controlled trial (RCT).

[0028] FIG. 13 depicts a cause inference effect workflow.

[0029] FIG. 14 is a block diagram of one possible and non-limiting system in which the example embodiments may be practiced.

[0030] FIG. 15 is an example apparatus configured to implement the examples described herein.

[0031] FIG. 16 shows a representation of an example of non-volatile memory media.

[0032] FIG. 17 is an example method to implement the examples described herein.

[0033] FIG. 18 is an example method to implement the examples described herein.

[0034] FIG. 19 is an example method to implement the examples described herein.

DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS

[0035] Modern software, including 5G cloud-native network function (e.g. CNF) appliances, is typically cloud-based, highly dynamic and service-oriented. These software appliances are often non-trivial to configure and difficult to optimize especially in environments where the infrastructure displays non-ideal conditions like production setups.

[0036] The industry convention on software development methodologies dictates that a continuous integration I continuous delivery (CFCD) pipeline should generally resemble the pipeline 101 shown in FIG. 1.

[0037] On these terms, the 5G CNF development process flows from left to right for instance the source code is checked into the repository (version control 202) then built 204, tested 206, released 208, deployed 210, operated 212, and monitored 214. With reference to FIG. 2, certain tasks that typically occur later during the development can be brought forward in the pipeline -the left-shift principle including left-shift 216- to help increase productivity by identifying and rectifying issues much earlier in the development cycle reducing cost and increasing compliance to requirements. This principle lies at the center of this disclosure of CAROT. The CAROT (e.g. CAusal Reasoning for Operation Twin) framework 201 is depicted in FIG. 2 where operational environment conditions (including during operate 212 and monitor 214 in the pipeline) are brought forward into the test 206 phase, effectively recreating a digital-twin.

[0038] The causal reasoning for operation twin system replicates operational environment conditions in a safe, controllable and repeatable manner. Its novelty comprises analyzing the observations collected from the digital twin setup to infer probable cause I effect relationships to provide insights into the 5G CNF’s configuration setups, design assertions, what-if scenarios and to support root-cause-analysis (RCA) basis for zero-touch management automation.

[0039] The ideas described herein are generally applicable to any containerized software development production cycle, however a focus of the examples described herein is associating the system to CNFs (cloud-native network functions) specifically those related to 5G radio and core infrastructure.

[0040] Throughout this submission all the references to software, applications, appliances, software appliances I applications may be understood as 5G-related CNFs which are micro- service-based software applications that support a functional communication network (e.g. 5G SA or NSA). One or more of the following technical effects can be selected: validating 5G CNFs in operational environment replicas and digital twins, robust operation of a 5G CNF-based communication network, optimizing configuration of a 5G CNF-based communication network, discovering unforeseen I unplanned 5G CNF behavior insights, and/or continuous improvement of the design of a 5G CNF-based communication network.

[0041] The examples described herein include following features:

[0042] 1. In general terms, as shown in FIG. 3 the overall positioning of the CAROT framework supports the “left-shift” CI/CD paradigm.

[0043] The causal reasoning system for operational twin system 201 spans engineering layers including outcome 302, ML 304, data 306, and system 308. Outcome 302 includes optimal code 310, RCA and fault 324, and RCA and fix 326. ML 304 includes code review 312, causality analytics 328, and anomaly detection 330. Data 306 includes code configurations 314 and production metrics 332. System 308 includes IDE 316 and production 334. Development 318 includes optimal code 310, code review 312, code configurations 314, and IDE 316. CAROT 201 is a component of the development operations pipeline 320 and implements simulation and a digital twin 322. Operations 336 includes RCA and fault 324, RCA and fix 326, causality analytics 328, anomaly detection 330, production metrics 332, and production 334.

[0044] As shown in FIG. 3, CAROT 201 receives input from IDE 316 and production metrics 332, and provides output to optimal code 310, causality analytics 328, and production 334.

[0045] 2. CAROT 201 is built over multiple specialized components as depicted in the high- level architecture shown in FIG. 4. The specialized components include chaos framework 402, chaos metrics 404, causal inference 406, causal discovery 408, and design optimization 410. Causal discovery provides results to causality analytics 328 over interface 412, and through interface 412 causal discovery 408 receives information from production metrics 332. Results of design optimization 410 is provided through interface 410 to develop optimal code 310. CAROT 201, including chaos framework 402, receives input from IDE 316 over interface 414. The term “chaos” alludes to study of fault generation and fault observation, with altering a configuration, parameter, or component of a system.

[0046] 2a. At its foundation CAROT employs a digital twin engine, including an automated planned failure injection (including chaos failure injection) and load stressing (e.g. REST and network traffic benchmarking) framework. This subjects target applications (5G CNFs) to configurable workloads during which planned infrastructure non-optimal conditions can be injected and observed through captured metrics. This capability is provided by the lower two blocks (chaos framework 402 and chaos metrics 404) in the CAROT framework high-level architecture as highlighted in system component 502 shown in FIG. 5. The resulting observations can help strengthen or dismiss assumptions (e.g. one or more hypotheses) about the application’s performance and/or robustness in measurable, quantitative terms.

[0047] 2b. The automation supported by the lower two blocks, chaos framework 402 and chaos metrics 404, as described in (2a) offers repeatable stimulation of load stresses and fault inductions in order to generate applicable extensive (depending on the type of application and the observation requirements, which are also configurable) observation datasets. These are processed by a causal inference (machine learning) algorithm that provides inference or insights into the relationship between cause and its plausible effects on the application itself, represented by causal inference 406 and causal discovery 408 upper blocks in the high-level architecture, including system component 602 shown in FIG. 6. These insights range from measuring and understanding the treatment effects (e.g. the magnitude of change in effect due to a change in the cause), discerning which application configuration options are most optimal for a specific network setup, to identifying which cloud infrastructure choices could have the most impact on overall performance and robustness.

[0048] 2c. The inference from (2b) is meant to provide feedback into the application design stage, proving or disproving earlier theoretical assumptions and potentially identifying unplanned behavior. This is represented by the design optimization block (410) in FIG. 6. These insights can also serve to complement root cause analysis (RCA) assumptions and models, including RCA and fault 324 and RCA and fix 326 as depicted by the dataflow shown on FIG. 4 that includes the causality analytics block (328).

[0049] Before released into production environments, applications are compiled and tested to validate compliance to functional and non-functional requirements. In modern software settings, these tests are typically integrated as part of the continuous integration I continuous deployment (CFCD) pipeline right after the application is built (compiled) so that any potentially unplanned issues can be detected and fixed early in the development cycle reducing costs and increasing productivity.

[0050] In many cases however, these test tasks are typically carried out in a cloud infrastructure displaying ideal conditions which makes it difficult to identify configuration and design issues that are likely to emerge once the application is deployed into a production environment where resources can be constrained and the application is generally displaying signs of duress.

[0051] Additionally, this traditional approach fails to explore what-if scenarios that can help identify issues with an application’s configuration setup or design principles that would otherwise go undetected during the normal execution of CI/CD tasks and could negatively impact its expected performance and behavior (SLO) once deployed into a production operating environment.

[0052] Examples include determining or refining infrastructure thresholds e.g. increasing or decreasing specific resource availability and how these influence the application to help save costs (e.g. suppressing resources that have negligible impact), refine environment requirements, validate scaling and performance capabilities, identify root causes more efficiently, etc.

[0053] Motivation for developing the herein described system includes that 99% of misconfiguration incidents in the cloud go unnoticed (Help Net Security, September 25, 2019), cloud misconfigurations cost companies nearly $5 trillion (TechRepublic, February 20, 2020), and non-functional fault is notoriously difficult. Hardware performance faults in large production systems fail slowly at scale.

[0054] Accordingly, it is important to highlight the relevance of the left-shift software development paradigm approach described herein, as this establishes an opportunity to identify and address issues that typically emerge much later in the process to help reduce costs, increase productivity and requirements or expectations compliance. FIG. 7 provides a general summary of the problem statement.

[0055] As shown by item 702 of FIG. 7, during the testing phase 206 of the CI/CD pipeline, the configuration, performance, or design issue is not identified because environmental conditions did not replicate production or generate a digital-twin. As further shown by item 702 of FIG. 7, during the operate phase 212 of the pipeline, the design, configuration, or performance issue is identified, but at this stage of the pipeline, the design, configuration, or performance issue is costly and prone to requiring more time to fix, leading to longer deployment timelines, a decrease in quality, and a general confidence drop from stakeholders. As shown by item 704 of FIG. 7, CAROT 201 identifies the design, configuration, or performance issue early in the CI/CD pipeline, and earlier than the process shown by item 702, soon after the test phase 206 and prior to the release phase 208 (and substantially earlier than the operate phase 212), when it is faster and less costly to rectify.

[0056] Chaos engineering methodologies and tools are generally available. This also applies to causal inference machine learning techniques. However, the examples described herein combine these tools and techniques in a unique and novel original solution for e.g. a 5G CNF cloud infrastructure (Kubernetes) operational environment digital twin capturing observations that are aggregated and processed for cause I effect insights (causal inference) early in the software production cycle, implementing the left-shift principle. These components specifically include the left-shift paradigm, digital twins, and causal inference.

[0057] The left-shift paradigm comprises operational environment-like conditions inserted early in the CNF CI/CD test task.

[0058] Digital twins include deploying and configuring the CNF, exposing it to controlled, variable rates (benchmarking of REST and network traffic), replicating optimal and sub- optimal operational environment cloud infrastructure conditions (chaos engineering), and observing infrastructure and interaction across the CNF microservices and components including monitoring and tracing.

[0059] Causal inference includes determining probable cause I effect relationships, analyzing the magnitude of influence between a cause and an effect, utilizing or constructing of a direct acyclic graph (DAG) by performing causal discovery, determining root cause analysis (RCA), and developing and determining what-if scenarios to generate inference insights.

[0060] The automation enabled by the examples described herein (CAROT) allows execution of experiments, when possible at scale based on the type of target application, developing permutations of and iterating over different cloud infrastructure conditions (e.g. digital twins). The observations and metrics collected become datasets which are analyzed by the causal inference algorithm, for example using machine learning to either validate existing design assumptions and/or discover new ones to analyze cause and effect impact. The system described herein may also provide complementary information for operational root cause analysis (RCA) in an operational environment.

[0061] The system described herein including CAROT provides a framework that is designed to be integrated into a CI/CD 5G CNF pipeline to explore behavior in a typical operational environment, creating a digital twin, inferring probable causes and effects helping to explore what-if scenarios, assert or disregard CNF design assumptions, identify optimal CNF configuration setups, validate SLO/SLA under non-ideal cloud infrastructure conditions, and support root cause analysis (e.g. zero-touch management automation).

[0062] CAROT’s components are referenced in FIG. 8. The system is intended for the 5G CNF CI/CD pipeline integration as shown in (802) via open APIs. As shown in FIG. 8, the system may be integrated into the test portion 206 of the CI/CD pipeline. The system subjects a target application 816 to a configurable infrastructure emulating a production environment (e.g. digital-twin): this includes load 814 - either REST or network traffic (804), and cloud infrastructure impairments (806). The chaos infrastructure impairment injection into the software application 816 includes total breakdown 818, resource stress 820, network stress 822, and I/O impairment 824.

[0063] The system automatically captures infrastructure and application observations (e.g. tracing and monitoring) for posterior analysis (808). Causal inference 406 and discovery 408 algorithms (ML) are applied to observations 826 in tandem with one or more causal direct acyclic graphs (DAGs) (828) to provide insights (810). The framework’s (CAROT) output involves what-if scenarios 830, decision insights 836, significant features 832 and a model for RCA 834 leading to zero-touch management automation (812).

[0064] The system described herein (CAROT) is supported on a design principle and two components that are described in detail below. These individual methodologies, technologies and tools supporting the framework have been assembled in this unique configuration for this specific purpose.

[0065] 1. Design principle: integration into CI/CD pipelines - the left-shift paradigm

[0066] Studies have shown that the cost of fixing issues near or very near the end of the process is much higher than at the start when the code base is designed and created. This approach contributes to improved delivery timelines and end-customer satisfaction and confidence. By embracing the shift-left design principle CAROT’ s framework can replicate conditions that typically occur in a production environment (e.g. resource stress, hardware failures, oversubscription, etc.) under a planned, repeatable, controlled setting (digital twin). Because the collected observations are later processed by a causal reasoning system, SLO assumptions and optimal configurations insights can be asserted or dismissed before the CNF reaches the operational deployment stage. FIG. 7 depicts this feature.

[0067] 2. Digital-twin component - recreating operational environments

[0068] A functional representation of CAROT’ s digital twin component 900 is shown in FIG. 9. The following descriptions provide further insights. [0069] 2a. 5G CNF target - at the center of the image is the 5G CNF 902 which must be wrapped in a helm template if the deployment and management is to be handled by CAROT internally. Alternatively, the CNF 902 may reside externally in which case the target system details must be provided, e.g., URL endpoints for load targeting and monitoring, such as for K8s labels chaos feature targeting and monitoring.

[0070] 2b. 5G CNF load - sitting on top of the application 816 is the load module 814. The load module 814 can target the 5G CNF 902 with either REST or network traffic such as IMIX traffic. The load level is configurable, and based on user choice CAROT automatically spawns the necessary workers to support it appropriately. The higher the load, the more workers are spawned.

[0071] 2c. The overall experiment length, e.g., the amount of time the 5G CNF 902 endures controlled environment features while monitored is set by the load duration, a configurable parameter that may range from seconds to days.

[0072] 2d. Environment conditions - the infrastructure impairment injection module 904 supports a configurable range of cloud infrastructure conditions that can be applied to the environment where the target application is deployed, provided by chaos engineering tools. Impairment features can be combined, or can completely dismissed to represent ideal conditions.

[0073] 2e. Observations 826 are collected from the application 816, e.g. using tracing, and infrastructure packaged and labelled with unique ids that are maintained in the framework’ s API persistence layer for future reference. The infrastructure may comprise node-exporter and cAdvisor via Prometheus.

[0074] FIG. 10 shows the general workflow of the digital twin component. The workflow shown in FIG. 10 can be related to the descriptions provided above. Three modules manage and operate the workflow/component 1000, namely the API module 1001, the operator module 1003, and the engine module 1005. The API module 1001 is an open-API endpoint for all tasks related to requests, e.g. submission, updates, cancellations, etc. The API module 1001 holds the persistence layer of the component. The operator module 1003 manages the automatic scaling of the K8s worker clusters. The engine module 1005 handles the order stored in the persistence layer, coordinates deployments, terminations for the application and the environment features. The engine module 1005 handles observation collections and packaging.

[0075] This component 1000 is designed and capable of scaling vertically and horizontally allowing for the simultaneous execution of experiments in large numbers provided the computational resources are available. As a reference, the component 1000 has handled tens of thousands of short duration experiments.

[0076] At 1002, the system determines whether a target application 816 (e.g. a 5G CNF 902) is internally administered. If the system determines at 1002 that the 5G CNF is not internally administered and is externally administered, then at 1004 the system provides endpoint details such as URLs or labels. If the system determines at 1002 that the 5G CNF is internally administered, then at 1006 the system selects a target 5G CNF including replicas. At 1008, the system selects 5G CNF features including computing features and instances. From items 1004 and 1008, the method transitions to 1010. At 1010, the system selects 5G CNF load including intensity and duration. At 1012, the system selects environmental conditions such as type, intensity, and periodicity. At 1014, the system determines whether the 5G CNF is internally administered.

[0077] If at 1014 the system determines that the 5G CNF is internally administered, the method transitions to 1016. If at 1014 the system determines that the 5G CNF is not internally administered and is externally administered, the method transitions to 1018. At 1016, the 5G CNF is deployed, e.g. using a helm chart or template. At 1018, environmental conditions are deployed, e.g. using a helm chart or template. At 1020, the load is deployed, e.g. using a helm chart or template. At 1022, the system waits for the experiment to complete. At 1024, the system collects, packages, and labels observations 826 such as metrics. At 1026, one or more cause effect inference components (such as causal inference 406 or causal discovery 408) process(es) the observations 826. At 1028, the 5G CNF, load and chaos deployments are terminated, e.g. using a helm chart or template. At 1030, the method ends.

[0078] 3. Cause effect inference - intelligently reason performance of target applications (e.g. 5G CNFs)

[0079] This component comprises applying causal reasoning techniques to the observations captured by the digital-twin module to produce application configuration optimization and performance insight discoveries. [0080] Causal reasoning breaks from traditional machine learning approaches in that it attempts to answer the why behind the decision-making process, effectively computationally addressing the counterfactual what if question: e.g., how much better would the latency KPI be if the vCPU resource type X is used instead for this application? Causal reasoning is a step forward towards artificial general intelligence (AGI) and it is pragmatically applied by the examples described herein towards the software configuration and performance optimization problem.

[0081] This component is built upon four modules, with reference to the cause effect inference component 1100 shown in FIG. 11, including generate and collect observational data (1104, 1116, 402), formulating the inputs (1103, 1122, 1128), performing causal inference (1102, 1110, 1112, 1114, 1118), and causal discovery (1124, 1126), as well as design optimization 410.

[0082] Generate and collect observational data (1104, 1116, 402) - the first step in the process is to collect observational data 1108 using at least partially module 1103. In the context of the herein described examples, this observational data 1108 is automatically collected from the digital-twin component. The data 1108 is generated and stored in tabular form (1122), such that each record corresponds to an experiment performed (1116) and each has associated with it a set of attributes. These sets of attributes correspond to the nodes in the direct acyclic graph (DAG) in which causal effects inference calculations are carried out. Attributes can be related to configuration (e.g. cloud computing settings), control (e.g. chaos features) or observables metrics and combinations thereof. As shown in FIG. 11, the chaos framework 402 includes domain name system (DNS) impairment 801, kernel breakdown 803, total breakdown 818, network stress 822, computational resource stress 820, HTTP stress 805 e.g. internet stress, time impairment 807, web service (WS) chaos or breakdown 809, and input/output (I/O) impairment 824. The chaos framework 402 may also include network load and application programming interface load. The run experiments block (1116) sets up the chaos conditions (1107), runs an experiment (402), observes by collecting data (1105), then repeats this process many times. The collective observational data is stored (1115) into the dataset (1122). Module 1104 includes chaos metrics 404 and chaos framework 402.

[0083] Formulating the inputs (1103, 1122, 1128) - to perform casual inference, the process requires a set of observational data 1122 that are generated 1115 at least partially with experiments 1116, and a causal graph (one of causal graphs 1128) e.g. the DAG which can be formulated using domain expert knowledge. The causal graphs are generated (1127) using at least ambiguity removal 1126. People that design DAGs carefully define edges interconnecting the nodes. An edge implies there might be a causal relationship from the parent node to the child node, where for example the parent defines the cause 1130 and the child defines the effect 1132, where the effect 1132 may be one or more KPIs. In a DAG, edges are unidirectional by nature, and the acyclic name implies that there cannot be any cycle in the graph structure. Cycles are considered unhelpful in the casual inference analysis as a cycle makes it nearly impossible to decipher between cause and effect. The strength of causal influence between two nodes need not be defined in the design phase. During the design phase what is simply defined is that the causation might exist. When in doubt, the recommendation is to add an edge as ignoring one is considered a very strong assumption. In essence, the casual inference function is to analyze and calculate this strength quantitatively between a change of a cause and its causal impact to an effect attribute. This is described in more detail in the following point.

[0084] Performing causal inference (1102, 1110, 1112, 1114, 1118) - this module is responsible for processing raw observational data 1122 with input from the DAG, e.g. with use of one or more of causal graphs 1128, to perform causal inference 406. Metrics are reflected by causal effects 1110 which may include ATE (average treatment effect), ATT (average treatment effect on the treated), conditional- ATE, ITE (individual treatment effect), meditation analysis e.g. NDE (natural direct effect) and NIE (natural indirect effect). Any configuration or control nodes can be treatment candidates and any value within that node can be used as control, or as a base, and some other value is to be considered as a treated value, for instance, by examining the service’s throughput (e.g. a DAG outcome node) average treatment effect (e.g. ATE) when modifying the K8s cluster worker node type (e.g. a treatment node) where the application 816 (5G CNF 902) is deployed from a low powered CPU to a medium powered one. In this case it may be stated that the worker node type is treated from a low powered CPU to a medium one and the goal is to understand the effect over service throughput. This section of the design typically involves performing a formulation of the estimand (refer to identification 1114) using doCalculus (refer to 1106) or some other causal inference technique as the first step, with input 1111 from the causal graphs 1128, and input 1113 from the column row data 1122. This allows counterfactual scenarios to be formulated -the reasoning aspect- which then allows the next step to evoke a statistical analysis technique 1112 to calculate the causal inference results metrics that is described herein. An output 1119 of ML and statistical analysis 1112 is provided to robustness check 1118, and another output 1121 of ML and statistical analysis 1112 is provided to causal effects module 1110. The output 1123 of the identification 1114 is provided to ML and statistical analysis 1112. Additionally, there are algorithm-based refutation techniques that can be used to corroborate the robustness and trustworthiness of the CI results (1118). The output 1117 of robustness check 1118 is provided to causal effects module 1110. For details refer to the DoWhy library. Item 1102 includes both causal inference 406 and causal discovery 408.

[0085] The DAG 1128 (the one or more DAGs 1128) can be 1) derived using the causal discovery process (1124, 1125, 1126, 1127) or 2) the DAG 1128 (the one or more DAGs 1128) can be come up with (e.g. developed) by a human expert.

[0086] Causal discovery (1124, 1126) - this module may be considered optional. The DAG can be generated semi-automatically using a causal discovery (CD) process 1124, with input 1125 from observational data 1122. Many of these algorithms can be used to infer a graph by using observational data 1122 as input. However, given the SOTA technique in this domain, the resulting graphs are not necessarily DAGs in nature, as edges might lack direction or display other types of ambiguity. Expert knowledge is usually still applied to assist in removing ambiguities 1126 in the graph to formulate the DAG. The causal discovery process might generate many candidate graphs where expert knowledge is applied to select the closest one. The resulting causal discovery-generated DAG can be used as part of the regular causal inference process. In addition, this DAG can be used for root cause analysis purpose. In certain use cases the causal discovery function can be used standalone beyond being just an assistive function to the causal inference process.

[0087] Design optimization - this final step includes assessing the resulting causal inference in search for validation or dismissal of assumptions established during the design phase. The design task may be fulfilled manually in CAROT. Alternatively, with adequate automation the design optimization can be tightly integrated to CI/CD pipelines to streamline the entire process.

[0088] Another feature of the herein described system is the ability to perform an RCT (randomized controlled trial) 1202 in the digital twin environment as shown and referenced in the high-level architecture of FIG. 12. RCT is often touted as the gold standard in causal analysis and treatment effect studies, and is at least a baseline standard. [0089] RCT is however, not always feasible due to cost, ethical and many other practical limitations. Another limitation of RCT is that it can generally study one cause and effect pair at a time.

[0090] In CAROT’ s case RCT is performed using the digital twin component testing one treatment effect at a time. The causal inference process described remains the most advantageous when considering possible permutations of treatment effects. The causal inference process is an effective process if the objective is to understand many specific treatment combinations. In this case, the RCT setup is used as a verification function to spot check the causal inference results, and carried out to specifically study a treatment outcome pair and compare results to those produced by the causal inference analysis. Also shown in FIG. 12 is interface 415 used to provide information from the chaos framework 402 to experiments 1116.

[0091] FIG. 13 shows the workflow diagram 1300 across the individual modules of the cause effect inference component 1100. Observational data is generated and collected (1108), which includes generating digital twin observations (1104) and formatting (1116) the observations as rows in a table, with columns representing experiment features and KPIs. At 1302, a determination is made as to whether causal discovery is applied. If at 1302 it is determined that causal discovery is applied, the method transitions to causal discovery 1304, and if at 1302 it is determined that causal discovery is not applied, the method transitions to formulating the input 1122. Optional causal discovery 1304 includes auto DAG generation 1124 and ambiguity removal 1126, which ambiguity removal 1126 may be a human-assisted process. Formatting the input 1122 includes generating a causal graph (1128) using domain knowledge.

[0092] Following generating and collecting the observational data (1108) and formatting the input (1122), the method transitions to causal inference (1102). Causal inference 1102 includes identification 1114, ML and statistical analysis 1112, robustness check 1118 which may include refutations, and generating causal effects 1110 such as ATE, CATE, and causal tree generation. Design optimization 410 is performed following causal inference 1102, followed by ending the process at 1306.

[0093] The causal reasoning system for operational twin described herein provides several advantages and technical effects. The system’s unique design allows for cause effect insights to be inferred from environments that resemble very closely production conditions by creating digital twins. This provides several advantages that help improve overall application development cycles and quality, specifically confirming or dismissing application e.g. 5G CNF design assumptions regarding ideal and impaired environments including performance, robustness, and application configuration setups. The system also provides functionality for confirming or dismissing application e.g. 5G CNF design assumptions regarding platform requirements such as computational and networking requirements, etc. The system additionally includes functionality for validating root cause assumptions in zero-touch and troubleshooting, and identifying potential application e.g. 5G CNF issues long before they are encountered in production environments.

[0094] Thus, the examples described herein relate to software development, agile methodologies, application architecture for the deployment (e.g. CI/CD) and testing of applications (e.g. 5G CNFs) on customer production environments. The examples described herein center on the application of causal reasoning (ML) techniques to 5G CNF software testing under a CI/CD pipeline, and what-if scenario analysis capability, the ability to address counterfactual questions and to provide the equivalent of treatment effect estimations for performing causal inference.

[0095] Turning to FIG. 14, this figure shows a block diagram of one possible and nonlimiting example in which the examples may be practiced. A user equipment (UE) 110, radio access network (RAN) node 170, and network element(s) 190 are illustrated. In the example of FIG. 14, the user equipment (UE) 110 is in wireless communication with a wireless network 100. A UE is a wireless device that can access the wireless network 100. The UE 110 includes one or more processors 120, one or more memories 125, and one or more transceivers 130 interconnected through one or more buses 127. Each of the one or more transceivers 130 includes a receiver, Rx, 132 and a transmitter, Tx, 133. The one or more buses 127 may be address, data, or control buses, and may include any interconnection mechanism, such as a series of lines on a motherboard or integrated circuit, fiber optics or other optical communication equipment, and the like. The one or more transceivers 130 are connected to one or more antennas 128. The one or more memories 125 include computer program code 123. The UE 110 includes a module 140, comprising one of or both parts 140- 1 and/or 140-2, which may be implemented in a number of ways. The module 140 may be implemented in hardware as module 140-1, such as being implemented as part of the one or more processors 120. The module 140-1 may be implemented also as an integrated circuit or through other hardware such as a programmable gate array. In another example, the module 140 may be implemented as module 140-2, which is implemented as computer program code 123 and is executed by the one or more processors 120. For instance, the one or more memories 125 and the computer program code 123 may be configured to, with the one or more processors 120, cause the user equipment 110 to perform one or more of the operations as described herein. The UE 110 communicates with RAN node 170 via a wireless link 111.

[0096] The RAN node 170 in this example is a base station that provides access by wireless devices such as the UE 110 to the wireless network 100. The RAN node 170 may be, for example, a base station for 5G, also called New Radio (NR). In 5G, the RAN node 170 may be a NG-RAN node, which is defined as either a gNB or an ng-eNB. A gNB is a node providing NR user plane and control plane protocol terminations towards the UE, and connected via the NG interface (such as connection 131) to a 5GC (such as, for example, the network element(s) 190). The ng-eNB is a node providing E-UTRA user plane and control plane protocol terminations towards the UE, and connected via the NG interface (such as connection 131) to the 5GC. The NG-RAN node may include multiple gNBs, which may also include a central unit (CU) (gNB-CU) 196 and distributed unit(s) (DUs) (gNB-DUs), of which DU 195 is shown. Note that the DU 195 may include or be coupled to and control a radio unit (RU). The gNB-CU 196 is a logical node hosting radio resource control (RRC), SDAP and PDCP protocols of the gNB or RRC and PDCP protocols of the en-gNB that control the operation of one or more gNB-DUs. The gNB-CU 196 terminates the Fl interface connected with the gNB-DU 195. The Fl interface is illustrated as reference 198, although reference 198 also illustrates a link between remote elements of the RAN node 170 and centralized elements of the RAN node 170, such as between the gNB-CU 196 and the gNB- DU 195. The gNB-DU 195 is a logical node hosting RLC, MAC and PHY layers of the gNB or en-gNB, and its operation is partly controlled by gNB-CU 196. One gNB-CU 196 supports one or multiple cells. One cell may be supported with one gNB-DU 195, or one cell may be supported/shared with multiple DUs under RAN sharing. The gNB-DU 195 terminates the Fl interface 198 connected with the gNB-CU 196. Note that the DU 195 is considered to include the transceiver 160, e.g., as part of a RU, but some examples of this may have the transceiver 160 as part of a separate RU, e.g., under control of and connected to the DU 195. The RAN node 170 may also be an eNB (evolved NodeB) base station, for LTE (long term evolution), or any other suitable base station or node. [0097] The RAN node 170 includes one or more processors 152, one or more memories 155, one or more network interfaces (N/W I/F(s)) 161, and one or more transceivers 160 interconnected through one or more buses 157. Each of the one or more transceivers 160 includes a receiver, Rx, 162 and a transmitter, Tx, 163. The one or more transceivers 160 are connected to one or more antennas 158. The one or more memories 155 include computer program code 153. The CU 196 may include the processor(s) 152, memory(ies) 155, and network interfaces 161. Note that the DU 195 may also contain its own memory/memories and processor(s), and/or other hardware, but these are not shown.

[0098] The RAN node 170 includes a module 150, comprising one of or both parts 150-1 and/or 150-2, which may be implemented in a number of ways. The module 150 may be implemented in hardware as module 150-1, such as being implemented as part of the one or more processors 152. The module 150-1 may be implemented also as an integrated circuit or through other hardware such as a programmable gate array. In another example, the module 150 may be implemented as module 150-2, which is implemented as computer program code 153 and is executed by the one or more processors 152. For instance, the one or more memories 155 and the computer program code 153 are configured to, with the one or more processors 152, cause the RAN node 170 to perform one or more of the operations as described herein. Note that the functionality of the module 150 may be distributed, such as being distributed between the DU 195 and the CU 196, or be implemented solely in the DU 195.

[0099] The one or more network interfaces 161 communicate over a network such as via the links 176 and 131. Two or more gNBs 170 may communicate using, e.g., link 176. The link 176 may be wired or wireless or both and may implement, for example, an Xn interface for 5G, an X2 interface for LTE, or other suitable interface for other standards.

[0100] The one or more buses 157 may be address, data, or control buses, and may include any interconnection mechanism, such as a series of lines on a motherboard or integrated circuit, fiber optics or other optical communication equipment, wireless channels, and the like. For example, the one or more transceivers 160 may be implemented as a remote radio head (RRH) 195 for LTE or a distributed unit (DU) 195 for gNB implementation for 5G, with the other elements of the RAN node 170 possibly being physically in a different location from the RRH/DU 195, and the one or more buses 157 could be implemented in part as, for example, fiber optic cable or other suitable network connection to connect the other elements (e.g., a central unit (CU), gNB-CU 196) of the RAN node 170 to the RRH/DU 195. Reference 198 also indicates those suitable network link(s).

[0101] It is noted that the description herein indicates that “cells” perform functions, but it should be clear that equipment which forms the cell may perform the functions. The cell makes up part of a base station. That is, there can be multiple cells per base station. For example, there could be three cells for a single carrier frequency and associated bandwidth, each cell covering one- third of a 360 degree area so that the single base station’s coverage area covers an approximate oval or circle. Furthermore, each cell can correspond to a single carrier and a base station may use multiple carriers. So if there are three 120 degree cells per carrier and two carriers, then the base station has a total of 6 cells.

[0102] The wireless network 100 may include a network function or functions 190 that may include core network functionality, and which provides connectivity via a link or links 181 with a further network, such as a telephone network and/or a data communications network (e.g., the Internet). Such core network functionality for 5G may include location management functions (LMF(s)) and/or access and mobility management function(s) (AMF(S)) and/or user plane functions (UPF(s)) and/or session management function(s) (SMF(s)). Such core network functionality for LTE may include MME (Mobility Management Entity)/SGW (Serving Gateway) functionality. Such core network functionality may include SON (self-organizing/optimizing network) functionality. These are merely example functions that may be supported by the network function(s) 190, and note that both 5G and LTE functions might be supported. The RAN node 170 is coupled via a link 131 to the network function 190. The link 131 may be implemented as, e.g., an NG interface for 5G, or an SI interface for LTE, or other suitable interface for other standards. The network function 190 includes one or more processors 175, one or more memories 171, and one or more network interfaces (N/W I/F(s)) 180, interconnected through one or more buses 185. The one or more memories 171 include computer program code 173.

[0103] The wireless network 100 may implement network virtualization, which is the process of combining hardware and software network resources and network functionality into a single, software-based administrative entity, a virtual network. Network virtualization involves platform virtualization, often combined with resource virtualization. Network virtualization is categorized as either external, combining many networks, or parts of networks, into a virtual unit, or internal, providing network-like functionality to software containers on a single system. Note that the virtualized entities that result from the network virtualization are still implemented, at some level, using hardware such as processors 152 or 175 and memories 155 and 171, and also such virtualized entities create technical effects.

[0104] The computer readable memories 125, 155, and 171 may be of any type suitable to the local technical environment and may be implemented using any suitable data storage technology, such as semiconductor based memory devices, flash memory, magnetic memory devices and systems, optical memory devices and systems, non-transitory memory, transitory memory, fixed memory and removable memory. The computer readable memories 125, 155, and 171 may be means for performing storage functions. The processors 120, 152, and 175 may be of any type suitable to the local technical environment, and may include one or more of general purpose computers, special purpose computers, microprocessors, digital signal processors (DSPs) and processors based on a multi-core processor architecture, as nonlimiting examples. The processors 120, 152, and 175 may be means for performing functions, such as controlling the UE 110, RAN node 170, network function(s) 190, and other functions as described herein.

[0105] In general, the various embodiments of the user equipment 110 can include, but are not limited to, cellular telephones such as smart phones, tablets, personal digital assistants (PDAs) having wireless communication capabilities, portable computers having wireless communication capabilities, image capture devices such as digital cameras having wireless communication capabilities, gaming devices having wireless communication capabilities, music storage and playback appliances having wireless communication capabilities, Internet appliances permitting wireless Internet access and browsing, tablets with wireless communication capabilities, head mounted displays such as those that implement virtual/augmented/mixed reality, as well as portable units or terminals that incorporate combinations of such functions.

[0106] UE 110, RAN node 170, and/or network function(s) 190, (and associated memories, computer program code and modules) may be configured to implement (e.g. in part) the methods described herein, including a causal reasoning system for operational twin (CAROT) for development and operation of 5G CNFS. Thus, computer program code 123, module 140- 1, module 140-2, and other elements/features shown in FIG. 14 of UE 110 may implement user equipment related aspects of the methods described herein. Similarly, computer program code 153, module 150-1, module 150-2, and other elements/features shown in FIG. 14 of RAN node 170 may implement gNB/TRP related aspects of the methods described herein, such as for a target gNB or a source gNB. Computer program code 173 and other elements/features shown in FIG. 14 of network function(s) 190 may be configured to implement network function/element related aspects of the methods described herein such as for an 0AM node.

[0107] FIG. 15 is an example apparatus 1500, which may be implemented in hardware, configured to implement the examples described herein. The apparatus 1500 comprises at least one processor 1502 (e.g. an FPGA and/or CPU), at least one memory 1504 including computer program code 1505, wherein the at least one memory 1504 and the computer program code 1505 are configured to, with the at least one processor 1502, cause the apparatus 1500 to implement circuitry, a process, component, module, or function (collectively control 1506) to implement the examples described herein, including a causal reasoning system for operational twin (CAROT) for development and operation of 5G CNFs. The memory 1504 may be a non-transitory memory, a transitory memory, a volatile memory (e.g. RAM), or a non-volatile memory (e.g. ROM).

[0108] The apparatus 1500 optionally includes a display and/or I/O interface 1508 that may be used to display aspects or a status of the methods described herein (e.g., as one of the methods is being performed or at a subsequent time), or to receive input from a user such as with using a keypad, camera, touchscreen, touch area, microphone, biometric recognition, one or more sensors, etc. The apparatus 1500 includes one or more communication interfaces (I/F(s)) 1510 e.g. one or more network (N/W) interface(s). The communication I/F(s) 1510 may be wired and/or wireless and communicate over the Internet/other network(s) via any communication technique. The communication I/F(s) 1510 may comprise one or more transmitters and one or more receivers. The communication I/F(s) 1510 may comprise standard well-known components such as an amplifier, filter, frequency-converter, (de)modulator, and encoder/decoder circuitries and one or more antennas.

[0109] The apparatus 1500 to implement the functionality of control 1506 may be UE 110, RAN node 170 (e.g. gNB), network element(s) 190 or any of the other examples described herein. Thus, processor 1502 may correspond to processor(s) 120, processor(s) 152 and/or processor(s) 175, memory 1504 may correspond to memory(ies) 125, memory(ies) 155 and/or memory(ies) 171, computer program code 1505 may correspond to computer program code 123, module 140-1, module 140-2, and/or computer program code 153, module 150-1, module 150-2, and/or computer program code 173, and communication I/F(s) 1510 may correspond to transceiver 130, antenna(s) 128, transceiver 160, antenna(s) 158, N/W I/F(s) 161, and/or N/W I/F(s) 180. Alternatively, apparatus 1500 may not correspond to either of UE 110, RAN node 170, or network function(s) 190, as apparatus 1500 may be part of a self- organizing/optimizing network (SON) node, such as in a cloud.

[0110] The apparatus 1500 may also be distributed throughout the network (e.g. 100) including within and between apparatus 1500 and any network element (such as a network control function (NCE) 190 and/or the RAN node 170 and/or the UE 110.

[0111] Interface 1512 enables data communication between the various items of apparatus 1500, as shown in FIG. 15. For example, the interface 1512 may be one or more buses such as address, data, or control buses, and may include any interconnection mechanism, such as a series of lines on a motherboard or integrated circuit, fiber optics or other optical communication equipment, and the like. Computer program code 1505, including control 1506 may comprise object-oriented software configured to pass data and messages between objects within computer program code 1505. The apparatus 1500 need not comprise each of the features mentioned, or may comprise other features as well.

[0112] FIG. 16 shows a schematic representation of non-volatile memory media 1600a (e.g. computer disc (CD) or digital versatile disc (DVD)) and 1600b (e.g. universal serial bus (USB) memory stick) storing instructions and/or parameters 1602 which when executed by a processor allows the processor to perform one or more of the steps of the methods described herein.

[0113] FIG. 17 is an example method 1700 to implement the example embodiments described herein. At 1710, the method includes operating a replicate of one or more cloud native network functions. At 1720, the method includes generating observational data of the replicate of the one or more cloud native network functions, the observational data generated based on a plurality of operating conditions of the one or more cloud native network functions. At 1730, the method includes applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions. Method 1700 may be performed with CAROT 201, apparatus 1100, or apparatus 1500. [0114] FIG. 18 is an example method 1800 to implement the example embodiments described herein. At 1810, the method includes operating a replicate of one or more target applications. At 1820, the method includes generating observational data of the replicate of the one or more target applications, the observational data generated based on a plurality of operating conditions of the one or more target applications. At 1830, the method includes applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more target applications and the at least one observed effect of the one or more target applications. Method 1800 may be performed with CAROT 201, apparatus 1100, or apparatus 1500.

[0115] FIG. 19 is an example method 1900 to implement the example embodiments described herein. At 1910, the method includes selecting one or more cloud native network functions. At 1920, the method includes selecting at least one feature of the one or more cloud native network functions. At 1930, the method includes selecting at least one environmental condition of the one or more cloud native network functions. At 1940, the method includes selecting a load of the one or more cloud native network functions, the load comprising an intensity and duration of processing of the one or more cloud native network functions subject to the at least one environmental condition. At 1950, the method includes performing at least one experiment with the one or more cloud native network functions, the experiment based on the at least one feature, the load, and the at least one environmental condition of the one or more cloud native network functions. At 1960, the method includes collecting observational data from the at least one experiment. At 1970, the method includes applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions. Method 1900 may be performed with CAROT 201, apparatus 1100, or apparatus 1500.

[0116] The following examples are provided and described herein.

[0117] Example 1. An apparatus including: at least one processor; and at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus at least to: operate a replicate of one or more cloud native network functions; generate observational data of the replicate of the one or more cloud native network functions, the observational data generated based on a plurality of operating conditions of the one or more cloud native network functions; and apply a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

[0118] Example 2. The apparatus of example 1, wherein the apparatus is caused to: design the one or more cloud native network functions based on the analyzed causality between the at least one observed cause and the at least one observed effect of the one or more cloud native network functions.

[0119] Example 3. The apparatus of example 2, wherein the one or more cloud native network functions is designed during or after a testing phase of a continuous integration and continuous deployment pipeline, and prior to release of the one or more cloud native network functions in a production environment, the release being of the continuous integration and continuous deployment pipeline.

[0120] Example 4. The apparatus of any of examples 1 to 3, wherein the apparatus is caused to: operate or configure the one or more cloud native network functions during or after a testing phase of a continuous integration and continuous deployment pipeline, and prior to release of the one or more cloud native network functions in a production environment, the release being of the continuous integration and continuous deployment pipeline.

[0121] Example 5. The apparatus of any of examples 1 to 4, wherein the one or more cloud native network functions comprises a software application for a 5G network.

[0122] Example 6. The apparatus of any of examples 1 to 5, wherein the replicate comprises a digital twin of the one or more cloud native network functions.

[0123] Example 7. The apparatus of any of examples 1 to 6, wherein the at least one observed cause comprises an operating attribute of the one or more cloud native network functions, wherein configuration of the one or more cloud native network functions comprises a type of operating attribute.

[0124] Example 8. The apparatus of any of examples 1 to 7, wherein the at least one observed effect comprises at least one performance indicator of the one or more cloud native network functions. [0125] Example 9. The apparatus of any of examples 1 to 8, wherein the at least one observed effect comprises at least one feature, a load, or at least one environmental condition of the one or more cloud native network functions.

[0126] Example 10. The apparatus of any of examples 1 to 9, wherein analyzing the causality between the at least one observed cause and the at least one observed effect of the one or more cloud native network functions comprises at least one of: determining an existence of a relationship between the at least one observed cause and the at least one observed effect; determining a magnitude of the relationship between the at least one observed cause and the at least one observed effect; or determining a way in which the at least one observed cause and the at least one observed effect are related.

[0127] Example 11. The apparatus of any of examples 1 to 10, wherein the plurality of operating conditions comprises at least one operating condition of the one or more cloud native network functions, the at least one operating condition comprising at least one of: network load; application programming interface load; domain name system impairment; kernel breakdown; total breakdown; network stress; computational resource stress; time impairment; web service breakdown; or input output impairment.

[0128] Example 12. The apparatus of any of examples 1 to 11, wherein the apparatus is further caused to: form at least one graph that represents the causality between the at least one observed cause and the at least one observed effect of the one or more cloud native network functions.

[0129] Example 13. The apparatus of example 12, wherein the at least one graph comprises a direct acyclic graph.

[0130] Example 14. The apparatus of any of examples 12 to 13, wherein the apparatus is further caused to: form the at least one graph using domain expert knowledge to remove at least one ambiguity related to causality between the at least one observed cause and the at least one observed effect of the one or more cloud native network functions.

[0131] Example 15. The apparatus of any of examples 1 to 14, wherein the apparatus is further caused to: generate the observational data as a table, wherein a row of the table comprises an observation, and a column of the table comprises a cause attribute, an effect attribute, or an attribute comprising both a cause and effect. [0132] Example 16. The apparatus of any of examples 1 to 15, wherein the apparatus is further caused to: generate the observational data as a table, wherein a first column of the table comprises at least a portion of operating configurations of the one or more cloud native network functions, and a second column of the table comprises the at least one observed effect of the one or more cloud native network functions.

[0133] Example 17. The apparatus of any of examples 1 to 16, wherein the apparatus is further caused to: generate the observational data as a table, wherein the table comprises an experimental group and a control group, wherein the experimental group comprises a collection of experiments in which at least one operating attribute is set to a specific value that is to be studied, and wherein the control group comprises a collection of experiments in which operating attributes are randomized.

[0134] Example 18. The apparatus of example 17, wherein the apparatus is further caused to: determine an average treatment effect of the one or more cloud native network functions when the at least one operating attribute is set to the specific value.

[0135] Example 19. The apparatus of any of examples 1 to 18, wherein the apparatus is further caused to: determine a magnitude of a relationship between the at least one observed cause and the at least one observed effect; wherein the magnitude of a relationship comprises at least one of: average treatment effect, a conditional average treatment effect, an individual treatment effect, a natural direct effect, or a natural indirect effect.

[0136] Example 20. The apparatus of any of examples 1 to 19, wherein applying the causal reasoning function comprises performing a do-calculus to replace a do operator with at least one conditional probability of the at least one observed effect of the one or more cloud native network functions given the at least one observed cause, the at least one conditional probability used to infer the causality between the at least one observed cause and the at least one observed effect.

[0137] Example 21. The apparatus of any of examples 1 to 20, wherein applying the causal reasoning function comprises performing a statistical analysis of the at least one observed cause and the at least one observed effect of the one or more cloud native network functions.

[0138] Example 22. The apparatus of any of examples 1 to 21, wherein applying the causal reasoning function comprises applying machine learning to analyze the causality between the at least one observed cause and the at least one observed effect of the one or more cloud native network functions.

[0139] Example 23. The apparatus of any of examples 1 to 22, wherein the apparatus is further caused to: validate or dismiss at least one design or configuration assumption related to the at least one observed cause or the at least one observed effect of the one or more cloud native network functions, based on the application of the causal reasoning function.

[0140] Example 24. The apparatus of any of examples 1 to 23, wherein the apparatus is further caused to: determine at least one operating condition of the one or more cloud native network functions to be suboptimal, based on the application of the causal reasoning function.

[0141] Example 25. The apparatus of any of examples 1 to 24, wherein the apparatus is further caused to: perform a root cause analysis to determine at least one cause of a fault of operation of the one or more cloud native network functions.

[0142] Example 26. The apparatus of example 25, wherein the apparatus is further caused to: operate or configure the one or more cloud native network functions using at least one result of the root cause analysis to complement the application of the causal reasoning function.

[0143] Example 27. The apparatus of any of examples 1 to 26, wherein the apparatus is further caused to: determine whether the one or more cloud native network functions is an external application or an internal application.

[0144] Example 28. The apparatus of any of examples 1 to 27, wherein the one or more cloud native network functions is in containerized software form.

[0145] Example 29. An apparatus including: at least one processor; and at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus at least to: operate a replicate of one or more target applications; generate observational data of the replicate of the one or more target applications, the observational data generated based on a plurality of operating conditions of the one or more target applications; and apply a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more target applications and the at least one observed effect of the one or more target applications.

[0146] Example 30. The apparatus of example 29, wherein the apparatus is caused to: design the one or more target applications based on the analyzed causality between the at least one observed cause and the at least one observed effect of the one or more target applications.

[0147] Example 31. The apparatus of example 30, wherein the one or more target applications is designed during or after a testing phase of a continuous integration and continuous deployment pipeline, and prior to release of the one or more target applications in a production environment, the release being of the continuous integration and continuous deployment pipeline.

[0148] Example 32. The apparatus of any of examples 29 to 31, wherein the apparatus is caused to: operate or configure the one or more target applications during or after a testing phase of a continuous integration and continuous deployment pipeline, and prior to release of the one or more target applications in a production environment, the release being of the continuous integration and continuous deployment pipeline.

[0149] Example 33. The apparatus of any of examples 29 to 32, wherein the one or more target applications comprises a software application for a 5G network.

[0150] Example 34. The apparatus of any of examples 29 to 33, wherein the replicate comprises a digital twin of the one or more target applications.

[0151] Example 35. The apparatus of any of examples 29 to 34, wherein the at least one observed cause comprises an operating attribute of the one or more target applications, wherein configuration of the one or more target applications comprises a type of operating attribute.

[0152] Example 36. The apparatus of any of examples 29 to 35, wherein the at least one observed effect comprises at least one performance indicator of the one or more target applications.

[0153] Example 37. The apparatus of any of example 29 to 36, wherein the at least one observed effect comprises at least one feature, a load, or at least one environmental condition of the one or more target applications. [0154] Example 38. The apparatus of any of examples 29 to 37, wherein analyzing the causality between the at least one observed cause and the at least one observed effect of the one or more target applications comprises at least one of: determining an existence of a relationship between the at least one observed cause and the at least one observed effect; determining a magnitude of the relationship between the at least one observed cause and the at least one observed effect; or determining a way in which the at least one observed cause and the at least one observed effect are related.

[0155] Example 39. The apparatus of any of examples 29 to 38, wherein the plurality of operating conditions comprises at least one operating condition of the one or more target applications, the at least one operating condition comprising at least one of: network load; application programming interface load; domain name system impairment; kernel breakdown; total breakdown; network stress; computational resource stress; time impairment; web service breakdown; or input output impairment.

[0156] Example 40. The apparatus of any of examples 29 to 39, wherein the apparatus is further caused to: form at least one graph that represents the causality between the at least one observed cause and the at least one observed effect of the one or more target applications.

[0157] Example 41. The apparatus of example 40, wherein the at least one graph comprises a direct acyclic graph.

[0158] Example 42. The apparatus of any of examples 40 to 41, wherein the apparatus is further caused to: form the at least one graph using domain expert knowledge to remove at least one ambiguity related to causality between the at least one observed cause and the at least one observed effect of the one or more target applications.

[0159] Example 43. The apparatus of any of examples 29 to 42, wherein the apparatus is further caused to: generate the observational data as a table, wherein a row of the table comprises an observation, and a column of the table comprises a cause attribute, an effect attribute, or an attribute comprising both a cause and effect.

[0160] Example 44. The apparatus of any of examples 29 to 43, wherein the apparatus is further caused to: generate the observational data as a table, wherein a first column of the table comprises at least a portion of operating configurations of the one or more target applications, and a second column of the table comprises the at least one observed effect of the one or more target applications.

[0161] Example 45. The apparatus of any of examples 29 to 44, wherein the apparatus is further caused to: generate the observational data as a table, wherein the table comprises an experimental group and a control group, wherein the experimental group comprises a collection of experiments in which at least one operating attribute is set to a specific value that is to be studied, and wherein the control group comprises a collection of experiments in which operating attributes are randomized.

[0162] Example 46. The apparatus of example 45, wherein the apparatus is further caused to: determine an average treatment effect of the one or more target applications when the at least one operating attribute is set to the specific value.

[0163] Example 47. The apparatus of any of examples 29 to 46, wherein the apparatus is further caused to: determine a magnitude of a relationship between the at least one observed cause and the at least one observed effect; and wherein the magnitude of a relationship comprises at least one of: average treatment effect, a conditional average treatment effect, an individual treatment effect, a natural direct effect, or a natural indirect effect.

[0164] Example 48. The apparatus of any of examples 29 to 47, wherein applying the causal reasoning function comprises performing a do-calculus to replace a do operator with at least one conditional probability of the at least one observed effect of the one or more target applications given the at least one observed cause, the at least one conditional probability used to infer the causality between the at least one observed cause and the at least one observed effect.

[0165] Example 49. The apparatus of any of examples 29 to 48, wherein applying the causal reasoning function comprises performing a statistical analysis of the at least one observed cause and the at least one observed effect of the one or more target applications.

[0166] Example 50. The apparatus of any of examples 29 to 49, wherein applying the causal reasoning function comprises applying machine learning to analyze the causality between the at least one observed cause and the at least one observed effect of the one or more target applications.

[0167] Example 51. The apparatus of any of examples 29 to 50, wherein the apparatus is further caused to: validate or dismiss at least one design or configuration assumption related to the at least one observed cause or the at least one observed effect of the one or more target applications, based on the application of the causal reasoning function.

[0168] Example 52. The apparatus of any of examples 29 to 51, wherein the apparatus is further caused to: determine at least one operating condition of the one or more target applications to be suboptimal, based on the application of the causal reasoning function.

[0169] Example 53. The apparatus of any of examples 29 to 52, wherein the apparatus is further caused to: perform a root cause analysis to determine at least one cause of a fault of operation of the one or more target applications.

[0170] Example 54. The apparatus of example 53, wherein the apparatus is further caused to: operate or configure the one or more target applications using at least one result of the root cause analysis to complement the application of the causal reasoning function.

[0171] Example 55. The apparatus of any of examples 29 to 54, wherein the apparatus is further caused to: determine whether the one or more target applications is an external application or an internal application.

[0172] Example 56. The apparatus of any of examples 29 to 55, wherein the one or more target applications is in containerized software form.

[0173] Example 57. The apparatus of any of examples 29 to 56, wherein the one or more target applications comprises a cloud native network function.

[0174] Example 58. The apparatus of example 57, wherein the cloud native network function comprises a 5G cloud native network function.

[0175] Example 59. The apparatus of any of examples 57 to 58, wherein the cloud native network function is in containerized software form.

[0176] Example 60. An apparatus including: at least one processor; and at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus at least to: select one or more cloud native network functions; select at least one feature of the one or more cloud native network functions; select at least one environmental condition of the one or more cloud native network functions; select a load of the one or more cloud native network functions, the load comprising an intensity and duration of processing of the one or more cloud native network functions subject to the at least one environmental condition; perform at least one experiment with the one or more cloud native network functions, the experiment based on the at least one feature, the load, and the at least one environmental condition of the one or more cloud native network functions; collect observational data from the at least one experiment; and apply a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

[0177] Example 61. The apparatus of example 60, wherein the at least one observed cause comprises an operating attribute of the one or more cloud native network functions, wherein configuration of the one or more cloud native network functions comprises a type of operating attribute.

[0178] Example 62. The apparatus of any of examples 60 to 61, wherein the at least one observed effect comprises at least one performance indicator of the one or more cloud native network functions.

[0179] Example 63. The apparatus of any of examples 60 to 62, wherein the at least one observed effect comprises the at least one feature, the load, or the at least one environmental condition of the one or more cloud native network functions.

[0180] Example 64. The apparatus of any of examples 60 to 63, wherein the apparatus is further caused to: design the one or more cloud native network functions based on the analyzed causality between the at least one observed cause and the at least one observed effect of the one or more cloud native network functions.

[0181] Example 65. The apparatus of any of examples 60 to 64, wherein the apparatus is further caused to: determine whether the one or more cloud native network functions is an external application or an internal application.

[0182] Example 66. The apparatus of example 65, wherein the apparatus is further caused to: provide an endpoint of the one or more cloud native network functions, in response to determining that the one or more cloud native network functions is an external application. [0183] Example 67. The apparatus of example 66, wherein the endpoint comprises a uniform resource locator.

[0184] Example 68. The apparatus of any of examples 65 to 67, wherein the apparatus is further caused to: deploy the one or more cloud native network functions, in response to determining that the one or more cloud native network functions is an internal application; wherein the least one experiment is performed with the one or more cloud native network functions.

[0185] Example 69. The apparatus of example 68, wherein the apparatus is further caused to: wrap the one or more cloud native network functions within a helm template during the application of the causal reasoning function.

[0186] Example 70. The apparatus of any of examples 60 to 69, wherein the environmental condition of the one or more cloud native network functions comprises at least one of: network load; application programming interface load; domain name system impairment; kernel breakdown; total breakdown; network stress; computational resource stress; time impairment; web service breakdown; or input output impairment.

[0187] Example 71. The apparatus of any of examples 60 to 70, wherein the apparatus is further caused to: form at least one graph that represents the causality between the at least one observed cause and the at least one observed effect of the one or more cloud native network functions.

[0188] Example 72. The apparatus of any of examples 60 to 71, wherein the apparatus is further caused to: determine a magnitude of a relationship between the at least one observed cause and the at least one observed effect; and wherein magnitude of a relationship comprises at least one of: average treatment effect, a conditional average treatment effect, an individual treatment effect, a natural direct effect, or a natural indirect effect.

[0189] Example 73. The apparatus of any of examples 60 to 72, wherein the one or more cloud native network functions is in containerized software form.

[0190] Example 74. The apparatus of any of examples 60 to 74, wherein the one or more cloud native network functions is a fifth generation (5G) cloud native network function.

[0191] Example 75. A method including: operating a replicate of one or more cloud native network functions; generating observational data of the replicate of the one or more cloud native network functions, the observational data generated based on a plurality of operating conditions of the one or more cloud native network functions; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

[0192] Example 76. A method including: operating a replicate of one or more target applications; generating observational data of the replicate of the one or more target applications, the observational data generated based on a plurality of operating conditions of the one or more target applications; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more target applications and the at least one observed effect of the one or more target applications.

[0193] Example 77. A method including: selecting one or more cloud native network functions; selecting at least one feature of the one or more cloud native network functions; selecting at least one environmental condition of the one or more cloud native network functions; selecting a load of the one or more cloud native network functions, the load comprising an intensity and duration of processing of the one or more cloud native network functions subject to the at least one environmental condition; performing at least one experiment with the one or more cloud native network functions, the experiment based on the at least one feature, the load, and the at least one environmental condition of the one or more cloud native network functions; collecting observational data from the at least one experiment; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

[0194] Example 78. An apparatus including: means for operating a replicate of one or more cloud native network functions; means for generating observational data of the replicate of the one or more cloud native network functions, the observational data generated based on a plurality of operating conditions of the one or more cloud native network functions; and means for applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

[0195] Example 79. An apparatus including: means for operating a replicate of one or more target applications; means for generating observational data of the replicate of the one or more target applications, the observational data generated based on a plurality of operating conditions of the one or more target applications; and means for applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more target applications and the at least one observed effect of the one or more target applications.

[0196] Example 80. An apparatus including: means for selecting one or more cloud native network functions; means for selecting at least one feature of the one or more cloud native network functions; means for selecting at least one environmental condition of the one or more cloud native network functions; means for selecting a load of the one or more cloud native network functions, the load comprising an intensity and duration of processing of the one or more cloud native network functions subject to the at least one environmental condition; means for performing at least one experiment with the one or more cloud native network functions, the experiment based on the at least one feature, the load, and the at least one environmental condition of the one or more cloud native network functions; means for collecting observational data from the at least one experiment; and means for applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

[0197] Example 81. A non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable with the machine for performing operations, the operations including: operating a replicate of one or more cloud native network functions; generating observational data of the replicate of the one or more cloud native network functions, the observational data generated based on a plurality of operating conditions of the one or more cloud native network functions; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions. [0198] Example 82. A non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable with the machine for performing operations, the operations including: operating a replicate of one or more target applications; generating observational data of the replicate of the one or more target applications, the observational data generated based on a plurality of operating conditions of the one or more target applications; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more target applications and the at least one observed effect of the one or more target applications.

[0199] Example 83. A non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable with the machine for performing operations, the operations including: selecting one or more cloud native network functions; selecting at least one feature of the one or more cloud native network functions; selecting at least one environmental condition of the one or more cloud native network functions; selecting a load of the one or more cloud native network functions, the load comprising an intensity and duration of processing of the one or more cloud native network functions subject to the at least one environmental condition; performing at least one experiment with the one or more cloud native network functions, the experiment based on the at least one feature, the load, and the at least one environmental condition of the one or more cloud native network functions; collecting observational data from the at least one experiment; and applying a causal reasoning function using the observational data to analyze causality between at least one observed cause of at least one observed effect of the one or more cloud native network functions and the at least one observed effect of the one or more cloud native network functions.

[0200] References to a ‘computer’, ‘processor’, etc. should be understood to encompass not only computers having different architectures such as single/multi -processor architectures and sequential or parallel architectures but also specialized circuits such as field- programmable gate arrays (FPGAs), application specific circuits (ASICs), signal processing devices and other processing circuitry. References to computer program, instructions, code etc. should be understood to encompass software for a programmable processor or firmware such as, for example, the programmable content of a hardware device whether instructions for a processor, or configuration settings for a fixed-function device, gate array or programmable logic device etc.

[0201] The memory(ies) as described herein may be implemented using any suitable data storage technology, such as semiconductor based memory devices, flash memory, magnetic memory devices and systems, optical memory devices and systems, non-transitory memory, transitory memory, fixed memory and removable memory. The memory(ies) may comprise a database for storing data.

[0202] As used herein, the term ‘circuitry’ may refer to the following: (a) hardware circuit implementations, such as implementations in analog and/or digital circuitry, and (b) combinations of circuits and software (and/or firmware), such as (as applicable): (i) a combination of processor(s) or (ii) portions of processor(s)/software including digital signal processor(s), software, and memory(ies) that work together to cause an apparatus to perform various functions, and (c) circuits, such as a microprocessor(s) or a portion of a microprocessor(s), that require software or firmware for operation, even if the software or firmware is not physically present. As a further example, as used herein, the term ‘circuitry’ would also cover an implementation of merely a processor (or multiple processors) or a portion of a processor and its (or their) accompanying software and/or firmware. The term ‘circuitry’ would also cover, for example and if applicable to the particular element, a baseband integrated circuit or applications processor integrated circuit for a mobile phone or a similar integrated circuit in a server, a cellular network device, or another network device.

[0203] In the figures, arrows between individual blocks represent operational couplings there-between as well as the direction of data flows on those couplings.

[0204] It should be understood that the foregoing description is only illustrative. Various alternatives and modifications may be devised by those skilled in the art. For example, features recited in the various dependent claims could be combined with each other in any suitable combination(s). In addition, features from different example embodiments described above could be selectively combined into a new example embodiment. Accordingly, this description is intended to embrace all such alternatives, modifications and variances which fall within the scope of the appended claims.

[0205] The following acronyms and abbreviations that may be found in the specification and/or the drawing figures are defined as follows (the abbreviations and acronyms may be appended with each other or with other characters using e.g. a dash, hyphen, or number): 3 GPP third generation partnership project

4G fourth generation

5G fifth generation

5GC 5G core network

AGI artificial general intelligence

AMF access and mobility management function

API application programming interface

ASIC application-specific integrated circuit

ATE average treatment effect

ATT average treatment effect on the treated

CATE conditional average treatment effect

CAROT causal reasoning for operational twin

CD causal discovery

CI causal inference

CVCD continuous integration I continuous deployment (or delivery)

CNF cloud native network function

Col column

CPU central processing unit

CSV comma separated values cu central unit or centralized unit

DAG direct acyclic graph

DNS domain name system/service

DSP digital signal processor eNB evolved Node B (e.g., an LTE base station)

EN-DC E-UTRAN new radio - dual connectivity en-gNB node providing NR user plane and control plane protocol terminations towards the UE, and acting as a secondary node in EN-

DC

E-UTRA evolved universal terrestrial radio access, i.e., the LTE radio access technology

E-UTRAN E-UTRA network env. environmental

Fl interface between the CU and the DU

FPGA field-programmable gate array gNB base station for 5G/NR, i.e., a node providing NR user plane and control plane protocol terminations towards the UE, and connected via the NG interface to the 5GC

Helm or helm package manager for Kubernetes

HTTP hypertext transfer protocol id identifier

IDE integrated development environment

I/F interface

IMIX internet mixture incl. including

I/O input/output

ITE individual treatment effect

K8 or K8s Kubernetes

KPI key performance indicator

LMF location management function

LTE long term evolution (4G)

MAC medium access control

ML machine learning

MME mobility management entity

NCE network control element

NDE natural direct effect

NIE natural indirect effect ng or NG new generation ng-eNB new generation eNB NG-RAN new generation radio access network

NR new radio (5G)

NSA non-standalone, typically in the context of 5G core

N/W network

Obs. observational

PDA personal digital assistant

PDCP packet data convergence protocol

PHY physical layer

RAM random access memory

RAN radio access network RCA root cause analysis

RCT randomized controlled trial

REST representational state transfer

RLC radio link control

ROM read-only memory

RRC radio resource control (protocol)

RU radio unit

Rx receiver or reception

SA standalone, typically in the context of 5G core

SGW serving gateway

SLA service level agreement

SLO service level objective

SMF session management function

SON self-organizing/optimizing network

SOTA state of the art

TRP transmission reception point

Tx transmitter or transmission

UE user equipment (e.g., a wireless, typically mobile device)

UPF user plane function

URL uniform resource locator vCPU virtual centralized processing unit

WS web service(s)

X2 network interface between RAN nodes and between RAN and the core network

Xn network interface between NG-RAN nodes