Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
RELIABILITY ANALYSIS FRAMEWORK FOR NODE-LOCAL INTERMEDIARY STORAGE ARCHITECTURES
Document Type and Number:
WIPO Patent Application WO/2024/086054
Kind Code:
A1
Abstract:
A method, computing device, and a non-transitory computer-readable medium are provided. The computing device determines an initial node-local burst buffer content at a start of a time period T. A current node-local burst buffer content is received by the computing device during the time period. For each checkpoint/restart time interval, the computing device: estimates stochastic transition rates λ and μ, indicating when the node-local burst buffer is receiving and draining, respectively; estimates input flow data rates of data entering the node-local burst buffer from a compute node; and estimates drain data rates of the data leaving the node-local burst buffer to a parallel file system. The computing device models an average statistical reliability function of the node-local burst buffer within the time period T with respect to not exceeding a predetermined threshold value. When the average statistical reliability function has a value that is less than a predetermined threshold, the computing device performs an action.

Inventors:
CLARK ANTWAN (US)
FLEMING NICOLE (US)
BERRIOS GIOVANNI (US)
SHAO YU (US)
BAI JIAWEN (US)
Application Number:
PCT/US2023/035013
Publication Date:
April 25, 2024
Filing Date:
October 12, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV JOHNS HOPKINS (US)
International Classes:
G06F3/06; G06F16/18
Attorney, Agent or Firm:
IRVING, Richard C. (US)
Download PDF:
Claims:
CLAIMS

We claims as our invention:

1. A method for performing real-time reliability analysis of node-local burst buffer architectures, the method comprising: determining, by a computing device, an initial node-local burst buffer content at a start of a time period T; receiving, by the computing device, a current node-local burst buffer content during the time period; for each checkpoint/restart time interval, performing by the computing device: estimating stochastic transition rates A = A12 and g = A21, indicating when the node-local burst buffer is receiving and draining, respectively, estimating input flow data rates of data entering the node-local burst buffer from a compute node, and estimating drain data rates of the data leaving the node-local burst buffer to be stored to a parallel file system; modeling, by the computing device, an average statistical reliability function of the node-local burst buffer within the time period T with respect to not exceeding a predetermined threshold value; and performing an action, by the computing device, when the average statistical reliability function has a value that is less than a predefined value.

2. The method of claim 1, wherein the estimating the stochastic transition rates comprises: estimating the A by dividing a number of transitions to a node-local burst buffer receiving state by a cumulative amount of time that the node-local burst buffer is in the nodelocal burst buffer receiving state; and estimating the /J. by dividing a number of transitions to a node-local buffer draining state by a cumulative amount of time that the node-local burst buffer is in the node-local burst buffer draining state.

3. The method of claim 2, wherein when the A and the /J. exceed a predefined threshold, the method further comprises: performing expectation maximization to estimate final values of the A and the //.

4. The method of claim 1, wherein:

W1(x, t) is equal to a probability that an amount of data in the node-local burst buffer is less than or equal to a predefined threshold given that the node-local burst buffer is in a nodelocal burst buffer draining state,

W2(x, t) is equal to a probability that an amount of data in the node-local burst buffer is less than or equal to the predefined threshold given that the node-local burst buffer is in a node-local burst buffer receiving state, and calculating a value of the statistical reliability function based on the sum of W1(x, t) and W2 (x, t) .

5. The method of claim 4, wherein: when the node-local burst buffer is initially empty at a start of the each checkpoint/restart time interval,

ln is a modified Bessel function of order n = 0, 1, 2.

6. The method of claim 4, wherein: an initial content of the node-local burst buffer is greater than the predefined threshold and when when

and ln are modified Bessel functions of an order n = 0, 1.

7. The method of claim 4, wherein: an initial content of the node-local burst buffer is less than or equal to the predefined threshold and when

8. The method of claim 4, further comprising: calculating from and from AtA21, wherein

9. A computing device for performing real-time reliability analysis of node-local burst buffer architectures, the computing device comprising: at least one processor; and a memory connected with the at least one processor; and a node-local burst buffer, wherein: the at least one processor is configured to perform operations comprising: determining an initial node-local burst buffer content at a start of a time period T; receiving a current node-local burst buffer content during the time period; for each checkpoint/restart time interval, performing: estimating stochastic transition rates X12 and g, indicating when the node-local burst buffer is receiving and draining, respectively, estimating input flow data rates of data entering the node-local burst buffer from a compute node, and estimating drain data rates of the data leaving the node-local burst buffer to a parallel file system; modeling an average statistical reliability function of the node-local burst buffer within the time period T with respect to not exceeding a predetermined threshold value; and performing an action when the average statistical reliability function has a value that is less than a predefined value.

10. The computing device of claim 9, wherein the estimating the stochastic transition rates comprises: estimating the A = A12 by dividing the number of transitions to a node-local buffer receiving state by a cumulative amount of time that the node-local burst buffer is in the nodelocal burst buffer receiving state; and estimating the /J. = A21 by dividing the number of transitions to the node-local buffer draining state by a cumulative amount of time that the node-local burst buffer is in the nodelocal burst buffer draining state.

11. The computing device of claim 10, wherein when the and the exceed a predefined threshold, the method further comprises: performing expectation maximization to estimate final values of and

12. The computing device of claim 9, wherein:

W1(x, t) is equal to the probability that an amount of data in the node-local burst buffer is less than or equal to a predefined threshold given that the node-local burst buffer is in a nodelocal burst buffer draining state;

W2 (x, t) is equal to the probability that an amount of data in the node-local burst buffer is less than or equal to the predefined threshold given that the node-local burst buffer is in a node-local burst buffer receiving state; and calculating a value of the statistical reliability function based on a sum of W1(x, t) and W2(x, f).

13. The computing device of claim 12, wherein: the node-local burst buffer is initially empty at a start of the each checkpoint/restart time interval,

In is a modified Bessel function of order n = 0, 1, 2.

14. The computing device of claim 12, wherein: an initial content of the node-local burst buffer is greater than the predefined threshold and when and In are modified Bessel functions of an order n = 0, 1.

15. The computing device of claim 12, wherein: an initial content of the node-local burst buffer is less than or equal to the predefined threshold and when and where

16. The computing device of claim 12, wherein the operations further comprise: calculating from AtA12 and from wherein

17. A non-transitory computer-readable medium having instructions recorded thereon for a processor of a computing device to perform operations comprising: determining an initial node-local burst buffer content at a start of a time period T; receiving a current node-local burst buffer content during the time period; for each checkpoint/restart time interval, performing: estimating stochastic transition rates and indicating when the node-local burst buffer is receiving and draining, respectively, estimating input flow data rates of data entering the node-local burst buffer from a compute node, and estimating drain data rates of the data leaving the node-local burst buffer to be stored to a parallel file system; modeling an average statistical reliability function of the node-local burst buffer within the time period T with respect to not exceeding a predetermined threshold value; and performing an action when the average statistical reliability function has a value that is less than a predefined threshold value.

18. The non-transitory computer-readable medium of claim 17, wherein the estimating the stochastic transition rates comprises: estimating the by dividing a number of transitions to a node-local buffer receiving state by a cumulative amount of time that the node-local burst buffer is in the nodelocal burst buffer receiving state; and estimating the by dividing a number of transitions to the node-local buffer draining state by a cumulative amount of time that the node-local burst buffer is in the nodelocal burst buffer draining state.

19. The non-transitory computer-readable medium of claim 18, wherein when the A and the jj. exceed a predefined threshold, the method further comprises: performing expectation maximization to estimate final values of A and g.

20. The non-transitory computer-readable medium of claim 17, wherein:

W1(x, t) is equal to the probability that an amount of data in the node-local burst buffer is less than or equal to a predefined threshold given that the node-local burst buffer is in a nodelocal burst buffer draining state;

W2 (x, t) is equal to the probability that an amount of data in the node-local burst buffer is less than or equal to the predefined threshold given that the node-local burst buffer is in a node-local burst buffer receiving state; and calculating a value of the statistical reliability function based on a sum of W1(x, t) and W2(x, f).

21. The non-transitory computer-readable medium of claim 20, wherein: the node-local burst buffer is initially empty at a start of the each checkpoint/restart time interval,

In is a modified Bessel function of order n = 0, 1, 2.

22. The non-transitory computer-readable medium of claim 20, wherein: an initial content of the node-local burst buffer is greater than the predefined threshold and when and In are modified Bessel functions of an order n = 0, 1.

23. The non-transitory computer-readable medium of claim 20, wherein the operations further comprise: calculating from 2 and 1 from AtA21, wherein

Description:
RELIABILITY ANALYSIS FRAMEWORK FOR

NODE-LOCAL INTERMEDIARY STORAGE ARCHITECTURES

[0001] This application claims the benefit of U.S. Provisional Patent Application No. 63/380,484, filed October 21, 2022, and U.S. Provisional Patent Application No. 63/382,580, filed November 7, 2022.

[0002] This invention was made with United States government support under contract S900294BAH awarded by the Army Research Laboratory. The United States government has certain rights in the invention.

BACKGROUND OF THE INVENTION

[0003] High performance computing (HPC) systems transformed the way that information is processed and stored because they can handle vast amounts of data. However, they also come with the challenge of handling input/output (I/O) bottlenecks due to the following reasons. First, big data applications running in these environments perform many read and write operations to handle workloads and thus consume much I/O bandwidth. Additionally, application-based checkpointing and restarting (C/R) is burdensome on I/O infrastructure because checkpointing operations perform a myriad number of write requests to a parallel file system (PFS) which also degrade storage server bandwidth. Job heterogeneity is also an issue because job requests of various sizes and priorities compete with each other for EO bandwidth and other resources. This results in prolonged average EO time because processing of smaller jobs would be delayed due to concurrent processing of larger jobs. As a result, an application C/R process is also affected because lower-priority jobs could frequently interrupt the checkpointing of higher-priority jobs. Scientists have addressed these concerns by proposing burst buffers (BBs) as brokers via developing infrastructures and algorithms to minimize effects of EO contention in supercomputing infrastructures. One approach is to create node-local BB architectures in which each burst buffer is collocated with a corresponding compute node. This is advantageous for its scalability while also improving checkpoint bandwidth because aggregate bandwidth increases proportionally to the number of compute nodes. Since researchers at the San Diego Supercomputer Center (SDSC) illustrated this proof of concept via a DASH supercomputing cluster, several current HPCs have adopted these types of storage. SUMMARY OF THE INVENTION

[0004] In a first embodiment, a method is provided for performing real-time reliability analysis of node-local burst buffer architectures. A computing device determines an initial node-local burst buffer content at a start of a time period. The computing device receives a current node-local burst buffer content during the time period. For each checkpoint/restart time interval, the computing device: estimates stochastic transition rates A and , indicating when the node-local burst buffer is receiving and draining, respectively; estimates input flow data rates of data entering the node-local burst buffer from a compute node; and estimates drain data rates of the data leaving the node-local burst buffer to a parallel file system. The computing device models an average statistical reliability function of the node-local burst buffer within the time period T with respect to not exceeding a predetermined threshold value. The computing device performs an action when the average statistical reliability function has a value that is less than a predefined value.

[0005] In a second embodiment, a computing device is provided for performing real-time reliability analysis of node-local burst buffer architectures. The computing device includes at least one processor, a memory connected with the at least one processor, and a node-local burst buffer. The at least one processor is configured to perform operations. According to the operations: an initial node-local burst buffer content is determined at a start of a time period T; a current node-local burst buffer content is received during the time period; for each checkpoint/restart time interval, stochastic transition rates A and /J. are estimated, indicating when the node-local burst buffer is receiving and draining, respectively, input flow data rates of data entering the node-local burst buffer from a compute node are estimated, and drain data rates of the data leaving the node-local burst buffer to be stored to a parallel file system are estimated. An average statistical reliability function of the node-local burst buffer within the time period T is modeled with respect to not exceeding a predetermined threshold value. An action is performed when the average statistical reliability function has a value that is less than a predefined value.

[0006] In a third embodiment, at least one non-transitory computer-readable storage medium has computer instructions stored thereon for a processor of a computing device to perform operations. According to the operations, an initial node-local burst buffer content at a start of a time period T is determined. A current node-local burst buffer content is received during the time period. For each checkpoint/restart time interval: stochastic transition rates A and g, indicating when the node-local burst buffer is receiving and draining, respectively, are estimated; input flow data rates of data entering the node-local burst buffer from a compute node are estimated; and drain data rates of the data leaving the node-local burst buffer to be stored to a parallel file system are estimated. An average statistical reliability function of the node-local burst buffer within the time period T is modeled with respect to not exceeding a predetermined threshold value. When the statistical reliability function has a value that is less than the predefined threshold value an action is performed.

BRIEF DESCRIPTION OF THE DRAWINGS

[0007] Figure l is a graph that shows an example change in node-local burst buffer content over time.

[0008] Figure 2 is a flowchart of an example process for accessing data in a node-local burst buffer during a time interval, determining a change in node-local burst buffer content since a previous time interval, performing clustering, determining an input data flow rate of the nodelocal burst buffer, determining an output drain rate of the node-local burst buffer, and determining transition rates regarding the node-local burst buffer according to embodiments. [0009] Figures 3 and 4 are flowcharts of an example process for counting a number of times data in the node-local burst buffer changes between entering and leaving the node-local burst buffer, and an amount of time the node-local burst buffer spends in each state according to embodiments.

[0010] Figure 5 is a graph showing power and asymptotic expansions of a modified Bessel function of a first kind I o .

[0011] Figure 6 shows an example compute node according to embodiments.

DETAILED DESCRIPTION OF THE INVENTION

[0012] Analyzing reliability of node-local burst buffers is an open problem. Prior results have not focused on direct performance of node-local burst buffers. In addition, node-local burst buffers are prone to failure and current approaches do not address this. The present application addresses these problems.

[0013] Equations 1 and 2 are stochastic processes on which a node-local framework, according to embodiments, may be based. where (to be estimated) are defined as follows: is an output data flow rate of data leaving a node-local BB to be stored in a parallel file system (PFS) and is an input data rate of data entering the nodelocal BB from a compute node. where and are stochastic transition rates that indicate when the node-local BB is receiving data and when the node-local BB is draining data

[0014] During acquisition, node-local BB activity is considered in which information enters and leaves a solid state device (SSD) during a certain checkpoint/restart interval I = [0, T], where T is an end time of the interval.

[0015] Next, preprocessing and feature extraction are performed considering: where Q represents data. The preprocessing may include 1) estimating a rate at which information enters and leaves a node local burst buffer and 2) estimating a length of time within each sub-interval in [0, T] that the node local burst buffer is receiving and draining data. Output of preprocessing includes < 1 and features are extracted related to and “t” is time.

[0016] Figure 1 shows an example change in node-local BB content over time. From a time of 0.0 to about a time of 0.625, no data is received into or drained from the node-local BB. At about the time of 0.625, four data items are received into the node-local buffer from a compute node. From about a time of 8.5 to about a time of 10.625, one data item is drained from the node-local BB to a parallel file storage (PFS). From about a time of 10.625 to about a time of 13.8, an additional four data items are received into the node-local BB from the compute node. From about the time of 13.8 to about a time of 14.95, an additional four items are received by the node-local BB from the compute node. From about the time of 14.95 to about a time of 15.5, one data item is drained from the node-local BB to the PFS. From about the time of 15.5 to about a time of 16.0, an additional four data items are received by the node-local BB from the compute node. From about the time of 16.0 to about a time of 18.8, another four data items are received into the node-local BB from the compute node. From about the time of 18.8 to about the time of 20.0, one data item is drained from the node-local BB to the PFS.

[0017] During testing, a simulator generated data for and drained data from a simulated node-local BB. Figure 2 is a flowchart of a process for obtaining data from the simulated nodelocal BB. The process may begin by setting ELAPSEDTIME to zero (act 202). Next, OBSERVEDTIME is set to an amount of time in which the simulation runs (act 204) and data, Q, in the simulated node-local BB is accessed (act 206).

[0018] Next, which is a change in an amount of data, Q, over a current time step, may be determined (act 208). then may be stored in a DQDT array for use during clustering (act 210) and ELAPSEDTIME may be updated by an amount of a time step (act 212).

[0019] Next, a determination is made regarding whether ELAPSEDTIME is less than OBSERVEDTIME (act 214). If ELAPSEDTIME is determined to be less than OBSERVEDTIME, then acts 206-214 again may be performed during a next time interval. Otherwise, if ELAPSEDTIME is determined not to be less than OBSERVEDTIME, then the data stored in the array, at act 210, may be used for clustering according to a K-Means++ method, which differs from a K-Means method by randomly selecting one of the data items from the DQDT array as a first centroid of a first cluster.

[0020] After performing the KMeans++ method, cluster centroids are determined and each of the DQDT array items is assigned to one of a number of clusters. ! and <p 2 may be estimated based on the determined cluster centroids (act 218).

[0021] Next, a number of transitions to a node-local BB entering state, a node-local BB draining state, a cumulative time in node-local BB entering state, a cumulative time in nodelocal BB draining state, and a cumulative time in which an amount of content in the node-local BB remained unchanged may be determined (act 220).

[0022] Figures 3 and 4 are flowcharts of a procedure that may be performed during act 220 of Figure 2. The process may begin by setting state_12 to true, indicating that the BB is in the node-local BB receiving state in which data is received by the node-local BB from a compute node (act 302). Next, a number of variables may be initialized to zero (act 304). In some embodiments, the variables may include small, large, enter_12, leave_12, count, and index.

[0023] Next, a determination may be made regarding whether DQDT [index] is greater than zero (act 306). If DQDT [index] is determined to be greater than zero, then the node-local BB is determined to be in the node-local BB receiving state and the variable “small” may be incremented by a small time step in this embodiment, “enter_12” may be incremented to keep track of a number of intervals in which the node-local BB state is the node-local BB receiving state, and “index” may be incremented by one (act 308). State_12 then may be set to true to indicate that the node-local BB state is node-local BB receiving (act 310).

[0024] If, during act 306, DQDT [index] is determined not to be greater than zero, then a determination is made regarding whether DQDT [index] is less than zero (act 312). If DQDT [index] is determined to be less than zero, then the variable “large” may be incremented by a large time step in this embodiment, “enter_21” may be incremented by one to keep track of a number of intervals in which the node-local BB state is in the node-local BB draining state, which indicates that data has been drained from the node-local BB to the PFS, and the variable “index” may be incremented by one (act 314). The variable state_12 may be set to false to indicate that the node-local BB is in the node-local BB draining state (act 316).

[0025] If, during act 312, DQDT [index] is determined to be zero, indicating no change in contents of the node-local BB, then a variable “count”, which counts a margin of error, and the variable “index” may be incremented by one (act 318). The margin of error may be estimated by calculating statistical reliability functions and taking a norm against the theoretical statistical reliability functions.

[0026] After performing act 310, act 316, or act 318, a determination may be made regarding whether DQDT [index] is equal to zero (act 320). If DQDT [INDEX] is determined to be equal to zero, then act 318 may again be performed. Otherwise, a determination is made regarding whether index is less than a length of DQDT (i.e., whether additional items in DQDT exist for processing) (act 402; Figure 4).

[0027] If, during act 402, additional items in DQDT are determined to exist, then a determination is made regarding whether DQDT [index] < 0, indicating that the node-local BB is in the node-local BB draining state (act 404). If the DQDT [index] is determined to be less than zero, then the variable “small” for counting small time steps may be incremented by one in this embodiment (act 406).

[0028] If, during act 404, DQDT [index] is determined to not be less than zero, then a determination is made regarding whether DQDT[index] is greater than zero (act 410). If DQDT [INDEX] is determined to be greater than zero, then the variable “large” for counting large time steps may be incremented by one in this embodiment (act 412).

[0029] After performing act 406, 412, or 414, then a determination may be made regarding whether DQDT [index] is less than zero and the node-local BB state is not the node-local BB receiving state (act 416). If these conditions are true, then a variable “enter_21” may be incremented to keep track of a number of transitions to the node-local BB draining state, the variable “leave_12” may be incremented by one to keep track of a number of times a transition from the node-local BB draining state occurred, and “index” may be incremented by one so that a next item in the DQDT array may be examined (act 418). State_12 then may be set to true to indicate that the current node-local BB state is node-local BB receiving state (act 420). [0030] If, during act 416, a determination is made that either DQDT [index] is not less than zero or state_12 is true, then a determination may be made regarding whether DQDT [index] is greater than zero and the node-local BB state is the node-local BB receiving state (state_12 is true) (act 422). If both of these conditions are determined to be true, then the variable

“enter_12” may be incremented by one to keep track of a number of transitions to the nodelocal BB receiving state, “leave_21” may be incremented to keep track of a number of transitions from the node-local BB draining state, and “index” may be incremented by one (act 424). “State_21” then may be set to false to indicate that the node-local BB state is the nodelocal BB draining state (act 428).

[0031] If, during act 422, a determination is made that DQDT [index] is not greater than zero or “state_12” is false, indicating that the node-local BB state is the node-local draining state, then the variable “index” may be incremented by one.

[0032] After performing act 420, 428, or 430, act 402 again may be performed.

[0033] If, during act 402, the variable “index” is determined to be equal to the length of the DQDT array, then a time in seconds while state_12 is true, a time in seconds while in the nodelocal BB draining state (state_21, or state_12 = false), and a time in seconds when there is no change in contents of the node-local BB may be determined.

[0034] For a stochastic estimation module, equations 1 and 2 can be expressed as where A and G m , a stochastic generator matrix, is equal to and 2

02 may be determined via k-means++ clustering.

[0035] The stochastic estimation (SE) module includes the following steps:

1. Use maximum likelihood estimation (MLE) as a rule of thumb to estimate A and and

2. If the estimates of A and g exceed a certain predefined threshold, which is chosen empirically by a user, then expectation-maximization (E-M) may be used to estimate final values of A and //, according to embodiments. a. In some embodiments, “simplified” E-M algorithms may be used (e.g., diagonal adjusted (DA) and diagonal weighted adjusted (DWA)).

[0036] According to the MLE, (5) where N t j (T) is a number of transitions between each state representing node-local BB behavior and Rt(7") is a cumulative amount of time that the node-local BB behavior is in a particular state. Thus, , where is a number of times that the node-local BB receiving state is exited and ) is a cumulative amount of time (in seconds) in the node-local BB receiving state. Similarly, where A is a number of times that the node- local BB draining state is entered and is a cumulative amount of time (in seconds) in which the state was the node-local buffer draining state.

[0037] If E-M is used, then where is a theoretical reliability analysis (RA) metric, is an actual or estimated RA metric from data, and E is an amount of error, E is determined to ensure that the parameters estimated, , and g are properly chosen.

[0038] In some embodiments, when an average (i.e. average statistical reliability function within the time period T) has a value that is less than a predefined value, an action may be performed. The action may include, but not be limited to, generating an alert to a user such as a system administrator. As a result of receiving the alert, the system administrator may perform a second action that may include, but not be limited to, shutting down access to the compute node where the node-local burst buffer resides, feeding data to a different compute node, and/or flush data from the node-local burst buffer to a parallel file system (PFS).

[0039] Similarly error checking may also be performed on the conditional statistical performance metrics and to ensure robustness.

[0040] Another method for estimating transition rates A 12 and A 21 may begin by first representing in discretized form where for m=l, 2 are estimated transition rates.

Additionally, and Hence, equation (7) can be expressed as

Employing linear least squares regression achieves where A 12 and A 21 are computed directly from (9).

[0041] These metrics consider the following main cases:

1. The node-local BB is initially empty at a start of each checkpoint/restart (C/R) interval.

2. The node-local BB is initially non-empty at the start of each C/R interval. a. For each case, only reactive BB strategies are considered

Equations

[0042]

(16)

(17)

(18) where R B (x, t) is a theoretical reliability analysis, a probability that contents of data in the node-local BB is less than a threshold x, and F B (x, t) is a failure analysis.

1. m = 1 : Considers the likelihood (probability) that the node-local BB is draining information to the PFS.

2. m = 2: Considers the likelihood (probability) that the node-local BB is receiving information from the compute node (CN).

[0043] The equations are valid for the following cases:

• Case 1 : Node-local BB is initially empty at a start of each C/R interval. This case considers both proactive ( and reactive cases. • Case 2: Node-local BB is initially non-empty at the start of each C/R interval. This case considers only reactive draining schemes Specifically, only the following cases were considered: o the initial content u is greater than a given threshold x at the start of the C/R interval o the initial content u is within the given threshold x at the start of the C/R interval

[0044] Analytical Solutions Case 1

and

I n , in equations (13) and (14), are modified Bessel functions of an order n = 0, 1, and 2.

Approximate Solutions Case 1

[0045] Approximate solutions consider the following integrals in equations (19) and (20), which include following relationships:

Approximate Solutions Case 1 : Short Time Behavior for t < 1 second

[0046] For short behavior, where second, equations (26) and (27) can be expressed in terms of power series representations as follows:

and the constants a, a, b and a are defined as and the constants a, a, b and a are defined by equations (31) and (32), respectively.

Approximate Solutions Case 1 : Long Time Behavior for t > 1 Second

[0047] For long time behavior where t > 1 second, this results in asymptotic representations a follows:

where the constants a, a, b and a are defined by equations (31) and (32), respectively.

[0048] Figure 5 shows power and asymptotic expansions of the Bessel Function I Q . A critical point t c may be estimated from where is a modified Bessel function of order n is a power series of a modified Bessel function of order is the asymptotic series of a modified Bessel function of n = 0, 1, and e = 1 X 10 -6 is an error tolerance.

[0049] This critical point is a transition point between power series and asymptotic expansion. Next, the power series and asymptotic representations of equations (26) and (27) may be fused into equations (19) and (20) to consider the behavior of t. See Appendix A for a computation of and

Analytical Solutions [0050]

(ii) When

Approximate Solutions case second

[0051] (1) When

where

Note: and can be found by substituting v 0 for vo in equations (56) and (57), respectively.

Approximate Solutions Case [0052]

equations (60) - (63), respectively.

Approximate Solutions Case : Comprehensive Expansion [0053] The critical point t c is estimated from where are modified Bessel functions of order and are power series of the modified Bessel functions of order and are asymptotic series of the modified Bessel function of order and is an error tolerance.

[0054] This critical point is a transition point between power series and asymptotic expansion.

[0055] Next, the power series and the asymptotic representations of equations (26) and (27) may be fused into equations (41) - (45) to consider the behavior of t.

[0056] See Appendix A for a computation of an d

Analytical Solution Case

[0057] where

[0058] Figure 6 illustrates an example compute node 600, which may be included in a high performance computing system. Components of compute node 600 may include, but are not limited to, one or more processing units 616, a system memory 628, and a bus 618 that couples various system components including system memory 628 to one or more processing units 616. [0059] Bus 618 represents any one or more of several bus structure types, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. Such architectures may include, but not be limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnects (PCI) bus.

[0060] Compute node 600 may include various non-transitory computer-readable media, which may be any available non-transitory media accessible by computing system 600. The computer-readable media may include volatile and non-volatile non-transitory media as well as removable and non-removable non-transitory media.

[0061] System memory 628 may include non-transitory volatile memory, such as random access memory (RAM) 630 and cache memory 634. System memory 628 also may include non-transitory non-volatile memory including, but not limited to, read-only memory (ROM) 632 and storage system 636. Storage system 636 may be provided for reading from and writing to a nonremovable, non-volatile magnetic medium, which may include a hard drive or a Secure Digital (SD) card. In addition, a magnetic disk drive, not shown, may be provided for reading from and writing to a removable, non-volatile magnetic disk such as, for example, a floppy disk, and an optical disk drive for reading from or writing to a removable non-volatile optical disk such as a CD-ROM, DVD-ROM or other optical media. Storage system 636 also may include a solid state device (SSD), which may function as a node-local burst buffer 638. Each memory device may be connected to bus 618 by at least one data media interface. System memory 628 further may include instructions for processing unit(s) 616 to configure compute node 600 to perform functions of embodiments. For example, system memory 628 also may include, but not be limited to, processor instructions for an operating system, at least one application program, other program modules, program data, and an implementation of a networking environment.

[0062] Compute node 600 may communicate with one or more external devices 614 including, but not limited to, one or more displays, a keyboard, a pointing device, a speaker, at least one device that enables a user to interact with compute node 600, and any devices including, but not limited to, a network card, a modem, etc. that enable compute node 600 to communicate with one or more other computing devices. The communication can occur via Input/Output (VO) interfaces 622. Compute node 600 can communicate with one or more networks including, but not limited to, a local area network (LAN), a general wide area network (WAN), a packet-switched data network (PSDN) and/or a public network such as, for example, the Internet, via network adapter 620. As depicted, network adapter 620 communicates with the other components of compute node 600 via bus 618.

[0063] It should be understood that, although not shown, other hardware and/or software components could be used in conjunction with compute node 600. Examples, include, but are not limited to, microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data archival storage systems, etc.

[0064] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises”, “comprising”, “includes”, "including", "has", "have", "having", "with" and the like, when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

[0065] The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.

[0066] The descriptions of the various embodiments of the present invention have been presented for purposes of illustration but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or improvement over conventional technologies, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

[0067] Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer-readable storage devices having instructions stored therein for carrying out functions according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions. Each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the blocks may occur out of the order noted in the Figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.

[0068] Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer-readable storage devices having instructions stored therein for carrying out functions according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions. Each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the blocks may occur out of the order noted in the Figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.

[0069] Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer-readable storage devices having instructions stored therein for carrying out functions according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions. Each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the blocks may occur out of the order noted in the Figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions. [0070] The various functions of the computer or other processing systems may be distributed in any manner among any number of software and/or hardware modules or units, processing or computer systems and/or circuitry, where the computer or processing systems may be disposed locally or remotely of each other and communicate via any suitable communications medium (e.g., LAN, WAN, Intranet, Internet, hardwire, modem connection, wireless, etc.). For example, the functions of the present invention embodiments may be distributed in any manner among the various end-user/client and server systems, and/or any other intermediary processing devices. The software and/or algorithms described above and illustrated in the flowcharts may be modified in any manner that accomplishes the functions described herein. In addition, the functions in the flowcharts or description may be performed in any order that accomplishes a desired operation.

[0071] In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.

Appendix A: Approximate Solution for Bessel Functions

Given the modified Bessel function the power series (i.e., as y -> 0) is given by

The asymptotic expansion for is given by

The asymptotic expansion for for is given by

(107)