Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
GAZE TRACKING FOR AUGMENTED REALITY DEVICES
Document Type and Number:
WIPO Patent Application WO/2024/085905
Kind Code:
A1
Abstract:
A method including generating a batch of calibration images including a ground-truth image, and an image captured by a backward facing camera of a head mounted AR device, predicting a first gaze direction based on the ground-truth image, training a neural network based on the predicted first gaze direction and the ground-truth image, predicting a second gaze direction based on the image and an indication displayed on a display of the head mounted AR device when the image was captured, and training the neural network based on the predicted second gaze direction and a position on the display of the head mounted AR device associated with the indication.

Inventors:
FANELLO SEAN RYAN FRANCESCO (US)
TOSIC IVANA (US)
SPENCER JASON TODD (US)
PANDEY ROHIT KUMAR (US)
ABOUSSOUAN ERIC (US)
WU YITIAN (US)
JABERI MARYAM (US)
GHABUSSI AMIRPASHA (US)
KOWDLE ADARSH PRAKASH MURTHY (US)
Application Number:
PCT/US2022/078436
Publication Date:
April 25, 2024
Filing Date:
October 20, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
GOOGLE LLC (US)
International Classes:
G06V10/82; G06V40/18; G06V40/19
Foreign References:
US20210049410A12021-02-18
US20220083134A12022-03-17
EP3557377B12022-10-19
US11024002B22021-06-01
Attorney, Agent or Firm:
SMITH, Edward P. et al. (US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1. A method comprising: generating a batch of calibration images including: a ground-truth image, and an image captured by a backward facing camera of a head mounted AR device; predicting a first gaze direction based on the ground-truth image; training a neural network based on the predicted first gaze direction and the ground-truth image; predicting a second gaze direction based on the image and an indication displayed on a display of the head mounted AR device when the image was captured; and training the neural network based on the predicted second gaze direction and a position on the display of the head mounted AR device associated with the indication.

2. The method of claim 1, wherein the ground-truth image includes a cartesian coordinate corresponding to a location of a ground-truth indicator on the head mounted AR device display.

3. The method of claim 1 or claim 2, wherein the predicting of the first gaze direction is based on a location of a ground-truth indicator on the head mounted AR device display.

4. The method of any of claim 1 to claim 3, wherein the training of the neural network is based an error associated with the first gaze direction and a location of a ground-truth indicator on the head mounted AR device display.

5. The method of any of claim 1 to claim 4, wherein the predicting of the first gaze direction is g = WN(I) where g is the predicted first gaze direction, I is the ground-truth image, N(I) is an output of the neural network, and W is a weighting matrix.

6. The method of claim 5, wherein N I') = where the matrix W is a matrix 2 k that maps a feature vector to final display coordinates, and F is the feature vector.

7. The method of any of claim 1 to claim 6, wherein the predicting of the first gaze direction includes solving argminw|| Gcaiib - WN(Icaiib)|| using a closed form solution.

8. The method of any of claim 1 to claim 7, further comprising storing parameters associated the training of the neural network for use when the neural network is used as a trained neural network.

9. The method of any of claim 1 to claim 8, wherein the predicting of the second gaze direction is based on a location of an indication displayed on the head mounted AR device display.

10. The method of any of claim 1 to claim 9, wherein the training of the neural network is based an error associated with the second gaze direction and a location of an indication displayed on the head mounted AR device display.

11. The method of any of claim 1 to claim 10, wherein the predicting of the second gaze direction is g = WN(I) where g is the predicted second gaze direction, I is the image by a backward facing camera of the head mounted AR device, N(I) is an output of the neural network, and W is a weighting matrix.

12. The method of claim 11, wherein N I') = where the matrix W is a 2 rows x k columns matrix that maps a feature vector to final display coordinates, and F is the feature vector.

13. The method of any of claim 1 to claim 12, wherein the predicting of the second gaze direction includes solving argminw|| Gcaiib - WN(Icaiib)|| using a closed form solution.

14. The method of any of claim 1 to claim 13, wherein at least one of: the ground-truth image is one of a plurality of ground-truth image, and the image captured by the backward facing camera of the head mounted AR device is one of a plurality of images captured by the backward facing camera of the head mounted AR device, wherein each of the plurality of images is captured under different conditions.

15. A head mounted augmented reality (AR) device comprising: at least one processor; and at least one memory including computer program code; the at least one memory and the computer program code configured to, with the at least one processor, cause the head mounted AR device to: generate a batch of calibration images including: a ground-truth image, and an image captured by a backward facing camera of the head mounted AR device; predict a first gaze direction based on the ground-truth image; train a neural network based on the predicted first gaze direction and the groundtruth image; predict a second gaze direction based on the image and an indication displayed on a display of the head mounted AR device when the image was captured; and train the neural network based on the predicted second gaze direction and a position on the display of the head mounted AR device associated with the indication.

16. The head mounted AR device of claim 15, wherein the ground-truth image includes a cartesian coordinate corresponding to a location of a ground-truth indicator on the head mounted AR device display.

17. The head mounted AR device of claim 15 or claim 16, wherein the predicting of the first gaze direction is based on a location of a ground-truth indicator on the head mounted AR device display.

18. The head mounted AR device of any of claim 15 to claim 17, wherein the training of the neural network is based an error associated with the first gaze direction and a location of a ground-truth indicator on the head mounted AR device display.

19. The head mounted AR device of any of claim 15 to claim 18, wherein the predicting of the first gaze direction is g = WN(I) where g is the predicted first gaze direction, I is the groundtruth image, N(I) is an output of the neural network, and W is a weighting matrix.

20. The head mounted AR device of claim 19, wherein N I') = F where the matrix W is a 2 rows x k columns matrix that maps a feature vector to final display coordinates, and F is the feature vector.

21. The head mounted AR device of any of claim 15 to claim 20, wherein the predicting of the first gaze direction includes solving argminw|| Gcaiib - WN(Icaiib)|| using a closed form solution.

22. The head mounted AR device of any of claim 15 to claim 21, further comprising storing parameters associated the training of the neural network for use when the neural network is used as a trained neural network.

23. The head mounted AR device of any of claim 15 to claim 22, wherein the predicting of the second gaze direction is based on a location of an indication displayed on the head mounted AR device display.

24. The head mounted AR device of any of claim 15 to claim 23, wherein the training of the neural network is based an error associated with the second gaze direction and a location of an indication displayed on the head mounted AR device display.

25. The head mounted AR device of any of claim 15 to claim 24, wherein the predicting of the second gaze direction is g = WN(I) where g is the predicted second gaze direction, I is the image by a backward facing camera of the head mounted AR device, N(I) is an output of the neural network, and W is a weighting matrix.

26. The head mounted AR device of claim 25, wherein N I') = F where the matrix W is a matrix 2 k that maps a feature vector to final display coordinates, and F is the feature vector.

27. The head mounted AR device of any of claim 15 to claim 26, wherein the predicting of the second gaze direction includes solving argminw|| Gcaiib - WN(Icaiib)|| using a closed form solution.

28. The head mounted AR device of any of claim 15 to claim 27, wherein at least one of: the ground-truth image is one of a plurality of ground-truth image, and the image captured by the backward facing camera of the head mounted AR device is one of a plurality of images captured by the backward facing camera of the head mounted AR device, wherein each of the plurality of images is captured under different conditions.

29. A non-transitory computer-readable storage medium comprising instructions stored thereon that, when executed by at least one processor, are configured to cause a computing system to: generate a batch of calibration images including: a ground-truth image, and an image captured by a backward facing camera of a head mounted AR device; predict a first gaze direction based on the ground-truth image; train a neural network based on the predicted first gaze direction and the ground-truth image; predict a second gaze direction based on the image and an indication displayed on a display of the head mounted AR device when the image was captured; and train the neural network based on the predicted second gaze direction and a position on the display of the head mounted AR device associated with the indication.

30. The non-transitory computer-readable storage medium of claim 29, wherein the groundtruth image includes a cartesian coordinate corresponding to a location of a ground-truth indicator on the head mounted AR device display.

31. The non-transitory computer-readable storage medium of claim 29 or claim 30, wherein the predicting of the first gaze direction is based on a location of a ground-truth indicator on the head mounted AR device display.

32. The non-transitory computer-readable storage medium of any of claim 29 to claim 31, wherein the training of the neural network is based an error associated with the first gaze direction and a location of a ground-truth indicator on the head mounted AR device display.

33. The non-transitory computer-readable storage medium of any of claim 29 to claim 32, wherein the predicting of the first gaze direction is g = WN(I) where g is the predicted first gaze direction, I is the ground-truth image, N(I) is an output of the neural network, and W is a weighting matrix.

34. The non-transitory computer-readable storage medium of claim 33, wherein V(/) = where the matrix W is a 2 rows x k columns matrix that maps a feature vector to final display coordinates, and F is the feature vector.

35. The non-transitory computer-readable storage medium of any of claim 29 to claim 34, wherein the predicting of the first gaze direction includes solving argminw|| Gcaiib - WN(Icaiib)|| using a closed form solution.

36. The non-transitory computer-readable storage medium of any of claim 29 to claim 35, further comprising storing parameters associated the training of the neural network for use when the neural network is used as a trained neural network.

37. The non-transitory computer-readable storage medium of any of claim 29 to claim 36, wherein the predicting of the second gaze direction is based on a location of an indication displayed on the head mounted AR device display.

38. The non-transitory computer-readable storage medium of any of claim 29 to claim 37, wherein the training of the neural network is based an error associated with the second gaze direction and a location of an indication displayed on the head mounted AR device display.

39. The non-transitory computer-readable storage medium of any of claim 29 to claim 38, wherein the predicting of the second gaze direction is g = WN(I) where g is the predicted second gaze direction, I is the image by a backward facing camera of the head mounted AR device, N(I) is an output of the neural network, and W is a weighting matrix.

40. The non-transitory computer-readable storage medium of claim 39, wherein A(/) = where the matrix W is a matrix 2 k that maps a feature vector to final display coordinates, and F is the feature vector.

41. The non-transitory computer-readable storage medium of any of claim 29 to claim 40, wherein the predicting of the second gaze direction includes solving argminw|| Gcaiib - WN(IcaUb)|| using a closed form solution.

42. The non-transitory computer-readable storage medium of any of claim 29 to claim 41, wherein at least one of: the ground-truth image is one of a plurality of ground-truth image, and the image captured by the backward facing camera of the head mounted AR device is one of a plurality of images captured by the backward facing camera of the head mounted AR device, wherein each of the plurality of images is captured under different conditions.

Description:
GAZE TRACKING FOR AUGMENTED REALITY DEVICES

FIELD

[0001] Embodiments relate to gaze tracking for head mounted augmented reality (AR) device.

BACKGROUND

[0002] Head mounted wearable devices may include, for example, smart glasses, headsets, goggles, ear buds, and the like. Gaze tracking systems for head mounted AR devices model the geometry of the eye and infer from image quantities such as pupil location, eye position and the optical axis. Finally, a calibration from the optical to visual axis is performed by showing two- dimensional (2D) coordinates on a display. These systems typically require a specific hardware configuration where at least one camera is coupled with multiple light emitting diodes (LEDs). The detection of a glint on the cornea of the eye and the camera to LEDs calibration allows determination of an accurate position of an eye in three-dimensional (3D) space. The position of the eye in 3D space is used to determine the gaze.

SUMMARY

[0003] Example implementations can generate a more robust and invariant gaze tracking system associated with a head mounted AR device using a neural network. Training the neural network is performed in a self-supervised manner using both ground-truth images and images captured by a backward facing camera.

[0004] In a general aspect, a device, a system, a non-transitory computer-readable medium (having stored thereon computer executable program code which can be executed on a computer system), and/or a method can perform a process with a method including generating a batch of calibration images including a ground-truth image, and an image captured by a backward facing camera of a head mounted AR device, predicting a first gaze direction based on the ground-truth image, training a neural network based on the predicted first gaze direction and the ground-truth image, predicting a second gaze direction based on the image and an indication displayed on a display of the head mounted AR device when the image was captured, and training the neural network based on the predicted second gaze direction and a position on the display of the head mounted AR device associated with the indication. BRIEF DESCRIPTION OF THE DRAWINGS

[0005] Example embodiments will become more fully understood from the detailed description given herein below and the accompanying drawings, wherein like elements are represented by like reference numerals, which are given by way of illustration only and thus are not limiting of the example embodiments and wherein:

[0006] FIG. 1 illustrates a block diagram of a signal flow for calibrating a gaze tracking system according to an example implementation.

[0007] FIG. 2 illustrates a block diagram of a signal flow for training a neural network when calibrating the gaze tracking system according to an example implementation.

[0008] FIG. 3 illustrates a block diagram of a method for calibrating the gaze tracking system according to an example implementation.

[0009] FIG. 4A illustrates an example head mounted augmented reality (AR) device worn by a user according to an example implementation.

[0010] FIG. 4B is a front view, and FIG. 4C is a rear view of the example head mounted AR device shown in FIG. 4A according to an example implementation.

[0011] FIG. 5 illustrates a method for calibrating a gaze tracking system according to an example implementation.

[0012] FIG. 6 illustrates a block diagram of a system according to an example implementation. [0013] It should be noted that these Figures are intended to illustrate the general characteristics of methods, and/or structures utilized in certain example embodiments and to supplement the written description provided below. These drawings are not, however, to scale and may not precisely reflect the precise structural or performance characteristics of any given embodiment and should not be interpreted as defining or limiting the range of values or properties encompassed by example embodiments. For example, the positioning of modules and/or structural elements may be reduced or exaggerated for clarity. The use of similar or identical reference numbers in the various drawings is intended to indicate the presence of a similar or identical element or feature.

DETAILED DESCRIPTION

[0014] Head mounted AR devices can have strict requirements on power consumption, industrial design (e.g., thin frames, low weight), manufacturability, and/or the like. These constraints make the use of multiple LEDs out of reach. Therefore, hardware configuration where at least one camera is coupled with multiple LEDs do not meet the strict requirements of head mounted AR devices. As mentioned above, existing gaze tracking systems are designed for, and limited to, head mounted AR devices including at least one camera that is coupled with multiple LEDs. Therefore, existing gaze tracking systems do not function properly (e.g., robustness to device movement on the user’s face) on the innovative head mounted AR devices with strict requirements on power consumption, industrial design, manufacturability, and/or the like.

[0015] Accordingly, new gaze tracking systems are needed for the innovative head mounted AR devices that include one backward facing camera (e.g., a camera used to track a gaze direction) and one LED per lens. Example implementations described herein illustrate gaze tracking systems and technologies for the innovative head mounted AR devices. For example, the gaze tracking system can use a machine learned (ML) algorithm to predict gaze direction. Training the ML algorithm can include a first training process based on ground-truth images and a second training process based on user gaze direction. This gaze tracking system can improve the user experience while wearing a head mounted AR device because the gaze tracking system can accurately predict the direction of a user’s gaze.

[0016] In some examples, systems and methods as described herein provide for calibrating the gaze tracking system of the head mounted AR device. The calibration can be based on calibration images including a ground-truth image and/or a captured image. A first gaze direction can be determined based on a ground-truth image and a second gaze direction can be determined based on a captured image. A neural network can be trained based on the predicted first gaze direction and/or the predicted second gaze direction. The training of the neural network can be selfsupervised. Self-supervised training of a neural network can be a technique where the user of the head mounted AR device is guided through the training process.

[0017] FIG. 1 illustrates a block diagram of a signal flow for calibrating a gaze tracking system according to an example implementation. As shown in FIG. 1 the signal flow includes a calibration display module 105 block, a calibration image module 110 block, a user-calibration module 115 block, a neural network training module 120 block, and a calibration storage module 125 block.

[0018] The calibration display module 105 can be configured to cause the displaying of user information, initiate and/or direct calibration of the head mounted AR device. For example, the head mounted AR device can include a display projected onto a lens (or lenses) of the head mounted AR device. The display can be configured to display user information (e.g., as an image, sequence of images, a video, a frame of a video, and/or the like) associated with calibration of the head mounted AR device. The calibration display module 105 can be configured to generate the information and cause the display of the information. The information can be textual. For example, the information can be informative with regard to the status (e.g., start calibration, calibration in process, calibration error, and/or the like) of the calibration. For example, the information can be instructions (e.g., say begin to start calibration, look directly at the indication on the display, and/or the like) for the calibration.

[0019] The calibration image module 110 can be configured to store and/or select images for use as ground-truth images used in a training process based on the ground-truth images. The ground-truth images can include a ground-truth indicator associated with a location on the display of the head mounted AR device. For example, a ground-truth image can include a ground-truth indicator in the location of a pixel(s) (e.g., the upper (or lower) left (or right) comer) of the display. The location can have a cartesian coordinate. Accordingly, the ground-truth image can include metadata including the cartesian coordinate corresponding to the location of the ground-truth indicator on the display. The ground-truth indicator can include any object. For example, the ground-truth indicator can be a dot, a line, a letter, an animal, furniture, a plant, and/or the like. In order to provide contrast, the background of the ground truth image can be, for example, a consistent color and/or a simple scene (e.g., a landscape, a sky, grass, and/or the like). The calibration image module 110 can be configured to provide the image(s) to the neural network training module 120. The calibration image module 110 can be configured to receive a request for the image(s) (e.g., a next or additional image(s)) from the neural network training module 120.

[0020] The user-calibration module 115 can be configured to cause the display of an indication at a location on the display of the head mounted AR device. The user-calibration module 115 can be configured to communicate the location of the indication to the neural network training module 120. The user-calibration module 115 can be configured to an instruction to display an indication from the neural network training module 120. The indication can include any object. For example, the indication can be a dot, a line, a letter, an animal, furniture, a plant, and/or the like. The user-calibration module 115 can be configured to communicate instructions to the calibration display module 105 in order to cause the calibration display module 105 to display, for example, the indication, information, and/or instructions. For example, the user-calibration module 115 can be configured to communicate instructions to the calibration display module 105 to cause the calibration display module 105 display an instruction to, look at the indication, change the environment (e.g., turn lights on or off), position the head mounted AR device (e.g., move the head mounted AR device from a typical position on the user’s nose toward the end of the nose), and/or the like. The user-calibration module 115 can be configured to cause the capture of an image using the backward facing camera and to communicate the image to the neural network training module 120. [0021] The neural network training module 120 can be configured to train a neural network based on a ground-truth image and/or in response to the user-calibration module. For example, the neural network training module 120 can predict a gaze direction (e.g., a direction that a user is looking in relation to the user’s eyes) based on a ground-truth image then calculate an error based on the difference between the predicted gaze direction and the direction associated with meta data corresponding to the location of a ground-truth indicator in the ground-truth image. If the error is above a threshold value, a parameter(s) associated with the neural network can be changed and the prediction, comparison and error test can be repeated until the error is within an acceptable error range or threshold or the error has increased.

[0022] The neural network training module 120 can be configured to train the neural network in response to the user-calibration module. For example, the neural network training module 120 can receive an instruction to begin the calibration process (e.g., the second calibration process mentioned above). The neural network training module 120 can receive an image captured (or detected) by the backward facing camera and information (e.g., Cartesian coordinates) corresponding to a location that an object was displayed on the display of the AR device. The neural network training module 120 can predict a gaze direction based on the image captured by the backward facing camera then calculate an error based on the difference between the predicted gaze direction and the direction associated with information (e.g., Cartesian coordinates) corresponding to the location that an object was displayed on the display of the AR device. If the error is above a threshold value, a parameter(s) associated with the neural network can be changed and the prediction, comparison and error test can be repeated until the error is within an acceptable error range or threshold or the error has increased. The neural network training module 120 can be configured to communicate an instruction to cause the user-calibration module 115 to display an object and capture an image for the neural network calibration.

[0023] The neural network training module 120 can be configured to train the neural network based on a batch (e.g., a plurality) of calibration images. The batch of calibration images can include an image(s) captured (or detected) by the backward facing camera and information (e.g., Cartesian coordinates) corresponding to a location that an object was displayed on the display of the AR device. The batch of calibration images can further include a ground-truth image(s) including the location of a ground-truth indicator in the ground-truth image. The neural network training module 120 can be configured to train the neural network by iterating through the batch of calibration images. The image(s) captured (or detected) by the backward facing camera can be one of a plurality of images captured by the backward facing camera and each of the plurality of images can be captured under different conditions (or calibration conditions). The conditions can include, for example, slippage positions (e.g., positions on the nose), illumination conditions (e.g., environment), facial conditions (e.g., makeup, suntan, and the like).

[0024] The calibration storage module 125 can be configured to store the parameters as trained by the neural network training module 120. The calibration storage module 125 can be configured to store the parameters that were trained based on the ground-truth image(s). The calibration storage module 125 can be configured to store the parameters that were trained in response to the user-calibration module in response to the user-calibration module. These parameters can be stored together and/or separately. For example, the parameters that were trained based on the ground-truth image(s) can be stored separately in order to perform an updated calibration using the calibration process that includes training in response to the usercalibration module in response to the user-calibration module.

[0025] The training of the neural network can include using a self-supervised technique. The self-supervised neural network training technique can mimic a run-time use scenario during the training of the neural network. Instead of training a single neural network that maps images, a user-calibration (or personalization) is added to the training loop. The user-calibration can allow mixing calibration conditions that differ from the actual run-time making the network more robust or invariant when the images differ from the calibration conditions. Therefore, example implementations include the training of the neural network that includes a first training process based on ground-truth images and a second training process based on a user interaction.

[0026] For example, a neural network used for gaze tracking (e.g., to predict a gaze direction over time) can be based on the following function: g = WN(I) . (1) where, g is the predicted gaze direction, and

I is the input image,

N(I) is the output of the neural network, and W is a weighting matrix.

In addition: where the matrix W is a 2 rows x k columns matrix that maps the feature vector to the final display coordinates, and F is the feature vector. [0027] The loss (e.g., the mean square error (MSE) loss) between the predicted gaze direction g= (x,y) and the ground-truth ggt= (xgt ,ygt) (where the subscript gt indicates a ground-truth image) can be minimized during the training. The loss can be back propagated through the neural network N(I), but not the weights W. In order to optimize the weights W, each example of the neural network can be assumed to have access not only to a pair of image I together with a ground-truth gaze ggt, but also multiple calibration pairs C=<Icaiibi, gcaiibi>, ..., <IcaiibN, gcaiibN>. These pairs can be used to solve the following problem: argminw|| Gcaiib - WN(Icaiib)|| . (3)

[0028] The above functions can be used in the abovementioned two-part training technique as described with regard to FIG. 2. FIG. 2 illustrates a block diagram of a signal flow for training a neural network when calibrating the gaze tracking system according to an example implementation. As shown in FIG. 2, the neural network training module 120 includes a groundtruth image training module 205 block and a user directed training module 210 block.

[0029] Referring to Eqns. (1) and (2), parameters associated with the function F, where N(I) is the output of the neural network or F=N(I), can be assigned random numbers. The parameters are associated with the neural network and can be weights. Therefore, at the initiation of the training, the weights can be initiated with random values. The training based on a calibration image can be performed in batches where each batch includes a plurality of calibration images. Each of the plurality of ground-truth images (e.g., of a batch) can be iterated through (e.g., each ground-truth image is used in a training iteration.

[0030] In each training iteration, Eqn. (3) can be solved. For example, Eqn. (3) can be solved using a closed form solution. A closed form solution is an expression for an exact solution given a finite amount of data. W can be the closed form solution of Eqn. (3). Then, the gaze direction for the calibration image can be predicted (calculated, generated). As mentioned above, the predicted gaze direction g can be predicted as a Cartesian (x.y) value. Therefore, for calibration image I, g = (x,y) = WF. An error can be calculated (e.g., MSE) based on the predicted gaze direction g and the location (e.g., metadata) on the display of the head mounted AR device for the calibration image I. The error can be backpropagate through F with respect to ggt = (xgt,ygt). In an example implementation, the error does not backpropagate through the weights W.

[0031] If the error is above a threshold value, the parameters (e.g., weights) associated with the neural network can be modified and the training process moves on to a next iteration. The training process continues iterating until the error is within an acceptable error range (e.g., threshold) or the error has increased. [0032] The ground-truth image training module 205 can be configured to train the neural network based on a ground-truth image. The ground-truth images can include an indicator associated with a location on the display of the head mounted AR device. For example, a groundtruth image can include a ground-truth indicator in the location of a pixel(s) (e.g., the upper (or lower) left (or right) comer) of the display. The location can have a cartesian coordinate. Accordingly, the ground-truth image can include metadata including the cartesian coordinate corresponding to the location of the ground-truth indicator on the display. The ground-truth indicator can include any object. For example, the ground-truth indicator can be a dot, a line, a letter, an animal, furniture, a plant, and/or the like.

[0033] As mentioned above, the neural network can be calibrated using a batch of calibration images. The ground-truth image training module 205 can be configured to train the neural network based on a plurality of ground-truth images. Therefore, the batch of calibration images can be a plurality of ground-truth images. The plurality of ground-truth images can include information (e.g., metadata) including position data corresponding to the position of the object on the ground-truth image. The plurality of ground-truth images can be received from the calibration image model.

[0034] The user directed training module 210 can be configured to train the neural network based on images received from the user-calibration module 115. Therefore, the batch of calibration images can be at least one image captured by the backward facing camera received from the user-calibration module 115. The at least one image captured by the backward facing camera can include information (e.g., metadata) including position data corresponding to the position of the object on when rendered on the display of the head mounted AR device.

[0035] In an example implementation, the at least one image captured by the backward facing camera can be received one at a time as the images are captured. In this implementation, the user directed training module 210 can be configured to interrupt the ground-truth image training module 205 such that when a training iteration is completed by the ground-truth image training module 205, the next training iteration can be performed by the user directed training module 210 then processing can return to the ground-truth image training module 205.

[0036] In an example implementation, the user-calibration module 115 can communicate the at least one image captured by the backward facing camera can be received one at a time as the images are captured and the user directed training module 210 can be configured to store each of the at least one image captured by the backward facing camera as the at least one image is received. The at least one image captured by the backward facing camera can be stored in memory to be used in a batch of calibration images. In response to a calibration iteration completing, the at least one image captured by the backward facing camera that is stored in memory can be used as the batch of calibration images in the next iteration.

[0037] By using the both the ground-truth images and the one image captured by the backward facing camera the neural network can be trained using different conditions of calibration mimicking a user’s behavior at run-time. For example, when capturing the at least one image captured by the backward facing camera a calibration sequence can be used such that the user can cause the capturing images under different conditions. The conditions can include, for example, slippage positions (e.g., positions on the nose), illumination conditions (e.g., environment), facial conditions (e.g., makeup, suntan, and the like). In addition, the at least one image captured by the backward facing camera can be stored and used in other calibration executions. Further, the at least one image captured by the backward facing camera can be recaptured after a very long period of time (e.g., months, and/or years) to update the images.

[0038] By using the both the ground-truth images and the one image captured by the backward facing camera during training of the neural network can guide the neural network towards learning a feature vector F=N(I) that should be invariant to the different conditions. For example, a calibration completed with the user wearing little or no makeup should not affect the calibration performed makeup on.

[0039] FIG. 3 illustrates a block diagram of a method for calibrating the gaze tracking system according to an example implementation. As shown in FIG. 3, in step S305 a ground-truth image is received. The ground-truth image can be used as a calibration image. For example, a groundtruth image can be an image including information associated with the position of an object on a display of a head mounted AR device. The ground truth image can be stored in a memory. In an example implementation, a calibration can be performed in an iterative manner using a batch of calibration images. In this implementation a plurality of ground-truth images can be received as a batch (or plurality of) calibration images.

[0040] In step S310 a first gaze direction based on the ground-truth image. For example, Eqn. (3) can be solved. Eqn. (3) can be solved using a closed form solution. A closed form solution is an expression for an exact solution given a finite amount of data. W can be the closed form solution of Eqn. (3). Then, the predicted gaze direction for the calibration image can be predicted (calculated, generated). As mentioned above, the predicted gaze direction g can be predicted as a Cartesian (x.y) value. Therefore, for calibration image I, gaze g = (x,y) = WF.

[0041] In step S315 a neural network is trained based on the predicted first gaze direction and the ground-truth image. For example, the neural network can be trained iteratively. In each iteration, an error can be calculated (e.g., MSE) based on the predicted gaze direction g and the location (e.g., metadata) on the display of the head mounted AR device for the calibration image I. The error can be backpropagate through F with respect to ggt = (xgt,ygt). In an example implementation, the error does not backpropagate through the weights W. If the error is above a threshold value, the parameters (e.g., weights) associated with the neural network can be modified and the training process moves on to a next iteration. The training process continues iterating until the error is within an acceptable error range (e.g., threshold) or the error has increased. The next iteration can include processing returning to step S305. Alternatively, processing can continue at step S320.

[0042] In step S320 an indication is displayed on a head mounted AR device display. For example, the indication can be an object caused to be rendered on the display of the head mounted AR device. For example, the indication can be a dot, a line, a letter, an animal, furniture, a plant, and/or the like. In an example implementation, the user or wearer of the head mounted AR device can be instructed (e.g., visually and/or audibly) to look at the object. In step S325 an image is captured by a backward facing camera of the head mounted AR device. The image can be used as a calibration image.

[0043] In step S330 a second gaze direction is predicted based on the image as a calibration image. For example, Eqn. (3) can be solved. Eqn. (3) can be solved using a closed form solution. Then, the predicted gaze direction for the calibration image can be predicted (calculated, generated). As mentioned above, the predicted gaze direction g can be predicted as a Cartesian (x.y) value. Therefore, for calibration image I, gaze g = (x,y) = WF.

[0044] In step S325 the neural network is trained based on the predicted second gaze direction and a position on the head mounted AR device display associated with the indication. For example, the neural network can be trained iteratively. In each iteration, an error can be calculated (e.g., MSE) based on the predicted gaze direction g and the location of the object when rendered on the display of the head mounted AR device. The error can be backpropagate through F with respect to ggt = (xgt,ygt) (in this step, gt refers to the image captured by the head mounted AR device). In an example implementation, the error does not backpropagate through the weights W. If the error is above a threshold value, the parameters (e.g., weights) associated with the neural network can be modified and the training process moves on to a next iteration. The training process continues iterating until the error is within an acceptable error range (e.g., threshold) or the error has increased. The next iteration can include processing returning to step S305. Alternatively, processing can return to step S320. Sometimes the calibration based on a ground-truth image is referred to as calibration. Sometimes the calibration based on a captured image is referred to as validation. However, example implementations use a ground-truth image or a captured image interchangeably. This can also allow for mixing calibration conditions that differ from the actual calibration run-time and/or stored ground-truth images, resulting in the neural network being more robust and/or invariant when the images captured during operational run-time differ from the calibration conditions.

[0045] FIG. 4A illustrates a user wearing an example head mounted AR device 400 in the form of smart glasses, or augmented reality glasses, including display capability, eye/gaze tracking capability, and computing/processing capability. FIG. 4B is a front view, and FIG. 4C is a rear view, of the example head mounted AR device 400 shown in FIG. 4A. The example head mounted AR device 400 includes a frame 410. The frame 410 includes a front frame portion 420, and a pair of arm portions 430 rotatably coupled to the front frame portion 420 by respective hinge portions 440. The front frame portion 420 includes rim portions 423 surrounding respective optical portions in the form of lenses 427, with a bridge portion 429 connecting the rim portions 423. The arm portions 430 are coupled, for example, pivotably or rotatably coupled, to the front frame portion 420 at peripheral portions of the respective rim portions 423. In some examples, the lenses 427 are corrective/prescription lenses. In some examples, the lenses 427 are an optical material including glass and/or plastic portions that do not necessarily incorporate corrective/prescription parameters.

[0046] In some examples, the head mounted AR device 400 includes a display device 404 that can output visual content, for example, at an output coupler 405, so that the visual content is visible to the user. In the example shown in FIGS. 4B and 4C, the display device 404 is provided in one of the two arm portions 430, simply for purposes of discussion and illustration. Display devices 404 may be provided in each of the two arm portions 430 to provide for binocular output of content. In some examples, the display device 404 may be a see-through near eye display. In some examples, the display device 404 may be configured to project light from a display source onto a portion of teleprompter glass functioning as a beamsplitter seated at an angle (e.g., 30-45 degrees). The beamsplitter may allow for reflection and transmission values that allow the light from the display source to be partially reflected while the remaining light is transmitted through. Such an optic design may allow a user to see both physical items in the world, for example, through the lenses 427, next to content (for example, digital images, user interface elements, virtual content, and the like) output by the display device 404. In some implementations, waveguide optics may be used to depict content on the display device 404.

[0047] In some examples, the head mounted AR device 400 includes one or more of an audio output device 406 (such as, for example, one or more speakers), an illumination device 408, a sensing system 411, a control system 412, at least one processor 414, and an outward facing image sensor 416 (for example, a camera). In some examples, the sensing system 411 may include various sensing devices and the control system 412 may include various control system devices including, for example, one or more processors 414 operably coupled to the components of the control system 412. In some examples, the control system 412 may include a communication module providing for communication and exchange of information between the head mounted AR device 400 and other external devices.

[0048] In some examples, the head mounted AR device 400 includes a gaze tracking device 415 to detect and track eye gaze direction and movement. The gaze tracking device 415 can include a backward facing camera and a LED. The backward facing camera can be used to capture images of the eye (e.g., the pupil of the eye). The images can be used the determine a gaze direction. Backward facing can be a direction from (an inside, face side, arm side) the front frame portion 420 toward the users face and eye(s). Backward facing can be a direction from (an inside, face side, arm side) the front frame portion 420 toward one of the two arm portions 430. Backward facing can be in direction d such that an image(s) of the eye can be captured. A lens of the backward facing camera can be pointed in a backward direction. Alternatively, the lens of the backward facing camera can be pointed in another direction where the captured image can be a reflection of the eye.

[0049] As mentioned above, head mounted AR devices (e.g., the head mounted AR device 400) can have strict requirements on power consumption, industrial design, manufacturability, and/or the like. Therefore, the gaze tracking device 415 can be restricted to minimal camera and LED quantities. Data captured by the gaze tracking device 415 may be processed to detect and track gaze direction and movement as a user input. In the example shown in FIGS. 4B and 4C, the gaze tracking device 415 is provided in one of the two arm portions 430, simply for purposes of discussion and illustration. In the example arrangement shown in FIGS. 4B and 4C, the gaze tracking device 415 is provided in the same arm portion 430 as the display device 404, so that user eye gaze can be tracked not only with respect to objects in the physical environment, but also with respect to the content output for display by the display device 404. In some examples, gaze tracking devices 415 may be provided in each of the two arm portions 430 to provide for gaze tracking of each of the two eyes of the user. In some examples, display devices 404 may be provided in each of the two arm portions 430 to provide for binocular display of visual content. [0050] Numerous different sizing and fitting measurements and/or parameters may be considered when selecting and/or sizing and/or fitting a wearable device, such as the example head mounted AR device 400 shown in FIGS. 4A-4C, for a particular user. This may include, for example, wearable fit parameters, or wearable fit measurements. Wearable fit parameters/measurements may consider how a particular frame 410 fits and/or looks and/or feels on a particular user. Wearable fit parameters/measurements may take into consideration numerous factors such as, for example, whether the rim portions 423 and bridge portion 429 are shaped and/or sized so that the bridge portion 429 rests comfortably on the bridge of the user’s nose, whether the frame 410 is wide enough to be comfortable with respect to the temples, but not so wide that the frame 410 cannot remain relatively stationary when worn by the user, whether the arm portions 430 are sized to comfortably rest on the user’s ears, and other such comfort related considerations. Wearable fit parameters/measurements may consider other as- worn considerations including how the frame 410 may be positioned based on the user’s natural head pose/where the user tends to naturally wear his/her glasses. In some examples, aesthetic fit measurements or parameters may consider whether the frame 410 is aesthetically pleasing to the user/compatible with the user’s facial features, and the like.

[0051] In a head mounted wearable device including display capability, display fit parameters, or display fit measurements may be considered in selecting and/or sizing and/or fitting the head mounted AR device 400 for a particular user. Display fit parameters/measurements may be used to configure the display device 404 for a selected frame 410 for a particular user, so that content output by the display device 404 is visible to the user. For example, display fit parameters/measurements may facilitate calibration of the display device 404, so that visual content is output within at least a set portion of the field of view of the user. For example, the display fit parameters/measurements may be used to configure the display device 404 to provide at least a set level of gazability, corresponding to an amount, or portion, or percentage of the visual content that is visible to the user at a periphery (for example, a least visible corner) of the field of view of the user.

[0052] Implementations can include one or more, and/or combinations thereof, of the following examples.

[0053] Example 1 : FIG. 5 illustrates a method for calibrating a gaze tracking system according to an example implementation. As shown in FIG. 5 the method includes in step S505 generating a batch of calibration images including a ground-truth image, and an image captured by a backward facing camera of a head mounted AR device. In step S510 predicting a first gaze direction based on the ground-truth image. In step S515 training a neural network based on the predicted first gaze direction and the ground-truth image. In step S520 predicting a second gaze direction based on the image and an indication displayed on a display of the head mounted AR device when the image was captured. In step S525 training the neural network based on the predicted second gaze direction and a position on the display of the head mounted AR device associated with the indication.

[0054] Example 2: The method of Example 1, wherein the ground-truth image can include a cartesian coordinate corresponding to a location of a ground-truth indicator on the head mounted AR device display.

[0055] Example 3: The method of Example 1, wherein the predicting of the first gaze direction can be based on a location of a ground-truth indicator on the head mounted AR device display. [0056] Example 4: The method of Example 1, wherein the training of the neural network can be based an error associated with the first gaze direction and a location of a ground-truth indicator on the head mounted AR device display.

[0057] Example 5: The method of Example 1, wherein the predicting of the first gaze direction can be g = WN(I) where g is the predicted first gaze direction, I is the ground-truth image, N(I) is an output of the neural network, and W is a weighting matrix.

[0058] Example 6: The method of Example 1, wherein N I') where the matrix W is a matrix 2 k that maps a feature vector to final display coordinates, and F is the feature vector.

[0059] Example 7: The method of Example 7, wherein the predicting of the first gaze direction can include solving argminw|| Gcaiib - WN(Icaiib)|| using a closed form solution.

[0060] Example 8: The method of Example 1 can further include storing parameters associated the training of the neural network for use when the neural network is used as a trained neural network.

[0061] Example 9: The method of Example 1, wherein the predicting of the second gaze direction can be based on a location of an indication displayed on the head mounted AR device display.

[0062] Example 10: The method of Example 1, wherein the training of the neural network can be based an error associated with the second gaze direction and a location of an indication displayed on the head mounted AR device display.

[0063] Example 11 : The method of Example 1, wherein the predicting of the second gaze direction can be g = WN(I) where g is the predicted second gaze direction, I is the image by a backward facing camera of the head mounted AR device, N(I) is an output of the neural network, and W is a weighting matrix.

[0064] Example 12: The method of Example 11, wherein N ) where the matrix W is a matrix 2 k that maps a feature vector to final display coordinates, and F is the feature vector.

[0065] Example 13: The method of Example 1, wherein the predicting of the second gaze direction can include solving argminw|| Gcaiib - WN(Icaiib)|| using a closed form solution. [0066] Example 14: The method of Example 1, wherein at least one of the ground-truth image can be one of a plurality of ground-truth image, and the image captured by the backward facing camera of the head mounted AR device can be one of a plurality of images captured by the backward facing camera of the head mounted AR device, wherein each of the plurality of images is captured under different conditions.

[0067] Example 15. A method can include any combination of one or more of Example 1 to Example 14.

[0068] Example 16. A non-transitory computer-readable storage medium comprising instructions stored thereon that, when executed by at least one processor, are configured to cause a computing system to perform the method of any of Examples 1-15.

[0069] Example 17. An apparatus comprising means for performing the method of any of Examples 1-14.

[0070] Example 18. An apparatus comprising at least one processor and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to perform the method of any of Examples 1-14.

[0071] FIG. 6 illustrates a block diagram of a system according to an example implementation. In the example of FIG. 6, the system (e.g., the head mounted AR device 400, an augmented reality system, a virtual reality system, a companion device, and/or the like) can include a computing system or at least one computing device and should be understood to represent virtually any computing device configured to perform the techniques described herein. As such, the device may be understood to include various components which may be utilized to implement the techniques described herein, or different or future versions thereof. By way of example, the system can include a processor 605 and a memory 610 (e.g., a non-transitory computer readable memory). The processor 605 and the memory 610 can be coupled (e.g., communicatively coupled) by a bus 615.

[0072] The processor 605 may be utilized to execute instructions stored on the at least one memory 610. Therefore, the processor 605 can implement the various features and functions described herein, or additional or alternative features and functions. The processor 605 and the at least one memory 610 may be utilized for various other purposes. For example, the at least one memory 610 may represent an example of various types of memory and related hardware and software which may be used to implement any one of the modules described herein.

[0073] The at least one memory 610 may be configured to store data and/or information associated with the device. The at least one memory 610 may be a shared resource. Therefore, the at least one memory 610 may be configured to store data and/or information associated with other elements (e.g., image/video processing or wired/wireless communication) within the larger system. Together, the processor 605 and the at least one memory 610 may be utilized to implement the techniques described herein. As such, the techniques described herein can be implemented as code segments (e.g., software) stored on the memory 610 and executed by the processor 605. Accordingly, the memory 610 can include the calibration display module 105, the calibration image module 110, the user-calibration module 115, the neural network training module 120, the calibration storage module 125, the ground-truth image training module 205, and the user directed training module 210.

[0074] The example implementation shown in FIG. 6 is only one example hardware configuration. In other implementations operations can be shared between the head mounted AR device and other communicatively coupled computing devices. For example, at least one block can be performed on a companion computing device (e.g., a mobile phone) and/or a web-based device (e.g., a server). For example, the ground-truth image training module 205 and/or the user directed training module 210 could be implemented on a companion computing device (e.g., a mobile phone) and/or a web-based device (e.g., a server). Therefore, the head mounted AR device can be configured to communicate and/or receive data to/from the companion computing device (e.g., a mobile phone) and/or the web-based device (e.g., a server). For example, the head mounted AR device can be configured to communicate images to the ground-truth image training module 205 and/or the user directed training module 210 and receive calibration (e.g., neural network training) information from the companion computing device (e.g., a mobile phone) and/or the web-based device (e.g., a server).

[0075] Example implementations can include a non-transitory computer-readable storage medium comprising instructions stored thereon that, when executed by at least one processor, are configured to cause a computing system to perform any of the methods described above. Example implementations can include an apparatus including means for performing any of the methods described above. Example implementations can include an apparatus including at least one processor and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to perform any of the methods described above.

[0076] Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, specially designed ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various implementations can include implementation in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.

[0077] These computer programs (also known as programs, software, software applications or code) include machine instructions for a programmable processor, and can be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms “machine-readable medium” “computer- readable medium” refers to any computer program product, apparatus and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term “machine- readable signal” refers to any signal used to provide machine instructions and/or data to a programmable processor.

[0078] To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having a display device (a LED (light-emitting diode), or OLED (organic LED), or LCD (liquid crystal display) monitor/ screen) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the computer. Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.

[0079] The systems and techniques described here can be implemented in a computing system that includes a back end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front end component (e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back end, middleware, or front end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network (“LAN”), a wide area network (“WAN”), and the Internet.

[0080] The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

[0081] A number of embodiments have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the specification.

[0082] In addition, the logic flows depicted in the figures do not require the particular order shown, or sequential order, to achieve desirable results. In addition, other steps may be provided, or steps may be eliminated, from the described flows, and other components may be added to, or removed from, the described systems. Accordingly, other embodiments are within the scope of the following claims.

[0083] Further to the descriptions above, a user may be provided with controls allowing the user to make an election as to both if and when systems, programs, or features described herein may enable collection of user information (e.g., information about a user’s social network, social actions, or activities, profession, a user’ s preferences, or a user’s current location), and if the user is sent content or communications from a server. In addition, certain data may be treated in one or more ways before it is stored or used, so that personally identifiable information is removed. For example, a user’s identity may be treated so that no personally identifiable information can be determined for the user, or a user’s geographic location may be generalized where location information is obtained (such as to a city, ZIP code, or state level), so that a particular location of a user cannot be determined. Thus, the user may have control over what information is collected about the user, how that information is used, and what information is provided to the user.

[0084] While certain features of the described implementations have been illustrated as described herein, many modifications, substitutions, changes and equivalents will now occur to those skilled in the art. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the scope of the implementations. It should be understood that they have been presented by way of example only, not limitation, and various changes in form and details may be made. Any portion of the apparatus and/or methods described herein may be combined in any combination, except mutually exclusive combinations. The implementations described herein can include various combinations and/or subcombinations of the functions, components and/or features of the different implementations described.

[0085] While example embodiments may include various modifications and alternative forms, embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that there is no intent to limit example embodiments to the particular forms disclosed, but on the contrary, example embodiments are to cover all modifications, equivalents, and alternatives falling within the scope of the claims. Like numbers refer to like elements throughout the description of the figures.

[0086] Some of the above example embodiments are described as processes or methods depicted as flowcharts. Although the flowcharts describe the operations as sequential processes, many of the operations may be performed in parallel, concurrently or simultaneously. In addition, the order of operations may be re-arranged. The processes may be terminated when their operations are completed but may also have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, subprograms, etc.

[0087] Methods discussed above, some of which are illustrated by the flow charts, may be implemented by hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the necessary tasks may be stored in a machine or computer readable medium such as a storage medium. A processor(s) may perform the necessary tasks.

[0088] Specific structural and functional details disclosed herein are merely representative for purposes of describing example embodiments. Example embodiments, however, be embodied in many alternate forms and should not be construed as limited to only the embodiments set forth herein.

[0089] It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of example embodiments. As used herein, the term and/or includes any and all combinations of one or more of the associated listed items.

[0090] It will be understood that when an element is referred to as being connected or coupled to another element, it can be directly connected or coupled to the other element or intervening elements may be present. In contrast, when an element is referred to as being directly connected or directly coupled to another element, there are no intervening elements present. Other words used to describe the relationship between elements should be interpreted in a like fashion (e.g., between versus directly between, adjacent versus directly adjacent, etc.).

[0091] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments. As used herein, the singular forms a, an and the are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms comprises, comprising, includes and/or including, when used herein, specify the presence of stated features, integers, steps, operations, elements and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof.

[0092] It should also be noted that in some alternative implementations, the functions/acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may in fact be executed concurrently or may sometimes be executed in the reverse order, depending upon the functionality/acts involved.

[0093] Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which example embodiments belong. It will be further understood that terms, e.g., those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

[0094] Portions of the above example embodiments and corresponding detailed description are presented in terms of software, or algorithms and symbolic representations of operation on data bits within a computer memory. These descriptions and representations are the ones by which those of ordinary skill in the art effectively convey the substance of their work to others of ordinary skill in the art. An algorithm, as the term is used here, and as it is used generally, is conceived to be a self-consistent sequence of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of optical, electrical, or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.

[0095] In the above illustrative embodiments, reference to acts and symbolic representations of operations (e.g., in the form of flowcharts) that may be implemented as program modules or functional processes include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types and may be described and/or implemented using existing hardware at existing structural elements. Such existing hardware may include one or more Central Processing Units (CPUs), digital signal processors (DSPs), application-specific-integrated-circuits, field programmable gate arrays (FPGAs) computers or the like. [0096] It should be bome in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise, or as is apparent from the discussion, terms such as processing or computing or calculating or determining of displaying or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical, electronic quantities within the computer system’s registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.

[0097] Note also that the software implemented aspects of the example embodiments are typically encoded on some form of non-transitory program storage medium or implemented over some type of transmission medium. The program storage medium may be magnetic (e.g., a floppy disk or a hard drive) or optical (e.g., a compact disk read only memory, or CD ROM), and may be read only or random access. Similarly, the transmission medium may be twisted wire pairs, coaxial cable, optical fiber, or some other suitable transmission medium known to the art. The example embodiments not limited by these aspects of any given implementation.

[0098] Lastly, it should also be noted that whilst the accompanying claims set out particular combinations of features described herein, the scope of the present disclosure is not limited to the particular combinations hereafter claimed, but instead extends to encompass any combination of features or embodiments herein disclosed irrespective of whether or not that particular combination has been specifically enumerated in the accompanying claims at this time.