METHOD AND SYSTEM FOR GENERATING A DECISION LOGIC AND ELECTRIC POWER SYSTEM

Title:

METHOD AND SYSTEM FOR GENERATING A DECISION LOGIC AND ELECTRIC POWER SYSTEM

Document Type and Number:

WIPO Patent Application WO/2023/062191

Kind Code:

A1

Abstract:

To generate a decision logic (34) for an IED (30), at least one machine learning model is trained in an iterative machine learning model training. Weighting functions are used to weight samples in the iterative machine learning model training. Weighting function(s) associated with one or several training cases are automatically modified in the iterative machine learning model training.

Inventors:

DAWIDOWSKI PAWEL (PL)
OTTEWILL JAMES (PL)
CHAKRAVORTY JHELUM (CA)

Application Number:

PCT/EP2022/078651

Publication Date:

April 20, 2023

Filing Date:

October 14, 2022

Export Citation:

Click for automatic bibliography generation Help

Assignee:

HITACHI ENERGY SWITZERLAND AG (CH)

International Classes:

G05B23/02; G05B19/042; G06N3/08; H02H7/26; H02J13/00

Foreign References:

US20200409323A1	2020-12-31
US20210264111A1	2021-08-26
US20170344900A1	2017-11-30

Attorney, Agent or Firm:

VOSSIUS & PARTNER PATENTANWÄLTE RECHTSANWÄLTE MBB (DE)

Download PDF:

View/Download PDF PDF Help

Claims:

47

CLAIMS

1. A method of generating a decision logic (34) operative to process a time-series input and to generate a decision logic output, in particular for generating a decision logic (34) for an electric power system or industrial automation control system, the method being performed by at least one integrated circuit (131) and comprising: retrieving, from a memory or storage medium (132; 140), at least one training dataset comprising a plurality of training cases, each training case comprising a training input time series (55; 56) and a target output time series (61); initializing weighting functions (62; 70; 82), each weighting function (62; 70; 82) being respectively associated with a training case of the plurality of training cases; and performing an iterative procedure comprising several iterations that respectively comprise: performing at least one training step for training at least one machine learning, ML, model that reduces a value of an aggregated loss function, the aggregated loss function being dependent on loss functions for at least a sub-set of the training cases with each of the loss functions being respectively weighted by a weighting function (62; 70; 82) associated with the respective training case, each loss function being dependent on a difference between the target output time series (61) of the respective training case and an output time series (61) provided by the ML model responsive to the training input time series (55; 56) of the respective training case; selectively modifying the weighting function(s) (62; 70) associated with one or several of the training cases between at least some successive iterations of the iterative procedure; and using the modified weighting function(s) (62'; 62") when performing at least one subsequent training step.

2. The method of claim 1, further comprising terminating the iterative procedure in response to determining that a termination criterion is fulfilled and storing an ML model of the at least one ML model trained in the iterative procedure as decision logic (34) for execution by at least one decision-making device, in particular for execution by an Intelligent Electronic Device, IED (30). 48 The method of claim 1 or claim 2, wherein selectively modifying the weighting function(s) (62; 70) comprises modifying weighting functions (62; 70) associated with different training cases independently of each other. The method of any one of the preceding claims, wherein selectively modifying the weighting function(s) (62; 70) comprises shifting at least a rising flank (63) of the weighting function (62; 70; 82) associated with a training case relative to sample times of the target output time series (61) of the training case. The method of claim 4, wherein the target output time series (61) of the training case changes its value at a sample time (67), and wherein shifting at least the rising flank (63) comprises reducing a delay (69; 69') of the rising flank (63) of the weighting function (62; 70; 82) relative to the sample time (67) at which the target output time series (61) changes its value. The method of claim 5, wherein the weighting function (62; 70) has weighting function values that, for sample times between the time (67) at which the target output time series (61) changes its value and an onset time (68) of the rising flank (63) of the weighting function, are smaller than weighting function values for sample times prior to the sample time (67) at which the target output time series (61) changes its value and/or weighting function values for sample times subsequent to the onset time (67) of the rising flank of the weighting function. The method of claim 5 or claim 6, wherein the weighting function (62; 70) is zero for sample times between the sample time (67) at which the target output time series (61) changes its value and the onset time (68) of the rising flank (63) of the weighting function (62; 70). The method of any one of claims 5 to 7, wherein the delay (69) is decremented in several steps in the iterative procedure. The method of any one of the preceding claims, wherein selectively modifying the weighting function(s) (62; 70) associated with one or several of the training cases comprises: determining that a modification criterion is fulfilled for the one or several of the training cases; and modifying the weighting function(s) (62; 70) associated with the one or several of the training cases for which the modification criterion is fulfilled, 49 optionally wherein determining that the modification criterion is fulfilled comprises determining that a number of correct classifications fulfills a threshold comparison criterion. The method of any one of the preceding claims, wherein the weighting functions (62; 70) comprise first weighting functions (62; 70) associated with training cases in which the target output time series (61) varies and second weighting functions (82) associated with training cases in which the target output time series (81) is constant, wherein initializing the weighting functions (62; 70; 82) comprises initializing the first weighting functions (62; 70) to have a dependency on sample time that is different from a dependency on sample time of the second weighting functions (82), optionally wherein initializing the weighting functions (62; 70; 82) comprises initializing the first weighting functions (62; 70) to vary as a function of sample time and initializing the second weighting functions (82) to be constant as a function of sample time. The method of claim 10, wherein the first and second weighting functions (62; 70; 82) are time-continuous functions and the first weighting functions (62; 70) are initialized such that a time-integral of each first weighting function (62; 70) depends on a time-integral of each second weighting function (82), the integrals being respectively computed over a time period representing a period defined by all sample times of the target output time series (61; 81) of the training cases; or the first and second weighting functions (62; 70; 82) are time-discrete functions and the first weighting functions (62; 70) are initialized such that a sum of values of each first weighting function (62; 70) depends on a sum of values of each second weighting function (82), the sums being respectively computed by a summation of the weighting function values for all sample times represented by the target output time series (61; 81) of the training cases. The method of any one of the preceding claims, wherein each ML model of the at least one M L model has an input layer (111) operative to receive one or several time series representative of electrical characteristics of an electric power system, and an output layer (113) operative to output a protection command for performing a protective or corrective action for an asset of the electric power system, optionally wherein the decision logic (34) is a distance protection or time domain protection logic (34), the one or several time series representative of electrical characteristics comprise current and/or voltage measurements for one or several phases or features determined from current 50 and/or voltage measurements for one or several phases, and the protection command is operative to change between values corresponding to circuit breaker trip and restrain, and/or wherein each ML model of the at least one ML model further comprises at least one recurrent neural network layer (112), in particular a long short-term memory, LSTM, layer or gated recurrent unit, GRU, cell (120). A method of performing asset protection or monitoring, comprising: generating a decision logic (34) for an intelligent electronic device, IED (30), using the method of any one of the claims 1 to 12; and storing the decision logic (34) in a memory or storage device of the IED (30) for execution by the IED; wherein the method optionally further comprises executing, by the IED (30), the decision logic (34), comprising triggering corrective, protective, and/or mitigating actions responsive to the decision logic output. A system (130) for generating a decision logic (34) operative to process a time-series input and to generate a decision logic output, in particular for generating a decision logic (34) for an electric power system or industrial automation control system, the system (130) comprising: an interface (135) operative to retrieve, from a memory or storage medium (140), at least one training dataset comprising a plurality of training cases, each training case comprising a training input time series (55; 56) and a target output time series (61); and at least one integrated circuit (131) operative to: initializes weighting functions (62; 70; 82), each weighting function (62; 70; 82) being respectively associated with a training case of the plurality of training cases; and perform an iterative procedure comprising several iterations that respectively comprise: performing at least one training step for training at least one machine learning, ML, model that reduces a value of an aggregated loss function, the aggregated loss function being dependent on loss functions for at least a sub-set of the training cases with each of the loss functions being respectively weighted by a weighting function (62; 70; 82) associated with the respective training case, each loss function being dependent on a difference between the target output time series (61) of the respective training case and an output time series (61) provided by the ML model responsive to the training input time series (55; 56) of the respective training case; selectively modifying the weighting function(s) (62; 70) associated with one or several of the training cases between at least some successive iterations of the iterative procedure; and using the modified weighting function(s) (62'; 62") when performing at least one subsequent training step

15. An electric power system (10), comprising: an intelligent electronic device, IED (30); and the system (130) of claim 14 operative to generate a decision logic (34) and to provide the decision logic (34) to the IED (30) for execution.

Description:

METHOD AND SYSTEM FOR GENERATING A DECISION LOGIC AND ELECTRIC POWER SYSTEM

FIELD OF THE APPLICATION

Embodiments of the application relate to systems and methods for generating a decision logic and to an electric power system. Embodiments of the application relate in particular to devices and systems for performing asset protection, monitoring or control, and to techniques for generating a decision logic for such devices and/or systems. Embodiments of the application relate to devices, systems, and methods for generating a decision logic that can be used to perform detect a fault and to initiate a corrective, protective, and/or mitigating action in response to a fault detection.

BACKGROUND OF THE APPLICATION

There are a number of applications where it is necessary to include protection, monitoring, and/or control systems in order to prevent major failures. Examples of such applications include, amongst others, power system protection, where protection devices are used to disconnect faulted parts of an electrical network, or process monitoring systems used for identifying anomalous behaviors in an industrial plant which might be indicative of a developing failure. These protection, monitoring, and/or control systems may include high levels of automation due to the fact that decisions might need to be made more quickly than is possible by a human operator, and/or that there are often too many devices (and signals recorded by devices) that must be monitored at any given time.

The automation may include a decision logic which is used to determine whether or not a mitigation action (e.g. tripping a circuit breaker, or signaling an alarm to an operator) is to be triggered. It is important that the correct decision be made as quickly as possible.

Various methods may be used for analyzing time series data for monitoring purposes. In recent years recurrent neural networks (RNNs) have grown in popularity. RNNs such as long-short term memory (LSTM) or gated recurrent unit (GRU) allow machine learning (ML) models to be trained to detect a specific event or situation, with a training set comprised of an input time series and desired output value at each time step. There are several challenges which need to be properly addressed in order to not overfit the ML model, and ensure correct and robust ML model based fault detection or alarm raise.

When critical decisions need to be taken automatically (such as in electric power systems or industrial process control), security, dependability, and speed are key performance indicators. Security (also referred to as selectivity) may relate to restraining from operation for a normal state or for faults out of a protected zone. Dependability (also referred to as sensitivity) means operating in case of a fault inside of the protected zone. For illustration, when a trip decision needs to be taken by a relay or other protection device of an electric power system, there is a need to ensure that fault cases are reliably identified in a timely manner that mitigates the risk of system failure, while reducing or essentially eliminating incorrect trips.

SUMMARY

There is a need in the art for enhanced methods and systems for automatically or semi- automatically generating a decision logic that can be executed by an intelligent electronic device (IED) to automatically take decisions. There is also a need in the art for enhanced methods and systems operative to generate a decision logic that provides enhanced dependability and speed for critical decision making, without requiring human expert knowledge for distinguishing simpler and more challenging training cases during training. There is also a need in the art for enhanced methods and systems operative to generate a decision logic which, in field use, receives time-series input and provides a time-series output, the time-series output being indicative of whether a corrective, protective, and/or mitigating action is to be taken. There is also a need in the art for enhanced devices that execute such a decision logic and/or for enhanced protection methods that employ such a decision logic.

According to the application, methods and systems as recited in the independent claims are provided. The dependent claims define preferred embodiments.

Methods and systems according to embodiments are operative to generate a decision logic by training one or several machine learning (ML) models using a training kernel technique. The training kernels are adjusted automatically during ML model training so that the resulting trained ML model can provide a correct decision as quickly as possible. The speed may generally be dependent on the complexity of the specific training case. Training kernels associated with different training cases may be adjusted independently from each other, respectively in an automated manner that may invoke an objective criterion without being dependent on a human expert classification of the complexity of a training case.

The methods and systems may be used in association with a protection relay to provide power system protection with improved performance. The methods and systems may be used in association with distance protection or time domain protection, with the decision logic being operative to process a time series of input features that are or depend on measured electric characteristics, determine whether a trip is to be performed in a zone for which an intelligent electronic device (IED) that executed the decision logic is responsible, and generate an output that triggers a corrective, protective, and/or mitigating action (such as a circuit breaker (CB) trip). The methods and systems may be used more broadly in association with asset monitoring and control.

A method of generating a decision logic operative to process a time-series input and to generate a decision logic output is provided. The method may be performed by at least one integrated circuit and may comprise retrieving, from a memory or storage medium, at least one training dataset comprising a plurality of training cases, each training case comprising a training input time-series and a target output time-series. The method may comprise initializing weighting functions, each weighting function being respectively associated with a training case of the plurality of training cases. The method may comprise performing an iterative procedure comprising several iterations that respectively comprise performing at least one training step for training at least one machine learning (ML) model that reduces a value of an aggregated loss function. The aggregated loss function may be dependent on loss functions for at least a sub-set of the training cases with each of the loss functions being respectively weighted by a weighting function associated with the respective training case. Each loss function may be dependent on a difference between the target output time-series of the respective training case and an output time-series provided by the ML model responsive to the training input time-series of the respective training case. The iterations of the iterative procedure may respectively comprise selectively modifying the weighting function(s) associated with one or several of the training cases between at least some successive iterations of the iterative procedure. The iterations of the iterative procedure may comprise using the modified weighting function(s) when performing at least one subsequent training step.

The present application also relates to a method of performing asset protection or monitoring. The method comprises generating a decision logic for an intelligent electronic device (IED), the decision logic being operative to process a time-series input and to generate a decision logic output; and executing, by the IED, the decision logic, comprising triggering corrective, protective, and/or mitigating actions responsive to the decision logic output. The generating the decision logic may performed by at least one integrated circuit and comprises: retrieving, from a memory or storage medium, at least one training dataset comprising a plurality of training cases, each training case comprising a training input time series and a target output time series; and performing an iterative procedure comprising several iterations that respectively comprise: performing at least one training step for training at least one machine learning (ML) model that reduces a value of an aggregated loss function, the aggregated loss function being dependent on loss functions for at least a sub-set of the training cases with each of the loss functions being respectively weighted by a weighting function associated with the respective training case, each loss function being dependent on a difference between the target output time series of the respective training case and an output time series provided by the ML model responsive to the training input time series of the respective training case; selectively modifying the weighting function(s) associated with one or several of the training cases between at least some successive iterations of the iterative procedure; and using the modified weighting function(s) when performing at least one subsequent training step. The method may further comprise initializing the weighting functions before performing the iterative procedure. The corrective, protective, and/or mitigating actions may comprise at least one of tripping a circuit breaker, or signaling an alarm to an operator.

The decision logic may be a decision logic for an electric power system.

The decision logic may be a function of a protection function.

The decision logic may be a sub-component of a protection function.

The decision logic may be a distance protection or time domain protection logic for an electric power transmission system or a sub-component of these or other protection functions.

The decision logic may be a decision logic for an industrial automation control system.

For each training case, both the target output time series and the weighting function associated with the respective training case may be a function of sample time.

For each training case, the output time-series may have N time-sequential values corresponding to a series of consecutive sample times and the weighting function may have N time-sequential values corresponding to the series of consecutive sample times.

The method may comprise terminating the iterative procedure in response to determining that a termination criterion is fulfilled.

The method may comprise storing an ML model of the at least one ML model trained in the iterative procedure as decision logic for execution by at least one decision-making device, in particular for execution by an IED.

The generating the decision logic may further comprise terminating the iterative procedure in response to determining that a termination criterion is fulfilled. The generating the decision logic may further comprise storing an ML model of the at least one ML model trained in the iterative procedure as decision logic for execution by at least one decision-making device, in particular for execution by an IED.

Selectively modifying the weighting function(s) may comprise modifying weighting functions associated with different training cases independently of each other.

Selectively modifying the weighting function(s) may comprise shifting at least a rising flank of the weighting function associated with a training case relative to sample times of the target output time-series of the training case.

The target output time-series of the training case may change its value at a sample time of the time-series. The target output time-series of the training case may change its value abruptly at that sample time, e.g., by exhibiting a discontinuity.

The target output time-series of the training case may change its value abruptly from one of a set of discrete possible output values to another one of the set of discrete possible values at the sample time.

Shifting at least the rising flank may comprise reducing a delay of the rising flank of the weighting function relative to the sample time at which the target output time-series changes.

Shifting at least the rising flank may comprise reducing a delay of the rising flank of the weighting function, along a temporal dimension of the sample times, relative to the sample time at which a class indicated by the target output time-series of the training case changes its value.

The weighting function may have weighting function values that, for sample times between the sample time at which the target output time-series changes its value and an onset time of the rising flank of the weighting function, are smaller than weighting function values for sample times prior to the sample time at which the target output time-series changes its value and/or weighting function values for sample times subsequent to the onset time of the rising flank of the weighting function.

The weighting function may be zero for sample times between the sample time at which the target output time-series changes its value and the onset time of the rising flank of the weighting function.

The delay may be decremented in several steps in the iterative procedure.

Selectively modifying the weighting function(s) associated with one or several of the training cases may comprise determining that a modification criterion is fulfilled for the one or several of the training cases.

Selectively modifying the weighting function(s) associated with one or several of the training cases may comprise modifying the weighting function(s) associated with the one or several of the training cases for which the modification criterion is fulfilled.

Determining that the modification criterion is fulfilled may comprise determining that a number of correct classifications of the trained at least one ML model fulfills a threshold comparison criterion.

The threshold comparison criterion may be checked independently for each of several training cases. Weighting functions associated with different training cases may be modified in different iterations of the iterative procedure, depending on when the threshold comparison criterion is fulfilled for the respective training cases.

The weighting functions may comprise first weighting functions associated with training cases in which the target output time-series varies and second weighting functions associated with training cases in which the target output time-series is constant. Initializing the weighting functions may comprise initializing the first weighting functions to have a dependency on sample time that is different from a dependency on sample time of the second weighting functions.

Initializing the weighting functions may comprise initializing the first weighting functions to vary as a function of sample time.

Initializing the second weighting functions to be constant as a function of sample time.

The first and second weighting functions may be time-continuous functions.

The first weighting functions may be initialized such that a time-integral of each first weighting function depends on a time-integral of each second weighting function, the integrals being respectively computed over a time period representing a period defined by all sample times of the target output time-series of the training cases.

The first and second weighting functions may be time-discrete functions.

The first weighting functions may be initialized such that a sum of values of each first weighting function depends on a sum of values of each second weighting function, the sums being respectively computed by a summation of the weighting function values for all sample times represented by the target output time-series of the training cases.

Each ML model of the at least one ML model may have an input layer operative to receive one or several time series representative of electrical characteristics of an electric power system.

Each ML model of the at least one ML model may have an output layer operative to output a protection command for performing a protective or corrective action for an asset of the electric power system.

Each ML model of the at least one ML model may have at least one recurrent neural network (RNN) layer, such as a long short-term memory (LSTM) layer or gated recurrent unit (GRU) cells.

The decision logic may be a distance protection or time domain protection logic or any subfunction of these or other protection functions.

The one or several time series representative of electrical characteristics may comprise current and/or voltage measurements for one or several phases (e.g., for three phases) or features determined from current and/or voltage measurements for one or several phases. Alternatively or additionally, the one or several time series representative of electrical characteristics may comprise current and/or voltage measurements for a DC grid or features determined from current and/or voltage measurements for a direct current (DC) grid or a hybrid alternating current (AC) - DC grid. Alternatively or additionally, the one or several time series representative of electrical characteristics may comprise current and/or voltage measurements obtained on a distributed energy resource (DER).

These one or several time series may be received by an input layer of a ML model (during training) or of the trained decision logic (during field use). The protection command may be operative to change between values corresponding to circuit breaker trip and restrain. The protection command may be operative to change between two or more discrete states for a converter and/or coupler (e.g., in a DC grid, such as a high voltage DC grid, or a hybrid AC-DC grid; and/or for a system comprising one or several DERs).

The one or several time series representative of electrical characteristics may comprise measurements of transformer characteristics. The transformer characteristics may include any one or any combination of insulation oil temperature, insulation oil composition, dissolved gas concentration(s) of one or several gases, transformer breather characteristics, transformer breather desiccant characteristics, without being limited thereto. These one or several time series may be received by an input layer of a ML model (during training) or of the trained decision logic (during field use).

The protection command may be operative to change between values corresponding to normal and abnormal transformer conditions.

The one or several time series representative of electrical characteristics may comprise measurements of tap changer and/or tap changer switch characteristics. The tap changer and/or tap changer switch characteristics may include current and/or voltage measurements obtained at at least one terminal of the tap changer and/or tap changer switch, features derived therefrom (such as impedance or admittance), oil characteristics, etc. These one or several time series may be received by an input layer of a ML model (during training) or of the trained decision logic (during field use).

The protection command may be operative to change between values corresponding to different tap changer positions. The protection command may be operative to change between values corresponding to normal and abnormal tap changer and/or tap changer switch positions.

A training step may respectively comprise adjusting parameters of the at least one ML model.

Adjusting the parameters may comprise adjusting one or several of biases, forwarding functions, weights, or other parameters of an artificial neural network (ANN) ML model, in particular, of an RNN.

Adjusting the parameters may comprise adjusting the parameters using an optimization procedure with an objective of reducing the aggregated loss function.

Adjusting the parameters may comprise updating the parameters using gradient descent, stochastic gradient descent (SGD), a nonlinear conjugate gradient technique, a limited-memory Broyden-Fletcher-Goldfarb-Shanno algorithm (L-BFGS), a Levenberg-Marquardt Algorithm (LMA), a population-based training algorithm such as an evolutionary algorithm (EA) or a particle swarm optimization (PSO), without being limited thereto.

The weighted loss function for a training case may be a sum of a modulus of a difference between a value of the target output time-series of the respective training case and a value of the output time-series provided by the ML model responsive to the training input time-series of the respective training case at the respective sample time, multiplied by the weighting function associated with the training case at the respective sample time, with the sum being taken over the sample times.

The aggregated loss function may be a sum of the loss functions weighted by the weighting function, with the sum of the loss functions being computed over the training cases.

Other techniques may be used to compute the loss function and aggregated loss function. For illustration, the loss function may be computed as entropy.

A method of generating a decision logic operative to process a time-series input and to generate a decision logic output according to another aspect is provided. The method may be performed by at least one integrated circuit and may comprise retrieving, from a memory or storage medium, at least one training dataset comprising a plurality of training cases, each training case comprising a training input time-series and a target output time-series. The method may comprise training kernel machine learning procedure to train at least one ML model using the training cases. The training kernel machine learning procedure may comprise automatically modifying, by the at least one integrated circuit, training kernels used to weight loss functions of training cases in the training kernel machine learning procedure.

Automatically adjusting the training kernels may comprise automatically adjusting training kernels associated with different training cases independently of each other.

The decision logic may be a decision logic for an electric power system.

The decision logic may be a distance protection or time domain protection logic for an electric power transmission system.

The decision logic may be a decision logic for an industrial automation control system.

For each training case, both the target output time series and the training kernel function associated with the respective training case may be a function of sample time.

For each training case, the output time-series may have N time-sequential values corresponding to a series of consecutive sample times and the training kernel function may have N time-sequential values corresponding to the series of consecutive sample times.

The method may comprise terminating the training kernel machine learning procedure in response to determining that a termination criterion is fulfilled.

The method may comprise storing an ML model of the at least one ML model trained in the training kernel machine learning procedure as decision logic for execution by at least one decisionmaking device, in particular, for execution by an IED.

Automatically modifying the training kernels may comprise modifying training kernels associated with different training cases independently of each other. Automatically modifying the training kernels may comprise shifting at least a rising flank of the training kernel associated with a training case relative to sample times of the target output time-series of the training case.

The target output time-series of the training case may change its value at a sample time of the time-series.

The target output time-series of the training case may change its value abruptly at that sample time, e.g., by exhibiting a discontinuity.

The target output time-series of the training case may change its value abruptly from one of a set of discrete possible output values to another one of the set of discrete possible values at the sample time.

Shifting at least the rising flank may comprise reducing a delay of the rising flank of the training kernel relative to the sample time at which the target output time-series changes its value.

Shifting at least the rising flank may comprise reducing a delay of the rising flank of the weighting function, along a temporal dimension of the sample times, relative to the sample time at which a class indicated by the target output time-series of the training case changes its value.

The training kernel may have training kernel values that, for sample times between the sample time at which the target output time-series changes its value and an onset time of the rising flank of the training kernel, are smaller than training kernel values for sample times prior to the sample time at which the target output time-series changes its value and/or training kernel values for sample times subsequent to the onset time of the rising flank of the training kernel.

The training kernel may be zero for sample times between the sample time at which the target output time-series changes its value of the target output time-series and the onset time of the rising flank of the training kernel.

The delay may be decremented in several steps in the training kernel machine learning procedure, independently for each training case.

Automatically modifying the training kernels may comprise determining that a modification criterion is fulfilled for one or several of the training cases.

Automatically modifying the training kernels may comprise modifying the training kernels for which the modification criterion is fulfilled.

Determining that the modification criterion is fulfilled may comprise determining that a number of correct classifications of the trained at least one ML model fulfills a threshold comparison criterion.

The threshold comparison criterion may be checked independently for each of several training cases. Training kernels associated with different training cases may be modified in different iterations of the training kernel machine learning procedure, depending on when the threshold comparison criterion is fulfilled for the respective training cases.

Training kernels associated with different training cases may be modified independently of each other.

The training kernels may comprise first training kernels associated with training cases in which the target output time-series varies and second training kernels associated with training cases in which the target output time-series is constant.

Initializing the training kernels may comprise initializing the first training kernels to have a dependency on sample time that is different from a dependency on sample time of the second training kernels.

Initializing the training kernels may comprise initializing the first training kernels to vary as a function of sample time and initializing the second training kernels to be constant as a function of sample time.

The first and second training kernels may be time-continuous functions and the first training kernels are initialized such that a time-integral of each first training kernel depends on a time-integral of each second training kernel, the integrals being respectively computed over a time period representing a period defined by all sample times of the target output time-series of the training cases.

The first and second training kernels may be time-discrete functions and the first training kernels are initialized such that a sum of values of each first training kernel depends on a sum of values of each second training kernel, the sums being respectively computed by a summation of the training kernel values for all sample times represented by the target output time-series of the training cases.

Each ML model of the at least one ML model may have an input layer operative to receive one or several time series representative of electrical characteristics of an electric power system.

Each ML model of the at least one ML model may have an output layer operative to output a protection command for performing a protective or corrective action for an asset of the electric power system.

Each ML model of the at least one ML model may have at least one recurrent neural network (RNN) layer, such as a long short-term memory (LSTM) layer or gated recurrent unit (GRU) cells.

The decision logic may be a distance protection or time domain protection logic.

The one or several time series representative of electrical characteristics may comprise current and/or voltage measurements for one or several phases (e.g., for three phases) or features determined from current and/or voltage measurements for one or several phases.

The protection command may be operative to change between values corresponding to circuit breaker trip and restrain.

A training step may respectively comprise adjusting parameters of the at least one ML model. Adjusting the parameters may comprise adjusting one or several of biases, forwarding functions, weights, or other parameters of an artificial neural network (ANN) ML model, in particular, of an RNN.

Adjusting the parameters may comprise adjusting the parameters using an optimization procedure with an objective of reducing an aggregated loss function that depends on weighted loss functions of at least a sub-set of the training cases, each weighted loss function being respectively weighted by the training kernel associated with the respective training case.

At least some of the training kernels may vary during the iterative training kernel machine learning procedure.

Adjusting the parameters may comprise updating the parameters using gradient descent, stochastic gradient descent (SGD), a nonlinear conjugate gradient technique, a limited-memory Broyden-Fletcher-Goldfarb-Shanno algorithm (L-BFGS), a Levenberg-Marquardt Algorithm (LMA), a population-based training algorithm such as an evolutionary algorithm (EA) or a particle swarm optimization (PSO), without being limited thereto.

The weighted loss function for a training case may be a sum of a modulus of a difference between a value of the target output time-series of the respective training case and a value of the output time-series provided by the ML model responsive to the training input time-series of the respective training case at the respective sample time, multiplied by the training kernel associated with the training case at the respective sample time, with the sum being taken over the sample times.

The aggregated loss function may be a sum of the loss functions weighted by the training kernel, with the sum of the loss functions being computed over the training cases.

Other techniques may be used to compute the loss function and aggregated loss function. For illustration, the loss function may be computed as entropy.

A method of performing asset protection or monitoring according to an aspect comprises generating a decision logic for an intelligent electronic device (IED) using the method according to an embodiment and executing, by the IED, the decision logic, comprising performing corrective, protective, and/or mitigating actions responsive to the decision logic output.

The decision logic may receive one or several measurement time series from one or several measurement devices. The decision logic may process the one or several measurement time-series to generate the decision logic output.

The one or several measurement time series may be representative of electrical characteristics of the asset and/or of components of an electric power generation, transmission, and/or distribution system that comprises the asset or that is coupled to the asset.

The one or several measurement time series may comprise current and/or voltage measurements for one or several phases (e.g., for three phases) or features determined from current and/or voltage measurements for one or several phases. Alternatively or additionally, the one or several measurement time series representative of electrical characteristics may comprise current and/or voltage measurements for a DC grid or features determined from current and/or voltage measurements for a direct current (DC) grid or a hybrid alternating current (AC) - DC grid. Alternatively or additionally, the one or several measurement time series representative of electrical characteristics may comprise current and/or voltage measurements obtained on a distributed energy resource (DER). These one or several measurement time series may be received by an input layer of the trained decision logic.

The corrective, protective, and/or mitigating action may comprise controlling the asset or at least one component of an electric power generation, transmission, and/or distribution system that comprises the asset or that is coupled to the asset.

The corrective, protective, and/or mitigating action may comprise controlling an output interface. The output interface may be provided at a control center.

The corrective, protective, and/or mitigating action may comprise selectively causing a circuit breaker to trip.

The corrective, protective, and/or mitigating action may comprise controlling a converter and/or coupler (e.g., in a DC grid, such as a high voltage DC grid, or a hybrid AC-DC grid; and/or for a system comprising one or several DERs).

The one or several measurement time series may comprise measurements of transformer characteristics. The transformer characteristics may include any one or any combination of insulation oil temperature, insulation oil composition, dissolved gas concentration(s) of one or several gases, transformer breather characteristics, transformer breather desiccant characteristics, without being limited thereto. These one or several time series may be received by an input layer of the decision logic.

The corrective, protective, and/or mitigating action may comprise controlling the transformer or a circuit breaker connected to a transformer input and/or output. The corrective, protective, and/or mitigating action may comprise controlling a tap changer for the transformer.

The one or several measurement time series representative of electrical characteristics may comprise measurements of tap changer and/or tap changer switch characteristics. The tap changer and/or tap changer switch characteristics may include current and/or voltage measurements obtained at at least one terminal of the tap changer and/or tap changer switch, features derived therefrom (such as impedance or admittance), oil characteristics, etc. These one or several time series may be received by an input layer of the decision logic.

The corrective, protective, and/or mitigating action may comprise controlling the tap changer to change a tap changer position and/or controlling the tap changer switch. The IED may have an output operative to output at least one control signal to effect the corrective, protective, and/or mitigating action.

The method may comprise storing the decision logic in a memory or storage device of the IED for execution by the IED, or otherwise deploying the decision logic to the IED.

The IED may be a relay.

The IED may be a distance protection or time domain protection relay for an electric power transmission system.

A method of generating and/or preparing an intelligent electronic device (IED) for field use according to an aspect comprises generating a decision logic for an intelligent electronic device (IED) using the method according to an embodiment and storing the decision logic in a memory or storage device of the IED for execution by the IED, or otherwise deploying the decision logic to the IED.

The decision logic may receive one or several measurement time series from one or several measurement devices. The decision logic may process the one or several measurement time-series to generate the decision logic output.

The one or several measurement time series may be representative of electrical characteristics of the asset and/or of components of an electric power generation, transmission, and/or distribution system that comprises the asset or that is coupled to the asset.

The one or several measurement time series may comprise current and/or voltage measurements for one or several phases (e.g., for three phases) or features determined from current and/or voltage measurements for one or several phases. Alternatively or additionally, the one or several measurement time series representative of electrical characteristics may comprise current and/or voltage measurements for a DC grid or features determined from current and/or voltage measurements for a direct current (DC) grid or a hybrid alternating current (AC) - DC grid. Alternatively or additionally, the one or several measurement time series representative of electrical characteristics may comprise current and/or voltage measurements obtained on a distributed energy resource (DER). These one or several measurement time series may be received by an input layer of the trained decision logic.

The corrective, protective, and/or mitigating action may comprise controlling the asset or at least one component of an electric power generation, transmission, and/or distribution system that comprises the asset or that is coupled to the asset.

The corrective, protective, and/or mitigating action may comprise controlling an output interface. The output interface may be provided at a control center.

The corrective, protective, and/or mitigating action may comprise selectively causing a circuit breaker to trip. The corrective, protective, and/or mitigating action may comprise controlling a converter and/or coupler (e.g., in a DC grid, such as a high voltage DC grid, or a hybrid AC-DC grid; and/or for a system comprising one or several DERs).

The one or several measurement time series may comprise measurements of transformer characteristics. The transformer characteristics may include any one or any combination of insulation oil temperature, insulation oil composition, dissolved gas concentration(s) of one or several gases, transformer breather characteristics, transformer breather desiccant characteristics, without being limited thereto. These one or several time series may be received by an input layer of the decision logic.

The corrective, protective, and/or mitigating action may comprise controlling the transformer or a circuit breaker connected to a transformer input and/or output. The corrective, protective, and/or mitigating action may comprise controlling a tap changer for the transformer.

The one or several measurement time series representative of electrical characteristics may comprise measurements of tap changer and/or tap changer switch characteristics. The tap changer and/or tap changer switch characteristics may include current and/or voltage measurements obtained at at least one terminal of the tap changer and/or tap changer switch, features derived therefrom (such as impedance or admittance), oil characteristics, etc. These one or several time series may be received by an input layer of the decision logic.

The corrective, protective, and/or mitigating action may comprise controlling the tap changer to change a tap changer position and/or controlling the tap changer switch.

The IED may have an output operative to output at least one control signal to effect the corrective, protective, and/or mitigating action.

The IED may be a relay.

The IED may be a distance protection or time domain protection relay for an electric power transmission system.

Computer-executable instruction code according to an embodiment comprises instructions that, when executed by at least one integrated circuit, cause the at least one integrated circuit to perform the method according to an embodiment.

A storage medium according to an embodiment has stored thereon computer-executable instructions that, when executed by at least one integrated circuit, cause the at least one integrated circuit to perform the method according to an embodiment.

A system for generating a decision logic operative to process a time-series input and to generate a decision logic output is provided. The system comprise an interface to retrieve at least one training dataset comprising a plurality of training cases, each training case comprising a training input timeseries and a target output time-series. The system comprises at least one integrated circuit operative to initialize weighting functions, each weighting function being respectively associated with a training case of the plurality of training cases. The at least one integrated circuit may be operative to perform an iterative procedure comprising several iterations that respectively comprise performing at least one training step for training at least one machine learning (ML) model that reduces a value of an aggregated loss function. The aggregated loss function may be dependent on loss functions for at least a sub-set of the training cases with each of the loss functions being respectively weighted by a weighting function (e.g., a kernel) associated with the respective training case. Each loss function may be dependent on a difference between the target output time-series of the respective training case and an output time-series provided by the ML model responsive to the training input time-series of the respective training case. The iterations of the iterative procedure may respectively comprise selectively modifying the weighting function(s) associated with one or several of the training cases between at least some successive iterations of the iterative procedure. The iterations of the iterative procedure may comprise using the modified weighting function(s) when performing at least one subsequent training step. The at least one integrated circuit may be operative to perform the method according to an embodiment.

The present application also relates to an electric power system, comprising: an intelligent electronic device (IED), operative to execute a decision logic operative to process a time-series input and to generate a decision logic output, the IED being operative to perform at least one action responsive to the decision logic output, and a system for generating the decision logic. The system comprises an interface operative to retrieve, from a memory or storage medium, at least one training dataset comprising a plurality of training cases, each training case comprising a training input time series and a target output time series; and at least one integrated circuit operative to: perform an iterative procedure comprising several iterations that respectively comprise: performing at least one training step for training at least one machine learning, ML, model that reduces a value of an aggregated loss function, the aggregated loss function being dependent on loss functions for at least a sub-set of the training cases with each of the loss functions being respectively weighted by a weighting function associated with the respective training case, each loss function being dependent on a difference between the target output time series of the respective training case and an output time series provided by the ML model responsive to the training input time series of the respective training case; selectively modifying the weighting function(s) associated with one or several of the training cases between at least some successive iterations of the iterative procedure; and using the modified weighting function(s) when performing at least one subsequent training step. The at least one integrated circuit may be operative to initialize the weighting functions before the iterative procedure is performed.

According to another aspect, there is provided a system comprising an interface to retrieve at least one training dataset comprising a plurality of training cases, each training case comprising a training input time-series and a target output time-series, and at least one integrated circuit operative to perform the method according to an embodiment.

The system may be operative to generate a decision logic for an electric power system.

The system may be operative to generate a distance protection or time domain protection logic for an electric power transmission system.

The system may be operative to generate a decision logic for an industrial automation control system

The system may be operative to terminate the iterative procedure in response to determining that a termination criterion is fulfilled.

The system may be operative to cause an ML model of the at least one ML model trained in the iterative procedure to be stored as decision logic for execution by at least one decision-making device, in particular, for execution by an IED.

The system may be operative such that selectively modifying the weighting function(s) may comprise modifying weighting functions associated with different training cases independently of each other.

The system may be operative such that selectively modifying the weighting function(s) may comprise shifting at least a rising flank of the weighting function associated with a training case relative to sample times of the target output time-series of the training case.

The system may be operative such that the target output time-series of the training case may change its value at a sample time of the time-series.

The target output time-series of the training case may change its value abruptly at that sample time, e.g., by exhibiting a discontinuity.

The target output time-series of the training case may change its value abruptly from one of a set of discrete possible output values to another one of the set of discrete possible values at the sample time.

The system may be operative such that shifting at least the rising flank may comprise reducing a delay of the rising flank of the weighting function relative to the sample time at which the target output time-series changes its value.

The system may be operative such that shifting at least the rising flank may comprise reducing a delay of the rising flank of the weighting function, along a temporal dimension of the sample times, relative to the sample time at which a class indicated by the target output time-series of the training case changes its value.

The system may be operative such that the weighting function may have weighting function values that, for sample times between the sample time at which the target output time-series changes its value and an onset time of the rising flank of the weighting function, are smaller than weighting function values for sample times prior to the sample time at which the target output time-series changes its value and/or weighting function values for sample times subsequent to the onset time of the rising flank of the weighting function.

The system may be operative such that the weighting function may be zero for sample times between the sample time at which the target output time-series changes its value of the target output time-series and the onset time of the rising flank of the weighting function.

The system may be operative such that the delay may be decremented in several steps in the iterative procedure.

The system may be operative such that selectively modifying the weighting function(s) associated with one or several of the training cases may comprise determining that a modification criterion is fulfilled for the one or several of the training cases.

The system may be operative such that selectively modifying the weighting function(s) associated with one or several of the training cases may comprise modifying the weighting function(s) associated with the one or several of the training cases for which the modification criterion is fulfilled.

The system may be operative such that determining that the modification criterion is fulfilled may comprise determining that a number of correct classifications of the trained at least one ML model fulfills a threshold comparison criterion.

The system may be operative such that the threshold comparison criterion may be checked independently for each of several training cases. Weighting functions associated with different training cases may be modified in different iterations of the iterative procedure, depending on when the threshold comparison criterion is fulfilled for the respective training cases.

The system may be operative such that the weighting functions may comprise first weighting functions associated with training cases in which the target output time-series varies and second weighting functions associated with training cases in which the target output time-series is constant.

The system may be operative such that initializing the weighting functions may comprise initializing the first weighting functions to have a dependency on sample time that is different from a dependency on sample time of the second weighting functions.

The system may be operative such that initializing the weighting functions may comprise initializing the first weighting functions to vary as a function of sample time and initializing the second weighting functions to be constant as a function of sample time. The system may be operative such that the first and second weighting functions may be time- continuous functions and the first weighting functions are initialized such that a time-integral of each first weighting function depends on a time-integral of each second weighting function, the integrals being respectively computed over a time period representing a period defined by all sample times of the target output time-series of the training cases.

The system may be operative such that the first and second weighting functions may be timediscrete functions and the first weighting functions are initialized such that a sum of values of each first weighting function depends on a sum of values of each second weighting function, the sums being respectively computed by a summation of the weighting function values for all sample times represented by the target output time-series of the training cases.

The system may be operative such that each ML model of the at least one ML model may have an input layer operative to receive one or several time series representative of electrical characteristics of an electric power system.

The system may be operative such that each ML model of the at least one ML model may have an output layer operative to output a protection command for performing a protective or corrective action for an asset of the electric power system.

The system may be operative such that each ML model of the at least one ML model may have at least one recurrent neural network (RNN) layer, such as a long short-term memory (LSTM) layer or gated recurrent unit (GRU) cells.

The system may be operative such that the one or several time series representative of electrical characteristics may comprise current and/or voltage measurements for one or several phases (e.g., for three phases) or features determined from current and/or voltage measurements for one or several phases.

The system may be operative such that the protection command may be operative to change between values corresponding to circuit breaker trip and restrain.

The system may be operative such that a training step may respectively comprise adjusting parameters of the at least one ML model.

The system may be operative such that adjusting the parameters may comprise adjusting one or several of biases, forwarding functions, weights, or other parameters of an artificial neural network (ANN) ML model, in particular, of an RNN.

The system may be operative such that adjusting the parameters may comprise adjusting the parameters using an optimization procedure with an objective of reducing the aggregated loss function.

The system may be operative such that adjusting the parameters may comprise updating the parameters using gradient descent, stochastic gradient descent (SGD), a nonlinear conjugate gradient technique, a limited-memory Broyden-Fletcher-Goldfarb-Shanno algorithm (L-BFGS), a Levenberg- Marquardt Algorithm (LMA), a population-based training algorithm such as an evolutionary algorithm (EA) or a particle swarm optimization (PSO), without being limited thereto.

The system may be operative such that the weighted loss function for a training case may be a sum of a modulus of a difference between a value of the target output time-series of the respective training case and a value of the output time-series provided by the ML model responsive to the training input time-series of the respective training case at the respective sample time, multiplied by the weighting function associated with the training case at the respective sample time, with the sum being taken over the sample times.

The system may be operative such that the aggregated loss function may be a sum of the loss functions weighted by the weighting function, with the sum of the loss functions being computed over the training cases.

The system may be operative such that other techniques may be used to compute the loss function and aggregated loss function. For illustration, the loss function may be computed as entropy.

An intelligent electronic device (IED) according to an embodiment comprises at least one integrated circuit operative to execute a decision logic generated by a method or system according to an embodiment.

The IED may be operative to perform at least one action (e.g., a corrective and/or protective action) responsive to a decision logic output.