Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
EMBEDDED COMPRESSION FOR PRODUCT LIFECYCLE DATA MANAGEMENT
Document Type and Number:
WIPO Patent Application WO/2024/043932
Kind Code:
A1
Abstract:
A system described herein can perform lightweight data compression that optimizes an embedded database for high-performance throughput, low memory usage, and fast query/response times. The embedded compression and decompression can be performed without libraries or branches, so as to define library-free and branch-free compression with highly predictable runtime behavior for lifecycle data management.

Inventors:
WEN CHENGTAO (US)
WANG LINGYUN (US)
SOLOWJOW EUGEN (US)
CHANDAK SHUBHAM (US)
TANDON PULKIT (US)
WEISSMAN TSACHY (US)
Application Number:
PCT/US2022/075452
Publication Date:
February 29, 2024
Filing Date:
August 25, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SIEMENS AG (DE)
UNIV LELAND STANFORD JUNIOR (US)
International Classes:
H03M7/30
Domestic Patent References:
WO2020197526A12020-10-01
Other References:
WANG MIAO-QIONG ET AL: "Survey of Time Series Data Processing in Industrial Internet", 2019 IEEE INTERNATIONAL CONFERENCES ON UBIQUITOUS COMPUTING & COMMUNICATIONS (IUCC) AND DATA SCIENCE AND COMPUTATIONAL INTELLIGENCE (DSCI) AND SMART COMPUTING, NETWORKING AND SERVICES (SMARTCNS), IEEE, 21 October 2019 (2019-10-21), pages 736 - 741, XP033705778, DOI: 10.1109/IUCC/DSCI/SMARTCNS.2019.00151
Attorney, Agent or Firm:
BRAUN, Mark E. (US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A method performed within an industrial control network that includes a plurality of edge devices and a database: monitoring, by the edge devices, the industrial control network so as to receive time series data; compressing the time series data so as to define compressed data; and storing the compressed data in the database, so as to define an embedded database of the industrial control network.

2. The method as recited in claim 1, wherein compressing the time series data further comprises: selecting a model for performing compression of the time series data; and detecting a delay associated with the time series data.

3. The method as recited in claim 1, wherein compressing time series data further comprises, compressing multiple time series data in parallel, so as to compress respective time series data simultaneously.

4. The method as recited in claim 1, the method further comprising: decompressing, by the database, the compressed data so as to define reconstructed original data.

5. The method as recited in claim 4, wherein the compressing and decompressing are performed without libraries or branches, so as to define library-free and branch-free compression and decompression with predictable run-time behavior.

6. The method as recited in claim 4, receiving, by the database, a query for the time series data; responsive to the query, displaying the reconstructed original data associated with the query.

7. The method as recited in claim 1, wherein the plurality of edge devices comprises sensors, and the time series data comprises respective time stamps and sensor data detected from the sensors, the method further comprising: compressing the time stamps together with the sensor data, such that the compressed data comprises compressed time stamps and sensor data.

8. A system comprising: a plurality of edge devices configured to monitor an industrial control network so as to receive time series data from the industrial control network; a database communicatively coupled to the plurality of edge devices; a memory having a plurality of application modules stored thereon; and a processor for executing the application modules, the modules configured to: compress the time series data so as to define compressed data; and store the compressed data in the database, so as to define an embedded database.

9. The system as recited in claim 6, the modules further configured to: select a model for performing compression of the time series data; and detect a delay associated with the time series data.

10. The system as recited in claim 6, the modules further configured to: compress multiple time series data in parallel, so as to compress respective time series data simultaneously.

11. The system as recited in claim 6, the database further configured to: decompress the compressed data so as to define reconstructed original data.

12. The system as recited in claim 11, wherein the time series data is compressed and the compressed data is decompressed without libraries or branches, so as to define library-free and branch-free compression and decompression with predictable run-time behavior.

13. The system as recited in claim 11, the database further configured to: receive a query for the time series data; responsive to the query, display the reconstructed original data associated with the query.

14. The system as recited in claim 10, wherein the plurality of edge devices comprises sensors, and the time series data comprises respective time stamps and sensor data detected from the sensors, the modules further configured to: compress the time stamps together with the sensor data, such that the compressed data comprises compressed time stamps and sensor data.

Description:
EMBEDDED COMPRESSION FOR PRODUCT LIFECYCLE DATA MANAGEMENT

BACKGROUND

[0001] Industrial automation or manufacturing systems can be used to control the operation of machines and other components in a systematic manner. Automation systems can include various automation domains such as factory automation, process automation, warehouse automation, building automation, energy automation, and the like. It is recognized herein that real-time compression of time- series data is often of crucial importance to manufacturing and process industries, which often require that collected data is understood so that decisions can be made in real time. It is further recognized herein that current approaches to data compression in the lifecycle of data management for various automation systems including manufacturing and process industries lack efficiencies.

BRIEF SUMMARY

[0002] Embodiments of the invention address and overcome one or more of the described- herein shortcomings or technical problems by providing methods, systems, and apparatuses that achieve high-efficiency data compression for manufacturing and process industries. In particular, for example, a system described herein can perform lightweight data compression that optimizes an embedded database for high-performance throughput, low memory usage, and fast query /response times.

[0003] In an example aspect, operations are performed within an industrial control network that includes a plurality of edge devices and a database. The edge devices can monitor the industrial control network so as to receive time series data from the industrial control network. The edge devices and/or the database can compress the time series data so as to define compressed data. The compressed data can be stored in the database, so as to define an embedded database of the industrial control network. In particular, for example, to compress the time series data, the edge devices or database can select a model for performing compression of the time series data. Further, the edge devices or database can detect a delay associated with the time series data. The database can decompress the compressed data, so as to define reconstructed original data. In an example, a query for the time series data can be received by the database. Responsive to the query, the database can display the reconstructed original data associated with the query. The plurality of edge devices can include sensors, and the time series data can include respective time stamps and sensor data detected from the sensors. Thus, in some examples, the time stamps are compressed together with the sensor data, such that the compressed data comprises compressed time stamps and sensor data.

BRIEF DESCRIPTION OF THE DRAWINGS

[0004] The foregoing and other aspects of the present invention are best understood from the following detailed description when read in connection with the accompanying drawings. For the purpose of illustrating the invention, there is shown in the drawings embodiments that are presently preferred, it being understood, however, that the invention is not limited to the specific instrumentalities disclosed. Included in the drawings are the following Figures:

[0005] FIG. 1 is a block diagram of an example manufacturing or automation system that includes a plurality of edge devices configured to perform data compression, in accordance with an example embodiment.

[0006] FIG. 2 is a block diagram of an example compression computing system, in accordance with an example embodiment.

[0007] FIG. 3 is a flow diagram that illustrates example operations that can be performed by the compression computing system, in accordance with an example embodiment.

[0008] FIG. 4 shows an example of a computing environment within which embodiments of the disclosure may be implemented.

DETAILED DESCRIPTION

[0009] As an initial matter, it is recognized herein that smart machines and sensors that provide ubiquitous connectivity typically produce measurements at very high frequencies. This can lead to a need to transmit a huge amount of data from edge devices to remote servers through constrained computation and bandwidth. In addition, it is further recognized herein that some manufacturing data must be always accessible instantly (e.g., guidance and control of automated guided vehicles). This can lead to a need to archive high frequency process data in local historians of edge devices (e.g., smart machines or Programable Logic Controllers (PLCs). Thus, time-series data collected by edge devices is often high-frequency and high- accuracy, which can lead to severe constraints on the bandwidth to transmit, and the space to archive, the data.

[0010] An embedded database can be used to log manufacturing and process data. In some cases, embedded databases can log large amounts of measured data using a comparatively small amount of storage space. An embedded database can require managing the entire lifecycle of live real-time data streams. The lifecycle can include receiving or capturing live data, processing the data, and then distributing the processed data, for instance to deliver visualization and analytics that can enable meaningful manufacturing decisions. In accordance with an example embodiment, a compression and management platform or system enables various applications to efficiently monitor, communicate, store, process, analyze, and visualize industrial data. Industrial data may include, for example and without limitation, data related to pressure, temperature, fluid flow rate, velocity, acceleration, and other physical, chemical, or biological parameters.

[0011] By way of further background, there are various fundamental classes of file compression algorithms. A first class identifies repeating elements in the original data (e.g., Winzip/7zip file compression). These algorithms generally define lossless compression, in which the original data is represented without losing any information, and the process is reversible. The compressing and decompressing are often computationally intensive, such that a wider application of these compressors can be limited in dealing with time- sensitive compression tasks (e.g., fast sampling manufacturing processes such as vibration control in CNC machines).

[0012] A second class of compressors identifies redundant data, which can be discarded based on the predefined compression accuracy. Typical examples are collector compression and archive compression. The collector compression examines the values of measured data, and discards those within a defined value range (e.g., ±1 mm in distance measurements and ±10 Pascal in pressure measurements). Collector compression stores data based on the amount of change in data. It records a value only when the new value deviates too much from the last recorded value. By comparison, archive compression stores data based on its rate of change in data. It examines the slope of measured data, and discards those that fall within a predefined slope range. Thus, this algorithm can store data that “changes direction” beyond a configured range. Archive compression is also called swinging door compression, which often runs after collector compression. In general, both collector compression and archive compression are lossy compressors. However, they are typically more scalable to the data sizes, and suitable for time- sensitive compression tasks. QVZ is an example lossy compressor for quality values in genomic data, which allows for a parameter to control the tradeoff between accuracy and compression ratio. GTRAC is an example compressor for genomic variants that allows fast random access directly from the compressed data without needing to decompress the entire compressed archive.

[0013] Time series data is common to various applications. The Gorilla database from Facebook uses a simple delta encoding-based compressor that uses the fact that time series data typically changes slowly. In an example, a lossy compressor uses a first-order regression model to predict the next element in the time series, and then store the errors from the predicted value. An example higher-order prediction model can be used for floating point data. A prediction for the next value can be calculated based on many previous observations instead of just the previous one. The time-series compressors mentioned above are useful for a generic time-series, but they are designed using one assumption: the data generating processes are linear and slow time-variant systems. Therefore, it is recognized herein that these methods do not capture nonlinear structures in time-series data well. As another example, an example compressor (LFZip) is developed based on the prediction-quantization-entropy coder framework, and benefits from improved prediction using nonlinear models and deep neural networks for multivariate floating-point time series data that provides guaranteed reconstruction up to user-specified maximum absolute error. LFZip achieves significant improvement in compression over the previous state-of-the-art compressors as demonstrated by evaluating the compressor on several time series datasets.

[0014] Referring initially to FIG. 1, an example automation or industrial control network or system 100 can include one or more plants or production networks 104 that contain control logic, host web servers, and the like. For example, the industrial control network 100 can include an enterprise or IT network 102 and multiple operational plant or production networks 104 communicatively coupled to the IT network 102. The production network 104 or enterprise network 102 can include a plurality of edge devices or modules 106 connected within the production network 104. The edge devices or modules 106 can define a compression system, as further described herein. An example edge device 106 is connected to the IT network 102. The arrangement of edge devices or modules 106 can vary as desired, and all such arrangements are contemplated as being within the scope of this disclosure.

[0015] Still referring to FIG. 1, the production network 104 can include various production machines configured to work together to perform one or more manufacturing operations. Example production machines of the production network 104 can include, without limitation, robots 108 and other field devices that can be controlled by a respective PLC 114, such as sensors 110, actuators 112, or other machines, such as automatic guided vehicles (AGVs) 108. The PLC 114 can send instructions to respective field devices. In some cases, a given PLC 114 can be coupled to a human machine interfaces (HMIs) 116. It will be understood that the industrial control network 100 is simplified for purposes of example. That is, the industrial control network 100 may include additional or alternative nodes or systems, for instance other network devices, that define alternative configurations, and all such configurations are contemplated as being within the scope of this disclosure.

[0016] The network or system 100, in particular each production network 104, can define a field portion or level 118 and plant level or portion 120. For example, and without limitation, the plant level 120 can define one or more industrial plants or systems that can be geographically and functionally separate from or independent of each other. For example, the plant level 120 can include Brownfield plants and Greenfield plants that are each connected to respective field devices within the field level 118. The field level 118 can include various field devices such as the robots 108, PLC 114, sensors 110, actuators 112, HMIs 116, and AGVs. The field portion 118 can define one or more production lines or control zones associated with a given plant in the plant level 120. The PLC 114, sensors 110, actuators 112, and HMI 116 within a given production line can communicate with each other via a respective field bus 122. Each control zone can be defined by a respective PLC 114, such that the PLC 114, and thus the corresponding control zone, can connect to the respective plant portion 120 via an Ethernet connection 124. In some cases, the robots 108 and AGVs can be configured to communicate with other devices within the fieldbus portion 118 via a Wi-Fi connection 126. Similarly, the robots 108 and AGVs can communicate with the Ethernet portion 120, in particular a Supervisory Control and Data Acquisition (SCADA) server 128, via the Wi-Fi connection 126. In various examples, a respective edge device or module 106 is communicatively coupled between the PLC 114 and the respective plant in the plant level 120, for instance via the Ethernet connection 124 or the Wi-Fi connection 126. In some examples, the edge module 106 is defined by the PLC 114.

[0017] The plant level 120 of a given production network 104 can include various computing devices or subsystems communicatively coupled together via the Ethernet connection 124. Example computing devices or subsystems in the plant portion 120 include, without limitation, a mobile data collector 130, HMIs 132, the SCADA server 128, the edge devices 106, a wireless router 134, a manufacturing execution system (MES) 136, an engineering system (ES) 138, and a log server 140. The ES 138 can include one or more engineering works stations. In an example, the MES 136, HMIs 132, ES 138, and log server 140 are connected to the production network 104 directly. The wireless router 134 can also connect to the production network 104 directly. Thus, in some cases, mobile users, for instance the mobile data collector 130 and robots 108 (e.g., AGVs), can connect to the production network 104 via the wireless router 134.

[0018] Referring also to FIG. 2, an example compression computing system 200 can be defined by one or more of the edge devices or modules 106 or PLCs 114. The compression computing system 200 can be configured to compress data in real-time. The computing system 200 can include one or more processors and memory having stored thereon applications, agents, and computer program modules including, for example, a predictor module 202, a quantization module 204 communicatively coupled to the predictor module 202, and a database 206 communicatively coupled to the quantization module 204. In various examples, the database 206 defines a real-time database (RTDB). The computing system 202 can define a compressor module 201 that includes the predictor module 202 and the quantization module 204, and the database 206 can define a decompressor module 203 and therefore the compressor module 201 can be communicatively coupled to the decompressor module 203, and thus the database 206.

[0019] It will be appreciated that the program modules, applications, computer-executable instructions, code, or the like depicted in FIG. 2 are merely illustrative and not exhaustive, and that processing described as being supported by any particular module may alternatively be distributed across multiple modules or performed by a different module. In addition, various program module(s), script(s), plug-in(s), Application Programming Interface(s) (API(s)), or any other suitable computer-executable code may be provided to support functionality provided by the program modules, applications, or computer-executable code depicted in FIG. 2 and/or additional or alternate functionality. Further, functionality may be modularized differently such that processing described as being supported collectively by the collection of program modules depicted in FIG. 2 may be performed by a fewer or greater number of modules, or functionality described as being supported by any particular module may be supported, at least in part, by another module. In addition, program modules that support the functionality described herein may form part of one or more applications executable across any number of systems or devices in accordance with any suitable computing model such as, for example, a client-server model, a peer-to-peer model, and so forth. In addition, any of the functionality described as being supported by any of the program modules depicted in FIG. 2 may be implemented, at least partially, in hardware and/or firmware across any number of devices, for instance the edge devices 106 or the PLCs 114.

[0020] The database 206 can define a real-time database configured to perform embedded compression. Thus, in some cases, the compression module 202 can be distributed across edge devices 106 and the database 206. In some examples, the real-time database 206 can define an InfluxDB, which refers to a time series platform that empowers developers to build loT, analytics, and monitoring software. The database 206 can process time-stamped data produced by sensors, applications and infrastructure. In various examples, compression operations (e.g., operations 300 shown in FIG. 3) are embedded onto a platform defined by the database 206, for instance the InfluxDB platform, though it will be understood that the compression operations can be embedded into alternative databases, for instance other time series databases having open programming APIs, and all such implementations and databases are contemplated as being within the scope of this disclosure.

[0021] With continuing reference to FIG. 2, in various examples, the predictor module 202 can calculate a one-step-ahead prediction y t using A previous quantized values {y t -N> ••• where t is current sampling time, and N is model order. The error of the prediction s t with respect to the true value y t can be quantized by the quantization module 204, so as to define vector quantized errors {e 1; ••• , s t }. The vector quantized errors can be transferred and stored in an embedded time series database, for instance the database 206. The database 206 can include a general-purpose compressor (e.g., 7-zip) that can compress the vector quantized error. In an example, the quantized value of y t is calculated by the sum of the quantized error s t and the predicted value y t . The predictor module 202 can define a supervised learning model that can learn the parameters of the predictor module 202, so as minimize 2-norm of the quantized errors {e 1 , ••• , -jJ. In some examples, the compressor module 202 runs on the edge devices 106, for instance the sensors 110 or PCLs 114. The quantization module 204 can send the quantized errors to the decompressor module 203 of the database 206, which, in some examples, can run on a cloud platform, for instance in the IT network 102.

[0022] In particular, the decompressor module 203 can be embedded in the real-time database 206. The quantized vectors can be fed into the decompressor module 206 to reconstruct the original data for query and visualization. In various examples, only the compressed data is stored in the database 206, which can result in significant reduction of storage and the query response time.

[0023] In various examples, the predictor module 202 defines a Normalized Least Mean Square (NLMS) predictor. The NLMS predictor can define an adaptive linear prediction filter. The parameters of the linear filter can be initialized with a fixed value and can be updated at each time- step based on the mean square prediction error. The update can be performed similar to stochastic gradient descent, where the gradients are normalized before update. As the predictor contains very few parameters, it is recognized herein that in practice, the predictor can require no pre-training and can adapt very quickly to changing input statistics. Without being bound by theory, in practice, the NLMS predictor can work well on various types of inputs, though it will be understood that the predictor module 202 is not limited to implementing an NLMS predictor.

[0024] For example, the predictor module 202 can define one or more neural network predictors, such as different variants of the Fully Connected (FC) and the biGRU networks for univariate time series. In some cases, the NLMS predictor can define a single layer linear neural network, and the FC and biGRU networks can define stronger models in comparison (in terms of expressions). In some cases, however, the larger number of parameters in FC and biGRU networks make them adapt more slowly to the changing statistics in the time series as compared to the NLMS predictor. To resolve this issue, in an example, offline training is performed for the neural network-based predictors before the encoding step. During the offline training, a given model can be trained on given training data with early stopping performed with respect to validation data. The trained model can then be used as the predictor during compression, and the parameters can be optionally updated online during the compression.

[0025] Still referring to FIG. 2, the input of the decompressor module 203 can include archived data that is stored in the database 206. The output of the decompressor module 203 can include reconstructed time series data to visualize on a graphic user interface (GUI), for instance in response to query commands from the end users. In an example, the decompressor module 203 defines an NLMS decompressor, the parameters of the NLMS decompressor are calculated from those of the NLMS predictors. For example, the weights of the neural network decompressor can be trained at the same time as when the neural network predictors are trained, for instance using labelled data via supervised learning or unlabeled data using unsupervised learning.

[0026] Referring also to FIG. 3, example operations 300 can be performed by the industrial control network 100, in particular the compression computing system 200. At 302, data is received by a given field device 106, for instance the compressor module 201 of a field device. At 304, the compressor module 201 can select a model for performing compression of the received data. For example, in time series analysis, the partial autocorrelation function (PACF) gives the partial correlation of a stationary time series with its own lagged values, and regresses the values of the time series at shorter lags. The compressor module 201 can use the partial autocorrelation to determine when the given time-series has no or low dependence on obtained past values. The model order can be selected by via sampling a set of data from the time series. The sampling set can be obtained from sampling the historical data set or by sampling the first few samples. The compressor module 201 can perform a model selection algorithm that can run periodically for streaming applications, for example, to fine-tune the model on the fly. In some examples, the compressor module 201 can perform a multi-armed bandit search to speed up the model selection process. Thus, in various examples, the compressor module 201 can select a compression model with the lowest model order and least number of model parameters, thereby resulting in a higher compression ratio.

[0027] In an example, at 304, the compressor module 201 can select a general, lossy, floatingpoint, time-series compressor, such as LFZip. The resulting compression can define a prediction-quantization-entropy coder framework that uses a maximum deviation metric. Continuing with the example, the predictor module 202 can define various predictors, such as an adaptive linear model (NLMS) or neural networks (NNs). In various example, an NLMS predictor results in an order of magnitude speed improvement as compared to NN models. The number of parameters, which correspond to the size of the past window based on which the predictor module 202 attempts to predict the current observed datapoint, can define the main hyperparameter in NLMS model-based prediction. In some cases, in various time-series, even having no predictor can result in state-of-the-art-results. Alternatively, or additionally, the optimal number of parameters can vary depending on the time-series properties and the user- specified maximum allowed deviation at each timepoint. In particular, in various examples, the compressor module 201 determines automatically the optimal model order for time-series with low or no dependence on obtained past values. In these cases, for example, using LFZip automatically can be advantageous both in terms of speed and compression ratio over LFZip (NLMS). For example, the first few samples can define a training dataset so as to determine time-series properties as described above, and then the optimal chosen model order can be used on the remaining time-series. Alternatively, or additionally, the chosen model can be finetuned on the fly by defining a training set periodically.

[0028] With continuing reference to FIG. 3, after the model is selected, a delay can be detected or determined, at 306. In some cases, time-series data may have a built-in delay, for example, of seasonality in the recorded data or mismatch in sensor synchronizations. In the first example case, a single time-series may define a periodic time structure. In the second example case, multiple time-series might share this periodic time structure. The system can perform a cross-correlation function (CCF) offline on the recorded datasets to determine the delays. The CCF can define a measure of similarity of two time series at different time shifts. After calculating the cross-correlation between two time series, the maximum (e.g., or minimum if the two time series are negatively correlated) of the CCF can indicate the point in time where the two series are best aligned. In particular, for example, the time delay between the two time series is determined by the argument of the maximum. The identified delays can be used to determine a better prediction for future timestamps given already observed data. In some cases, delay detection increases the compression accuracy of the system 200. For example, the compressor module 201 can perform a lower order compression model (e.g., with smaller number of parameters) for a given accuracy requirement. In some cases, the time-series might have a built-in delay of seasonality in the recorded data or mismatch in sensor synchronizations. For example, a single time-series may have a periodic time structure.

Alternatively, or additionally, multiple time-series might share the periodic time structure. In these cases, for example, the compressor module 202 can perform an autocorrelation function (ACF) or cross -correlation function (CCF) on offline recorded datasets to determine this delay. The delay can be used to achieve a better prediction for future timestamps given already observed data. [0029] After the delay is detected, the system 200 can perform multi-time series compression, at 308. For example, at 308, the compression module 202 can compress multiple time series together, and can compress the data and the time stamps together so as to define compressed data. Such compression can reduce the storage because of, for example, dependencies between multi-time series; and the same time stamps can be shared by multiple data points in different time series. In particular, for example, compressing multiple time-series can lead to better compression by taking into account the correlations between observed values in different time series. By way of example, LFZip supports compression of multiple time-series. Additionally, it is recognized herein that industrial sensor measurements (e.g., represented as multiple time series) often share timestamps, and the global structure of the data can look similar to each other. Such series can occur in pairs (e.g., 2 time-series sharing the same timestamps.) In such cases, without being bound by theory, compressing both the timeseries together can result in almost two times compression over an LFZip compression ratio of individual files. Furthermore, sensor synchronization and the delay detection techniques described herein can further enhance the compression results.

[0030] With continuing reference to FIG. 3, at 310, the compressed data can be transferred over the associated industrial network, for instance the industrial network 102. In some cases, referring also to FIG. 1, industrial networks can be in general categorized into difference classes, such as an automation system, SCADA system, and business system 120. In an example, the primary requirement of an industrial automation system is real-time operation and reliability, and the primary requirement of a business network may be high bandwidth and low operation costs. These requirements drive the use of real-time fieldbus protocols within control system processes and control loops, while business networks might utilize fast, low-cost Ethernet networks and TCP/IP. SCADA systems can sit between these two very different networks. In many ways, for example, SCADA systems can share the requirements of the control system itself. For example, SCADA systems might need to be able to operate in real time. SCADA systems also can communicate with business systems over TCP/IP. In various examples, the compression described herein is designed to meet the requirements of the classes of industrial networks identified above, so as to achieve goals related to lifecycle data management. For example, the embedded lightweight compression described herein can meet the real time requirements of automation and SCADA networks, and the relatively high compression ratio can meet requirements of a low-cost business network with limited bandwidth. At 312, the compressed data that is transferred over the industrial network is decompressed so as to define reconstructed original data. For example, an embedded decompressor can decode the compressed data transferred over the industrial network. In various examples, the decompressor can be implemented into a data historian on edge devices 106, for instance the SCADA server 128, mobile data collector 130, HMIs 132, and PLC 114. [0031] Thus, as described herein, the compression computing system 200 can define lightweight and computationally inexpensive mathematical operators, (e.g., multiplication, subtraction). Such operators can be supported by mainstream time series databases (TSDBs). For example, the described herein compression can be integrated into a time series platform that is optimized for real-time applications, such as applications for manufacturing process monitoring and control (e.g., Influx DB). The resulting embedded compressor integrated into the a database can define superior performance in terms of compression ratio, speed, storage, and accuracy when compared with state-of-art compressors. It will be understood that the InfluxDB is presented by way of example, the compression can be embedded in other real-time databases or industrial PLCs, and all such databases are contemplated as being within the scope of this disclosure.

[0032] Thus, the compression computing system 200 can define an embedded and portable compressor with self-learning capability to deploy in different and challenging industrial settings (e.g., flexible manufacturing systems, process systems, industrial loT platforms, supply chain management and industrial data visualization, etc.). The real-time compression with learning capability can be achieved by transmitting and archiving compressed data with an embedded decompressor to reconstruct the original data on the fly, thereby reducing the redundancies in generic time-series data via utilizing automatically learned features using deep neural networks.

[0033] Thus, as described herein, in an example aspect, operations are performed within an industrial control network that includes a plurality of edge devices and a database. The edge devices can monitor the industrial control network so as to receive time series data from the industrial control network. The edge devices and/or the database can compress the time series data so as to define compressed data. The compressed data can be stored in the database, so as to define an embedded database of the industrial control network. In particular, for example, to compress the time series data, the edge devices or database can select a model for performing compression of the time series data. Further, the edge devices or database can detect a delay associated with the time series data. Multiple time series data can be compressed in parallel, so as to compress respective time series data simultaneously. The compressed data can be transferred or sent throughout the industrial network. The database can decompress the compressed data, so as to define reconstructed original data. In various examples, the compressing and decompressing are performed without libraries or branches, so as to define library-free and branch-free compression and decompression with predictable run-time behavior for lifecycle data management. In an example, a query for the time series data can be received by the database. Responsive to the query, the database can display the reconstructed original data associated with the query. The plurality of edge devices can include sensors, and the time series data can include respective time stamps and sensor data detected from the sensors. Thus, in some examples, the time stamps are compressed together with the sensor data, such that the compressed data comprises compressed time stamps and sensor data.

[0034] Without being bound by theory, the compression computing system 200 was verified using an example test dataset that consisted of 1000 time series data having various natures (e.g., fast and slow dynamics, linear or highly nonlinear data set, short-term or long-term dependency etc.). The compression computing system 200 exhibited significant improvement in both memory and disk storage over the previous state-of-the-art compressors. In an example, the response time to query command is comparable with the embedded decompressor activated or deactivated.

[0035] FIG. 4 illustrates an example of a computing environment within which embodiments of the present disclosure may be implemented. A computing environment 800 includes a computer system 810 that may include a communication mechanism such as a system bus 821 or other communication mechanism for communicating information within the computer system 810. The computer system 810 further includes one or more processors 820 coupled with the system bus 821 for processing the information. The industrial control network 100, in particular the compression computing system 200, may include, or be coupled to, the one or more processors 820.

[0036] The processors 820 may include one or more central processing units (CPUs), graphical processing units (GPUs), or any other processor known in the art. More generally, a processor as described herein is a device for executing machine-readable instructions stored on a computer readable medium, for performing tasks and may comprise any one or combination of, hardware and firmware. A processor may also comprise memory storing machine-readable instructions executable for performing tasks. A processor acts upon information by manipulating, analyzing, modifying, converting or transmitting information for use by an executable procedure or an information device, and/or by routing the information to an output device. A processor may use or comprise the capabilities of a computer, controller or microprocessor, for example, and be conditioned using executable instructions to perform special purpose functions not performed by a general purpose computer. A processor may include any type of suitable processing unit including, but not limited to, a central processing unit, a microprocessor, a Reduced Instruction Set Computer (RISC) microprocessor, a Complex Instruction Set Computer (CISC) microprocessor, a microcontroller, an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA), a System-on-a-Chip (SoC), a digital signal processor (DSP), and so forth. Further, the processor(s) 820 may have any suitable microarchitecture design that includes any number of constituent components such as, for example, registers, multiplexers, arithmetic logic units, cache controllers for controlling read/write operations to cache memory, branch predictors, or the like. The microarchitecture design of the processor may be capable of supporting any of a variety of instruction sets. A processor may be coupled (electrically and/or as comprising executable components) with any other processor enabling interaction and/or communication there-between. A user interface processor or generator is a known element comprising electronic circuitry or software or a combination of both for generating display images or portions thereof. A user interface comprises one or more display images enabling user interaction with a processor or other device.

[0037] The system bus 821 may include at least one of a system bus, a memory bus, an address bus, or a message bus, and may permit exchange of information (e.g., data (including computer-executable code), signaling, etc.) between various components of the computer system 810. The system bus 821 may include, without limitation, a memory bus or a memory controller, a peripheral bus, an accelerated graphics port, and so forth. The system bus 821 may be associated with any suitable bus architecture including, without limitation, an Industry Standard Architecture (ISA), a Micro Channel Architecture (MCA), an Enhanced ISA (EISA), a Video Electronics Standards Association (VESA) architecture, an Accelerated Graphics Port (AGP) architecture, a Peripheral Component Interconnects (PCI) architecture, a PCI-Express architecture, a Personal Computer Memory Card International Association (PCMCIA) architecture, a Universal Serial Bus (USB) architecture, and so forth. [0038] Continuing with reference to FIG. 4, the computer system 810 may also include a system memory 830 coupled to the system bus 821 for storing information and instructions to be executed by processors 820. The system memory 830 may include computer readable storage media in the form of volatile and/or nonvolatile memory, such as read only memory (ROM) 831 and/or random access memory (RAM) 832. The RAM 832 may include other dynamic storage device(s) (e.g., dynamic RAM, static RAM, and synchronous DRAM). The ROM 831 may include other static storage device(s) (e.g., programmable ROM, erasable PROM, and electrically erasable PROM). In addition, the system memory 830 may be used for storing temporary variables or other intermediate information during the execution of instructions by the processors 820. A basic input/output system 833 (BIOS) containing the basic routines that help to transfer information between elements within computer system 810, such as during start-up, may be stored in the ROM 831. RAM 832 may contain data and/or program modules that are immediately accessible to and/or presently being operated on by the processors 820. System memory 830 may additionally include, for example, operating system 834, application programs 835, and other program modules 836. Application programs 835 may also include a user portal for development of the application program, allowing input parameters to be entered and modified as necessary.

[0039] The operating system 834 may be loaded into the memory 830 and may provide an interface between other application software executing on the computer system 810 and hardware resources of the computer system 810. More specifically, the operating system 834 may include a set of computer-executable instructions for managing hardware resources of the computer system 810 and for providing common services to other application programs (e.g., managing memory allocation among various application programs). In certain example embodiments, the operating system 834 may control execution of one or more of the program modules depicted as being stored in the data storage 840. The operating system 834 may include any operating system now known or which may be developed in the future including, but not limited to, any server operating system, any mainframe operating system, or any other proprietary or non-proprietary operating system.

[0040] The computer system 810 may also include a disk/media controller 843 coupled to the system bus 821 to control one or more storage devices for storing information and instructions, such as a magnetic hard disk 841 and/or a removable media drive 842 (e.g., floppy disk drive, compact disc drive, tape drive, flash drive, and/or solid state drive). Storage devices 840 may be added to the computer system 810 using an appropriate device interface (e.g., a small computer system interface (SCSI), integrated device electronics (IDE), Universal Serial Bus (USB), or FireWire). Storage devices 841, 842 may be external to the computer system 810.

[0041] The computer system 810 may also include a field device interface 865 coupled to the system bus 821 to control a field device 866, such as a device used in a production line. The computer system 810 may include a user input interface or GUI 861, which may comprise one or more input devices, such as a keyboard, touchscreen, tablet and/or a pointing device, for interacting with a computer user and providing information to the processors 820.

[0042] The computer system 810 may perform a portion or all of the processing steps of embodiments of the invention in response to the processors 820 executing one or more sequences of one or more instructions contained in a memory, such as the system memory 830. Such instructions may be read into the system memory 830 from another computer readable medium of storage 840, such as the magnetic hard disk 841 or the removable media drive 842. The magnetic hard disk 841 and/or removable media drive 842 may contain one or more data stores and data files used by embodiments of the present disclosure. The data store 840 may include, but are not limited to, databases (e.g., relational, object-oriented, etc.), file systems, flat files, distributed data stores in which data is stored on more than one node of a computer network, peer-to-peer network data stores, or the like. The data stores may store various types of data such as, for example, skill data, sensor data, or any other data generated in accordance with the embodiments of the disclosure. Data store contents and data files may be encrypted to improve security. The processors 820 may also be employed in a multi-processing arrangement to execute the one or more sequences of instructions contained in system memory 830. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions. Thus, embodiments are not limited to any specific combination of hardware circuitry and software.

[0043] As stated above, the computer system 810 may include at least one computer readable medium or memory for holding instructions programmed according to embodiments of the invention and for containing data structures, tables, records, or other data described herein. The term “computer readable medium” as used herein refers to any medium that participates in providing instructions to the processors 820 for execution. A computer readable medium may take many forms including, but not limited to, non-transitory, non-volatile media, volatile media, and transmission media. Non-limiting examples of non-volatile media include optical disks, solid state drives, magnetic disks, and magneto-optical disks, such as magnetic hard disk 841 or removable media drive 842. Non-limiting examples of volatile media include dynamic memory, such as system memory 830. Non-limiting examples of transmission media include coaxial cables, copper wire, and fiber optics, including the wires that make up the system bus 821. Transmission media may also take the form of acoustic or light waves, such as those generated during radio wave and infrared data communications.

[0044] Computer readable medium instructions for carrying out operations of the present disclosure may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, statesetting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present disclosure.

[0045] Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, may be implemented by computer readable medium instructions. [0046] The computing environment 800 may further include the computer system 810 operating in a networked environment using logical connections to one or more remote computers, such as remote computing device 880. The network interface 870 may enable communication, for example, with other remote devices 880 or systems and/or the storage devices 841, 842 via the network 871. Remote computing device 880 may be a personal computer (laptop or desktop), a mobile device, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to computer system 810. When used in a networking environment, computer system 810 may include modem 872 for establishing communications over a network 871, such as the Internet. Modem 872 may be connected to system bus 821 via user network interface 870, or via another appropriate mechanism.

[0047] Network 871 may be any network or system generally known in the art, including the Internet, an intranet, a local area network (LAN), a wide area network (WAN), a metropolitan area network (MAN), a direct connection or series of connections, a cellular telephone network, or any other network or medium capable of facilitating communication between computer system 810 and other computers (e.g., remote computing device 880). The network 871 may be wired, wireless or a combination thereof. Wired connections may be implemented using Ethernet, Universal Serial Bus (USB), RJ-6, or any other wired connection generally known in the art. Wireless connections may be implemented using Wi-Fi, WiMAX, and Bluetooth, infrared, cellular networks, satellite or any other wireless connection methodology generally known in the art. Additionally, several networks may work alone or in communication with each other to facilitate communication in the network 871.

[0048] It should be appreciated that the program modules, applications, computer-executable instructions, code, or the like depicted in FIG. 4 as being stored in the system memory 830 are merely illustrative and not exhaustive and that processing described as being supported by any particular module may alternatively be distributed across multiple modules or performed by a different module. In addition, various program module(s), script(s), plug-in(s), Application Programming Interface(s) (API(s)), or any other suitable computer-executable code hosted locally on the computer system 810, the remote device 880, and/or hosted on other computing device(s) accessible via one or more of the network(s) 871, may be provided to support functionality provided by the program modules, applications, or computer-executable code depicted in the figures and/or additional or alternate functionality. Further, functionality may be modularized differently such that processing described as being supported collectively by the collection of program modules depicted in the figures may be performed by a fewer or greater number of modules, or functionality described as being supported by any particular module may be supported, at least in part, by another module. In addition, program modules that support the functionality described herein may form part of one or more applications executable across any number of systems or devices in accordance with any suitable computing model such as, for example, a client-server model, a peer-to-peer model, and so forth. In addition, any of the functionality described as being supported by any of the program modules depicted in the figures may be implemented, at least partially, in hardware and/or firmware across any number of devices.

[0049] It should further be appreciated that the computer system 810 may include alternate and/or additional hardware, software, or firmware components beyond those described or depicted without departing from the scope of the disclosure. More particularly, it should be appreciated that software, firmware, or hardware components depicted as forming part of the computer system 810 are merely illustrative and that some components may not be present or additional components may be provided in various embodiments. While various illustrative program modules have been depicted and described as software modules stored in system memory 530, it should be appreciated that functionality described as being supported by the program modules may be enabled by any combination of hardware, software, and/or firmware. It should further be appreciated that each of the above-mentioned modules may, in various embodiments, represent a logical partitioning of supported functionality. This logical partitioning is depicted for ease of explanation of the functionality and may not be representative of the structure of software, hardware, and/or firmware for implementing the functionality. Accordingly, it should be appreciated that functionality described as being provided by a particular module may, in various embodiments, be provided at least in part by one or more other modules. Further, one or more depicted modules may not be present in certain embodiments, while in other embodiments, additional modules not depicted may be present and may support at least a portion of the described functionality and/or additional functionality. Moreover, while certain modules may be depicted and described as sub-modules of another module, in certain embodiments, such modules may be provided as independent modules or as sub-modules of other modules. [0050] Although specific embodiments of the disclosure have been described, one of ordinary skill in the art will recognize that numerous other modifications and alternative embodiments are within the scope of the disclosure. For example, any of the functionality and/or processing capabilities described with respect to a particular device or component may be performed by any other device or component. Further, while various illustrative implementations and architectures have been described in accordance with embodiments of the disclosure, one of ordinary skill in the art will appreciate that numerous other modifications to the illustrative implementations and architectures described herein are also within the scope of this disclosure. In addition, it should be appreciated that any operation, element, component, data, or the like described herein as being based on another operation, element, component, data, or the like can be additionally based on one or more other operations, elements, components, data, or the like. Accordingly, the phrase “based on,” or variants thereof, should be interpreted as “based at least in part on.”

[0051] Although embodiments have been described in language specific to structural features and/or methodological acts, it is to be understood that the disclosure is not necessarily limited to the specific features or acts described. Rather, the specific features and acts are disclosed as illustrative forms of implementing the embodiments. Conditional language, such as, among others, “can,” “could,” “might,” or “may,” unless specifically stated otherwise, or otherwise understood within the context as used, is generally intended to convey that certain embodiments could include, while other embodiments do not include, certain features, elements, and/or steps. Thus, such conditional language is not generally intended to imply that features, elements, and/or steps are in any way required for one or more embodiments or that one or more embodiments necessarily include logic for deciding, with or without user input or prompting, whether these features, elements, and/or steps are included or are to be performed in any particular embodiment.

[0052] The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the Figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.