WU WENBO (US)
WU WENBO (US)
CLAIMS What is claimed is: 1. A system for providing onside weather forecasting for a building in a predetermined location, comprising: a source of crowdsourced weather data representing a plurality of personal weather stations in an area surrounding the predetermined location; and a platform interconnected to the source of crowdsourced weather data, wherein the platform includes a processor programmed to provide a prediction of weather in the future at the predetermined location based upon the source of crowdsourced weather data using a spatial-temporal correlation of a predetermined set of weather features. 2. The system of claim 1, wherein the spatial-temporal correlation comprises a point to point correlation between each of the plurality of personal weather stations and the predetermined location of the building. 3. The system of claim 2, wherein the source of crowdsourced weather data includes a plurality of forecasts from the plurality of personal weather stations, respectively. 4. The system of claim 3, wherein the prediction of weather comprises an ensemble of the plurality of forecasts. 5. The system of claim 4, wherein the processor is programmed to provide the prediction of weather based on a machine learning model that has been trained using a set of historical weather features from the plurality of personal weather stations. 6. The system of claim 5, wherein the machine learning model comprises a random forest. 7. The system of claim 6, wherein the predetermined set of weather features comprises at least one endogenous temporal feature, at least one exogenous temporal feature, and at least one spatial feature. 8. The system of claim 7, wherein the processor is programmed to provide the prediction of weather according to a schedule. 9. The system of claim 8, wherein the schedule is every fifteen minutes. 10. The system of claim 9, wherein the historical weather features encompass up to a two day time period. 11. The system of claim 10, further comprising a source of public weather data representing at least one public weather station in the area surrounding the predetermined location. 12. The system of claim 11, wherein the at least one public weather station is a National Oceanic and Atmospheric Administration (NOAA) airport weather station. 13. The system of claim 12, wherein the source of public weather data from the at least one public weather station includes a series of forecasts. 14. The system of claim 13, wherein the prediction of weather comprises an ensemble of the plurality of forecasts from the plurality of weather stations and the series of forecasts from the at least one public weather station. |
[0052] Note that the size of the training data at each time is and might be different at different time points depending on the availability of the data at . The neighborhood parameter not only controls the number of data points used for training the model, but also provides additional insights on spatial impact from surrounding stations to the target location. During the real implementation, may be more than hundreds of PWS according to the data availability. Due to computing limitations, would be chosen to be the same as in the experiments. [0053] Most machine learning models require a large number of observations to accurately learn and capture the underlying model structure. Hence, a potential improvement to the model training and forecasting illustrated in FIG.5 is to introduce a rolling window such that more data are used for training the machine learning models. As a trade-off, it is necessary to assume that the model structure remains unchanged over the rolling period. That is, The longer period is assumed, the stronger assumption of the fixed model structure is imposed. An additional benefit of introducing a rolling window is to verify the persistency of the model structure, hence, to reduce the number of times necessary to train a new model for making the forecast. The updated local model training and forecasting process is illustrated in FIG.5. [0054] Following the convention in the literature, when a single rolling window is considered ( , the values of the response varia ble (first column in Table ) at tim e is denoted by which is a vector. Note that the values of the response vector are independent of . The input features corresponding to the fixed values of and are denoted which is a (R N t ) x ( 2 L + 2 ) matrix. When W rolling windows are considered as illustrated in FIG.5, the response vector is extended to be , which is a vector and the input feature matrix is extended to be which is a matrix. [0055] Without having a direct access to the crowdsourcing data, weather forecasts at local airports are often used to predict building energy consumption. While the airport weather forecasts such as the one provided by NOAA are usually made using sophisticated models, it may not be accurate to some specific locations because there might be an urban heat island effect that causing a temperature difference between downtown and suburban area. Nevertheless, the airport forecasts carry important weather information for the region and can potentially improve the forecasting accuracy of our proposed model. Platform 10 may thus use one of two approaches of incorporation airport forecasts into the model framework. [0056] NOAA provides weather forecasts from 1-hour-ahead up to 24-hour-ahead forecasts. Hence, the model (1) can be further added these airport forecasts at different forecasts horizons as model inputs. Thus, the model (1) becomes [0057] where contains up to 24 temperature forecasts in different forecasts horizons for time t+F. When F is small such as 1-hr-ahead forecasts, the airport forecasts are available to use all 24 forecasts because these forecasts are made before time . When is set to be 24-hr-ahead, it is possible to only use 24-hr-ahead airport forecasts at the external model inputs because the airport forecasts at other smaller forecast horizons for time t+F have not been performed. Thus, contains more forecasts information for smaller . Therefore, the input does not related with and . In the training data preparation, the airport forecasts are always the forecasts for the time stamp that is same as the response variable . [0058] Since the airport forecasts are available in this example, it is possible to further ensemble the forecasts from equation (3) with the airport forecasts under the same forecast horizon by taking the average of the two predictions: where the equation (5) can be potentially improved by using weighted average by giving the weights depending on the geographical locations similar to the concept mentioned in equation (2). Through this way, the variations of prediction at each time stamp can be further minimized if one believes that these external forecasts are reliable. In our empirical studies, it is demonstrated that the present invention is already accurate without ensemble with external forecasts. However, the ensemble step can mostly further boost the model performance. EXAMPLE [0059] The term “resolution” of model (1) refers to how frequently the forecast is made and it is usually determined by how frequent the weather forecasts are needed for building energy simulation. During the implementations in this study, the resolution is 15 minutes, that is, the model is trained and forecast for every 15 minutes. The term “frequency” in this study refers to how frequent the data are collected or equivalently the time span between any two adjacent time ticks t. In the present example, the weather variables at each PWS are collected roughly at a 5-minute frequency which allows one to choose the model resolution to be any interval that is greater than 5 minutes. The data was aggregated into 15- minute frequencies because it was assumed that the variations in the weather variables are negligible within this interval. [0060] As a critical step in the modeling framework, one needs to obtain an estimate based on the training data. Since no restriction was imposed on , many traditional models may be used, including linear regression model, time-series models, spatial models, etc. As an alternative, the general modeling framework may be equipped with advanced machine learning models given their demonstrated capabilities of capturing the underlying complex relationship between the response variable and model inputs. In the present example, random forest and the gradient boosting were used as representatives of the advanced machine learning models given their flexibility (e.g., nonparametric) and computational efficiency. Other advanced machine learning or deep learning models can also be adopted in a similar way with permitted computational resources. [0061] Random forest is an ensemble method built on a classification and regression tree. The basic CART works by means of a recursive partitioning of the data that can be represented within a basis function framework. The basic CART is prone to over-fitting, and RF is an ensemble approach to improve on the basic CART. With a random sample of input features, the predictions made by each of the trees are more independent. As a result, averaging over a large number of independent trees leads to great improvement in prediction accuracy. Working with a subsample of smaller number of features, the RF model is capable of handling high-dimensional data, as well as correlated features. When two features are highly correlated, one can use either one as the splitting feature. However, the two features would not split the data in the exact same way given the hierarchical structure in the regression trees. Without imposing any parametric functional form, a RF model is very flexible to accommodate highly localized features of the data. In addition, averaging over the out-of-bag observations significantly reduces the risk of over-fitting. There are three major tuning parameters: 1) the terminal node size for the single tree, 2) the number of trees, and 3) the proportion of predictors to be sampled. These tuning parameters can be chosen following a similar cross-validation scheme as described in the previous subsection. Following the conventional settings, a value of 5 was used for the terminal node size, the number of trees set to be 10,000, and the proportion of the features selected in each subsample set to be 1/3 in the exmplary analysis. [0062] Boosting method assigns observations responsible for highly localized features more weight in the fitting procedure. After a large number of iterations, by giving relatively more weight for the difficult-to-fit observation, one can combine the predictions from each iteration in a sensible way to reduce over-fitting. In the present example, a specific approach called stochastic gradient boosting was used to estimate the complex regression function in model (1). The minimum number of observations in the terminal node was set to be 1 and sample all the observations employing a shrinkage parameter of 0.02. The interaction depth ranged from 5 to 15 based on five-fold cross-validation with the total number of iterations capped at 1,000. [0063] The input features in model (1) can be grouped into three categories: the endogenous temporal features, the exogenous temporal features, and the spatial features. The endogenous temporal features are the lagged values of the response variables at location which are included to capture the temporal effects of the response variable at location on the target location . Using the endogenous along without the other two types of input features, the model reduces to a traditional time series model if the target location is also . The exogenous temporal features are extra (idiosyncratic) variables other than the response variable that are collected at location up to certain lags to the past because the target weather variable may be influenced by many other weather variables as well. [0064] The endogenous temporal features are same as the response variable which depends on the specific research interest. In the present example, the temperature was used as the response variable and as the endogenous feature because it plays a determinant role in building energy simulation. The choices of the exogenous temporal features depend on the availability of the data collected from the PWSs. In the present example, the variables denoted by “input" in Table 2 were used as the initial set of endogenous features since they are collected at each PWS. It is also conceivable that not all the endogenous temporal features are indeed useful in predicting the response variable in the future time. Including irrelevant features not only introduce additional biases in the model estimation but also significantly increase the unnecessary computation burden. Hence, a practical variables selection procedure may be used to identify the set of relevant features. More details about the variable selection procedure is provided herein. Table 2 - Model response and input features in the case study [0065] The spatial features contain information of spatial relevance of location to the target location . They are included to capture the spatial impact of target location form surrounding locations. A common choice of the spatial feature is the geo-distance, , between and . However, in the present example, the relative differences in latitude and longitude, and , were used to allow machine learning models capture more complex spatial relationship than a linear relationship induces by the geo-distance. Note that, and provide more information than because for any two locations s i and s , then one always has [0066] Besides choosing a specific model form for the parameters F,L andR need to be determined corresponding to the chosen model. The choice of solely depends on the demand of when the future weather forecast is needed. For example, with a 15-minute model resolution, the predictive model for making a one-day ahead forecast corresponds toF = 9 6 [0067] The value of determines how far back to the past input features are fed to the model. For example, with a 15-minute model frequency, setting allows the model to use historical weather variables up to two days prior to the present time where the forecast is made. The appropriate value for can be determined by carefully assessing the auto correlation function (ACF) plot to see how strong the temporal association is among the weather variables. Alternatively, one can also adopt a cross-validation approach to choose the optimal value for that minimizes the prediction error. In this way, the selected optimal value can shed light on the temporal dependence of the response variable from the input weather variables. [0068] The parameter R restricts the model structure to be the same within the neighborhood of the target location. In general, a larger value of a lows tol use more data from the neighboring PWSs in training but imposes a more stringent assumption on the model. Similar to selecting the optimal value for one can follow a cross-validation approach to select the optimal value for for gaining insights on the spatial dependence of the response variable on the input weather variables. Details on how the selection of optimal values for and is implemented is provided herein. [0069] The spatial-temporal model equipped by random forest (RF) using weather data collected by PWSs located in the city of San Antonio, Texas was to demonstrate the implementation of platform 10. Temperature was selected as the response variable as it is most important weather input for building energy simulation. The forecasting accuracy was compared with existing benchmark methods to illustrate the usefulness of the model. [0070] Data was are obtained from the Weather Underground (WU) Rapid Historical Observations API service which supplies the high-frequency (at a roughly 5-minute interval) weather information collected by PWS over the city of San Antonio, TX. The available fields for WU API data collection are known in the art. For a typical day, on average, there are around 300 PWSs reporting weather data to WU on a voluntary basis. Since the data reporting process from PWSs are voluntary and the availability also depends on the functional and network status of the device, not all the listed fields are reported for all time points by all PWSs. Therefore, the number of available stations by the time when the data are collected may vary from one time point to another. In addition, some stations contain missing values for some data fields during periods when data were collected due to various reasons, such as connectivity issues. A typical imputation approach was used to impute the missing values through interpolation using adjacent available value in time and location. There are also measurement errors and outliers in the collected weather data. To ensure the model accuracy and reduce noises in the training data, each observation was assessed to filter out the erroneous data values or impute them with more reasonable values. A detailed description of how data was collected and prepared is provided herein. [0071] To demonstrate the advantages of the proposed model for platform 10, its performance was comparted to some commonly used weather forecasting methods in the literature. Typically, when the crowdsourcing data are not available, temperatures from public weather sources (e.g., airports) are usually used. One limitation of using the airport temperature is that most available data of airport temperature is at hourly or coarser frequency which may potentially miss the rapid changing patterns of the temperature within an hour. More importantly, airports are often located in suburbs area, where are far away from a target building in most cases. In addition, for most buildings, the airport is located at a very distant location of which the temperature may be different from the target location. In this example, the temperature forecasts at airport provided by NOAA are denoted by , which predicts temperatures in 1-hr-ahead forecasting up to 24-hr-ahead forecasting depending on . Details of NOAA forecasts data are provided herein. [0072] Persistence models are also widely used in the literature as a “model-free” approach to forecast the future temperature. One of the simplest persistent models is to assume that the forecasted temperatures are the same as the temperature 24 hours ago at the target location. Thus, the persistent model could be also called the “Like-Yesterday Model”, which is denoted as in this study. In this example, if one observes for the - PWS, one can forecast the future temperature at as However, the proposed framework for platform 10 is for making a forecast at arbitrary location which may not necessarily be a location where historical observations are available. Hence, the traditional persistent model may be modified using the similar approach discussed herein. That is, the local forecasts from each PWS are combined using the persistence model (6) where Note that, if the on-site historical observations are available for the target location , one can directly use the persistence model forecast as . [0073] To evaluate the forecasting accuracy and compare between different methods, it possible to use the mean absolute percentage error (MAPE): (7) where ’s are the forecasted values, ’s are the observed values of the response variable, and n is the total number of evaluated data points. [0074] In this example, the forecast resolution was set to be every 15 minutes. Following the model framework and implementation procedure described above, the optimal model parameters were identified and the best subset of input features selected to be used for a selected period. Then, the optimal model was used to make temperature forecast at an arbitrary located building, San Antonio Technology Center (SATC), where the temperature is measured at every 15 minutes frequency when available. [0075] In practice, historical observations may not be available at the target location. In such a case, it is possible to identify the optimal model parameters and select the best subset of input features based on the weather variables collected at surrounding PWSs such that the trained model would capture the most characteristics of the target station according to the weather information from the closest neighboring PWSs. Therefore, to keep the generality, the closest PWS to SATC were selected according to the station availability in different seasons and treat it as the target location for model parameters selection. In this way, it is possible to identify the optimal parameters and the set of input features that produce the best out-of-sample forecast accuracy for the target station. If historical data are available at the target location, the target location can be viewed as a PWS so that more direct information can be utilized. With the identified optimal parameters, the model performance may be evaluated during selective 8-day periods of different seasons which are summarized in Table 3 in chronological order of when the data were collected. Table 3 - Selected periods and target stations at different seasons [0076] As noted above, the local forecast model structure depends on the size of the neighboring PWSs surrounding the target location, and the temporal lags of weather input features that are predictive for the future temperature. As part of the optimal parameter identification process, different choices of rolling window size were considered to see how much additional data are needed to improve the forecasting accuracy and evaluate the persistence of the model structure. [0077] Referring to FIGS.6A through 6D, the performance of the RF model during the evaluation periods at different seasons using different values of the model parameters is shown. The left, middle, and right plots present results of the distribution of MAPE for different values of , , and respectively over different choices of other model parameters. For example, the left figure presents the average MAPE over different choices of , , and given different values of . Each boxplot represents a conditional distribution of MAPE for a given value of single model parameter over other model parameters. [0078] In FIGS.6A through 6D, each season is commented with the optimal parameters , and in the parentheses that are selected according to the lowest MAPE across other parameters. The optimal value of suggests that the effective size of the neighborhood is to include 5 to 15 PWSs around the target location to forecast the future temperature. The evident improvements from indicate that including crowdsourcing data around the target location is useful compared with using only the on-site weather data at a single location. The temporal lags of the input features around 4 days prior to the time when the forecasts are made are sufficient to build a reliable forecasting model. Finally, the most significant gains in MAPE are observed of having a rolling window to be 18 hours or 24 hours indicating that the modeling structure and parameters can roughly last during a 24-hour time span. With the computation time in mind, =72 or 96 was chosen for the forecasting model at a 15-minute frequency. [0079] Once the optimal parameters are identified, a forward-selection procedure is used to obtain the optimal set of the model input features. To implement the procedure, one begins with building the forecasting models using a single weather variable. The single feature that performs the best according to the average MAPE is selected, and an additional feature to pair with the selected feature is added in the second step to train two-feature forecasting models. Then, the second weather input feature is selected by choosing the feature that produce the best performance being paired with the selected feature in the first step. This process is carried on until all features are sequentially included in the model. From FIG.7, the order of the features is shown as the horizontal axis, and it may be seen that for different seasons, different sets of weather variables are included in the optimal set. However, tempAvg and dLatdLon seem to be the most import predictive features in all seasons, which means that the temporal and spatial information are crucial at most time, and dewptAvg, humidityAvg, windgustAvg, precipRate, pressureTrend, pressureMax, precipTotal, windchillAvg, and heatindexAvg are also found useful to further improve the forecasting accuracy depending on the season. If the model performs as well as a full fed of features input, then fewer inputs could be used for computing efficiency. Ultimately, the first few features up to the red-marked variable in FIG.7 are the selected optimal feature sets for temperature forecasts model. [0080] Using the model with optimal parameters, the performance is first validated by making weather forecasts for the same target PWSs mentioned in Table 3, but as an arbitrary location. This means that, the actual weather information of the target PWSs are pretended to be unknown such that the first station included as the neighboring stations according to the parameter starts from the closest PWS instead of the target PWS. Since a unique set of optimal parameters have been settled for each season, the computing speed would be much faster than the model validation process. Thus, the forecasts performance were performed in a longer period as shown in Table 4 to evaluate the assumption that the model would hold in each season with corresponding optimal parameters and feature sets selected in FIGS.6A through 6D and FIG.7. Using the forecasts of airport temperature and the persistent model described above as benchmark models, the average MAPE of the RF model using the proposed framework during the evaluation periods of different seasons is shown in FIG.8. The average MAPE is calculated for different forecasting horizon up to 24-hour-ahead ( ). Table 4 - Selected forecasts period for arbitrary stations at different seasons [0081] Each line in FIG.8 represents MAPE performance of its corresponding model at different forecasting horizon F. RF_NOAA represents the random forest model with NOAA airport forecast as feature input and RFNOAA_Ensemble NOAA is random forest model with NOAA as feature input and further ensemble with NOAA airport forecast. The notation works similar for gradient boosting method. Compared with the persistent model (green dashed line), the proposed model demonstrates evident advantages for most values as majority part of each line are below the green dashed line. For the forecasts when is large and close to one-day ahead forecast, the proposed model performs similarly to the persistent model. This is expected due to the daily seasonality of the temperature, while the persistent model is always taking yesterday’s temperature. Comparing the overall average MAPE, the proposed model performs better in summer and fall, compared with spring and winter. This is due to the stability of the temperature in San Antonio during the evaluation periods. Hence, the time-series of the forecast and the observed temperature are inspected at the target PWSs during each season. [0082] In terms of different machine learning methods, random forest slightly outperforms the gradient boosting model with or without including NOAA airport forecast features for most forecasting horizon F through all four seasons. For instance, the light orange line, RF_NOAA, is below the light blue line that is corresponding to GB_NOAA for most Fs in summer except for F greater than 80. [0083] In fact, these target PWSs locate about 5 miles away from the airport location, where the actual temperature may only have minor differences with the temperature at the target PWSs. Since the airport forecasts is retrieved from the NOAA, it has relatively matured and high-performance model. The model framework has been significantly improved when including the airport forecasts as the model inputs for all seasons, and by averaging with the airport forecasts, the accuracy could be further improved especially for larger forecasting horizon . Although, the model did not demonstrate an advantage over NOAA airport forecasts here, the scenario where the model outperforms NOAA airport forecast is discussed herein. [0084] FIG.9A and 9B show a time series plot for the observed temperature and forecasting temperature from different models for different forecast horizon F (F = 1, 48, and 96) in all four seasons. The “Airport” here is NOAA airport forecast mentioned above. The temperatures in San Antonio in summer are more stable compared with other seasons. The selected periods in all other seasons include abnormal days that the daily seasonal patterns are distrusted. This caused the deterioration of the overall model performance, especially for spring and winter showing in FIG.8. Nevertheless, the proposed model still outperforms the persistent models for any before 16 consistently, and the model is even better when including more accurate airport forecasts information. The average MAPE at different look- ahead time by different models are reported in Table 5 and the lowest MAPE for each selected is highlighted in bold. [0085] Based on Table 5, the proposed model framework is able to outperform the persistent model by a lot of margins. For example, the mean of MAPE for persistent model is 12.21 through all forecasting horizon F in Table 7 but RFNOAA_EnsembleNOAA is 4.21, which implies a 65% of improvement. In addition, similar to finding of FIG.8, the proposed model framework demonstrates an advantage in short-term forecast, where F is smaller than 24 (equivalent to four hours) as most bold numbers are from proposed model framework when F is small. For instance, in winter MAPE of RF is 3.44 while 7.12 for NOAA airport forecast, which represents a 51.68% improvement. The improvement in short-term forecast compared with our bench model NOAA airport forecast is significant for building control load forecast as the system mostly used temperature forecasting results range from F = 1 (15 minutes ahead) to F = 24 (six hours ahead) forecasts as inputs. In general, the proposed model framework in summer, fall and winter in terms of mean but not in spring. The reason can be found in FIG.9A and 9B, where there is an abnormal temperature pattern occurring in spring that may adversely affects the forecasting accuracy. Table 5 - Model performance (MAPE) at selected look-ahead windows
[0086] It is often of interest in building energy simulation to know during what hours of the day the weather forecasts are more reliable compared with others. For each season, two plots are presented in FIG.10A through 10D. The top-panel plots compare the distribution of MAPE at different hour of the day by three models. It reveals the fact that the proposed RF model is much more stable (with a smaller range in boxplots) than the airport forecast and persistent model. To further assess the gain in forecast accuracy of the proposed model, the difference of the MAPE in the bottom-panel is plotted by subtracting MAPE of the proposed model from the MAPE of airport forecast and persistence model with corresponding look- ahead period accordingly. If the difference is above the 0-reference line, the proposed model demonstrates an absolute advantage of all look-ahead forecast horizon. As a result, a commonly observed pattern is that the proposed model almost consistently outperforms the benchmark models in almost all seasons except for winter where the proposed model may slightly under-perform from mid-night to the sunrise. The improvements are more significant in summer than other seasons. [0087] Until this point, the model performance for the target PWSs was evaluated as an arbitrary station given that the target location has unknown weather information. Since the goal is to forecast the weather at the SATC location as mentioned above when selecting the target PWSs nearby the SATC, the model performance should be evaluated at the actual SATC location. However, due to the data availability, only observed temperatures for SATC building in 2019 were available to calculate the performance accuracy. Hence, the model framework would also be evaluated the assumption that it holds annually for the same seasons such that it possible to be applying the same optimal parameters and features set to forecasts temperatures in 2019. Since this project started collecting PWS temperatures from summer in 2019 and relatively less availability than the data collection in 2020, it is only possible to evaluate the SATC forecasts for summer, fall and winter in 2019, and the forecasting period for SATC is shown as Table 6. [0088] The forecasts performance for SATC location in 2019 is seen in FIG.11. Similarly, the temperature in summer is relatively stable, and the model performance is promising to be improved by adding the airport forecasts information as part of the input. Table 6 - Selected forecasts period for SATC in 2019 [0089] However, according to FIG.12, when there is sudden change of temperature such as temperature suddenly dropped in a large amount after October 11 th in fall, it significantly impacts the model’s performance. Similar events happens between December 8 th and December 11 th in winter. This explains why the model underperforms NOAA airport forecasts in fall and winter. [0090] Even though the airport forecasts provided by the NOAA have been performed by a high-performance model, a single airport forecasts would be too broad to represent the temperatures for every station in the city. In the example, the target locations are not relatively far from the airport, and thus the model performances have not showed a significant advantage on the crowdsourcing weather data. Thus, the model implementation was extended for more target PWSs as the arbitrary locations and evaluate the forecasts performance at various locations within the city to compare with the airport forecasts.50 PWSs scattered around the San Antonio area were selected, and these are always available to retrieve the observed temperature for the sake of calculating the forecasts accuracy. The rest of the settings are consistent with the study discussed above for four seasons. FIG.13 is a map of the San Antonio marked with 50 selected target PWSs. The airport and SATC location are also labeled in blue icons. [0091] During the implementations for each target PWSs, all forecasts accuracy is calculated with MAPE. In a large-scale point of view, since it is desired to analyze the overall advantages of the model framework comparing with the airport forecasts, the number of the target PWSs that are outperformed the airport forecasts over various model methods are counted in FIG.14. When the line is above 50, it implies that more than half of locations outperform NOAA airport forecast with the model in that season and vice versa. It is also clear that even the classic random forest (RF – the first plot on the left) can beat NOAA airport forecast in most locations through all four seasons when F is smaller than eight. Evolving the proposed model framework from with weather inputs to adding airport forecasts as inputs (RF_NOAA) and later ensemble with the airport forecasts (RFNOAA_EnsembleNOAA), the percentage of the target PWSs that has forecasts accuracy outperformed the airport forecasts has increased when F is greater than eight. In this case, including a promising external forecast into the model framework would significantly improve the forecasts performance, and the built-in parameters in our framework could further take the spatial and temporal advantages using crowdsourcing data from the PWSs nearby. [0092] The model demonstrates clear advantages over forecasts made using airport temperature and persistence model. Generally speaking, the model framework equipped with random forest can improve forecasting accuracy by 50% compared with persistent model based on Table 5. Additionally, according to FIG.8 and FIG.11, ther model framework equipped with random forest has approximately 90% chance to beat the airport forecast in short-term forecast (F smaller than eight – two hours ahead forecast) in any arbitrary locations of San Antonio and Syracuse through four seasons. Including the airport forecasts as model inputs and ensembling our forecasts with the airport forecasts can further improve the model’s performance. In a real-time setting, the proposed model framework is able to provide more accurate temperature forecasting results regarding CoE building compared with using airport temperature forecast for most forecast horizon F. For instance, based on FIG.8, the model using random forest machine learning method including NOAA airport forecast as feature inputs and ensembling with NOAA airport forecast outperforms airport forecast all the way until F = 72 (18 hours). A 2-day temporal lag are found sufficient in capturing the association between the future temperature and historical weather data. When studying the effect of a rolling window, it was found that the model structure can hold last as long as eight weeks. Finally, tempAvg, humidityAvg, precipTotal, windchillAvg, pressureMin, dewptAvg, precipRate and pressureMax are found to be the most important features in predicting the future temperature and having additional features may negatively influence the model performance. [0093] FIG.15 shows the steps that followed to prepare a clean data for modeling. The first issue of the collected data from PWSs is the misalignment of time-ticks. Since the data from individual PWSs are uploaded to the server through the local network using different devices, there exists a large number of variations in the final time-ticks when the data are recorded to the server although the WU requires each PWS to report the weather data at a 5-minute frequency. [0094] Since the forecasting is aiming to be under 5-minute frequency, which has a total 288 time-ticks for a day (from 00:00:00 am to 23:55:00 pm), each reported tick-tick was matched to the nearest whole 5-minute mark. For example, an observation recorded at 00:09:58 am is matched to the 00:10:00 am mark. If the raw time-tick is at the middle of two whole 5-minute time-ticks (e.g., 03:02:30 am), it is rounded to the previous exact 5-minute mark (e.g., 03:00:00 am). If two raw time-ticks (e.g., 03:02:36 am and 03:07:17 am) are rounded to the same exact 5-minute mark, then it was possible to record the average of the two observations for the matching 5-minute mark. Table 7 - Example of time-tick matching [0095] There are two scenarios to be considered when dealing with the missing values in the data. First, a check whether the data are completely missing or missing at a significant portion (e.g., over 70% of data are missing). In this case, the quality or reliability of the data for that date may be questionable. Hence, it was possible impute the missing data or replace the small portion of the observed data (which might be of poor quality) by the complete data from the dates that are closest to the missing date. It was possible denote the data to be imputed at time on certain date by , the complete data at time on date by where is the most recent date before date that the complete data is available, and the complete data at time on date by where is the most recent date after date that the complete data is available. Then, it was possible impute the missing data at time on date as where After imputing the dates with complete or large missing values, the missing values at a local scope are dealt with next. In this stage, the missing values may occur at different weather stations and at different time ticks. Since the imputation at this stage will focus on a specific PWS, the data at time is denoted on certain date for -PWS by It was possible to impute the missing data as for and is not missing. [0096] Finally, the collected data may contain some abnormal values for weather variables (e.g., for temperature) due to the defectiveness of the device or other issues. Hence, in order avoid the impact of these erroneous values in the modeling process, the tail 2.5% values of any weather variables are truncated. That is, for any weather variable , if is set at and if it is possible to set where and are the empir th th ical 2.5 and 97.5 percentile of the weather variable at time on date over all PWSs where data are not missing. [0097] The data collected through the personal weather stations are uniformly close to each other within a certain range due to the nature of weather. For instance, FIG.16 shows the CoE building in Syracuse (red star icon) with closest PWS to CoE (yellow flag icon). Blue points represent the closest 20 PWSs to the one that is closest to CoE with the average distance of 3.12 miles. [0098] Table 8 shows the pearson correlation between each station and the closest PWS to CoE in the winter period. Each column represents a PWS (R1, R2, …, R15 is the ranking in terms of geo distance to the closest PWS) and each row is different forecast time horizon with F = 0 (true temperature), F = 8 (two hours ahead), F = 48 (12 hours ahead), and F = 96 (24 hours ahead). As the table shows, the recorded temperature of PWS is highly correlated to each other as the correlation values are all closest to one in the first row. This may adversely affect the forecasting model using data from PWS because temperature data from extra PWS will not provide more information regarding the temperature in that area. Table 8. Correlation table [0099] Besides collecting weather data from PWSs, historical and real-time weather forecasting data was collected from the NOAA as the model’s benchmark and the feature inputs. NOAA’s weather forecasting data is obtained from their High-Resolution Rapid Refresh (HRRR) model, which provides detailed forecasts of severe weather events such as temperature and atmosphere in the contiguous United States for next 18 hours (f = 00, 01, …, 17, 18) with 15 minutes resolution and HRRR model runs on an hourly cycle at 00Z, 01Z, …, 22Z, 23Z (coordinate universal time). In this section, the step-by-step process of preparing historical weather forecasting data from NOAA’s HRRR model is explained. [00100] The first step is to download data files needed for the target period. NOAA stores their weather forecasting data generated by the HRRR model in grib2 file format, a binary file containing gridded meteorological data. Each individual grib2 file can be retrieved through curl command from NOAA’s Amazon AWS cloud source. For instance, the following command code can be used to download the historical weather forecasting data for next one hour (f = 01) at 1 a.m. (t = 01Z) on 01/01/2020 in the contiguous United States: curl https://noaa-hrrr-bdp- pds.s3.amazonaws.com/hrrr.20200101/conus/hrrr.t01z.wrfsubhf0 1.grib2 --output 20200101_hrrr.t01z.wrfsubhf01.grib2 (Note: use cd command to change the directory to desired folder for storing data before downloading) [00101] The next step is to configure wgrib2 that is commonly used for retrieving data from grib2 file. To download wgrib2, it is possible to the link << ftp://ftp.cpc.ncep.noaa.gov/wd51we/wgrib2/Windows10/>> (open the link through Internet Explorer) and then copy the (version) / *.dll and (version) / wgrib2.exe files to the directory where our grib2 files are stored. [00102] After configuring wgrib2 into a data directory, it was possible to retrieve weather forecasting data based on one target location with its latitude and longitude from the grib2 file. For instance, one can use the following code to generate the weather forecasting data around San Antonio International Airport (29.5312 , -98.4683°W) for next one hour (f = 01) at 1 a.m. (t = 01Z) 01/01/2020: wgrib2 Data_Dir\20200101_hrrr.t01z.wrfsubhf01.grib2 -undefine out-box -98.48:-98.4429.43:29.63 -csv Data_Dir\subset_20200101_hrrr.t01z.wrfsubhf01.csv [00103] The code on the above will draw a rectangle centering around San Antonio International Airport (red plane marker) shown as FIG.17. All weather forecasting data from locations represented by grid points (black dots on the map) within this rectangle will be retrieved from grib2 file. [00104] Table 9 is a data dictionary for NOAA data and some variables are measured at different height levels e.g., there are surface level, 2 meter above ground (mb), 500 mb, 700 mb, 850 mb, 925 mb and 1000 mb for temperature. In the present invention, data id collected from 2 meters above the ground. Table 9 - NOAA data dictionary
EXAMPLE 2 [00105] In this example, the results of implementing the models in other cities (Syracuse and Chicago) that have different climate types from San Antonio are shown. Following the similar process as above, the closest PWSs were chosen that are also always available of each season, to CoE (target building in Syracuse) for the process of selecting optimal parameters. Table 10 summarizes PWS that were selected. KNYHOMER2 is located 0.73 miles away from CoE and the distance between KNYSYRAC37 and CoE is 0.25 miles. Table 10. Selected periods and target stations at different seasons [00106] FIG.18 is a map of Syracuse with selected PWSs. Orange flag icon represents PWS KNYHOMER2 and Red flag icon is KNYSYRAC37. Red star marks the CoE building and black solid circles are other PWSs in the surrounding area. [00107] After finalizing PWSs for model validation purposes, the optimal parameters were identified for each season in Syracuse following the procedures above. Table 11 displays the optimal parameter selected for each season in Syracuse. Note that a feature selection process was only conducted in Spring as it was assumed the optimal feature stays unchanged in all other seasons. Table 11. Selected optimal parameters and features in different seasons [00108] FIG.19 shows the final performance of various models with optimal parameters in selected periods of all four seasons. Each line represents the MAPE of its corresponding model through different forecasting horizon Fs. The solid green dashed line is the performance of the persistent model. In general, all models perform better in summer and fall but not in spring and winter. On top of that, all models outperform the persistent model as majority parts of solid lines are below the green dashed line. The performance of the NOAA airport forecast (dark blue) stays almost constant through all forecasting horizon points. Random forest and gradient boosting machine learning algorithms produce similar results with or without using NOAA ensemble features before F = 72. Gradient boosting slightly outperforms random forest when F is greater than 72. Finally, both the performance of random forest and gradient boosting are significantly improved after using NOAA forecasts as input features and the performance of both models are further improved ensemble e.g., RF_NOAA (light orange line) is below line RF (dark red) and RFNOAA_EnsembleNOAA (orange line) is below RF_NOAA. [00109] Moreover, the NOAA airport forecasts usually outperform both random forest and gradient boosting models in a long-term forecast (F > 24). For instance, NOAA has the lowest MAPE among all models in spring when F is greater than eight based on Table 12 in spring. While the model shows advantage against NOAA in a short-term forecast (F < 24) as most bold values are from the model. MAPE of RF is 2.03 but 4.45 for NOAA, which implies that the RF outperforms NOAA by 54.36%. NOAA beats the models in spring and winter in terms of mean and RFNOAA_EnsembleNOAA is better in summer and fall. Table 12 - Model performance (MAPE) at selected look-ahead windows [00110] FIG.19 is the time series plot of observed temperature from the target station as well as forecasting temperature (persistent, NOAA Airport and random forest ensemble with NOAA included) based on different forecasting horizons. It can be observed that the temperature change in summer and fall follow a strong daily pattern while there are no clear patterns in spring and winter. This explains the finding in FIG.19 and Table 12 that the model has better performance in summer and fall but not in spring and winter. The model framework integrates the temporal factors for forecasting by utilizing lagged features (L). The weaker the temperature pattern is, the harder to predict the future change of temperature based on the past. In addition, time-series lines from each model almost overlap with each other except the persistent model. However, the model deviates from the true observed temperature when F increases. The most obvious one is the figure on the top right corner when F = 96 in spring. The orange line (temperature forecasting from RF_EnsembleNOAA) deviates from the blue line (Observed temperature from personal weather station). While NOAA can still accurately predict the temperature as there is no obvious deviation of light blue line (NOAA Airport) from the true observed temperature. [00111] A simulation of deploying the model in a real-time setting with CoE building as a target location during the period of 2021/10/04 – 2021/10/12 was also conducted. All parameters (R, W, and L) are set to equal to the optimal parameters that were selected for fall season in Syracuse in Table 11. FIG.21 is the results of this simulation. It may be concluded that in a real time setting the model framework can provide more accurate forecast than NOAA airport forecast for most forecasting horizon F with random forest including NOAA airport forecast inputs and further ensemble with NOAA (RFNOAA_EnsembleNOAA) and only lose the advantage over NOAA when F > 72, which is equivalent to 18 hours looking ahead forecasting. [00112] To further investigate how rolling-window (W) influences over the model’s performance, a was conducted test by increasing the rolling window size from W=96 (one day) to W=5376 (eight weeks). FIG.22 shows results of these tests. The dark red line (Average MAPE over F) in the middle, one can see that increasing W will increase the overall performance of both models. For instance, MAPE of random forest ensemble with NOAA feature (RFNOAA) decreases to 2.29 with W = 5376 from 2.74 with W = 96. However, the improvement is marginal with much longer computing time because the size of our training data is equal to . [00113] To test the general performance of the model in the entire city, ten stations were randomly chosen each season in Syracuse as target stations. FIG.23 is the map with all the stations (red solid circle) that are selected. [00114] FIG.24 shows the percentage of stations that beat NOAA forecasting in each season through different forecasting horizon points Fs. The model demonstrates obvious advantages over using NOAA airport forecasts in the short term (F < 24 equivalent to six hours look ahead forecasting). For instance, the random forest model with NOAA forecast as feature input (RF_NOAA) is able to outperform NOAA airport forecast in all the locations in winter (blue line) at both F = 16 (four hours ahead forecasting) and F = 20 (five hours ahead forecasting). In addition, random forest with NOAA airport forecast as feature input and ensemble with airport forecast (RFNOAA_EnsembleNOAA) can outperform the NOAA airport forecast in majority selected stations through all forecasting horizon points in all seasons except winter because spring (green line), summer (orange line) and fall (dark red line) are all above 50%. Although, NOAA airport forecast is able to provide accurate temperature forecasts regarding CoE building, crowdsourcing data shows its advantage over airport forecasts when used to predict various locations that are far from the airport or have different landforms within the city. [00115] Similar to above, ten stations were randomly selected around the city for each season in 2020 and then compare the model’s performance with NOAA airport forecasts through different forecast horizon points in different season. [00116] FIG.26 shows the results of model implementation in Chicago. Although the model demonstrates advantages over NOAA airport forecasts in various locations of Syracuse, it is not the case here. In the short-term forecast (F < 24), the model can still outperform NOAA. E.g., random forest with NOAA as feature input and ensemble with NOAA (RFNOAA_EnsembleNOAA) outperforms NOAA at all selected locations at F = 1 (15 minutes) and F = 4 (one hour) in summer as the orange line in the last figure is at 100 in the last figure. While the model loses advantage over NOAA when the look ahead forecasting range increases. [00117] There are two reasons that may potentially cause the model to underperform NOAA airport forecast in Chicago but not in Syracuse. The first reason is the difference of terrain characteristics between Chicago and Syracuse. Chicago is in a relatively flat glacial plain. While there are rolling hills, flat plains, lakes, and streams in Syracuse. The landform of Syracuse is much more complicated than Chicago, which may cause the relatively larger temperature difference in different locations of Syracuse. In this kind of scenario, NOAA airport temperature cannot well present temperature in other locations in Syracuse. The temperature in Chicago, on the other hand, is much more uniform. Thus, the NOAA airport forecast can be applied to different locations. In addition, from FIG.25, it may be seen that most stations selected (red solid circle) are located at the suburban area of Chicago so the heat island effect of an urban city may not be strong in these areas. In this case, the NOAA airport forecast can perform better as most airports are also located in suburban areas of cities. [00118] FIG.27 shows an exemplary application design for implementing the present invention in software where left side panel allows user enter model inputs such as city, building name, location, forecasting starting and ending datetime. The main panel is where all outputs (e.g., log message, figure for temperature forecasting results and map) are generated by the model. FIG.28 shows an exemplary output for a software application implementing the present invention.