Dynamic Continuous Flows on Networks

There are many cases in which one has continuous ﬂows over networks, and there is interest in predicting and monitoring such ﬂows. This paper provides Bayesian models for two types of networks—those in which ﬂow can be bidirectional, and those in which ﬂow is unidirectional. The former is illustrated by an application to electrical transmission over the power grid, and the latter is examined with data on volumetric water ﬂow in a river system. Both applications yield good predictive accuracy over short time horizons. Predictive accuracy is important in these applications—it improves the eﬃciency of the energy market and enables ﬂood warnings and water management.


INTRODUCTION
Problems concerning the statistical modeling and monitoring of dynamic network flows arise naturally in areas such as energy, hydrology, and transportation. Our goal is to extend previous methodology for Bayesian dynamic flow models of discrete data [3] to the modeling of continuous flows. Rather than modeling Poisson counts, we use Normal and Gamma models for real-valued and positive flows, respectively. Our example applications concern the electrical power grid and hydrology, but the methodology applies to many more situations.
The first focus is the United States energy network, where key problems arise in managing electricity distribution. An excess of supply leads to lost revenue while an excess of demand leads to outages. Balancing authorities (BAs) help coordinate the demand and supply through interchange and power generation.
Most BAs produce electricity within their balancing authority area and can directly serve consumers. In addition, BA systems manage the flow of electricity in or out of their system to other BAs with which they are networked. This flow is called interchange. Electricity flow is directed, and power transfer occurs between two BA systems when one agrees to sell electricity to the other. The exact commercial value is determined by bids and offers within the market, and payments must be made by the end of the next business day. BAs which can accurately forecast demand over short horizons will be financially advantaged in these transactions.
We seek to model the dynamic flows between BAs as well as the self-flow within a BA. Each vertex in the network represents a balancing authority. We use data from the U.S. Energy Information Administration (EIA), which features hourly interchange, generation, and demand for 64 balancing authorities in the U.S. See Fig. 1 for a visualization of the BA network. Our data cover the time period from May 1 to May 31, 2020.
Our second contribution is an application to hydrology. Hydrologic problems arise in environmental studies and flood modeling. Flooding occurs when the volume of water exceeds the capacity of a channel. To monitor river status, stream gauges at various points on a river measure the water flow and other properties such as precipitation and river height. Streamflow, or discharge, is the volume of water passing through rivers and other channels. Runoff from rainfall and evaporation are two of many mechanisms that can increase or decrease streamflow.
Hydrologic modeling has a long history [16]. Most previous statistical modeling of hydrologic activity focuses upon estimating the probability of extreme events, such as floods [8], but there have also been efforts to model flows using stochastic processes [4]. We use Bayesian modeling to make near-term horizon forecasts of changing flows, which is important for reservoir management and flood control (e.g., decisions on dam retention and spillway operation).
In the hydrology model, vertices represent stream gauges and the edges represent directed, downstream flow from one part of the river to the next. We used the dataRetrieval package from R to obtain instantaneous discharge data in 15 minute intervals from the U.S. Geological Survey for the time period from January 1 to May 31, 2020 for the Eel River tributary system in northern California. See Fig. 2 for a map of the waterway with stream gauge locations. Discharge is calculated as the water velocity multiplied by the area in a cross section of the river. We obtain temperature and precipitation data from the National Oceanic and Atmospheric Administration.
The BA model easily extends to other applications with bidirectional continuous flows, such as banking networks or internet traffic. The hydrology model extends to networks with unidirectional continuous flows, such as gas pipelines or sewer systems. (Arguably, some of these flows might be discrete pennies or packets, but these are approximately continuous and neither satisfies the Poisson assumption used in Chen et al. [3].) Nonetheless, any new application will probably require some handfitted modification of the methods we describe.
In both case studies, we are interested in modeling seasonality and day-of-week effects. Electricity demand is generally regular; the demand on the previous Monday is likely similar to the demand of the current Monday. Similarly, precipitation and evaporation follow annual patterns that affect water levels. We take advantage of discounting and conjugacy to provide a general structure for sequential and seasonal analysis. Our work is able to characterize the node dynamics that are inherent in the networks. In addition, the model presents opportunities for Bayesian monitoring and intervention.
Sections 2 and 3 describe the statistical analyses for the electricity model and the hydrology model, respectively. Section 4 concludes.

THE ELECTRICAL GRID
Power demand forecasting has been done almost since the beginning of electrification. Previous research addresses different time horizons: annual usage drives the creation of new power plants [11], seasonal demand generally reflects weather trends [18], and hourly fluctuations are central to the energy market maintained by the BAs [17]. Our work focuses on the short horizon case.
Ghalehkhondabi et al. [5] reviews methodology for demand forecasting that was introduced between 2005 and 2015, especially for short term prediction. The methods listed include time series analysis, fuzzy logic, artificial neural networks, genetic algorithms, and hybrid methods. The review does not mention Bayesian methods, but these exist. Wang et al. [19] presents a Bayesian hierarchical regression model for predicting residential demand, and Bassamzadeh and Ghanem [2] presents a Bayesian network formulation. Our paper proposes a different Bayesian approach, one that ensures computational speed through decoupling and conjugacy, and which respects the connectivity of the electrical grid.
Suppose we have a connected network with v vertices and n edges. Let y * ijt denote the electrical flow in megawatt hours (MWh) between vertices i and j at time t; a positive flow runs from i to j; a negative flow runs from j to i. A flow y * iit denotes electricity created or used within the ith balancing authority (BA). Physics requires that n j=1 y * ijt = 0. This constraint will be enforced during the recoupling.
Exploratory data analysis confirmed the need to transform the data. We compared the standard logarithmic transformation to Fisher's arctanh transformation. The latter showed markedly better performance at stabilizing the variance. This transformation requires us to rescale the observations to lie in [−1, 1], and to avoid infinite values, we rescaled to (−1, 1). Specifically, let m ij1 be the minimum flow between BAs i and j, m ij2 be the second smallest value, M ij1 be the maximum flow, and M ij2 be the second largest value. We then applied the transformation This avoids infinities by replacing the sample maximum and minimum with pseudovalues: 2M 1 − M 2 is slightly larger than the maximum, 2m 1 − m 2 is slightly smaller than the minimum, and these pseudovalues preserve the observed tail behavior. We shall compare three models for these data. One is autoregressive with lag 168 (one week earlier) and covariate x jt , which is the centered temperature at the jth BA at time t. Experts believe that there are weekly patterns in electricity usage and temperature is also a major driver of power demand [10]. The second model is autoregressive with lags 1 and 168. Temperature is implicitly present in the electrical demand during the previous hour. The third model is autoregressive with lag 1. As in Wilke [20], we shall compare these models in terms of predictive mean squared error for the next time step. Also, we use the first week of data to initialize the model for the autoregression using lag T = 168 and one hour of data to initialize for the simple lag 1 model.
The centering is done so that x jt has mean zero, to prevent nonidentifiability due to confounding with the mean of the level process φ ijt described below. Specifically, if the temperature at the jth BA at time t is K jt , then The y ijt is a time series with y ijt |φ ijt ∼ N (φ ijt , σ 2 ijt ) that is conditionally independent over t = 1, 2, . . ., where φ ijt is a latent level process and σ 2 ijt is the variance of y ijt at time t. The φ ijt process evolves via a Markov model. To account for the influence of temperature on the electricity demand at destination node j as well as the day of the week effect, we have is the discount factor that controls dependence upon the past. The innovation term t is independent of s and φ s for s < t. The discount factor is determined as where d i is the constant baseline discount factor for node i and l > 0 is a specified constant that determines how close δ t is to d i when information is high. The second autoregressive model is like the first, except where ijt is the innovation term defined previously. The β ij are not dynamic; we can determine a value for the β ij by maximizing the model marginal likelihood with a normal prior on β such that β ∼ N (β 0 , σ 2 β ), Using the specification below, we have Let φ and σ 2 have the usual normal-inverse gamma priors, which provides conjugacy and enables rapid computation. The hyperparameters for the mean and variance incorporate time series information on historical weekly and seasonal trends in electrical demand among the BAs. Specifically, at time t = 0, specify the priors as The time t → t + 1 update/evolve steps are: 1. Time t prior: 2. Updates to the posterior: 3. Evolution of the time t + 1 prior: The third autoregressive model simply depends on the flow in the preceding hour. Aside from dropping the lag at T = 168, the model specification is exactly the same as for the immediately previous model.
We fit these three models with prior parameters m ij,0:T initialized with the first week of data, k ij,0:T , c ij,0:T , r ij,0:T = 1, β 0 = 0, and σ 2 β = 1. We compared the results in terms of one-step-ahead predictive squared error (OPSE). The autoregressive model that included temperature had an OPSE of 0.186, the autoregressive model with two lags had an OPSE of 0.100, and the model that depended only on the previous hour had OPSE equal to 0.281. The AIC value for the two-lag model is -1083.52 and the AIC for the one-lag model is 1125.84, so the former is preferred. Figure 3 shows the predicted and observed values from the two-lag model, along with a 95% pointwise confidence band, for the City of Homestead BA. We selected that BA because it was easy to gather its temperature data and because it had relatively few edges in the network. Based on similar plots made for other BAs, we believe it is typical.
Similarly, Fig. 4 plots the model's predicted values against the actual values. It serves as a useful diagnostic. For example, note the tendency towards higher variation among the smaller values.
Using the two-lag model, we calculated the flows between all connected pairs of the 64 BAs. It took a total of 40 minutes to perform the computation using a laptop with a 2.5GHz CPU and 12 GB RAM. There are 296 edges in the network, and our decoupling, which considers only the flows between linked pairs of the BAs, enables massive parallelism. Further speed accrues from the conjugacy in our model.
This application seeks to advise BAs on how to manage their energy markets more efficiently, by accurately forecasting future needs and enabling detection of changepoints. So the sum-to-zero constraint that arises from the physics of energy transmission is not essential. (In fact, because of power loss due to resistance and other factors, the constraint only holds approximately). Nonetheless, it is interesting to address that case.
For the predicted valuesŷ ijt , letŷ * ijt = tanh(ŷ ijt ) so that the measurements are back on their original scale. Let N i be the set of indices corresponding to the BAs to which the ith BA is connected. Letŷ + ijt denote the modified estimates that satisfy the sum-to-zero constraint, so that: This does not have a unique solution, so we add the condition that the y + ijt minimize at each value of t. This formulation is a linear equation with a convex constraint, so it is easily solved with standard software. Ideally, this application would close with a comparison the predictive accuracy of the model described to the predictive accuracy of the forecasts made by BAs using commercial software (and, of course, show that our methodology is superior). Unfortunately, such software is proprietary and our efforts to work with relevant practitioners have so far been unsuccessful. Consequently, this publication represents what we believe to be the first statistically sophisticated model for predicting power markets among BAs in the literature.

THE HYDROLOGIC NETWORK
Let y ijt denote the river flow in cubic feet (ft 3 ) from node i to node j at time t. In this study, we focus on the flows between nodes rather than self-loops. Hence, we are not applying sum-to-zero constraints on the overall hydrologic network. The y iit denotes the change in volume at node i at time t.
Rivers do not reverse their flow directions unless affected by tides or by rare geological activities. Thus, y ijt ≥ 0 if i and j are connected. To account for the non-negativity of the flows, we follow [13, section 5.5] and use a gamma distribution. Specifically, we have y ijt |φ ijt ∼ Gamma(α ij , α ij /φ ijt ), where α ij is a fixed shape parameter specified for the river between nodes i and j and E(y ijt |φ ijt ) = φ ijt . Here φ ijt is a latent level process, which evolves via the following Markov model: The discount factor is determined as where d i is the constant baseline discount factor for node i and l > 0 is a specified constant that determines how close δ t is to d i when information is high.
A generalized beta prime distribution is formed by compounding two gamma distributions: If a random variable X ∼ β (α, β, p, q) then kX ∼ β (α, β, p, kq), which is useful in accelerating computation.
Hence, we have the following equations by setting α ij as the value that maximizes the model marginal likelihood, where the generalized beta prime distribution (or inverted beta distribution) has density α, β, p, q > 0, and B(·, ·) is the usual beta function.
It is possible to include covariates x jt , such as precipitation and temperature, at node j and time t to the model. As before, we center the covariates. The resulting model is where ijt ∼ Normal(0, σ 2 ij ). At time t = 0, specify the prior as 1 φij0 | y ij0 ∼ Gamma(r 0 , c 0 ), where r 0 > 0 and c 0 > 0 are known. Then the time t → t + 1 update/evolution steps are: 1. Time t prior: 2. Updates to the posterior: where r ijt = α ij + r ij,t−1 and c ijt = α ij y ijt + c ij,t−1 . 3. Evolves to the t + 1 prior: For the Eel River hydrology data, we fit three models on the log of river flow: one without covariates, one with the log of precipitation, and one with the natural logarithm of precipitation and temperature. The logarithmic transformations were chosen based upon a preliminary exploratory data analysis. We set priors 1 φij0 ∼ Gamma(r ij0 , c ij0 ) with r ij0 = c ij0 = 1. The model without covariates had an OPSE of 0.00187, the model with one covariate had an OPSE of 0.00185, and the two-covariate model had an OPSE of 0.00185. The AIC for the zero covariate model is -8133.64, for the precipitation-only model it is -10571.28, and for the model with precipitation and temperature it is -10573.36. Since the two-covariate model has a lower AIC, it is preferred. Figure 5 shows the predicted and observed river flow values for the two-covariate model for the flow at the gauge near Fortuna. We see that the predicted flow (red) closely follows the observed flow, as expected. Plots of the other stream gauges also show similar results. Figure 6 shows the predicted flow against the observed flow. From the plot, it appears that the model tends to underestimate more than overestimate river discharge. This is reasonable since our analysis did not use relative humidity as a covariate, which, in that area during this time period, should reduce the evaporation rate.
Hydrology researchers measure the performance of a model in several ways, including root mean squared error and the percent bias in runoff ratio [6], and measures of percent bias in model predictions [12]. But the most traditional metric is the Nash-Sutcliffe model efficiency coefficient (NSE) [14]. This measure is equivalent to the coefficient of determination or r 2 value between predicted values and observations.
A second criterion important to the hydrological community is model agility [7]. Model agility refers to the ability of the same model to apply with high accuracy to different river segments and across different river systems. For many years, hydrology tried to micromodel the physical processes that drove water flow, including such quantities as leaf reflectance, maximum rate of carboxylation at 25 • , a monthly leaf area index, and a plethora of other quantities Figure 5: This figure shows the observed log river flow and the log river flow predicted by the model with log precipitation and temperature as covariates (the best model we considered) for the stream gauge closest to Fortuna. It also shows a pointwise 95% confidence band (which is so tight that it is nearly invisible). that encoded beliefs about the physical mechanisms that drive evaporation, runoff, and rainfall. Our review of the literature did not discover any discussion of the Curse of Dimensionality [cf. 9, section 2.5], but surely that would an issue in these models too.
Researchers found that these physics-based models lacked agility [12,15]. They pushed for simpler models that worked well in general. To study the performance of our model with respect to agility and NSE, we computed the NSE for every gauge in the Eel River system. We see that the values are all very close to 1, demonstrating that our models have agility. Table 1 shows the results.
Values of NSE close to 1 correspond to good performance, and values near 0 indicate poor performance. Table 1 shows good performance across all the river segments, indicating that, at least within the Eel River system, our model demonstrates agility.
We initially considered a model that was lagged to reflect the time required for a bolus of water to move between adjacent gauges. That model did not (with one gauge pair exception) outperform the model without distance-based lagging. One possible reason for this lack of dependence is that the Eel River system is only 111 miles long, so rainfall, evaporation, and runoff are similar across the entire geography. Another issue is that flow rates are not constant. Typically they are slower at the headwaters and faster at the coast, but they also depend upon volume. After a heavy rainfall, rivers flow faster. For these reasons, we chose to use the simpler model which made no assumptions about travel time between gauges.

CONCLUSION
This paper addresses the problem of modeling continuous flows in networks, which arise in many physical and engineering situations. The models are widely applicable, although they will surely need to be tailored to specific applications. The examples in this paper provide some guidance on how to do such tailoring, through transformations, the use of covariates, and model selection based on such criteria as the AIC. The paper treats both directed and bidirectional flows.
The models use decoupling to enable massively parallel solution, which is essential because the number of edges in a network often scale combinatorially. For the relatively sparse networks in our examples, it was possible to do the computation upon a single laptop, but for many applications it would be infeasible.
Similarly, we leverage conjugacy in the modeling to achieve rapid computation. Such conjugacy is not always faithful to the physical reality of the problem, but it is a convenient starting point for the analysis. In our examples, the conjugacy approximation performs well.
The electrical grid example has an interesting sum-tozero constraint that should be respected when the decoupled analyses are recoupled. In contrast, the hydrology example does not, since precipitation runoff and evaporation are poorly measured and highly variable, and can differ over even relatively short geographic distances.