Date on Master's Thesis/Doctoral Dissertation


Document Type

Master's Thesis

Degree Name

M. Eng.


Computer Engineering and Computer Science

Committee Chair

Desoky, Ahmed H.

Author's Keywords

Principal component; SCADA


Pumping stations--Computer programs; Supervisory control systems; Waterworks--Data processing


This research dealt with the examination of Supervisory Control and Data Acquisition (SCADA) information of a water distribution system through Principal Component Analysis (PCA). PCA is a mathematical method to convert a set of possibly correlated data into a set of fewer variables called principal components. In a SCADA environment, possibly hundreds of data points such as booster pumps, storage tanks, pressure reducing valves, and others constantly provide operational statistics including water pressure, tank capacity, and more. This vast amount of data can be difficult to analyze in its entirety, especially to detect issues in the distribution system. PCA was utilized to observe abnormalities in these SCADA readings. Each SCADA data point was used as an input variable to PCA such as the pressure flow through a pump. Various calculations could be achieved by examining data points from a specific pressure zone or through the entire system. Breaking down the observations into specific areas resulted in better identification of the problem location. Each SCADA data point also provides an updated reading each minute. At the same interval, the principal component is calculated along with the variance of the prior twenty minutes worth of data. The difference between the current variance and the previous minute’s variance highlights possible issues when compared to normal operations. For instance, at 11 am, the current principal component is computed and the principal component results from 10:40 am to 11 am are used as inputs in determining the current variance. This variance, when compared to the previous minute’s variance, is plotted to show deviations in the data. One principal component was calculated each minute resulting in a single value to correlate all provided inputs regardless of the number of SCADA data points analyzed. Analysis of normal operations still results in varying outputs as a result of low and high demand on the water distribution system throughout the day but maintains a regular pattern. Review of data during a main break condition emphasizes the irregular pattern signaling a possible fault. PCA interpretation can be an additional monitoring tool of the distribution system to provide advanced warning of main breaks or other system issues.