Document Type


Publication Date





Direct measurement of the flow rate in sanitary sewer lines is not always feasible and is an important parameter for the normalization of data used in wastewater-based epidemiology applications. Machine learning to estimate past wastewater influent flow rates supporting public health applications has not been studied. The aim of this study was to assess wastewater treatment plant influent flow rates when compared with weather data and to retrospectively estimate flow rates in Louisville, Kentucky (USA), based on other data types using machine learning. A random forest model was trained using a range of variables, such as feces-related indicators, weather data that could be associated with dilution in sewage systems, and area demographics. The developed algorithm successfully estimated the flow rate with an accuracy of 91.7%, although it did not perform as well with short-term (1-day) high flow rates. This study suggests using variables such as precipitation (mm/day) and population size are more important for wastewater flow estimation. The fecal indicator concentration (cross-assembly phage and pepper mild mottle virus) was less important. Our study challenges currently accepted opinions by showing the important public health potential application of artificial intelligence in wastewater treatment plant flow rate estimation for wastewater-based epidemiological applications.


© 2022 The Authors

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (

Original Publication Information

Dhiraj Kanneganti, Lauren E. Reinersman, Rochelle H. Holm, Ted Smith; Estimating sewage flow rate in Jefferson County, Kentucky, using machine learning for wastewater-based epidemiology applications. Water Supply 1 December 2022; 22 (12): 8434–8439. doi: