Pinpointing potential risks in recycled water through data analytics

Potable reuse systems generate massive amounts of monitoring data. Hazen designed a data analytics program that cuts through it all to find early signs of potential public health threats.

William J. Raseman uses data science and systems optimization to find solutions for a wide range of water sector issues, including potable reuse challenges.

Utilities must respond rapidly to potential issues with direct potable reuse systems due to the risks (real and perceived) of using those systems to recycle municipal wastewater.
Through a Water Research Foundation project with Trussell Technologies, Hazen built a data analytics program that can quickly spot potential problems in potable reuse systems—helping operators address them before they could threaten public health or violate regulations.
Of 8,000 potential monitoring variables, the team identified 22 to focus on for early detection of a wide range of critical control point failures.
Water quality and process failures are crucial metrics to monitor. But alongside those details, the program can track an additional key metric: the integrity of the monitoring devices themselves.
The team’s final report also includes guidance on how to establish this kind of risk identification system and how the numerous disciplines it involves can effectively work together.

Related Topics:

“Critical control points are all the treatment processes designed to protect public health. It’s crucial for them to be working properly and for operators to have accurate data about them. We created this program to focus on the most important CCP data, so it can catch problems before they compromise water quality.”

~ Billy Raseman, PhD, PE, Engineer, Hazen

Potable reuse is the process of treating municipal wastewater until it’s safe to drink. Indirect potable reuse (IPR) involves sending the recycled water to an underground aquifer or large reservoir upstream of a drinking water plant. Days—even months—can pass before the water travels through the plant and into people’s faucets.

Direct potable reuse (DPR) systems have no such buffer: The recycled water is piped straight from the reuse facility to a drinking water plant or distribution system, sometimes within hours. For that reason, it’s critical for DPR system operators to spot and address potential problems immediately.

That’s easier said than done. On top of the treatment needed to make wastewater clean enough to put back into rivers, streams, and other natural areas, potable reuse systems use multiple advanced processes, from reverse osmosis to UV disinfection. That means massive amounts of data to monitor—and, in DPR systems, little time to analyze it.

Hazen and Trussell Technologies built a computer program that can quickly analyze data from potable reuse systems—both DPR and IPR—to spot potential public health issues. It’s a way to cut through the fog of data, zero in on what's most important, and address potential problems early.

How the program works:

Data is frequently gathered from devices across the entire potable reuse system, then stored in the program.
The program screens out irrelevant data for times when parts of the system are offline, undergoing maintenance, or starting up.
Using customized commands and statistical tests, it flags data points that could indicate problems.
The flags help the program identify potential events, from malfunctioning equipment to changes in upstream water quality, then classify them by type and rank them by level of urgency.

The program can even use the data to pinpoint specific devices that might need attention—for example, one reverse osmosis unit in a bundle of such units in a large reverse osmosis system.

What we used to create it:

Extensive technical knowledge of potable reuse systems, which helped us identify the 22 most critical data points (out of 8,000 such variables) to monitor for public health
Python, an open-source programming language
Pecos, an open-source Python package designed to automate quality control and performance monitoring to enable operators to quickly detect issues
Several months of data from a pilot potable reuse system at a partner utility, which we used to train and test the program

What’s next:

The team is fine-tuning the computer script, which is free and publicly available. We also published a report that details the basic framework for establishing this kind of risk identification system, along with guidance for how the numerous disciplines involved can effectively work together to get the most out of their data.