Protection & Restoration of Ecosystem Services
Development of a PostgreSQL database for accessible research, operations, and communications of real-time Harmful Algal Bloom data
Overview and Objectives
In August 2014, a harmful algal bloom (HAB) contaminated the water supply of Toledo, OH, leaving over 400,000 residents without drinking water for 2 days. Real-time observations of temperature, HAB-related optical parameters (chlorophyll, phycocyanin, turbidity) and nutrients (phosphorus, nitrogen) are important components of HAB modeling and forecasting efforts serving Lake Erie drinking water managers but have been traditionally stored in plain-text format representing a data management challenge. Such a setup does not allow for expedient querying, analysis, or quick visualization of parameters of interest.
We are developing a database using the robust and industry standard PostgreSQL database management system. In addition, we will create a workflow that receives the data from a station or buoy, and executes quality checks against the data before they are backed up and inserted into the database. The goal is a robust and efficient research and communications platform for real-time data, as well as a system that could be rapidly cloned into an operational setting. In regard to the communications component of this proposal, we wish to develop an online interface for external users to easily browse the real-time data.
Over the past two years, NOAA-GLERL and CILER have developed an interactive, online real-time buoy data display for the Western Lake Erie (WLE) Harmful Algal Blooms (HABs) season. The primary challenge in operating this system has been taking high density text data, parsing, and displaying the data on clients’ computers. While the data are made available this way, the general user is greatly hindered by the speed at which their computer is able to download, process, and display the data. Therefore, the usefulness of the data is not well realized due to little or no access at all for slower client computers. The proposed system would delegate the task of data processing and delivery of small chunks of real-time data to a definitively faster server, and avoid the need to develop custom, possibly inaccessible algorithms to deliver chunks of data to users.