Applying Machine Learning to Understand Water Security and Water Access Inequality in Underserved Colonia Communities

03/29/2023
by   Zhining Gu, et al.
0

This paper explores the application of machine learning to enhance our understanding of water accessibility issues in underserved communities called Colonias located along the northern part of the United States - Mexico border. We analyzed more than 2000 such communities using data from the Rural Community Assistance Partnership (RCAP) and applied hierarchical clustering and the adaptive affinity propagation algorithm to automatically group Colonias into clusters with different water access conditions. The Gower distance was introduced to make the algorithm capable of processing complex datasets containing both categorical and numerical attributes. To better understand and explain the clustering results derived from the machine learning process, we further applied a decision tree analysis algorithm to associate the input data with the derived clusters, to identify and rank the importance of factors that characterize different water access conditions in each cluster. Our results complement experts' priority rankings of water infrastructure needs, providing a more in-depth view of the water insecurity challenges that the Colonias suffer from. As an automated and reproducible workflow combining a series of tools, the proposed machine learning pipeline represents an operationalized solution for conducting data-driven analysis to understand water access inequality. This pipeline can be adapted to analyze different datasets and decision scenarios.

READ FULL TEXT

page 15

page 17

page 18

research
06/05/2020

Utilizing machine learning to prevent water main breaks by understanding pipeline failure drivers

Data61 and Western Water worked collaboratively to apply engineering exp...
research
10/28/2022

Not Another Day Zero: Design Hackathons for Community-Based Water Quality Monitoring

This study looks at water quality monitoring and management as a new for...
research
11/28/2019

Analysis of Hydrological and Suspended Sediment Events from Mad River Wastershed using Multivariate Time Series Clustering

Hydrological storm events are a primary driver for transporting water qu...
research
12/08/2022

Mining Explainable Predictive Features for Water Quality Management

With water quality management processes, identifying and interpreting re...
research
12/05/2021

Smart IoT-Biofloc water management system using Decision regression tree

The conventional fishing industry has several difficulties: water contam...
research
02/13/2023

A Novel Poisoned Water Detection Method Using Smartphone Embedded Wi-Fi Technology and Machine Learning Algorithms

Water is a necessary fluid to the human body and automatic checking of i...

Please sign up or login with your details

Forgot password? Click here to reset