As relevant examples such as the future criminal detection software show, fairness of AI-based and social domain affecting decision support tools constitutes an important area of research. In this contribution, we investigate the applications of AI to socioeconomically relevant infrastructures such as those of water distribution networks (WDNs), where fairness issues have yet to gain a foothold. To establish the notion of fairness in this domain, we propose an appropriate definition of protected groups and group fairness in WDNs as an extension of existing definitions. We demonstrate that typical methods for the detection of leakages in WDNs are unfair in this sense. Further, we thus propose a remedy to increase the fairness which can be applied even to non-differentiable ensemble classification methods as used in this context.
https://github.com/jstrotherm/FairnessInWDNS https://doi.org/10.1007/978-3-031-43085-5_10Many Machine Learning models are vulnerable to adversarial attacks: One can specifically design inputs that cause the model to make a mistake. Our study focuses on adversarials in the security-critical domain of leakage detection in water distribution networks (WDNs). As model input in this application consists of sensor readings, standard adversarial methods face a challenge. They have to create new inputs that still comply with the underlying physics of the network. We propose a novel approach to construct adversarial attacks against Machine Learning based leakage detectors in WDNs. In contrast to existing studies, we use a hydraulic model to simulate leaks in the water network. The adversarial attacks are then constructed based on these simulations, which makes them intrinsically physics-constrained. The adversary maximizes water loss by finding the least sensitive point, that is, the point at which the largest possible undetected leak could occur. We provide a mathematical formulation of the least sensitive point problem together with a taxonomy of adversarials in WDNs, in order to relate our work to other possible approaches in the field. The problem is then solved using three different algorithmic approaches on two benchmark WDNs. Finally, we discuss the results and reflect on potentials to enhance model robustness based on knowledge about adversarial weaknesses.
https://doi.org/10.1007/978-3-031-43078-7_37Concept drift refers to the phenomenon that the distribution generating the observed data changes over time. If drift is present, machine learning models can become inaccurate and need adjustment. While there do exist methods to detect concept drift or to adjust models in the presence of observed drift, the question of explaining drift, i.e., describing the potentially complex and high dimensional change of distributions in a human-understandable fashion, has hardly been considered so far. This problem is of importance since it enables an inspection of the most prominent characteristics of how and where drift manifests. Hence, it allows human understanding of the change and it increases acceptance of life-long learning models. In this paper, we present a novel technology characterizing concept drift in terms of the characteristic change of spatial features based on various explanation techniques. To do so, we propose a methodology to reduce the explanation of concept drift to an explanation of models that are trained in a suitable way to extract relevant information from the drift. This way, a large variety of explanation schemes is available, and a suitable method can be selected for the problem at hand. We outline the potential of this approach and demonstrate its usefulness in several examples.
https://doi.org/10.1016/j.neucom.2023.126640We investigate the task of missing value estimation in graphs as given by water distribution systems (WDS) based on sparse signals as a representative machine learning challenge in the domain of critical infrastructure. The underlying graphs have a comparably low node degree and high diameter, while information in the graph is globally relevant, hence graph neural networks face the challenge of long term dependencies. We propose a specific architecture based on message passing which displays excellent results for a number of benchmark tasks in the WDS domain. Further, we investigate a multi-hop variation, which requires considerably less resources and opens an avenue towards big WDS graphs.
https://doi.org/10.1007/978-3-031-30047-9_3In many real-world scenarios, data are provided as a potentially infinite stream of samples that are subject to changes in the underlying data distribution, a phenomenon often referred to as concept drift. A specific facet of concept drift is feature drift, where the relevance of a feature to the problem at hand changes over time. High-dimensionality of the data poses an additional challenge to learning algorithms operating in such environments. Common scenarios of this nature can for example be found in sensor-based maintenance operations of industrial machines or inside entire networks, such as power grids or water distribution systems. However, since most existing methods for incremental learning focus on classification tasks, efficient online learning for regression is still an underdeveloped area. In this work, we introduce an extension to the SAM-kNN Regressor that incorporates metric learning in order to improve the prediction quality on data streams, gain insights into the relevance of different input features and based on that, transform the input data into a lower dimension in order to improve computational complexity and suitability for high-dimensional data. We evaluate our proposed method on artificial data, to demonstrate its applicability in various scenarios. In addition to that, we apply the method to the real-world problem of water distribution network monitoring. Specifically, we demonstrate that sensor faults in the water distribution network can be detected by monitoring the feature relevances computed by our algorithm.
https://doi.org/10.1080/08839514.2023.2198846Introduction
To foster usefulness and accountability of machine learning (ML), it is essential to explain a model's decisions in addition to evaluating its performance. Accordingly, the field of explainable artificial intelligence (XAI) has resurfaced as a topic of active research, offering approaches to address the “how” and “why” of automated decision-making. Within this domain, counterfactual explanations (CFEs) have gained considerable traction as a psychologically grounded approach to generatepost-hocexplanations. To do so, CFEs highlight what changes to a model's input would have changed its prediction in a particular way. However, despite the introduction of numerous CFE approaches, their usability has yet to be thoroughly validated at the human level.
Methods
To advance the field of XAI, we introduce the Alien Zoo, an engaging, web-based and game-inspired experimental framework. The Alien Zoo provides the means to evaluate usability of CFEs for gaining new knowledge from an automated system, targeting novice users in a domain-general context. As a proof of concept, we demonstrate the practical efficacy and feasibility of this approach in a user study.
Results
Our results suggest the efficacy of the Alien Zoo framework for empirically investigating aspects of counterfactual explanations in a game-type scenario and a low-knowledge domain. The proof of concept study reveals that users benefit from receiving CFEs compared to no explanation, both in terms of objective performance in the proposed iterative learning task, and subjective usability.
We have witnessed in recent years an ever-growing volume of information becoming available in a streaming manner in various application areas. As a result, there is an emerging need for online learning methods that train predictive models on-the-fly. A series of open challenges, however, hinder their deployment in practice. These are, learning as data arrive in real-time one-by-one, learning from data with limited ground truth information, learning from nonstationary data, and learning from severely imbalanced data, while occupying a limited amount of memory for data storage. We propose the ActiSiamese algorithm, which addresses these challenges by combining online active learning, siamese networks, and a multi-queue memory. It develops a new density-based active learning strategy which considers similarity in the latent (rather than the input) space. We conduct an extensive study that compares the role of different active learning budgets and strategies, the performance with/without memory, the performance with/without ensembling, in both synthetic and real-world datasets, under different data nonstationarity characteristics and class imbalance levels. ActiSiamese outperforms baseline and state-of-the-art algorithms, and is effective under severe imbalance, even only when a fraction of the arriving instances’ labels is available. We publicly release our code to the community.
https://www.sciencedirect.com/science/article/pii/S0925231222011481A key challenge in designing algorithms for leakage detection and isolation in drinking water distribution systems is the performance evaluation and comparison between methodologies using benchmarks. For this purpose, the Battle of the Leakage Detection and Isolation Methods (BattLeDIM) competition was organized in 2020 with the aim to objectively compare the performance of methods for the detection and localization of leakage events, relying on supervisory control and data acquisition (SCADA) measurements of flow and pressure sensors installed within a virtual water distribution system. Several teams from academia and the industry submitted their solutions using various techniques including time series analysis, statistical methods, machine learning, mathematical programming, met-heuristics, and engineering judgment, and were evaluated using realistic economic criteria. This paper summarizes the results of the competition and conducts an analysis of the different leakage detection and isolation methods used by the teams. The competition results highlight the need for further development of methods for leakage detection and isolation, and also the need to develop additional open benchmark problems for this purpose.
https://ascelibrary.org/doi/full/10.1061/%28ASCE%29WR.1943-5452.0001601Numerical optimization is gradually finding its way into drinking water practice. For successful introduction of optimization into the sector, it is important that researchers and utility experts work together on the problem formulation with the water utility experts. Water utilities heed the solutions provided by optimization techniques only when the underlying approach and performance criteria match their specific goals. In this contribution, we demonstrate the application of numerical optimization on a real-life problem. The Belgian utility De Watergroep is looking to not only reinforce its distribution networks but to also structurally modify the network’s topology to enhance the quality of water delivered in the future. To help the utility explore the possibilities of these far-reaching changes in the most flexible way possible, an optimization problem was formulated to optimize topology and pipe sizing simultaneously for the distribution network of a Belgian city. The objective of the problem is to minimize the volume of the looped network and thereby work towards a situation where most of the customers are fed by branched extremities of the network. This objective is constrained by pressure and fire flow requirements and thresholds on the number of customers on the branched sections. The requirements for continuity of supply under failure scenarios are guaranteed by these constraints, as verified in the final solution. The results of the optimization process show that it is possible to design a network which is 18.5% cheaper than the currently existing network. Moreover, it turns out the—previously completely meshed—topology can be restructured so that 67% of the network length is turned into branched clusters, with a meshed superstructure of 33% of the length remaining.
View PublicationWater distribution networks (WDNs) evolve continuously over time. Changes in water
demands and pipe deterioration require construction upgrades to be performed on the
network during its entire lifecycle. However, strategically planning WDNs, especially for the
long term, is a challenging task. This is because parameters that are essential for the
description of WDNs in the future, such as climate, population and demand transitions, are
characterized by deep uncertainty. To cope with future uncertainty, and avoid overdesign or
costly unplanned and reactive interventions, research is moving away from the static design
of WDNs. Dynamic design approaches, aim to make water networks adaptive to changing
conditions over long planning horizons. A promising, dynamic design approach is the staged
design of WDNs, in which the planning horizon is divided into construction phases. This
approach allows short-term interventions to be made, while simultaneously considering the
expected long-term network growth outcomes. The aim of this paper is to summarize the
current state of the art in staged design of water distribution networks. To achieve that, we
critically examined relevant publications and classified them according to their shared key
characteristics, such as the nature of the design problem (new or existing network design,
expansion, strengthening, and rehabilitation), problem formulation (objective functions,
length of planning horizon), optimization method, and uncertainty considerations. In the
process, we discuss the latest findings in the literature, highlight the major contributions of
staged design on water distribution networks, and suggest future research directions.
View PublicationThe percentage of the world population living in urban settlements is expected to increase to
70% of 9.7 billion by 2050. Historically, as cities grew, the development of new water
infrastructures followed as needed. However, these developments had less to do with real
planning than with reacting to crisis situations and urgent needs, due to the inability of urban
water planners to consider long-term, deeply uncertain and ambiguous factors affecting urban
development and water demand. The “Smart Water Futures: Designing the Next Generation of
Urban Drinking Water Systems” or “Water-Futures” project, which was funded by the
European Research Council (ERC), aims to develop a new theoretical framework for the
allocation and development decisions on drinking water infrastructure systems so that they
are: (i) socially equitable, (ii) economically efficient, and (iii) environmentally resilient, as
advocated by the UN Agenda 2030, Sustainable Development Goals. The ERC Synergy grant
project tackles the “wicked problem” of transitioning water distribution systems in a holistic
manner, involving civil engineering, control engineering, machine learning, decision theory
and environmental economics expertise. Developing a theoretical foundation for designing
smart water systems that can deliver optimally robust and resilient decisions for short/long-
term planning is one of the biggest challenges that future cities will be facing. This paper
presents an overview of related past research on this topic, the knowledge gaps in terms of
investigating the problem in a holistic manner, and the key early outcomes of the project.
View PublicationVaquet V., Artelt A., Brinkrolf J. and Hammer B., "Taking Care of Our Drinking Water: Dealing with Sensor Faults in Water Distribution Networks", ICANN 2022
The water supply is part of the critical infrastructure as the accessibility of clean drinking water is essential to ensure the health of the people. To guarantee the availability of fresh water, efficient and reliable water distribution networks are crucial. Monitoring these systems is necessary to avoid deterioration in water quality, deal with leakages and prevent cyber-physical attacks. While the installation of a growing amount of sensors is increasing the possibilities to monitor the system, considering the control of the senors becomes another challenge as sensor faults negatively influence the reliability of systems dealing with leakages and monitoring water quality. In this work, we aim to overcome the negative implications induced by sensor faults by using a sensor fault monitoring system based on three steps. First, established residual based fault detection is applied. In a second step, we extend this method to a fault isolation technique and finally propose fault accommodation by standard imputation techniques and different types of virtual sensors.
View PublicationJakob J., Artelt A., Hasenjäger M. and Hammer B., "SAM-kNN Regressor for Online Learning in Water Distribution Networks", ICANN 2022
Water distribution networks are a key component of modern infrastructure for housing and industry. They transport and distribute water via widely branched networks from sources to consumers. In order to guarantee a working network at all times, the water supply company continuously monitors the network and takes actions when necessary – e.g. reacting to leakages, sensor faults and drops in water quality. Since real world networks are too large and complex to be monitored by a human, algorithmic monitoring systems have been developed. A popular type of such systems are residual based anomaly detection systems that can detect events such as leakages and sensor faults. For a continuous high quality monitoring, it is necessary for these systems to adapt to changed demands and presence of various anomalies.
In this work, we propose an adaption of the incremental SAM-kNN classifier for regression to build a residual based anomaly detection system for water distribution networks that is able to adapt to any kind of change.
View PublicationArtelt A., Vrachimis S., Eliades D., Polycarpou M. and Hammer B., "One Explanation to Rule them All -- Ensemble Consistent Explanations", XAI workshop at IJCAI 2022
Transparency is a major requirement of modern AI based decision making systems deployed in real world. A popular approach for achieving transparency is by means of explanations. A wide variety of different explanations have been proposed for single decision making systems. In practice it is often the case to have a set (i.e. ensemble) of decisions that are used instead of a single decision only, in particular in complex systems. Unfortunately, explanation methods for single decision making systems are not easily applicable to ensembles -- i.e. they would yield an ensemble of individual explanations which are not necessarily consistent, hence less useful and more difficult to understand than a single consistent explanation of all observed phenomena. We propose a novel concept for consistently explaining an ensemble of decisions locally with a single explanation -- we introduce a formal concept, as well as a specific implementation using counterfactual explanations.
View PublicationPittis N., Koundouri P., Samartzis P., Englezos N. and Papandreou A., "Ambiguity aversion, modern Bayesianism and small worlds" [version 1; peer review: 2 approved], Open Research Europe 2021, 1:13
The central question of this paper is whether a rational agent under uncertainty can exhibit ambiguity aversion (AA). The answer to this question depends on the way the agent forms her probabilistic beliefs: classical Bayesianism (CB) vs modern Bayesianism (MB). We revisit Schmeidler's coin-based example and show that a rational MB agent operating in the context of a "small world", cannot exhibit AA. Hence we argue that the motivation of AA based on Schmeidler's coin-based and Ellsberg's classic urn-based examples, is poor, since they correspond to cases of "small worlds". We also argue that MB, not only avoids AA, but also proves to be normatively superior to CB because an MB agent (i) avoids logical inconsistencies akin to the relation between her subjective probability and objective chance, (ii) resolves the problem of "old evidence" and (iii) allows psychological detachment from actual evidence, hence avoiding the problem of "cognitive dissonance". As far as AA is concerned, we claim that it may be thought of as a (potential) property of large worlds, because in such worlds MB is likely to be infeasible.
View PublicationAlamanos, A.; Koundouri, P.; Papadaki, L.; Pliakou, T.; Toli, E. Water for Tomorrow: A Living Lab on the Creation of the Science-Policy-Stakeholder Interface. Water 2022, 14, 2879.
The proactive sustainable management of scarce water across vulnerable agricultural areas of South Europe is a timely issue of major importance, especially under the recent challenges affecting complex water systems. The Basin District of Thessaly, Greece’s driest rural region, has a long history of multiple issues of an environmental, planning, economic or administrative nature, as well as a history of conflict. For the first time, the region’s key-stakeholders, including scientists and policymakers, participated in tactical meetings during the 19-month project “Water For Tomorrow”. The goal was to establish a common and holistic understanding of the problems, assess the lessons learned from the failures of the past and co-develop a list of policy recommendations, placing them in the broader context of sustainability. These refer to enhanced and transparent information, data, accountability, cooperation/communication among authorities and stakeholders, capacity building, new technologies and modernization of current practices, reasonable demand and supply management, flexible renewable energy portfolios and circular approaches, among others. This work has significant implications for the integrated water resources management of similar south-European cases, including the Third-Cycle of the River Basin Management Plans and the International Sustainability Agendas.
View PublicationCurrently, in the water distribution systems literature, fault detection methods are typically evaluated on benchmark water networks that do not include real-time experimental data, or on private commercial datasets, which prohibit the reproducibility of the results. Moreover, realistic modeling of faults on hydraulic system components, sensors and actuators is often unavailable. In this work, we provide a framework for the application of fault-diagnosis methodologies on WaterSafe, a water network benchmark for fault diagnosis. The WaterSafe benchmark is a small scale replica of a water transport network constructed using industrial components and devices, while the communications are implemented in a way that resemble a water utility's Supervisory Control and Data Acquisition system. A general problem formulation for fault-diagnosis on water systems is provided, in accordance to the mathematical model of the benchmark. Moreover, we provide a calibrated simulation model including system, sensor and actuator faults, based on observations from the real system. Finally, we provide open access to the datasets generated from the experiments containing the aforementioned faults.
https://www.sciencedirect.com/science/article/pii/S2405896322005870