Why we need knowledge plumbers

(This article has been published on Medium)
Is data alone enough to tackle the challenges of climate change?
When we find answers and build solutions relying on climate data only, we risk misusing climate model outputs and drawing incorrect conclusions. Because in order to understand our world and its climate, there’s something just as crucial as data: knowledge.
Nevertheless, to include knowledge, we must first collect data. Whenever you deal with climate change, the collection of climate data is a critical step. That’s why numerous organisations use it to estimate risks and make decisions. So collecting this data is crucial to enable these organisations, in their entirety, to analyse and create insights from it, connecting it to existing data workflows and corporate processes.
Lost without a paddle: analysing climate data without knowledge
So how would that work? Let’s say a company needs to analyse the risk posed by climate change in Perth, Western Australia. Every year the company asks the experts to analyse the risk on the company assets in the next 30 years. This assessment is based on the latest climate change projections distributed by the research institutions and by the Australian government. The process starts with collecting the climate data and downloading the simulation data from the climate models until 2050. This data consists of hundreds of gigabytes containing the meteorological data (for example, rainfall and maximum daily temperature) simulated by the models for the entire globe. And it’s downloaded using data pipelines — a set of automated procedures performing a predefined sequence: data download, error checking, metadata collection, etc. At the end of the process, the data is ready to be analysed on the company servers.
Using this pipeline, the analysts develop ways to evaluate the probability of having extreme precipitation in some specific areas using the simulated rainfall values from the dozens of climate change models used for the projections. But while doing this they see that a couple of models simulate rainfall values drastically lower than the others. How should they interpret this difference? Are the values caused by an error in post-processing or by a different way to simulate the interactions between land and atmosphere? Or perhaps those values are just due to randomness: in other words, the natural fluctuations of weather simulated by the model. The climate data collected does not provide any answer alone. Something is missing.
To generate insights and make decisions, you cannot use data directly. You need to process it, organise it, interpret it. And due to the complexity of the climate processes that this data tries to explain and simulate, you often need something more: knowledge about the data itself. That means understanding how the data has been produced, what has been produced, the known limitations, the uncertainty — but also the processes that the data is simulating and representing.
So our analysts in Perth have copied the data into the company systems, as a necessary step to produce useful information. But, as we have seen before, the analysts are unsure how to interpret some of the values coming from the models: this may potentially lead to some incorrect estimations and — eventually — wrong decisions. To improve the process, the analysts should collect knowledge too.
What kind of knowledge should the analysts try to collect for their assessment? They could start with the known limitations of the climate change models that produced the data that the team collected. Maybe someone knows if the quality of the simulations has ever been evaluated for western Australia? This would help in interpreting the different outputs from the models. Or perhaps the analysts would like to know the year-over-year variability of the rainfall that the models simulate. This would help to understand if very low (or very high) values can be the result of natural fluctuations or not.
Even if you have a climate research team in your organisation, some of the above questions cannot be answered easily: this would require very specific knowledge about how the climate models have been implemented or how the experiments have been designed. Data collection is just the first step, but it is not enough: you still need to collect knowledge. And knowledge collection is a whole other story.
Casting the net: starting with the collection
So how does this work? For data, we have several protocols and software frameworks: there are literally hundreds of books discussing different ways of collecting data. But the same is not true for gathering knowledge. Instead, it’s all about talking, discussing, writing, reading, building wikis, etc.
For both data and knowledge, we need to establish a flow. In the same way as you build a software channel to bring data into your systems — the data pipeline I mentioned earlier — we need to build knowledge pipelines. A knowledge pipeline is a workflow moving outside knowledge in: from the world to your organisation. And just as plumbers design and maintain physical pipelines, we need people ensuring that the knowledge flows smoothly and effectively. Let’s call these people knowledge plumbers.
So just as we need data pipelines, we need knowledge pipelines and people to build and maintain them. In other words, we need to treat climate knowledge like we treat data. Just like data, it needs retrieving, storing, structuring and sharing internally.
But the comparison between knowledge and data has its limitations. For one, creating and sharing data is a discrete process, commonly limited only by technology and infrastructure. It’s something that can be divided into tasks and steps that are normally automated and performed by software. Knowledge, on the other hand, is created and exchanged through a more organic process. This process is not driven by software, it’s hard to generalise, and it’s sometimes unpredictable.
In other words, building a knowledge pipeline seems more challenging than building a data pipeline. The question is: how do we begin?
I should probably clarify that we are not starting from scratch. We can already find around us similar processes in place, even if they cannot yet be called knowledge pipelines, exactly. Take, for example, climate services, defined by the World Meteorological Organisation as “the provision and use of climate data, information and knowledge to assist decision-making”. There are hundreds of examples of climate services and, in most cases, they are proving their worth: bringing climate knowledge to the people who need it to make decisions.
All the discussion so far indicates that at the center of a knowledge pipeline there is the flow and movement of knowledge. In fact, a knowledge pipeline moves knowledge from the outside in: as something external to the organisation. Knowledge pipelines can include universities, research centres and data providers. In our Perth case, the outside can be considered the Australian government and the research institutions running the climate simulations.
However, sometimes climate knowledge is too complex to be used in an organisation straight away and therefore needs translating. In her article published in 2021 titled Business risk and the emergence of climate analytics, Dr Tanya Fiedler writes about “climate translators”:
Their critical role lies in translating the complex climate information compiled from observations, meteorological reanalyses, and near- and long-term model predictions for non-expert users. Climate translators are needed to communicate an understanding of the direct risks arising from climate change on physical assets, businesses, portfolios and supply chains, but also potentially more disruptive impacts associated with tipping points.
So while knowledge plumbers can connect different places to let the knowledge flow smoothly, translators can help in shaping this knowledge — making it accessible.
To put it simply, for knowledge to be used properly, it must be understood by all users — and that’s why sometimes translators are also a fundamental part of an effective knowledge pipeline.
The long and short of it
At the end of the day, when it comes to climate change we simply cannot afford not to build these knowledge pipelines.
Failing to do so risks, first of all, overestimating the capability of climate models to represent reality. That means losing the ability to distinguish between “model-land” (as Erica Thompson describes it) and the real world; not everything in the data is “real”, in other words. There’s a lyric from a Radiohead song that captures this well: “Just ’cause you feel it / doesn’t mean it’s there”.
The second risk is that you will ultimately be less effective. The lack of proper knowledge may reduce the users’ trust in the process: we can imagine knowledge as shedding light on the data; in the darkness, you are slow and uncertain. To illustrate, suppose climate projections indicate a significant increase in the risk of flooding for a particular building. Without sufficient knowledge, we may feel hesitant to implement costly mitigation measures, such as building a new drainage system or a reservoir. The impact of this hesitance could be severe: for example it might lead to severe damage in the future, and more generally it might erode the credibility of the entire decision-making process, undermining the relationship between the people involved in the decision-making process.
In case I didn’t already say enough times: climate knowledge is as important as climate data. We need to permeate our organisations with climate knowledge, making it a core part of our decision processes. What’s more, knowledge collection does not work in a single direction. Knowledge cannot be ordered from a menu, like falafel in a street kiosk. Rather, to collect knowledge we need to engage with the scientific community, and we need climate translators to explain what we are trying to achieve with it. Knowledge creation does not happen in a vacuum; it requires context and collaboration.
Indeed, we must — given the urgency of the climate emergency — start considering data as an asset that can activate its true value only with knowledge. To achieve this, we need to build robust knowledge pipelines that can transport the necessary insights and understanding along with the data itself. Just as physical pipelines move resources from one place to another, these knowledge pipelines must facilitate the smooth flow of climate knowledge into our organisations and in our decisions.
(I want to thank Marcello Petitta for the interesting discussion on this topic)