In the last years I needed many times to aggregate the data into a gridded dataset (for example ERA5 meteorological data) into a time-series, according to specific borders (for example administrative regions). In the past I was using a R package which I developed and that I used for example for the ERA-NUTS dataset. However, when recently I had to deal with a 5km resolution hydrological model (1M grid cells per day in Europe) I had to find a solution which allowed me to work out-of-memory.

Girolamo Ruscelli, an Italian cartographer and writer, in 1561 drawed this map of West Africa. Today, after almost 500 years, we can access from literally everywhere satellite images with an incredibly high-resolution, like the following one produced by NASA/NOAA Suomi-NPP and download from Worldview. Those two pictures can roughly summarise the impressive advancement in the knowledge of our planet but, still today, there are parts of the world with far less observations than others.

NOTE: this post has been updated. The previous code was based on the conversion of the GRIB files into NetCDF, which introduces unfortunately some issues. Among the data products of the Copernicus Climate Change (C3S) available through the Climate Data Store, there is a collection of seasonal forecasts, from the 13th November consisting of five different models (ECMWF, UK Met Office, Meteo-France, DWD and CMCC). One of the interesting things you can do with multiple climate models is to combine them into a multi-model ensemble.

Public sector information (PSI, the information held by governments) is very valuable and its value is expecting to increase in the next decades. The first PSI European Union directive is dated 2003 (2003/98/EC) and it has been amended after ten years, in 2013 (2013/37/EU). However, it presented many limitations and in fact the European Commission launched a public consultation in late 2017. The consultation received many feedbacks (slightly less than 300), including one from the energy modelling community (download), that is definitely worth reading because it summarises in its 23 pages the current limitations in licenses, data access & quality and fair use.

During the last summer the water temperature in some important rivers (especially the Rhine and the Rhone) in central Europe was so high that some nuclear and coal power plants in Germany, France and Switzerland had to limit their generation or even shut down due to the regulations imposing them to do not discharge the warm water used for cooling the plant when the river temperature was too high. It was July, during a heatwave strong enough to deserve a Wikipedia page.

We can easily say that the Copernicus Climate Change (C3S) initiative is definitely shaping the field of climate services. I might have said “Climate Science” instead of “Climate Services”, but I want to focus here on the applicative side of the climate science. The Copernicus Climate Change (C3S) initiative and the CDS The best thing of the C3S is that they are trying to foster the creation of a ecosystem of data services and — not surprisingly — software (design, development, architecture) plays a critical role here.

I have recently moved in North Holland and in the past weeks the weather was particularly fortunate: for many (consecutive) days there was no rain and the temperature have been very high for this area (the maximum temperature was easily above 25° degrees). Given that I have no experience yet for this weather, I asked around how frequently this happens and I got diverse answers. Then, I have decided to look at some historical time-series of temperature and precipitation to try to satisfy my curiosity.

To evaluate the performances of the probabilistic forecasts provided by seasonal forecasting it is very common the use of skill scores. A skill score is normally defined as a ratio between a specific accuracy measure computed on the forecast and the same measure applied to a reference forecast. In the case of seasonal forecasts, as reference forecast is commonly used the climatological probability, that is the observed frequency of the target event (the event predicted by the forecast) in the past.

Recently I had the occasion to contribute to a document providing a feedback on European Commission public consultation on directives 2003/98/EC and 2013/37/EU. The work has been coordinated by Robbie Morrison, an important contributor to the OpenMod Initiative (and not only). The document is available at this link, and it discusses all the concerns related to the (re-)use of public information and the needs from energy modellers community. It is also related to my recent experience in the C3S ECEM service where, with a bunch of smart and enthusiastic people from the energy & meteorology community, I had the possibility to really understand the potentialities and limitations in the use of public available data for the modelling of European power systems in relation to the link with meteorological variables.

I am not exaggerating when I say that the main reason to develop in R is the presence of the tidyverse and the plethora of packages that let you to explore data and deal easily with the most common tasks (splitting, grouping, temporal aggregation, etc.). Unfortunately, the biggest limitation in R is its capability to manipulate effectively arrays of data (data cubes), especially nowadays when it is normal to deal with very large data sets (please, don’t say “big”).