EFAS reporting point layer
EFAS reporting point layer

by Corentin Carton de Wiart, Louise Arnal, Maurizio Latini, Blazej Krzeminski, Tiago Quintino and Christel Prudhomme, EFAS Computational Centre and ECMWF

On 8 October EFAS released an improved hydrological forecast layer (figure above). The layer shows a combination of 1) dynamically created reporting points where a flood signal is forecasted in the next 10 days (red and yellow squares), and 2) physical stations for which no flood is forecasted, but where hydrological information is shared by EFAS partners (grey and blue squares). This new design gives users a complete and coherent overview of all stations at which EFAS medium-range forecasts are accessible. It also enables partners who have shared their hydrological data with the EFAS consortium to monitor the hydrological evolution of their catchments of interest at any time, something that was not possible beforehand.

The improved design facilitates the work of the forecasters on duty of the EFAS dissemination centres (the Hydro-Meteorological Services of Sweden, Netherlands and Slovakia) as stations with an expected flood are quickly identifiable (red and yellow squares). In addition, EFAS partner can easily identify stations for which post-processed forecast are available as those stations are highlighted in blue. For all reporting points detailed forecast information in the form of discharge hydrographs, time series of temperature or precipitation, forecast overview tables, etc. (figure below) are available.

Figure 2. Example of EFAS reporting points products.
Example of EFAS reporting points products.

On the production side, the number of points for which detailed forecast information is produced at each forecast has increased from a few dozens of dynamic points (where a flood signal is forecasted) to around 2000 static points (points where EFAS partners share hydrological data with EFAS). Using the legacy system, the increased complexity and computing requests would have prevented ECMWF to meet its Service Level Agreement regarding forecast dissemination time.

The expected ever-increasing number of EFAS partners – and therefore number of reporting points – together with the envisaged increase in resolution of the EFAS system motivated a full review of the existing product generation chain to enable greater scalability and prepare for future upgrades. This was also an opportunity to reduce the complexity of the legacy tools that make up the operational suite, which used a mix of programming languages, such as Python, R, PCRaster, C++ and bash scripts. The new product generation was designed with the following ideas in mind:

  • Gather common functionalities of flood forecasting in a shared framework
  • Improve modularity of the system, allowing easier implementation of new features
  • Improve maintainability using modern software engineering processes
  • Promote collaborative development practices between hydrologists and computer-scientists
  • Introduce parallel computing and data hypercubes in the process, preparing the system for the next resolution upgrade

A new Python framework called “danu” was developed to support these improvements. As the use of Python can result in a drop of efficiency for heavy operations, typically involving looping over the grid points, the framework delegates these parts to dedicated libraries, such as:

  • Numpy/Scipy
  • Xarray/Dask/Multiprocessing
  • PCRaster (Python interface only)
  • GDAL
Figure 3. Runtime for different EFAS components and resulting speedup.
Runtime for different EFAS components and resulting speedup.

Using parallelism in compute intensive parts of the workflow, for instance when calculating statistics over the ensemble members, also led to a dramatic improvement resulting in one to two orders of magnitude faster runtime. Porting the multiple processes into a single framework also made it possible to optimize filesystem I/O by avoiding data transfers, which was a major performance bottleneck. Having demonstrated the increased efficiency after code refactoring, the same exercise was also conducted for the flash flood product generation (ERIC), sharing as much as possible the libraries between the different product generation chains. The reduction of runtime on key parts of the considered EFAS workflow is illustrated in the figure above.

These performance improvements were achieved thanks to a close collaboration between the scientists that developed the system and computer-scientists that reworked it, creating a positive feedback where technical developments empower the science to go further. The tools developed in the python framework “danu” and this successful collaboration between teams are now applied on other environmental forecast modelling chains, such as those developed for the EU-funded SMUFF project ‘Seamless probabilistic multi-source forecasting of heavy rainfall hazards for European flood awareness’.