The aim of the paper is to compare several data-driven models using different Numerical Weather Prediction (NWP) input data and then to build up an outperforming Multi-Model Ensemble (MME) and its prediction intervals. Statistic, stochastic and hybrid machine-learning algorithms were developed and the NWP data from IFS and WRF models were used as input. It was found that the same machine learning algorithm differs in performance using as input NWP data with comparable accuracy. This apparent inconsistency depends on the capability of the machine learning model to correct the bias error of the input data. The stochastic and the hybrid model using the same WRF input, as well as the stochastic and the non-linear statistic models using the same IFS input, produce very similar results. The MME resulting from the averaging of the best data-driven forecasts, improves the accuracy of the outperforming member of the ensemble, bringing the skill score from 42% to 46%. To reach this performance, the ensemble should include forecasts with similar accuracy but generated with the higher variety of different data-driven technology and NWP input. The new performance metrics defined in the paper help to explain the reasons behind the different models performance.