The growing variability in hydrological regimes and the more frequent and intense extreme events necessitate a transition from conventional rule-based reservoir management to adaptive operations. This transition can be facilitated by harnessing improved hydro-meteorological forecasts and implementing feedback and feedforward control strategies. While the importance of assessing hydrological forecast value for reservoir operation has long been recognized and studied, new opportunities and challenges are now emerging thanks to the availability of a wealth of hydro-meteorological forecast products over different time scales. These products often include information on uncertainty via ensemble forecasts, which is the current standard in operational forecasting. When multiple forecasts from different systems are available, users should address a number of challenges, including the selection of the forecast product, the lead time, the variable aggregation, the bias correction, how to cope with the forecast uncertainty. Usually, these choices are reservoir-specific and based on the operators’ experience, often lacking a transparent reporting of operational rules and guidelines to support such critical choices. Here we explore the potential for advancing forecast informed reservoir operations via Reinforcement Learning (RL) algorithms. Our RL approach is evaluated using the Lake Como system in Northern Italy, a regulated lake primarily operated for flood control and water supply. The sub-alpine basin of the lake is characterized by mixed slow and fast dynamics resulting from the snow- and rain-dominated hydrology. In this context, forecasts over short and seasonal time scales may be valuable, and today the lake operator has access to different forecast products: short-term (i.e., 60 hours lead time) deterministic forecasts produced with locally calibrated models as well as the sub-seasonal and seasonal forecasts of the Copernicus Emergency Management Service's European Flood Awareness System. Our results show that RL can support the extraction of valuable information from the available forecast products. Specifically, we first extended the Evolutionary Multi-Objective Direct Policy Search method to explore a wider decision space, including the operating policy parameters along with hyperparameters determining how to process the forecast information in terms of selection of the best forecast product, lead time, variable aggregation. The strength of this approach is that the information extraction is completely integrated with the multi-objective policy design, producing a seamless RL approach able to extract the most valuable information for different Pareto optimal tradeoffs. In addition, we show how performing the optimization through Monte Carlo simulations that use all members of the forecast ensemble yields more robust performance than informing the operations with deterministic forecasts or statistics extracted from the forecast ensemble.