Information on crop yield at scales ranging from field to global level is imperative for farmers and decision makers. In the first place, farmers need reliable crop yield estimates. Since, knowledge on crop yield at the field level help farmers monitor yield effects of certain management choices, identify potential threats (e.g. consequences of increasing drought occurrence during the growing season) and explore potential opportunities . Also for insurance purposes knowledge on average yield and yield variability is essential . In the second place, crop yield data are needed for decision making and strategic planning , . For example to set up agricultural development programs, regional crop yield data are indispensable for policy and decision makers. The current data sources to monitor crop yield, such as regional agriculture statistics, are often lacking in spatial and temporal resolution to monitor crop yield at field level. Crop yield data derived from remote sensed vegetation indices (VIs) could offer crop yield data at a higher spatial and temporal resolution. The spatial resolution of VIs can be as high as a few meters and some VIs are available at a daily scale . VIs monitor particular properties of crops that can be related to final crop yield at the farm  and regional scale . A VI that is often used to monitor crop yield is the normalized difference vegetation index (NDVI) as an indicator of the photosynthetic active biomass . Several studies have shown that the information on the photosynthetic active biomass (i.e. NDVI) during the growing season or at particular stages of the crop growing season is related to crop yield , –. Remotely sensed vegetation indices (VI) such as NDVI are able to estimate crop yield using empirical modelling strategies. Empirical crop yield models are less data intensive, compared to mechanistic VI based crop models, since they relate VIs to crop yield using statistical techniques such as linear regressions or random forests.
In this research we evaluated the applicability of empirical NDVI based crop yield models for sugar beet, potato and winter wheat in northern Belgium. Crop yield data from 468 sugar beet fields, 685 potato and 666 winter wheat fields from 2016-2018 were available at the farm level from the Department of Agriculture. The platform https://openeo.org/ was used to extract 5-daily Sentinel-2 NDVI pixels (10 m resolution) within each field, apply a cloud mask based on the scene classification layer from Sentinel-2 and compute the average NDVI series for each field from the extracted NDVI pixels . For each field the NDVI integral was calculated using the trapezoidal rule . NDVI values between day of year (DOY) 91-273 (i.e. beginning of April-end of September), 121-273 (i.e. end of April-end of September), and 1-202 (i.e. beginning of January- end July) were considered as the length of the growing season for sugar beet, potato and winter wheat respectively. To evaluate the applicability of empirical NDVI based crop yield models for sugar beet, potato and winter wheat in northern Belgium random forest models were built. In a first model, NDVI integral was the only predictor variable to model crop yield. In a second model weather variables (i.e. monthly precipitation (P) and maximum temperature (Tmax) during the growing season) were added to the yield model. For sugar beet and potato a third model was built. In these models yield was modelled in function of NDVI integral, weather variables and the root zone soil water depletion during the growing season. Soil water depletion was modelled based on crop specific parameters, soil texture and weather data (i.e. minimum and maximum temperature, precipitation and reference evapotranspiration) for each field using Aquacrop-OS . These models allowed us to evaluate if adding information of soil texture by means of including the root zone soil water depletion improved the crop yield model for sugar beet and potato. Finally, a fourth model was built for sugar beet and potato where crop yield was modelled in function of NDVI integral and the monthly P, Tmax and root zone soil water depletion during the growing season. The random forest models were based on 500 trees. The out of bag prediction error (MSE) and the corresponding explained variance (R²) were used to evaluate model performance. Variable importance of the predictor variables were calculated for the models in order to determine which predictors explained most of the crop yield variability.
We found that empirical crop models based on only the NDVI integral were not able to explain winter wheat and potato yield variability in northern Belgium. For sugar beet, the random forest model based on only the NDVI integral explains part of the sugar beet yield variability (R²=0.16). The NDVI series of winter wheat and potato were not sensitive enough to yield affecting weather and soil water conditions during important phenological stages. Winter wheat yield variability was better predicted by monthly precipitation during tillering and anthesis and NDVI integral (R²=0.66) than by NDVI integral in the period from 2016 to 2018. The model performance of the crop models based on NDVI integral and monthly root zone soil water depletion versus the models based on NDVI integral and monthly weather variables reached similar model performance for both sugar beet and potato. When both weather variables and root zone soil water depletion throughout the growing season in combination with NDVI integral were used as predictor variables model performance was not higher compared to when only weather variables or root zone soil water depletion in combination with NDVI integral were used to model sugar beet and potato crop yield. However, from the variable importance plots it is clear that Tmax and root zone soil water depletion in certain months in combination with NDVI integral explain large part of the sugar beet and potato yield variability. Maximum temperature in September, root zone soil water depletion in June and NDVI integral were found to be the most important variables to explain potato yield variability (R²=0.56). For sugar beet, soil water depletion in the month of April explains large part of the sugar beet yield variability (R²=0.84). In addition, NDVI integral and maximum temperature in September were found to be important variables to explain sugar beet yield variability. Our findings confirm the importance of meteorological variables during sensitive phenological stages . We concluded that yield affecting weather and soil water conditions during important phenological stages are needed in addition to the NDVI integral to be able to model crop yield variability of sugar beet, potato and winter wheat in northern Belgium using empirical crop models.
 D. B. Lobell, D. Thau, C. Seifert, E. Engle, and B. Little, “A scalable satellite-based crop yield mapper,” Remote Sens. Environ., vol. 164, pp. 324–333, Jul. 2015, doi: 10.1016/j.rse.2015.04.021.  P. C. Doraiswamy, B. Akhmedov, L. Beard, A. Stern, and R. Mueller, “Operational prediction of crop yields using MODIS data and products. Workshop proceedings: Remote sensing support to crop yield forecast and area estimates, ISPRS Archives XXXVI-8/W48,” p. 5, 2007.  I. Becker-Reshef, E. Vermote, M. Lindeman, and C. Justice, “A generalized regression-based model for forecasting winter wheat yields in Kansas and Ukraine using MODIS data,” Remote Sens. Environ., vol. 114, no. 6, pp. 1312–1323, Jun. 2010, doi: 10.1016/j.rse.2010.01.010.  A. Vannoppen et al., “Wheat Yield Estimation from NDVI and Regional Climate Models in Latvia,” Remote Sens., vol. 12, p. 2206, Jul. 2020, doi: 10.3390/rs12142206.  A. Vannoppen and A. Gobin, “Estimating Farm Wheat Yields from NDVI and Meteorological Data,” Agronomy, vol. 11, no. 5, Art. no. 5, May 2021, doi: 10.3390/agronomy11050946.  Y. Ö. Durgun, A. Gobin, G. Duveiller, and B. Tychon, “A study on trade-offs between spatial resolution and temporal sampling density for wheat yield estimation using both thermal and calendar time,” Int. J. Appl. Earth Obs. Geoinformation, vol. 86, p. 101988, Apr. 2020, doi: 10.1016/j.jag.2019.101988.  G. Genovese, C. Vignolles, T. Nègre, and G. Passera, “A methodology for a combined use of normalised difference vegetation index and CORINE land cover data for crop yield monitoring and forecasting. A case study on Spain,” Agronomie, vol. 21, no. 1, pp. 91–111, Jan. 2001, doi: 10.1051/agro:2001111.  O. Rojas, “Operational maize yield model development and validation based on remote sensing and agro‐meteorological data in Kenya,” Int. J. Remote Sens., vol. 28, no. 17, pp. 3775–3793, Sep. 2007, doi: 10.1080/01431160601075608.  M. Schramm et al., “The openEO API–Harmonising the Use of Earth Observation Cloud Services Using Virtual Data Cube Functionalities,” Remote Sens., vol. 13, no. 6, Art. no. 6, Jan. 2021, doi: 10.3390/rs13061125.  T. Foster et al., “AquaCrop-OS: An open source version of FAO’s crop water productivity model,” Agric. Water Manag., vol. 181, pp. 18–22, Feb. 2017, doi: