|Paper title||Leveraging Deep-Learning and Computer Vision to Provide Automated and Scalable Insight for Fresh Produce Crops|
|Form of presentation||Poster|
Recent advances in drone technology and Computer Vision techniques provide opportunities to improve yield and reduce chemical inputs in the fresh produce sector. We demonstrate a novel real world approach which combines remote sensing and deep learning techniques to provide accurate, reliable and efficient counting and sizing of fresh produce crops, such as lettuce, in agricultural production hot spots. In production regions across the world, including UK, USA and Spain, unmanned multispectral aerial vehicles (UAVs) flown over fields acquire high-resolution (~1 cm ground sample distance [GSD]) georeferenced image maps during the growing season. Field boundaries and batch-level zone boundaries are catalogued for the field and provide a unique way for growers to monitor growth in separate regions of the same field to account for unique crop varieties or growth stages. These UAV images undergo an orthomosaic process to stitch and geometrically correct the data. Next, for counting and sizing metrics, we leveraged a Mask R-CNN architecture with an edge agreement loss to provide fast object instance segmentation [1,2]. We optimised and trained the architecture on over 75,000 manually annotated training images across a number of diverse geographies world-wide. Semantic objects belonging to the crop class are vastly outnumbered by background objects on the field, such as machinery, rocks, soil, weeds, fleece material and dense patches of vegetation. Crop objects and background objects are also not colocalised in the same space on the field meaning a single training image suffers class imbalance and in many cases training samples rich with background class labels do not contain a single crop label to discriminate against. We therefore incorporate a novel on-the-fly inpainting approach to insert positive crop labels into completely crop negative training samples to encourage the Mask R-CNN model to learn as many background objects as possible. Our approach achieves a segmentation Intersection over Union (IoU) score of 0.751 and a DICE score of 0.846, with an object detection precision score of 0.999 and a recall score 0.995. We also developed a fast, novel, computer vision approach to detect crop row orientation to display counting and sizing information to the grower at different levels of granularity with increased readability. This approach allows growers an unprecedented level of large-scale insight into their crop and is used for a number of valuable metrics such as establishment rates, growth stage, plant health, and homogeneity, whilst also assisting in forecasting optimum harvest dates and yield (Figure 1a). These innovative science products in turn help reduce waste by optimising and reducing inputs to make key actionable decisions on the field. In addition, counting and sizing allows the generation of bespoke variable rate Nitrogen application maps that can be uploaded straight to machinery and increases crop homogeneity and yield whilst simultaneously reducing chemical usage by as much as 70% depending on the treatment plan (Figure 1b). This brings additional environmental benefits through reduced Nitrogen leaching and promotes more sustainable agriculture.
Figure 1. Example plant counting and sizing outputs. (a) Sizing information per detected plant (measured in cm²) using the Mask R-CNN model trained with edge agreement loss. (b) Variable rate Nitrogen application plan clustered into three rates based on plant size, orientated to the direction of the crop row.
 He, K., Gkioxari, G., Dollar, P. and Girshick, R., 2020. Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2), pp.386-397.
 Zimmermann, R. and Siems, J., 2019. Faster training of Mask R-CNN by focusing on instance boundaries. Computer Vision and Image Understanding, 188, p.102795.