Hilfe beim Zugang
Unsupervised domain adaptation using transformers for sugarcane rows and gaps detection
Deep learning represented an impressive advance in the field of machine learning and is continually breaking records in dozens of areas of artificial intelligence, such as image recognition. Nevertheless, the success of these architectures depends on a large amount of labeled data and the annotation...
Ausführliche Beschreibung
Deep learning represented an impressive advance in the field of machine learning and is continually breaking records in dozens of areas of artificial intelligence, such as image recognition. Nevertheless, the success of these architectures depends on a large amount of labeled data and the annotation of training data is a costly process that is often performed manually. The cost of labeling and the difficulty of generalizing the model knowledge to unseen data poses an obstacle to the use of these techniques in real-world agricultural challenges. In this work, we propose an approach to deal with this problem when detecting crop rows and gaps and our findings can be extended to other problems related with few modifications. Our approach proposes to generate approximated segmentation maps from annotated one-pixel-wide lines using dilation. This method speeds up the pixel labeling process and reduces the line detection problem to semantic segmentation. We considered the transformer-based method, SegFormer, and compared it with ConvNet segmentation models, PSPNet and DeepLabV3+, on datasets containing aerial images of four different sugarcane farms. To evaluate the ability to transfer the knowledge learned from source datasets to target datasets, we used a very recent and current state-of-the-state unsupervised domain adaptation (UDA) model, DAFormer, which has achieved great results in adapting knowledge from synthetic data to real data. In this work, we were able to evaluate its performance using only real-world images from different but related domains. Even without using domain adaptation, the Transformer-based model, SegFormer, performed significantly better than ConvNets for unseen data, but when applying UDA using DAFormer, the results were even better, reaching from 71.1% to 94.5% relative performance regarding the average F1-score achieved when using supervised training with labeled data. Ausführliche Beschreibung