Cross-Scale Query-Support Alignment Approach for Small Object Detection in the Few-Shot Regime

Abstract

Small object detection is a challenging task in computer vision. In the few-shot regime, this problem is reinforced. Leveraging useful information from only a few examples is difficult, in particular with small objects. We hypothesize that features extracted from small objects are noisy and often dominated by background information. In addition, recent detectors rely on multi-scale features and visually similar objects of different sizes may have unaligned representations. We address these issues with Cross-Scale Query-Support Alignment (XQSA), a novel attention mechanism that combines features from query and support images at different scales. This allows matching objects of different sizes and therefore improves Few-Shot Object Detection (FSOD) performance. Extensive experiments are conducted on four distinct datasets, including natural images (Pascal VOC and MS COCO) and aerial images (DOTA and DIOR). XQSA improves the detection of small objects on all tested datasets. In aerial images, which contain smaller objects, it yields significant gains for the overall detection and outperforms the state-of-the-art results on DOTA and DIOR.

Pierre Le Jeune
Pierre Le Jeune
PhD Student in Deep Learning

My research interests include computer vision, deep learning and applications in low-data regime.