How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval?

1SketchX, CVSSP, University of Surrey, United Kingdom 2iFlyTek-Surrey Joint Research Centre on Artifiial Intelligence

(Left): Freehand sketches exhibit varied levels of abstraction G: good, R: reasonable, A: abstract. (Right): Unlike existing feature vector embedding, we learn a feature matrix representation in the joint embedding space, regularised by a pre-trained StyleGAN's disentangled latent space, and an abstraction-aware retrieval loss. The abstraction identification head dynamically decides the row-dimension of the matrix embedding based on the query sketch abstraction/completeness.

Abstract

In this paper, we propose a novel abstraction-aware sketch-based image retrieval framework capable of handling sketch abstraction at varied levels. Prior works had mainly focused on tackling sub-factors such as drawing style and order, we instead attempt to model abstraction as a whole, and propose feature-level and retrieval granularity-level designs so that the system builds into its DNA the necessary means to interpret abstraction. On learning abstraction-aware features, we for the first-time harness the rich semantic embedding of pre-trained StyleGAN model, together with a novel abstraction-level mapper that deciphers the level of abstraction and dynamically selects appropriate dimensions in the feature matrix correspondingly, to construct a feature matrix embedding that can be freely traversed to accommodate different levels of abstraction. For granularity-level abstraction understanding, we dictate that the retrieval model should not treat all abstraction-levels equally and introduce a differentiable surrogate Acc.@q loss to inject that understanding into the system. Different to the gold-standard triplet loss, our Acc.@q loss uniquely allows a sketch to narrow/broaden its focus in terms of how stringent the evaluation should be - the more abstract a sketch, the less stringent (higher q). Extensive experiments depict our method to outperform existing state-of-the-arts in standard SBIR tasks along with challenging scenarios like early retrieval, forensic sketch-photo matching, and style-invariant retrieval.

Demonstration video of our FG-SBIR interface. (Best viewed when full-screened and unmuted)

Pilot Study

Pilot Study I: StyleGAN latent-disentanglement via optimising different groups of latent codes (coarse, medium, and fine).
 
Pilot Study II: Evaluate retrieval consistency by comparing entropy of separation in the embedding space, evaluated over successive stages of sketch completion. Inset images show how our method directs the query to a single gallery image (blue) while pushing others away as sketching progresses.

Architecture

Our method learns a feature matrix representation in the joint embedding space, regularised by a pre-trained StyleGAN, trained with a weighted summation of reconstruction, abstraction identification, and Acc.@q losses.

Results

Proposed (blue) method's efficacy over Triplet-SN (green) against different sketching styles of the same shoe (red bordered).

Quantitative results on ShoeV2 for early retrieval setup, visualised via the percentage of sketch. A higher area under the curve indicates better early retrieval performance.

Top-10 retrieved images for inputs abstracted (by GDSA) at different budgets (10%, 30%, 100%). Paired photo is red bordered.

Top-10 retrieved images for inputs abstracted (by GDSA) at different budgets (10%, 30%, 100%). Paired photo is red bordered.

Proposed (blue) method's efficacy over Triplet-SN (green) against different sketching styles of the same shoe (red bordered).

Proposed (blue) method's efficacy over Triplet-SN (green) against different sketching styles of the same chair (red bordered).

Top-10 qualitative retrieval results of the proposed method on sketches from ShoeV2 dataset. Paired photo is red bordered.

Top-10 qualitative retrieval results of the proposed method on sketches from ChairV2 dataset. Paired photo is red bordered.

BibTeX

@inproceedings{koley2024handle,
title={{How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval?}},
author={Koley, Subhadeep and Bhunia, Ayan Kumar and Sain, Aneeshan and Chowdhury, Pinaki Nath and Xiang, Tao and Song, Yi-Zhe},
booktitle={CVPR},
year={2024}
}

Copyright: CC BY-NC-SA 4.0 © Subhadeep Koley | Last updated: 25 April 2024 | Good artists copy, great artists steal.