Image feature aggregation using Attention mechanisms
I demonstrated a use case of feature aggregation to get a condensed view of an image using the concept of attention to images. I utilized pretrained CLIP embeddings which already have knowledge of images. The attention-mechanism is done by breaking the images into image patches and calculating the attention of the patch embeddings.