35
Building a reverse image search system with Towhee
Reverse image search helps you search for similar or related images given an input image. Reverse image search is a content-based image retrieval (CBIR) query technique that involves providing the CBIR system with a query image that it will then base its search upon. Unlike the traditional image search (which often relies on performing text queries across user-generated labels), reverse image search is based on the content of the image itself.
A few applications of reverse image search include finding the original source of an image, searching for similar content, and product recommendation.
Building a reverse image search system typically involves the following steps:
- model and pipeline selection
- computing embedding vectors for the existing image dataset
- insert all generated embedding vectors into a vector database
- process search queries
A block diagram for a basic reverse image search system is shown in the images below. The first image shows how an existing image dataset is transformed into embedding vectors and inserted into a vector database, while the second image shows how the system processes query images.
In the upcoming sections, we will first walk you through some of the prep work required for this tutorial. After that, we will elaborate on each of the four steps mentioned above.
In this step, we will download the image dataset, install Towhee, and setup Milvus, an open source vector database.
In this tutorial, we will use a subset of the ImageNet dataset (100 classes, 10 images for each class). You can download the dataset via:
In this tutorial, we will use gdown
to download and unzip the data from Google Drive.
$ pip3 install gdown
$ gdown "https://drive.google.com/uc?id=1bg1RtUjeZlOfV2BiA2nf7sn5Jec9b-9I"
$ unzip -q image_dataset.zip
The downloaded data contains two directories - dataset
for the image dataset and query
for the query images.
We'll use pip
in this tutorial. We also support installing Towhee via conda
as well as from source; check out this page for more information.
$ pip3 install towhee
Milvus is an open-source vector database built to power embedding similarity search and AI applications. More info about Milvus is available here.
We'll be using docker-compose
to install Milvus standalone. Before installing Milvus (see the official Milvus installation guide), make sure you have the necessary prerequisites.
# download the latest docker-compose file
$ wget https://github.com/milvus-io/milvus/releases/download/v2.0.0-pre-ga/milvus-standalone-docker-compose.yml -O docker-compose.yml
# start the Milvus service
$ docker-compose up -d
# check the state of the containers
$ docker-compose ps
We will also need to install Python bindings for Milvus.
$ pip3 install pymilvus==2.0.0rc9
The first step in building a reverse image search system is selecting an appropriate embedding model and one of its associated pipelines. Within Towhee, all pipelines can be found on the Towhee hub. Clicking on any of the categories on the right hand side of the page will filter the results based on the specified task; selecting the image-embedding
category will reveal all image embedding pipelines that Towhee offers. We also provide a summary of popular image embedding pipelines here.
Resource requirements, accuracy, inference latency are key trade-offs when selecting a proper pipeline. Towhee provides a multitude of pipelines to meet various application demands. The current state-of-the-art embedding pipelines are ensemble pipelines that include multiple models (our best ensemble combines the Swin Transformer with EfficientNet and Resnet-101). These pipelines are fairly computational expensive. In contrast, if a slightly less "accurate" but much faster pipeline is okay for your application, we recommend EfficientNet (image-embedding-efficientnetb7). For demonstration purposes, we will be using Resnet-50 (image-embedding-resnet50) in this tutorial.
from towhee import pipeline
embedding_pipeline = pipeline('towhee/image-embedding-resnet50')
With an optimal pipeline selected, computing embedding vectors over our image dataset is the next step. All image-embedding
Towhee pipelines output an embedding vector given an image path.
import numpy as np
import os
from pathlib import Path
dataset_path = './image_dataset/dataset/'
images = []
vectors = []
for img_path in Path(dataset_path).glob('*'):
vec = embedding_pipeline(str(img_path))
norm_vec = vec / np.linalg.norm(vec)
vectors.append(norm_vec.tolist())
images.append(str(img_path.resolve()))
While brute-force computation of distances between queries and all image dataset vectors is perfectly fine for small datasets, scaling to billions of image dataset items requires a production-grade vector database that utilizes a search index to greatly speed up the query process. Here, we'll insert the vectors computed in the previous section into a Milvus collection.
import pymilvus as milvus
collection_name = 'reverse_image_search'
vec_dim = len(vectors[0])
# connect to local Milvus service
milvus.connections.connect(host='127.0.0.1', port=19530)
# create collection
id_field = milvus.FieldSchema(name="id", dtype=milvus.DataType.INT64, is_primary=True, auto_id=True)
vec_field = milvus.FieldSchema(name="vec", dtype=milvus.DataType.FLOAT_VECTOR, dim=vec_dim)
schema = milvus.CollectionSchema(fields=[id_field, vec_field])
collection = milvus.Collection(name=collection_name, schema=schema)
# insert data to Milvus
res = collection.insert([vectors])
collection.load()
img_dict = {}
# maintain mappings between primary keys and the original images for image retrieval
for i, key in enumerate(res.primary_keys):
img_dict[key] = images[i]
We can use the same pipeline to generate an embedding vector for each query image. We can then search across the collection using the vector.
query_img_path = './image_dataset/query/'
query_images = []
query_vectors = []
top_k = 5
for img_path in Path(query_img_path).glob('*'):
vec = embedding_pipeline(str(img_path))
norm_vec = vec / np.linalg.norm(vec)
query_vectors.append(norm_vec.tolist())
query_images.append(str(img_path.resolve()))
query_results = collection.search(data=query_vectors, anns_field="vec", param={"metric_type": 'L2'}, limit=top_k)
Display the results.
import matplotlib.pyplot as plt
from PIL import Image
for i in range(len(query_results)):
results = query_results[i]
query_file = query_images[i]
result_files = [img_dict[result.id] for result in results]
distances = [result.distance for result in results]
fig_query, ax_query = plt.subplots(1,1, figsize=(5,5))
ax_query.imshow(Image.open(query_file))
ax_query.set_title("Searched Image\n")
ax_query.axis('off')
fig, ax = plt.subplots(1,len(result_files),figsize=(20,20))
for x in range(len(result_files)):
ax[x].imshow(Image.open(result_files[x]))
ax[x].set_title('dist: ' + str(distances[x])[0:5])
ax[x].axis('off')
That's it! I hope you had fun with this tutorial - if you'd like to know more about Towhee, feel free to visit the Towhee website or the Towhee github repository.
35