Car Model Detection and Reverse Image Search AI | by Vansh Khaneja | Jun, 2024

Think about a world the place expertise seamlessly connects imaginative and prescient and motion, reworking our interplay with the surroundings. On this article, we discover how laptop imaginative and prescient and vector similarity collectively can revolutionize visible knowledge administration, enhancing search capabilities enabling smarter and extra intuitive picture retrieval methods.

Introduction

The undertaking shall be executed in two phases. Within the first section, we are going to collect pattern photos of assorted vehicles together with their costs and convert these photos into numerical vectors. Within the second section, we are going to use the collected knowledge to match an enter picture with the saved photos and show related photos utilizing the Streamlit framework.

Uncover the complete code and implementation particulars on GitHub.

Setting Up the Surroundings

We’ll be utilizing ImageBind, an open-source library developed by Meta to transform all the pictures into their respective embeddings.

The code begins by putting in the ImageBind library, which can’t be put in instantly through the pip command. As a substitute, it requires cloning from its GitHub repository for correct integration and utilization.

Execute the under talked about code in command shell to obtain the library.

git clone https://github.com/facebookresearch/ImageBind.git
cd ImageBind
pip set up -e .

Putting in needed Libraries

Moreover, we require a number of different libraries, together with ultralytics and qdrant-client, to make sure the undertaking capabilities appropriately and effectively.

pip set up ultralytics
pip set up qdrant-client
pip set up streamlit

Information Gathering

We’ve gathered a set of photos representing varied kinds of vehicles together with their respective costs from the web. Subsequently, I’ve created two Python lists: one to retailer the names of the automobile photos and one other to retailer their corresponding costs.

cars_img_list = ["img01","img02","img03","img04","img05","img06","img07","img08","img09","img10","img11","img12","img13","img14","img15"]
cars_cost_list = ["6.49","3.99","6.66","6.65","7.04","5.65","61.85","11.00","11.63","11.56","11.86","46.05","75.90","13.59","13.99"]

Importing Libraries

Now we’ll import all the mandatory libraries required to transform the pictures to their embeddings.

from ultralytics import YOLO
import cv2
import osimport torch
from imagebind import knowledge
from imagebind.fashions import imagebind_model
from imagebind.fashions.imagebind_model import ModalityType

Detecting Vehicles within the Pictures

We shall be utilizing YOLOv8 algorithm to detect the vehicles within the photos and crop out them from the pictures thereby eradicating pointless noise from the picture.

To realize this, we are going to first draw a bounding field across the automobile after which use OpenCV to crop that area from the picture.

In any case the pictures are cropped, we are going to save them in a brand new listing named “cropped_imgs”.

mannequin = YOLO('yolov8n.pt')for im in cars_img_list:
img = cv2.imread("cars_imgs/"+im+".jpg")
img = cv2.resize(img,(320,245))
outcomes = mannequin(img,stream=True)
for r in outcomes:
packing containers = r.packing containers
for field in packing containers:
x1,y1,x2,y2 = field.xyxy[0]
x1,y1,x2,y2 = int(x1),int(y1),int(x2),int(y2)
cv2.rectangle(img,(int(x1),int(y1)),(int(x2),int(y2)),(255,0,0),1)
cropped_img = img[y1:y2, x1:x2]
cv2.imwrite("cropped_imgs/"+im+"_cropped.jpg",cropped_img)

This code will save all the pictures that we collected earlier than within the cropped side sizes.

Changing the Photos to Embedding

Subsequent, we are going to convert the cropped photos into numeric format by reworking them into vector embeddings utilizing the ImageBind library.

Be aware: It is a time-consuming course of as it’ll obtain a mannequin from the web which shall be round 4 GB in dimension.

machine = "cpu"model_embed = imagebind_model.imagebind_huge(pretrained=True)
model_embed.eval()
model_embed.to(machine)
embedding_list = []
for i in vary(1,len(cars_img_list)):
img_path = "cropped_imgs/img"+str(i)+"_cropped.jpg"
print(img_path)
vision_data = knowledge.load_and_transform_vision_data([img_path], machine)
with torch.no_grad():
image_embeddings = model_embed({ModalityType.VISION: vision_data})
embedding_list.append(image_embeddings)
for i in embedding_list:
print(i['vision'][0])

To scale back processing time, you possibly can set pretrained=False, though this can lower this system’s effectivity.

Subsequent, we are going to save the embeddings for later use within the code.

import pickle
with open('embedded_data.pickle', 'wb') as file:
pickle.dump(embedding_list, file)

Comparable Photos Search

Transferring ahead to the second a part of the code, we are going to now proceed to take a picture as enter and establish essentially the most related photos. Subsequently, we are going to show these photos together with their respective costs.

Importing Libraries

Few libraries shall be related as the sooner together with different new libraries which will even be used on this a part of code.

import streamlit as st
from PIL import Picture
import base64
import os
from io import BytesIOimport torch
from imagebind import knowledge
from imagebind.fashions import imagebind_model
from imagebind.fashions.imagebind_model import ModalityType
import pickle
from qdrant_client import QdrantClient
from qdrant_client.http.fashions import VectorParams, Distance
from qdrant_client.http.fashions import PointStruct
import cv2
import numpy as np
from ultralytics import YOLO

Now we are going to begin the code by initiating the picture bind mannequin that can convert the uploaded enter picture to the vector embeddings.

machine = "cpu"
model_embed = imagebind_model.imagebind_huge(pretrained=True)
model_embed.eval()
model_embed.to(machine)

Let’s proceed by opening the saved file “embedded_data.pickle” which incorporates the vector knowledge for our picture dataset.

with open('embedded_data.pickle', 'rb') as file:
embedding_list = pickle.load(file)

Storing Vector Information

We are going to make the most of Qdrant, an open-source vector database, to retailer and evaluate all of the embeddings of the pictures we now have created and saved in pickle format.

consumer = QdrantClient(":reminiscence:")consumer.recreate_collection(
collection_name='vector_comparison',
vectors_config=VectorParams(dimension=1024, distance=Distance.COSINE)
)
consumer.upsert(
collection_name='vector_comparison',
factors=[
PointStruct(id=i, vector=embedding_list[i]['vision'][0].tolist()) for i in vary(15)
]
)

Evaluating Photos

Subsequent we are going to evaluate every vector embedding we saved within the Qdrant database with the enter picture equipped to this system.

This shall be carried out in 3 steps:-

Cropping the Automotive from the Picture.
Convert cropped picture into vector embedding.
Examine that vector with the all of the vectors of different photos.

On this course of, we’re using cosine similarity to evaluate the similarity among the many embeddings.

We’ve declared a perform that takes a picture as enter provides out the index of 4 most related photos.

def image_to_similar_index(cv2Image):
img = cv2.resize(cv2Image,(320,245))
mannequin = YOLO('yolov8n.pt')
outcomes = mannequin(img,stream=True)
outcomes = mannequin(img,stream=True)
for r in outcomes:
packing containers = r.packing containers
for field in packing containers:
x1,y1,x2,y2 = field.xyxy[0]
x1,y1,x2,y2 = int(x1),int(y1),int(x2),int(y2)cv2.rectangle(img,(int(x1),int(y1)),(int(x2),int(y2)),(255,0,0),1)
cropped_img = img[y1:y2, x1:x2]
cv2.imwrite("test_cropped.jpg",cropped_img)
vision_data = knowledge.load_and_transform_vision_data(["test_cropped.jpg"], machine)
with torch.no_grad():
test_embeddings = model_embed({ModalityType.VISION: vision_data})
consumer.upsert(
collection_name='vector_comparison',
factors=[
PointStruct(id=20, vector=test_embeddings['vision'][0].tolist()),
]
)
search_result = consumer.search(
collection_name='vector_comparison',
query_vector=test_embeddings['vision'][0].tolist(),
restrict=20 # Retrieve high related vectors (excluding the brand new vector itself)
)
return [search_result[1].id,search_result[2].id,search_result[3].id,search_result[4].id]

Deploying the mannequin

We are going to now proceed to develop a frontend internet utility for our mannequin to boost interactivity and user-friendliness.

To perform this, we are going to make the most of Streamlit, a software that facilitates the creation of internet interfaces for Python functions in a easy and environment friendly method.

We’ll start by configuring the web page and integrating a file uploader widget onto the net web page

st.set_page_config(structure="broad")
st.title('Comparable Vehicles Finder')
st.markdown("""
<model>
.block-container {
padding-top: 3rem;
padding-bottom: 0rem;
padding-left: 5rem;
padding-right: 5rem;
}
</model>
""", unsafe_allow_html=True)
# Create a file uploader widget
uploaded_file = st.file_uploader("Add a picture of a automobile", sort=["jpg", "jpeg", "png"])

Now we are going to create a perform to show the pictures with correct padding and margins together with the costs. This perform takes photos and costs record as enter and reveals them on the webpage in formatted method.

def display_images_with_padding_and_price(photos, costs, width, padding, hole):
cols = st.columns(len(photos))
for col, img, value in zip(cols, photos, costs):
with col:
col.markdown(
f"""
<div model="margin-right: {0}px; text-align: heart;">
<img src="knowledge:picture/jpeg;base64,{img}" width="{250}px;margin-right: {50}px; ">
<p model="font-size: 20px;">₹{value} Lakhs</p>
</div> 
""",
unsafe_allow_html=True,
)

Lastly, we are going to learn the uploaded picture as enter, convert it right into a NumPy array, and supply it as enter to the image_to_similar_index perform we outlined earlier, which can return the indices of essentially the most related photos to the enter.

We are going to then retrieve the pictures and costs similar to the returned indices and provide them to the display_images_with_padding_and_price perform, which can format the pictures and show them on the webpage.

if uploaded_file will not be None:
# Open and show the uploaded picture
car_image = Picture.open(uploaded_file)
img_array = np.array(car_image)
st.picture(car_image, caption='Uploaded Automotive Picture', use_column_width=False, width=300)
outcomes = image_to_similar_index(img_array)
print(outcomes)# Listing the place further automobile photos are saved
car_images_dir = "cars_imgs"
# Make sure the listing exists
if os.path.exists(car_images_dir):
car_images = [os.path.join(car_images_dir, img) for img in os.listdir(car_images_dir) if img.endswith(('jpg', 'jpeg', 'png'))]
print(car_images)
else:
st.error(f"Listing {car_images_dir} doesn't exist")
car_images = []
# Examine if there are sufficient photos
if len(car_images) < 4:
st.error("Not sufficient automobile photos within the native storage")
else:
car_imagess = []
for i in outcomes:
car_imagess.append(car_images[i])
car_prices = [cars_cost_list[a] for a in outcomes]
car_images_pil = []
for img_path in car_imagess:
attempt:
img = Picture.open(img_path)
buffered = BytesIO()
img.save(buffered, format="JPEG")
img_str = base64.b64encode(buffered.getvalue()).decode()
car_images_pil.append(img_str)
besides Exception as e:
st.error(f"Error processing picture {img_path}: {e}")
if car_images_pil:
st.subheader('Comparable Vehicles with Costs')
display_images_with_padding_and_price(car_images_pil, car_prices, width=200, padding=10, hole=20)

Remaining Output

Upon importing a picture to the webpage, the ImageBind mannequin initiates loading, which can take a second. Nonetheless, as soon as the mannequin is absolutely loaded, the picture is transformed into embeddings and in comparison with others to establish essentially the most related one. Subsequently, the same photos are displayed on the webpage.

Video Demonstration

Conclusion

In abstract, this undertaking showcases the ability of mixing laptop imaginative and prescient, vector embeddings, and internet growth instruments like Streamlit to create a user-friendly system for picture similarity detection. By environment friendly processing and comparability of picture embeddings, we’ve demonstrated the potential for enhancing search and advice methods.

Source link

Car Model Detection and Reverse Image Search AI | by Vansh Khaneja | Jun, 2024

Working with Input-Convex Neural Networks part3(Machine Learning 2024) | by Monodeep Mukherjee | Jul, 2024

Embracing the Future: The Rise of AI-Driven Development in Software Engineering The software… | by DevBlogs | Jul, 2024

Research on Metaheuristic methods part4(Machine Learning 2024) | by Monodeep Mukherjee | Jul, 2024

How Real-Time Data Analytics and AI Are Transforming Heavy Equipment Operations

NVIDIA Accelerates Google Quantum AI Processor Design With Simulation of Quantum Device Physics

Game Development and Cloud Computing: Benefits of Cloud-Native Game Servers

Teradata AI Unlimited in Microsoft Fabric is Now Available for Public Preview through Microsoft Fabric Workload Hub

Cognigy Unveils Agentic AI: Transforming the Future of Enterprise Contact Centers

Our Picks

A Complete Guide for 2024

CPO-SimPO | Training Phi3-Mini4k-Instruct with CPO-SimPO | by Zain ul Abideen | Jul, 2024

Google DeepMind’s new AlphaFold can model a much larger slice of biological life

Most Popular

Revolutionizing the Way We Find Love

Will GenAI Replace Data Engineers? No – And Here’s Why.

Assortment Optimization Machine Learning | by Danishaliarshar | Mar, 2024

Car Model Detection and Reverse Image Search AI | by Vansh Khaneja | Jun, 2024

Related Posts