Image Search Notes

I recently had a thought. If you need to note down something related to a physical object (for example, the specific URL of that item in an online store), how would you link the notes with the object?

The most precise way is probably some kind of barcode or RFID tag, but that's cumbersome. I thought it may be possible to link it using some kind of image search. As in, you attach a photo along with the notes, then later on you can find the notes by taking another photo of the object.

So I implemented this prototype. It consists of a camera view and a list of notes below. You can take a photo using the camera, and it will show the most relevant notes (I preloaded a few notes as an example) and allow you to create new ones. The data is stored using IndexedDB in the browser.

Loading... (JavaScript required)

It works by computing an embedding of the photo using an Image Feature Extraction model running on Transformers.js. This is then compared with the notes photo embeddings using cosine similarity to find the relevant notes. I wrote about using Transformers.js in more detail in a previous post.

Even though I just used the very simple example code provided by Transformers.js, it worked remarkably well. It can easily tell different kinds of objects apart even with different backgrounds, and it can even tell different mugs apart in my testing.

You can check out the source code here.

Credits for the initial photos: