Upload Photo -> ID pill was a hit. App downloads increased and pill photos were coming in daily just enough for me to ID each one. I also created a way for users to invite their pharmacists to help. From v3— v5, one of my main focus was to make the app ID flow faster. To date, I’ve collected about 10k pills in all kinds of environment from all over the world, of which I’ve only been able to ID about 8k US drugs, many of which top sites do not have an image for (try searching Round Pink E 345). I firmly believe that PillSync will soon have the most comprehensive database of Pill Photos on the internet.
Surprisingly, there are about 1,500 pills that I cannot locate even after a thorough Google Search. These pills include international generics, streets, and fake drugs that exploded during the opioid crisis. Here’s a foreign generic viagra pill. I’m able to ID about 200 drugs from Canada, Australia, and the UK. A lot more is probably still in the batch of unidentified photos so feel free to help!
Today, I can confidently tell you that even with the pill photo, it can be hard and time-consuming to track down a drug as many pills are just so similar that the basic parameters of imprint, shape, color, and score are not enough. Try this search of Round White Logo. Google is severely inadequate here. As someone who has ID thousands of pills everyday, I wish there was a better way. After all, I cannot scale this up as the need grows to thousands of pills a day.
As it turns out, another tech trend comes to the rescue. The past few years have seen a democratization in Machine Learning (ML) that allows computers to see images. More specifically, Object Detection can pinpoint the object and Classification can tell you what that object is. The challenge is training it with a large amount of real-world data. There is no replacement for data found in the wild and chaotic environment. You cannot simulate this. This is why Tesla leads the pack with real-world self-driving dataset flowing in everyday from Teslas on the road.
Likewise, PillSync is the largest and only pill dataset of its kind leading the APR pill AI revolution. In 2016, a different team from NLM launched an APR challenge using about 40k pill photos taken under different background/lighting conditions. I’ve been analyzing these pills and concluded that they are not the same as pills users submitted over the years. At this time, I’m not sure if I can use the NLM set in training as they may skew the algorithm’s ability to ID pills found in real-world conditions.
Of the 8k US drugs identified over the last 10 years, about 3,000 are unique pills. This means that:
~10k prescription/OTC drugs, only 3k drugs are found in ~90% of user-submitted pills
In order to accurately auto ID these 3,000 pills, I would need ~100 sample photos of each pill to train. That’s a total of 300,000 images. This will take time to collect.
However, given PillSync 10k images, there are ways to accelerate this collection by shortening the pill indexing process with AI. Currently, let’s say it takes me ~30 seconds to ID one pill. I have to enter the imprint, select a shape, choose the color, pick the score, then sift through the results.
What if I could reduce that to 10 secs? to 3 secs?
What if I can detect the shape, score, and imprint?
What if I can reduce the choices from 50 pills to 5 then pick the right one?
If I can use ML to read some traits and return the top 3–5 pills, I can even have the user pick the right match and validate the pill collection for me. This will scale quite well and get us closer to that 300k images in a few years.
Pill AI is here. With about 50 samples of each shape ROUND, OVAL, and CAPSULE, I was already able to train a model to Recognize and Classify each pill in a given photo. This means a photo of 5 pills will return each pill location.