Images are a great way to communicate without text but oftentimes images are used/abused to spread text within social media and advertisements. Text in images also presents an accessibility issue. The truth is that it’s important, for any number of reasons, to be able to detect text in image files. The amazing open source tool that makes detecting text in images possible is tesseract OCR!
I recommend using Homebrew to install tesseract:
brew install tesseract
To run tesseract to read text from an image, you can run the following from command line:
tesseract ~/Downloads/MyImage.png ~/Downloads/MyImage.txt -l eng
The command above extracts detected text in the English language (
-l eng) into a text file (
MyImage.txt). The process is very quick and there are dozens of supported languages.
Let’s look at the following example:
The following text is detected:
International ‘Champions Cup ~- TOUR SQUAD #AFCTour2018 CECH MUSTAFI GUENDOUZI oziL LENO SOKRATIS NELSON IWOBI MARTINEZ MAVROPANOS SMITHROWE = NKETIAH BELLERIN OSEI-TUTU WILLOCK PEREZ KOLASINAC ELNENY RAMSEY LACAZETTE CHAMBERS MAITLAND-NILES MKHITARYAN AUBAMEYANG HOLDING
There are a number of utilities in different programming languages that plug into tesseract’s functionality, but it’s important to know the underlying tool! tesseract is an unbelievable tool that you should take advantage of if you need an open source utility for detecting text in an image!
Create a CSS Flipping Animation
CSS animations are a lot of fun; the beauty of them is that through many simple properties, you can create anything from an elegant fade in to a WTF-Pixar-would-be-proud effect. One CSS effect somewhere in between is the CSS flip effect, whereby there’s…
I love almost every part of being a tech blogger: learning, preaching, bantering, researching. The one part about blogging that I absolutely loathe: dealing with SPAM comments. For the past two years, my blog has registered 8,000+ SPAM comments per day. PER DAY. Bloating my database…