Things Computers Are Bad at #1: Reading pictures

Sometimes I feel like I run a small, but very inefficient computer support company.

My main customers are a collection of aunts, uncles, parents and elderly friends of the family, who regard my ability to point out the “bold” button to them in Word as nothing less than miraculous. Most of the questions I get are relatively simple and are easy enough to explain. But there is one recurring question that I find difficult to explain, that really sums up the difference between computers and the human brain. It is:

What is the difference between this:

This is some text

and this:

this is some text

The first one is text and the second one is an image of some text. The weird thing is that although these are almost indistinguishable to a human, they could not be more different to a computer.

The scenario comes up most frequently around scanned documents. I can see why users get confused. They both look like text. But turning squiggles on the page into text is one of those things that humans are better at than computers.

What you have to do to explain Optical Character Recognition, and then suggest they download some software that allows them to translate the image into text. Surprisingly, it’s 2013, and converting text to images is still not a solved problem. To paraphrase XKCD, “I like how we’ve had computers for decades, yet editing text” is something early adopters are still figuring out how to do”.

file_transfer

Thankfully, there are an array of cloud services now (I’ve recently developed a somewhat unhealthy obsession with Google Drive).

But OCR-ing text is still difficult. I Googled OCR recently, and the first match was a UK GCSE awarding body). The 6th match on Google (and the penultimate one on the first page) is Free OCR, a free web service that allows you to upload image files and have them converted into text.

I uploaded this:

this is some text

I considered it to be a small and very clear file. But Free-OCR felt differently, and couldn’t find any text in the image. This might be unfair to Free-OCR; I’m sure it’s a very wonderful website, built by kind caring people who feed puppies and so on. But in this one off test, they absolutely failed.

Doonesbury

Usually, I don’t really get Doonesbury (And, man, have I tried.) Most times I just don’t even understand where the joke is. Even here, I’m not quite sure I get the joke. This is pretty much exactly my experience of text recognition.

In practice, and this pains me to say this, if Aunt Mildred is asking why she can’t edit the recipe she’s scanned out of Waitrose Magazine, the easiest thing to do is still just to type it out manually.

Advertisements