You know those distorted images of texts that Google makes you read in order to ensure you are a human? Google is redoing them so there is less reliance on distorted texts.
CAPTCHAs, or reCAPCHAs as Google calls its version, are designed to keep software agents from logging into sites where they don’t belong. The reason the technology giant is undertaking this revision is that “over the last few years advances in artificial intelligence have reduced the gap between human and machine capabilities in deciphering distorted text,” according to a posting Friday on the Google blog by reCAPTCHA Product Manager Vinay Shet.
In what may be a related development, a San Francisco-based startup called Vicarious announced last Sunday, two days after the Google posting, that it had developed algorithms which could “reliably solve modern CAPTCHAs,” including Google’s. Vicarious said a CAPTCHA can be considered broken if software can decipher it as least one percent of the time. The company claims that its success rate for Google’s reCAPTCHA, the most widely used version, is as high as 90 percent. For individual letters, Vicarious said its software could achieve 95 percent accuracy.
Shet said on the Google blog that the new reCAPTCHA is the result of extensive research, and is now “more adaptive and better equipped to distinguish legitimate users from automated software.”
The new system uses advanced risk techniques that help to elicit clues about whether the would-be entrant is a carbon- or a silicon-based unit. Google will now be presenting different kinds of reCAPTCHAs for different kinds of users, and the company said it will help to serve easier-to-solve visual puzzles to “our legitimate users.” But if you’re a bot reading this, beware, as the system will somehow suspect you and will make you burn up cycles over more difficult images.
Optical Character Recognition
Google is rather mysterious about how this actual separation of sentient beings from software takes place. It does say that humans have a easier time with numeric reCAPTCHAs compared to ones that contain both numbers and letters. It promised that more developments in its war against the bots would be announced in coming months.
CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart. Originated at Carnegie Mellon Institute in 2000, it has quickly become utilized on many sites to deter bots from gaining access to protected sites and information.
In 2009, Google bought the company reCAPTCHA, whose co-founder, Luis von Ahn, had been involved in the original development of CAPTCHAs at Carnegie Mellon. reCAPTCHA’s angle was that it not only offered the distorted images for blocking bots, but that the images were derived from scanned books.
By having humans read the distorted images and enter accurate text, the optical character recognition software behind reCAPTCHA iteratively improves. Google uses large scale text scanning in book and news projects, and the company has reported that the reCAPTCHA approach has helped it achieve more than 99.5 percent transcription accuracy at the word level for its optical character recognition software.
Posted: 2013-10-31 @ 4:26am PT
Of course Google does not tell how it discriminates between "silicon and carbon units", because if the carbon units would know, they would be up in arms. There is research out there showing that the data miner like Google or Facebook can reliably detect a carbon units' age, gender, race, sexual and political orientation using their secret recipe which is nothing less but peeping over the carbon unit's shoulder as it surf from web page to web page. Discrimination, for example in the context of the job market, is back with a vengeance as the tracking enable google to discern the WASPs from the rest and conveniently display their partner's job ads only to those targeted. Wake up, free world, before it is too late and the hard work of the post-war generation is flushed down the silicon drain.