News & Information for Technology Purchasers NewsFactor Sites:     Enterprise Security Today     CRM Daily     Business Report     Sci-Tech Today  
This ad will display for the next 20 seconds. Click for more information, or
Home Enterprise I.T. Cloud Computing Applications Hardware More Topics...
Tame your scariest paperwork. Find Out How
Average Rating:
Rate this article:  
Google Will Buy reCAPTCHA To Help Scan Books
Google Will Buy reCAPTCHA To Help Scan Books

By Jennifer LeClaire
September 17, 2009 8:27AM

    Bookmark and Share
Google plans to purchase startup reCAPTCHA to improve its book-scanning project. Free CAPTCHAs from reCAPTCHA already help protect more than 100,000 Web sites, and Google plans to continue that service. The words in CAPTCHAs come from scanning problems and Google plans to use the technology to teach computers to read degraded text.

On Wednesday, Google announced plans to acquire a startup that helps Web sites combat spam and fraud. Google is investing an undisclosed amount to bring reCAPTCHA into its technology fold to address scanning challenges in the Google Books project.

reCAPTCHA is a free anti-bot service that helps digitize books. The company also provides CAPTCHAs to help protect more than 100,000 Web sites. A CAPTCHA is a program that can detect whether its user is a human or a computer.

CAPTCHAs appear as images with distorted text at the bottom of Web registration forms and are used by many Web sites to prevent abuse from automated programs written to generate spam. But Google sees it as a way to teach computers to read.

Teaching Computers To Read

Luis von Ahn, cofounder of reCAPTCHA, and Google product manager Will Cathcart explained the reCAPTCHA twist: The words in many of the CAPTCHAs provided by reCAPTCHA come from scanned archival newspapers and old books.

"Computers find it hard to recognize these words because the ink and paper have degraded over time, but by typing them in as a CAPTCHA, crowds teach computers to read the scanned text," von Ahn and Cathcart explained. "In this way, reCAPTCHA's unique technology improves the process that converts scanned images into plain text, known as optical character recognition (OCR)."

Now here's the Google-reCAPTCHA connection: OCR also powers large-scale text-scanning projects like Google Books and Google News Archive Search. As Google sees it, having the text version of documents is important because plain text can be searched, easily rendered on mobile devices, and displayed to visually impaired users.

Google plans to apply the reCAPTCHA technology not only to increase fraud and spam protection for Google products but also to improve the books and newspaper scanning process. Google will also continue to allow Web-site owners to use reCAPTCHA free of charge to protect their digital assets.

Between the Lines

Google is embroiled in a legal controversy in its Google Books project. Last October, Google settled a class-action copyright suit filed by the Authors Guild and the Association of American Publishers. But Amazon and Microsoft, among others, are speaking out against the deal, which has not yet been settled in federal court.

"Having an archive of the world's knowledge is not something Google feels is outside the scope of its interests," said Brad Shimmin, an analyst at Current Analysis. "This acquisition is Google saying they are going to continue scanning books because they know they are within their rights to do so, and now they are going to do it better with this technology."

With reCAPTCHA, Shimmin sees opportunities for Google to stand out in the small crowd of players scanning out-of-copyright books. Although it may not seem like an earth-shattering acquisition, Shimmin said, it may help Google compete against Microsoft and Yahoo in the long term.

"With some of the moves Microsoft and Yahoo have been making lately, it's not a done deal that Google is going to be the leading search destination for the next 10 years," Shimmin said. "Google recognizes that danger and is constantly looking to not only broaden its portfolio but also deepen its capabilities in a way that differentiates the company from Microsoft and Yahoo. reCAPTCHA helps that cause."

Tell Us What You Think

Name: is the market and technology leader in Software-as-a-Service. Its award-winning CRM solution helps 82,400 customers worldwide manage and share business information over the Internet. Experience CRM success. Click here for a FREE 30-day trial.

1.   IBM, California Partner in the Cloud
2.   Sprint Becomes Google Apps Reseller
3.   Tor Working To Fix Security Exploit
4.   Can One Size Windows OS Fit All?
5.   Wall Street Journal Hacked Again

Backlash Stirs Against H-1B Visas
Debate over foreign workers continues.
Average Rating:
Amazon Intros Zocalo Storage Service
Online storage and sharing for business.
Average Rating:
Tor Working To Fix Security Exploit
Bug reportedly reveals ID of users
Average Rating:
Product Information and Resources for Technology You Can Use To Boost Your Business

Network Security Spotlight
Researchers Working To Fix Tor Security Exploit
Developers for the Tor privacy browser are scrambling to fix a bug revealed Monday that researchers say could allow hackers, or government surveillance agencies, to track users online.
Wall Street Journal Hacked Again
Hacked again. That’s the story at the Wall Street Journal this week as the newspaper reports that the computer systems housing some of its news graphics were breached. Customers not affected -- yet.
Dropbox for Business Beefs Up Security
Dropbox is upping its game for business users. The cloud-based storage and sharing company has rolled out new security, search and other features to boost its appeal for businesses.

Enterprise Hardware Spotlight
Microsoft Makes Design Central to Its Future
Over the last four years, Microsoft has doubled the number of designers it employs, putting a priority on fashioning devices that work around people's lives -- and that are attractive and cool.
Contrary to Report, Lenovo's Staying in Small Windows Tablets
Device maker Lenovo has clarified a report that indicated it is getting out of the small Windows tablet business -- as in the ThinkPad 8 and the 8-inch Miix 2. But the firm said it is not exiting that market.
Seagate Unveils Networked Drives for Small Businesses
Seagate is out with five new networked attached storage products aimed at small businesses. The drives are for companies with up to 50 workers, and range in capacity from two to 20 terabytes.

NewsFactor Network
Home/Top News | Enterprise I.T. | Cloud Computing | Applications | Hardware | Mobile Tech | Big Data | Communications
World Wide Web | Network Security | Data Storage | CRM Systems | Microsoft/Windows | Apple/Mac | Linux/Open Source | Personal Tech
Press Releases
NewsFactor Network Enterprise I.T. Sites
NewsFactor Technology News | Enterprise Security Today | CRM Daily

NewsFactor Business and Innovation Sites
Sci-Tech Today | NewsFactor Business Report

NewsFactor Services
FreeNewsFeed | Free Newsletters

About NewsFactor Network | How To Contact Us | Article Reprints | Careers @ NewsFactor | Services for PR Pros | Top Tech Wire | How To Advertise

Privacy Policy | Terms of Service
© Copyright 2000-2014 NewsFactor Network. All rights reserved. Article rating technology by Blogowogo. Member of Accuserve Ad Network.