News & Information for Technology Purchasers
NewsFactor Network Sites:   NewsFactor.com Security CRM Business Sci-Tech Newsletters White Papers XML/RSS Feed  
   
Home Enterprise I.T. Hardware Software Network Security More Topics...
July 08, 2008
Average Rating:
Rate this article:  
Journey to the Internet Journey to the Internet's Unknown Regions
By Tim McDonald
April 24, 2002 4:41PM

Digg It!   Bookmark to del.icio.us
The average Web surfer cannot uncover information on the deep Web because of the ways in which most search engines get their listings.
 
Advertisement

You heard it a couple of years ago: The World Wide Web is far deeper and more vast than previously imagined. That is terrific news. But where are all those fascinating, hidden pockets of information that are supposedly out there, and why can't we find them on Yahoo?

"No one can argue with the fact the Web is huge and it continues to grow at an astronomical rate," Giga Information Group analyst Laura Ramos told NewsFactor.

"Really, what people are dealing with is how to harvest the valuable information out there, because there is a lot of good stuff and there is a lot of junk," Ramos said.

Billions More Documents

Experts estimate that the "surface Web" contains 1 billion to 2 billion documents, while the "deep Web" could contain as many as 550 billion. Put another way, the surface Web contains about 19 terabytes of information, while the deep Web contains about 7,500 terabytes.

A terabyte is a measure of data Relevant Products/Services storage. One terabyte is the equivalent of about 1,600 CDs or 1,000 gigabytes.

There are more than 200,000 deep Web sites, more than half of which are located in topic-specific databases. About 95 percent of information on the deep Web is available to the public and is not subject to subscription fees.

Why Is It Hidden?

Many surfers cannot find these sites, however, because each page generally is not linked to many other pages.

Full-text search engines get their listings in one of two ways: Site developers can submit addresses to a search engine, asking to be indexed; or a search engine can use "spiders," which depend on links from existing sites to discover new ones.

While there is a huge amount of information on the deep Web, much of it is valuable primarily to researchers, scholars and the merely curious, so it may have few, if any, links. Without such links, search engines can find such sites only by chance.

Also, more and more information is being stored by governments, universities and corporations in monster databases. These databases cannot be accessed by conventional search engines, which identify "static" pages rather than the "dynamic" pages used by large databases. Information in such databases can be accessed only by a direct query.

Theoretically, search engines create and maintain their own databases in an effort to index the entire Web. But even the biggest and best search engines can index only between one-third and one-half of all publicly available documents. (continued...)

1  |  2  |  3  |  Next Page >

 

Advertisement


Advertisement


 
1.   Angry YouTube Users Pillory Viacom
2.   DreamWorks, Intel To Develop 3-D Films
3.   Pioneer Has 400GB Blu-Ray Disc
4.   Microsoft Offers ActiveX Workaround
5.   Is Overtime BlackBerry Use Billable?


advertisement
EA Hypes Spore via 'Creature Creator'EA Hypes Spore via 'Creature Creator'
Teaser released before future game.
Average Rating:
China Accused of Hacking CongressChina Accused of Hacking Congress
Rep. Wolf says dissident info copied.
Average Rating:
DreamWorks, Intel To Develop 3-D FilmsDreamWorks, Intel To Develop 3-D Films
New chips will replace AMD system.
Average Rating:


advertisement
Product Information and Resources for Technology You Can Use To Boost Your Business

Network Security Spotlight
Vulnerabilities in Web Browsers Worry Researchers
A study from the Swiss Federal Institute of Technology, Google and IBM says more than 600 million Internet browsers were at risk this year. Firefox's auto-update mechanism was judged the best.
 
Online Surveys Can Expose Customers' Data
The use of online survey software to collect feedback from customers is growing as companies search for ways to take the pulse of their client base. But exposing customer data has some real risks.
 
Forty Percent of Web Browsers Open to Hackers
Researchers from Google, IBM and the Communications Systems Group in Switzerland have released a study that shows only 60 percent of Web users are surfing with patched, updated browsers.
 

Enterprise Hardware Spotlight
Laptop: The Best Bet in Today's Computer Market
Today's market offers ever-more-powerful computers at lower prices, not to mention a generation of cheap, pocket-sized gadgets. In many cases, your best computer choice is likely to be a laptop.
 
Panasonic Releases Rugged Ultra-Mobile PC
Rugged, small and ultra-mobile. That could be the description of a unit of miniature commandos, but it's actually the specs on Panasonic's new Toughbook CF-U1, the latest in its line of durable handhelds.
 
Panasonic Adds Ultra-Mobile PC to Toughbook Series
Panasonic's latest offering in its Toughbook series of rugged laptops is small enough to cradle in one hand, yet strong enough to handle the rough and tumble of extreme environments.
 

Navigation
NewsFactor Network
Home/Top News | Enterprise I.T. | Hardware | Software | Network Security | Wireless Tech | Linux/Open Source | Apple/Macintosh
Microsoft/Windows | World Wide Web | Data Storage | E-Commerce | Personal Tech | Tech Trends | Business Briefing
NewsFactor Network Enterprise I.T. Sites
NewsFactor Technology News | Enterprise Security Today | CRM Daily

NewsFactor Business and Innovation Sites
Sci-Tech Today | NewsFactor Business Report

NewsFactor Services
FreeNewsFeed | Free Newsletters | Free Whitepapers | XML/RSS Feed

About NewsFactor Network | How To Contact Us | Article Reprints | Careers @ NewsFactor | How To Advertise

Privacy Policy | Terms of Service
© Copyright 2000-2008 NewsFactor Network. All rights reserved. Article rating technology by Blogowogo.