What is the Deep Web?

M. McGee

The deep web is the portion of the Internet that is outside standard search methods. A standard search engine finds webpages by bringing up a single page and clicking on all the links. This allows them to extend out from a single page like a giant spider web, finding page after page through linking. This process only captures a fraction of the pages that exist on the Internet; huge amounts of data are completely unclassified for one of many reasons. These pages will never come up in a standard search engine and are, therefore, invisible to most web users.

The reason the deep web exists is mainly due to limitations on search engines.
The reason the deep web exists is mainly due to limitations on search engines.

The surface web is the part of the Internet with which most users are familiar. This portion contains the standard webpages and web services that most users know about. The deep web is comprised of information that only specific portions of Internet users are aware of or have access too. The deep web is enormous compared to the surface web; in the year 2000, it was nearly 50 times larger than the surface web.

The reason the deep web exists is mainly due to limitations on search engines. As search engines look through links, they are unable to access certain types of web pages. These pages never enter the system and, therefore, are never indexed. When a user searches for one of these pages, he or she will never find it, as the search engine doesn't record its existence or its failure to access it.

There are a number of different page types that are difficult or impossible for a search engine to index. Dynamic and database-based webpages are practically impossible, as they require specific input to exist. These web pages are made up on the spot, often through user input. Since a dynamic page doesn’t exist until it is needed, search engines skip them because they don't know what to ask for.

Private or gated webpages make up another large portion of the deep web. Since these pages require credentials or login information and the search engine has neither, it is blocked from accessing information on the other side of the login. Even with this issue, some login-based sites are part of the surface web. The website sets up special provisions to allow engines to search its pages. This is common among pages that have open registration and want to generate additional traffic.

Another large portion of the deep web is made of unlinked or restricted websites. These pages don't possess any links to outside resources or actively block existing links. This prevents the search engines from ever stumbling upon the page, so it is never added to any listings. This used to be common among personal webpages, but changes in modern web use have made most personal pages linked and indexed.

You might also Like

Discussion Comments


This is the dimension of the web where the C.I.A stores all of our information and the records of extraterrestrial life.

Post your comments
Forgot password?