The surface web, also called the visible web, is the portion of the Web that is freely available to the general public. Almost any page with a simple web address [http://www.servername.domain/filename] is a surface web page. These pages are indexed [crawled] by web crawlers that are used to build the databases of search engines such as Google and Bing.
The Deep Web, also known as the Invisible Web, is a portion of the web not reached by standard search engines such as Google and Bing. "It’s almost impossible to measure the size of the Deep Web. While some early estimates put the size of the Deep Web at 4,000-5,000 times larger than surface web, the changing dynamic of how information is accessed and presented means that the Deep Web is growing exponentially and at a rate that defies quantification" (Bright Planet: "How large is the deep web?")
Content on the Deep Web is not found by most search engines because it is stored in a database which is not coded in HTML. Google and Bing might lead us to a front door [a search interface], but it generally can't search the content of a databse. It is up to you to search the database where the results of your search are loaded into a dynamically generated HTML page for viewing.
Some database providers have found it valuable to program their database content to show up on the surface web. For example, if you are searching for a product for sale such as a halloween costume, a Google search will send you directly to the page in the database of Amazon, Party City, and Spirit of Halloween. A search for a movie title will lead you to the IMDb site for that movie.
The Hidden Web, also known as the Private Web, is the portion of the web viewable by a restricted set of people. Web resources can be restricted with a firewall, by IP address restriction [UW Libraries databases], by password [UW Canvas]. Search engines and tools designed to seach the deep web will not find this content.
For more info about web crawlers, see A brief history of web crawlers from CASCON '13 Proceedings of the 2013 Conference of the Center for Advanced Studies on Collaborative Research.
|Database content [dynamically generated for a particular inquiry]||ERIC
|Subscription databases||EBSCOhost, LexisNexis academic|
|Deep websites||Library of Congress, U.S. Census Bureau|
|Real time content|
|Formats||Any new format|
|Sites that require login||
|Sites that require that forms be filled out||Sites offering travel directions, job hunting sites|
|New content||Any new websites or content newly added to an existing website|
|Sites with a no-index protocol||Private websites|
|Social networking sites||Facebook, LinkedIn, etc.|
Content adpated from: http://library.laguardia.edu/invisibleweb/characteristics