Revision as of 10:47, 15 August 2005 view sourcePigsonthewing (talk | contribs)Autopatrolled, Event coordinators, Extended confirmed users, Page movers, File movers, IP block exemptions, New page reviewers, Pending changes reviewers, Rollbackers, Template editors266,061 editsm →How the web works: tweak← Previous edit | Revision as of 10:48, 15 August 2005 view source Pigsonthewing (talk | contribs)Autopatrolled, Event coordinators, Extended confirmed users, Page movers, File movers, IP block exemptions, New page reviewers, Pending changes reviewers, Rollbackers, Template editors266,061 editsm →How the web works: tweakNext edit → | ||
Line 16: | Line 16: | ||
Finding that IP address is managed by a program called Domain Name Services (DNS). Every computer that has access to the Internet, has access to a computer that has the DNS program on it. Usually your ISP provides the computer with DNS, and everyone that uses that ISP has access to the DNS program. Larger companies have their own computer that has a DNS program on it, which is accessable by everyone who works in that company. | Finding that IP address is managed by a program called Domain Name Services (DNS). Every computer that has access to the Internet, has access to a computer that has the DNS program on it. Usually your ISP provides the computer with DNS, and everyone that uses that ISP has access to the DNS program. Larger companies have their own computer that has a DNS program on it, which is accessable by everyone who works in that company. | ||
So, your computer sends the website |
So, your computer sends the website address, e.g. wikipedia.com, to the DNS computer which in turn breaks off the letters to the right of the dot, which in this case is "com". The DNS system knows exactly where to find the computer with the list of "com" websites, since there are twelve of these servers located around the world, they are known as root servers. There are also root servers for names that end in ".org", ".net", ".us". and others. | ||
The root servers contain a list of each website and the IP address (phone number) of the computer that is responsible for knowing the exact IP address of the computer that contains the website. The root server doesn't actually know the exact address for the website, but it does have the address of the computer that has the answer. | The root servers contain a list of each website and the IP address (phone number) of the computer that is responsible for knowing the exact IP address of the computer that contains the website. The root server doesn't actually know the exact address for the website, but it does have the address of the computer that has the answer. |
Revision as of 10:48, 15 August 2005
- For the world's first web browser, see WorldWideWeb.
This article's factual accuracy is disputed. Relevant discussion may be found on the talk page. Please help to ensure that disputed statements are reliably sourced. (Learn how and when to remove this message) |
The World Wide Web ("WWW", "W3", or simply "Web") is an information space in which the items of interest, referred to as resources, are identified by global identifiers called Uniform Resource Identifiers (URIs). The term is often mistakenly used as a synonym for the Internet, but the Web is actually a service that operates over the Internet.
Basic terms
Hypertext is viewed using a program called a web browser which retrieves pieces of information, called "documents" or "web pages", from web servers and displays them, typically on a computer monitor. One can then follow hyperlinks on each page to other documents or even send information back to the server to interact with it. The act of following hyperlinks is often called "surfing" or "browsing" the Web. Web pages are often arranged in collections of related material called "web sites."
Although the English word worldwide is normally written as one word (without a space or hyphen), the proper name World Wide Web and abbreviation WWW are now well-established even in formal English. The earliest references to the Web called it the WorldWideWeb (an example of computer programmers' fondness for intercaps) or the World-Wide Web (with a hyphen, this version of the name is the closest to normal English usage).
How the web works
When you type into your browser the address of the website that you want to view, how does your computer know where to find the computer that contains that website? You can think of it as a global phone network, and what you are actually trying to do is find out the phone number of the website so that you can call it. These phone numbers are called IP addresses.
Finding that IP address is managed by a program called Domain Name Services (DNS). Every computer that has access to the Internet, has access to a computer that has the DNS program on it. Usually your ISP provides the computer with DNS, and everyone that uses that ISP has access to the DNS program. Larger companies have their own computer that has a DNS program on it, which is accessable by everyone who works in that company.
So, your computer sends the website address, e.g. wikipedia.com, to the DNS computer which in turn breaks off the letters to the right of the dot, which in this case is "com". The DNS system knows exactly where to find the computer with the list of "com" websites, since there are twelve of these servers located around the world, they are known as root servers. There are also root servers for names that end in ".org", ".net", ".us". and others.
The root servers contain a list of each website and the IP address (phone number) of the computer that is responsible for knowing the exact IP address of the computer that contains the website. The root server doesn't actually know the exact address for the website, but it does have the address of the computer that has the answer.
The root server receives the request from your DNS system, looks up the IP address for the responsible computer and then sends the information back to your DNS system. So, your DNS server sends out another request, this time to the computer that knows the exact address, and gets the specific IP address of the computer where the website actually resides. The DNS system will then hand this IP address to your browser, which will then send a request to your local router asking the router to go to the specific IP address and to send to that IP address the HTTP request to get the web page.
When a web browser is commanded to open a web page, it communicates on behalf of the client (or computer being used) using the Hypertext Transfer Protocol (HTTP) and makes a web page request. HTTP is represented at the beginning of any web page address. This address tells the browser which server to talk to using HTTP. Once the request is sent, the client computer will wait for a hypertext data stream from the server. When the server gets the request, it looks for the requested file and, if present, sends it to the client as requested.
For example, if 'http://en.wikipedia.org ' is input into a browser, the client computer will connect to the server known as 'en.wikipedia.org' to send it an HTTP request. On the server side, the server known as 'en.wikipedia.org' gets the request and responds, since it is running web server software and is setup to handle such a request, by sending the hypertext it was programmed to send when such a request is made. The client computer then accepts the hypertext (according to web standards) and begins rendering the web page in the browser's window.
Origins
See also: History of the Internet
The Web can be traced back to a project at the European Organization for Nuclear Research (CERN) in 1989 when Tim Berners-Lee and Robert Cailliau built ENQUIRE (short for Enquire Within Upon Everything, a book Berners-Lee recalled from his youth). While it was rather different from the Web we use today, it contained many of the same core ideas (and even some of the ideas of Berners-Lee's next project, the Semantic Web). Berners-Lee mentions that much of the motivation behind the project was so that he could access library information that was scattered on several different servers at CERN.
Tim Berners-Lee published a more formal proposal for the actual World Wide Web on November 12, 1990 and wrote the first Web page on November 13 on a NeXT workstation. Over Christmas of that year Berners-Lee built all the tools necessary for a working Web , the first actual Web browser (which was a web-editor as well), and the first web server. On August 6, 1991, he posted a short summary of the World Wide Web project on the alt.hypertext newsgroup.
The primary underlying concept of hypertext came from earlier efforts, such as Ted Nelson's Project Xanadu and Douglas Engelbart's oN-Line System (NLS). Both Nelson and Engelbart were in turn inspired by Vannevar Bush's microfilm-based "memex," which was described in the 1945 essay "As We May Think".
Berners-Lee's brilliant breakthrough was to marry hypertext to the Internet. In his book Weaving The Web, he explains that he had repeatedly suggested that a marriage between the two technologies was possible to members of both technical communities, but when no one took up his invitation, he finally tackled the project himself. In the process, he developed a system of globally unique identifiers for resources on the Web and elsewhere: the Uniform Resource Identifier.
The World Wide Web had a number of differences from other hypertext systems that were then in place.
- The WWW required only unidirectional links rather than bidirectional ones. This made it possible for someone to link to another resource without action by the owner of that resource. It also significantly reduced the difficulty of implementing Web servers and browsers (in comparison to earlier systems), but in turn presented the chronic problem of broken links.
- Unlike certain applications such as HyperCard or Gopher, the World Wide Web was non-proprietary, making it possible to develop servers and clients independently and to add extensions without licensing restrictions.
On April 30, 1993, CERN announced that the World Wide Web would be free to anyone, with no fees due.
Web standards
At its core, the Web is made up of three standards:
- The Uniform Resource Locator (URL), which is a universal system for addressing individual pages;
- The HyperText Transfer Protocol (HTTP), which specifies how the browser and server communicate with each other; and
- The HyperText Markup Language (HTML), which allows a page to control how its information is presented.
Berners-Lee now heads the World Wide Web Consortium (W3C), which develops and maintains these and other standards that enable computers on the Web to effectively store and communicate different forms of information.
Beyond text
The initial "www" program at CERN displayed styled text and images, and it was a WYSIWYG HTML editor as well as the browser.
As it ran only on NeXT machines, CERN released a simple, text-only version to the world. Some journalists first encountered the Web through the text browser written by Nicola Pellow and this engendered a myth that the Web was text-only until Mosaic came along. The Web had graphics from the start, at least for NeXT users.
Meanwhile, browsers such as Tony Johnson's "Midas" and Pei-Yuan Wei's Viola (1991) added the ability to display graphics also on other Unix machines. Marc Andreessen of NCSA released a browser called "Mosaic for X" in 1993 that sparked a tremendous rise in the popularity of the Web among novice users. Andreessen went on to found Mosaic Communications Corporation (now Netscape Communications Corporation, a unit of Time Warner). Additional features such as dynamic content, music and animation can be found in modern browsers.
Browser makers do not always adhere to the standards set forth by the W3C, so it is not uncommon for these newer features not to work properly on all browsers. The ever-improving technical capability of the WWW has enabled the development of real-time web-based services such as webcasts, Internet radio and live web cams.
Java and JavaScript
Another significant advance in the technology was Sun Microsystems' Java programming language, which initially enabled web servers to embed small programs (called applets) directly into the information being served that would run on the user's computer, allowing faster and richer user interaction, but came to be more widely used as a tool for generating complex server-side content as it is requested.
JavaScript, however, is a scripting language that was developed for Web pages (the standardized version is ECMAScript). While its name is similar to Java it was developed by Netscape and not Sun Microsystems (and has almost nothing to do with it). In conjunction with the Document Object Model, JavaScript has become a much more powerful language than its creators originally envisioned. Sometimes its usage is expressed under the term Dynamic HTML (DHTML), to emphasize a shift away from static HTML pages.
Sociological implications
The exponential growth of the Internet was primarily attributed to the emergence of the web browser Mosaic, followed by its commercial offspring Netscape Navigator, during the mid-1990s.
It brought unprecedented attention to the Internet from media, industries, policy makers, and the general public.
Eventually, it led to several visions of how modern societies might change into information societies, although some point out that those visions are not unique to the Internet, but repeated with many new technologies (especially information and communications technologies) of various eras.
Because the Web is global in scale, some suggested that it will nurture mutual understanding on a global scale.
Publishing web pages
The Web is available to individuals outside mass media. In order to "publish" a web page, one does not have to go through a publisher or other media institution, and potential readers could be found in all corners of the globe.
Unlike books and documents, hypertext does not have a linear order from beginning to end. It is not broken down into the hierarchy of chapters, sections, subsections, etc.
Many different kinds of information are now available on the Web, and for those who wish to know other societies, their cultures and peoples, it has become easier. When travelling in a foreign country or a remote town, one might be able to find some information about the place on the web, especially if the place is in one of the developed countries. Local newspapers, government publications, and other materials are easier to access, and therefore the variety of information obtainable with the same effort may be said to have increased, for the users of the Internet.
Although some websites are available in multiple languages, many are in the local language only. Also, not all software supports all special characters, and RTL languages. These factors would challenge the notion that the World Wide Web will bring a unity to the world.
The increased opportunity to publish materials is certainly observable in the countless personal pages, as well as pages by families, small shops, etc., facilitated by the emergence of free web hosting services.
Statistics
According to a 2001 study , there were more than 550 billion documents on the Web, mostly in the "invisible web". A 2002 survey of 2,024 million web pages determined that by far the most Web content was in English: 56.4%; next were pages in German (7.7%), French (5.6%) and Japanese (4.9%). These numbers are no longer accurate as there has been a recent surge in Chinese websites. A more recent study which used web searches in 75 different languages to sample the web determined that there were over 11.5 billion web pages in the publically-indexable web as of January 2005.
Speed issues
Frustration over congestion issues in the Internet infrastructure and the high latency that results in slow browsing has lead to an alternative name for the Web: the World Wide Wait. Speeding up the Internet is an ongoing discussion over the use of peering and QoS technologies. Other solutions to reduce the World Wide Wait can be found on W3C.
Academic conferences
The major academic event covering the WWW is the World Wide Web series of conferences, promoted by IW3C2. There is a list with links to all conferences in the series.
Pronunciation of "www"
Most English-speaking people pronounce the 9-syllable letter sequence www used in some domain names for websites as "double U, double U, double U", but many shorter pronunciations can be heard: "triple double U", "double U, double U" (omitting one W), "dub, dub, dub", "hex u", etc. Some speakers, mostly those with southern United States accents, pronounce the sequence "dubya, dubya, dubya."
Some languages do not have the letter w in their alphabet (for example, Italian), which leads some people to pronounce www as "vou, vou, vou." In some languages (such as Czech) the w is substituted by a v, so Czechs pronounce www as "veh, veh, veh" rather than the correct but much longer pronunciation "dvojité veh, dvojité veh, dvojité veh." Several other languages (e.g. German, Dutch etc.) simply pronounce the letter W as a single syllable, so this problem doesn't occur.
Depending on how the domain and web server are set up, a www website can often be accessed without entering the "www.", as long as the ".com" or other appropriate top-level domain is appended. Even this is not always necessary as some browsers will automatically try adding "www." and ".com" to typed URIs if a web page isn't found without them.
In English pronunciation, saying the full words "World Wide Web" takes one-third as many syllables as saying the initialism "www". According to Berners-Lee, others mentioned this fact as a reason to choose a different name, but he persisted.
See also
- History of the Internet
- Semantic Web
- Media studies
- Smartphone
- List of websites
- Search engine
- Web directory
- Hypertext
- First image on the Web
- Streaming media
- Cyberzine
References
- Template:Web reference full
- Template:Citepaper publisher version
- Template:Citepaper publisher version
- Template:Web reference full
External links
- Open Directory - Computers: Internet: Web Design and Development
- World Wide Web, the first known web page.
- Internet Statistics: Growth and Usage of the Web and the Internet
Standards
The following is a cursory list of the documents that define the World Wide Web's three core standards:
- Uniform Resource Locator (URL)
- RFC 1738, URL Specification
- Hypertext Markup Language (HTML)
- HyperText Transfer Protocol (HTTP)
- RFC 2068, HTTP version 1.0
- RFC 2616, HTTP version 1.1