HTTP: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Eric M Gearhart
(spelling and added a little flow to the section)
imported>Howard C. Berkowitz
(More generalization)
Line 1: Line 1:
{{subpages}}
{{subpages}}


'''HTTP''' (the Hypertext Transfer Protocol) is the network protocol on which the [[World Wide Web]] is based. Its original purpose was the transfer of [[HTML]] pages, but it is being used for transferring any type of document. It supports rich meta-information and has a robust caching system.
'''HTTP''' (the Hypertext Transfer Protocol) is the network protocol on which the [[World Wide Web]] is based. Its original purpose was the transfer of [[HTML]] pages, but it is being used for transferring any type of document. It supports [[metadata]] about the transfer, but not the presentation, of information. It does not define the content of the pages themselves, that being the function of the [[HTML|Hypertext Markup Language]] and other page description methods such as [[Cascading Style Sheets|cascading style sheets (CSS)]].
 
HTTP is a relatively simple protocol, which relies on the [[Transmission Control Protocol]] to ensure its traffic is carried, free from errors, over [[internet protocol]] networks. It works in the same manner if the users or servers are connected to the public [[Internet]], an [[intranet]], or an [[extranet]]. HTTP needs to be supplemented to provide security of the message transfer.<ref name=RFC2818 >{{citation
| id = RFC2818
| title =  HTTP Over TLS
| author = Rescorla, E.
| date = May 2000
| publisher = Internet Engineering Task Force
| url = http://www.ietf.org/rfc/rfc2818.txt}}</ref>
 
The [[World Wide Web]] is more than HTML and HTTP alone. It includes a wide range of administrative techniques, performance-enhancing methods such as [[web cache]]s and [[content distribution network]]s, and  and has a robust caching system.  


==History==
==History==
HTTP was created at [[CERN]] by [[Tim Berners-Lee]] in the 1980s as a way to share hypertext documents. After 1990, the protocol began to be used by other sites, primarily in the scientific world. Notable developments were the [[Mosaic]] [[web browser]] and the [[NCSA HTTPd]] web server, both developed at the [[National Center for Supercomputing Applications]].
HTTP was created at [[CERN]] by [[Tim Berners-Lee]] in the 1980s as a way to share hypertext documents.<ref name=CERN>{{citation
| first = Tim | last = Berners-Lee
| url = http://info.cern.ch/Proposal.html
| date = March 1989
| title = Tim Berners-Lee's proposal: "Information Management: a Proposal"}}</ref> After 1990, the protocol began to be used by other sites, primarily in the scientific world. Notable developments were the [[Mosaic]] [[web browser]] and the [[NCSA HTTPd]] web server, both developed at the [[National Center for Supercomputing Applications]] by [[Marc Andreessen]].
 
The first (1990) version of HTTP, called HTTP/0.9, was a simple protocol for raw data transfer across the Internet. HTTP/1.0, as defined by RFC 1945 (1996), improved the protocol by allowing messages to be in a self-describing language, [[HTML]], containing metadata about the location of the user-desired information and how to handle the request and response.
 
Based on experience with the operational Web, however, HTTP/1.0 did not deal well with real-world needs such as  hierarchical [[proxy (computer}|proxies]], [[web cache]]s, the need for persistent communications for long sessions, and [[virtual web server]]s. There were enough optional features that a client and server needed to exchange information about their capabilities before the user information transfer could begin. To meet those needs,  HTTP/1.1 was developed.<ref name=RFC2616 >{{citation
| id = RFC2616
| title = Hypertext Transfer Protocol -- HTTP/1.1
| author = Fielding, R. ''et al.''
| date = June 1999
| publisher = Internet Engineering Task Force
| url = http://www.ietf.org/rfc/rfc2616.txt}}</ref>


==Technical details==
==Technical details==
Line 37: Line 61:
</blockquote>
</blockquote>
====All W3C status codes====
====All W3C status codes====
All the codes are described in [http://www.w3.org/Protocols/rfc2616/rfc2616.html RFC2616  Hypertext Transfer Protocol - HTTP/1.1] in [http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.1.1 Section 10]. Note that like other RFCs, this is only a ''recommendation'', not a standard
All the codes are described in RFC 2616.
 
===HTTP header and cache management===
===HTTP header and cache management===


The HTTP message header includes a number of fields used to facilitate cache management. One of these, Etag (entity tag) is a string valued field that represents a value that should (weak entity tag) or must (strong entity tag) change whenever the page (or other resource) is modified. This allows browsers or other clients to determine whether or not the entire resource needs to be downloaded. The HEAD method, which returns the same message header that would be included in the response to a GET request, can be used to determine if a cached copy of the resource is up to date without actually downloading a new copy. Other elements of the message header can be used, for example, to indicate when a copy should expire (no longer be considered valid), or that it should not be cached at all. This can be useful, for example, when data is generated dynamically (for example, the number of visits to a web site).
The HTTP message header includes a number of fields used to facilitate cache management. One of these, Etag (entity tag) is a string valued field that represents a value that should (weak entity tag) or must (strong entity tag) change whenever the page (or other resource) is modified. This allows browsers or other clients to determine whether or not the entire resource needs to be downloaded. The HEAD method, which returns the same message header that would be included in the response to a GET request, can be used to determine if a cached copy of the resource is up to date without actually downloading a new copy. Other elements of the message header can be used, for example, to indicate when a copy should expire (no longer be considered valid), or that it should not be cached at all. This can be useful, for example, when data is generated dynamically (for example, the number of visits to a web site).
 
==References==
==External links==
{{reflist}}
*[http://tools.ietf.org/html/rfc2616 RFC2616] - the formal specification of HTTP/1.1
*[http://www.w3.org/Protocols/rfc2616/rfc2616-sec6.html#sec6 RFC2616 - section #6] - Status Code and Reason Phrase (HTTP/1.1)

Revision as of 10:54, 17 July 2008

This article is developing and not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article is under development and subject to a disclaimer.

HTTP (the Hypertext Transfer Protocol) is the network protocol on which the World Wide Web is based. Its original purpose was the transfer of HTML pages, but it is being used for transferring any type of document. It supports metadata about the transfer, but not the presentation, of information. It does not define the content of the pages themselves, that being the function of the Hypertext Markup Language and other page description methods such as cascading style sheets (CSS).

HTTP is a relatively simple protocol, which relies on the Transmission Control Protocol to ensure its traffic is carried, free from errors, over internet protocol networks. It works in the same manner if the users or servers are connected to the public Internet, an intranet, or an extranet. HTTP needs to be supplemented to provide security of the message transfer.[1]

The World Wide Web is more than HTML and HTTP alone. It includes a wide range of administrative techniques, performance-enhancing methods such as web caches and content distribution networks, and and has a robust caching system.

History

HTTP was created at CERN by Tim Berners-Lee in the 1980s as a way to share hypertext documents.[2] After 1990, the protocol began to be used by other sites, primarily in the scientific world. Notable developments were the Mosaic web browser and the NCSA HTTPd web server, both developed at the National Center for Supercomputing Applications by Marc Andreessen.

The first (1990) version of HTTP, called HTTP/0.9, was a simple protocol for raw data transfer across the Internet. HTTP/1.0, as defined by RFC 1945 (1996), improved the protocol by allowing messages to be in a self-describing language, HTML, containing metadata about the location of the user-desired information and how to handle the request and response.

Based on experience with the operational Web, however, HTTP/1.0 did not deal well with real-world needs such as hierarchical [[proxy (computer}|proxies]], web caches, the need for persistent communications for long sessions, and virtual web servers. There were enough optional features that a client and server needed to exchange information about their capabilities before the user information transfer could begin. To meet those needs, HTTP/1.1 was developed.[3]

Technical details

The HTTP protocol follows a client-server model, where the client issues a request for a resource to the server. Requests and responses consist of several headers and, optionally, a body. Resources are identified using a URI (Uniform Resource Identifier).

Request methods

Clients can use one of eight request methods:

  • HEAD
  • GET
  • POST
  • PUT
  • DELETE
  • TRACE
  • OPTIONS
  • CONNECT

Typically, only GET, HEAD and POST methods are used in web applications, although protocols like WebDAV make use of others.

Status codes

Server responses include a status header, which informs the client whether the request succeeded. The status header is made up of a "status code" and a "reason phrase" (descriptive text).

Status codes classes

Status codes are grouped into classes:

  • 1xx (informational) : Request received, continuing process
  • 2xx (success) : The action was successfully received, understood, and accepted
  • 3xx (redirect) : Further action must be taken in order to complete the request
  • 4xx (client error) : The request contains bad syntax or cannot be fulfilled
  • 5xx (server error) : The server failed to fulfill an apparently valid request.

For example, if the client requests a non-existent document, the status code will be "404 Not Found".

According to the W3C consortium :

HTTP applications are not required to understand the meaning of all registered status codes, though such understanding is obviously desirable. However, applications MUST understand the class of any status code, as indicated by the first digit, and treat any unrecognized response as being equivalent to the x00 status code of that class, with the exception that an unrecognized response MUST NOT be cached.

All W3C status codes

All the codes are described in RFC 2616.

HTTP header and cache management

The HTTP message header includes a number of fields used to facilitate cache management. One of these, Etag (entity tag) is a string valued field that represents a value that should (weak entity tag) or must (strong entity tag) change whenever the page (or other resource) is modified. This allows browsers or other clients to determine whether or not the entire resource needs to be downloaded. The HEAD method, which returns the same message header that would be included in the response to a GET request, can be used to determine if a cached copy of the resource is up to date without actually downloading a new copy. Other elements of the message header can be used, for example, to indicate when a copy should expire (no longer be considered valid), or that it should not be cached at all. This can be useful, for example, when data is generated dynamically (for example, the number of visits to a web site).

References

  1. Rescorla, E. (May 2000), HTTP Over TLS, Internet Engineering Task Force, RFC2818
  2. Berners-Lee, Tim (March 1989), Tim Berners-Lee's proposal: "Information Management: a Proposal"
  3. Fielding, R. et al. (June 1999), Hypertext Transfer Protocol -- HTTP/1.1, Internet Engineering Task Force, RFC2616