doc:urlencoding

This is my understanding of how escaping done in URL (or known as URL Encoding).

Standards

A related and often mentioned is application/x-www-form-urlencoded from HTML 4.01 Specification.

RFC 3986

RFC 2396

RFC 1738

application/x-www-form-urlencoded

Quote from HTML 4.01 Spec:

  1. Control names and values are escaped. Space characters are replaced by `+', and then reserved characters are escaped as described in [RFC1738], section 2.2: Non-alphanumeric characters are replaced by `%HH', a percent sign and two hexadecimal digits representing the ASCII code of the character. Line breaks are represented as “CR LF” pairs (i.e., `%0D%0A').
  2. The control names/values are listed in the order they appear in the document. The name is separated from the value by `=' and name/value pairs are separated from each other by `&'.

Comparison

Unreserved character comparison.

CharacterRFC 3986RFC 2396RFC 1738application/x-www-form-urlencoded
A-Z
a-z
0-9
exclamation point (!) ?
dollar ($) ?
single quote (') ?
opening parenthesis (() ?
closing parenthesis ()) ?
asterisk (*) ?
plus (+) ?
comma (,) ?
hyphen (-)?
period (.)?
underscore (_)?
tilde (~) ?

Implementations

Java

PHP

Reference

doc/urlencoding.txt · Last modified: 2012/04/09 15:55 by 114.45.60.96