Table of Contents
This is my understanding of how escaping done in URL (or known as URL Encoding).
Standards
A related and often mentioned is application/x-www-form-urlencoded from HTML 4.01 Specification.
RFC 3986
RFC 2396
RFC 1738
application/x-www-form-urlencoded
Quote from HTML 4.01 Spec:
- Control names and values are escaped. Space characters are replaced by `+', and then reserved characters are escaped as described in [RFC1738], section 2.2: Non-alphanumeric characters are replaced by `%HH', a percent sign and two hexadecimal digits representing the ASCII code of the character. Line breaks are represented as “CR LF” pairs (i.e., `%0D%0A').
- The control names/values are listed in the order they appear in the document. The name is separated from the value by `=' and name/value pairs are separated from each other by `&'.
Comparison
Unreserved character comparison.
Character | RFC 3986 | RFC 2396 | RFC 1738 | application/x-www-form-urlencoded |
---|---|---|---|---|
A-Z | ∨ | ∨ | ∨ | ∨ |
a-z | ∨ | ∨ | ∨ | ∨ |
0-9 | ∨ | ∨ | ∨ | ∨ |
exclamation point (! ) | ∨ | ∨ | ? | |
dollar ($ ) | ∨ | ? | ||
single quote ( ') | ∨ | ∨ | ? | |
opening parenthesis (( ) | ∨ | ∨ | ? | |
closing parenthesis () ) | ∨ | ∨ | ? | |
asterisk (* ) | ∨ | ∨ | ? | |
plus (+ ) | ∨ | ? | ||
comma (, ) | ∨ | ? | ||
hyphen (- ) | ∨ | ∨ | ∨ | ? |
period (. ) | ∨ | ∨ | ∨ | ? |
underscore (_ ) | ∨ | ∨ | ∨ | ? |
tilde (~ ) | ∨ | ∨ | ? |