Table of Contents
This is my understanding of how escaping done in URL (or known as URL Encoding).
Standards
A related and often mentioned is application/x-www-form-urlencoded from HTML 4.01 Specification.
RFC 3986
RFC 2396
RFC 1738
application/x-www-form-urlencoded
Quote from HTML 4.01 Spec:
- Control names and values are escaped. Space characters are replaced by `+', and then reserved characters are escaped as described in [RFC1738], section 2.2: Non-alphanumeric characters are replaced by `%HH', a percent sign and two hexadecimal digits representing the ASCII code of the character. Line breaks are represented as “CR LF” pairs (i.e., `%0D%0A').
- The control names/values are listed in the order they appear in the document. The name is separated from the value by `=' and name/value pairs are separated from each other by `&'.
Comparison
Unreserved character comparison.
| Character | RFC 3986 | RFC 2396 | RFC 1738 | application/x-www-form-urlencoded |
|---|---|---|---|---|
A-Z | ∨ | ∨ | ∨ | ∨ |
a-z | ∨ | ∨ | ∨ | ∨ |
0-9 | ∨ | ∨ | ∨ | ∨ |
exclamation point (!) | ∨ | ∨ | ? | |
dollar ($) | ∨ | ? | ||
single quote (') | ∨ | ∨ | ? | |
opening parenthesis (() | ∨ | ∨ | ? | |
closing parenthesis ()) | ∨ | ∨ | ? | |
asterisk (*) | ∨ | ∨ | ? | |
plus (+) | ∨ | ? | ||
comma (,) | ∨ | ? | ||
hyphen (-) | ∨ | ∨ | ∨ | ? |
period (.) | ∨ | ∨ | ∨ | ? |
underscore (_) | ∨ | ∨ | ∨ | ? |
tilde (~) | ∨ | ∨ | ? |