URL Encoder / Decoder

Encode and decode URLs and URL components.

URL Encoding:

  • encodeURIComponent: Encodes all special characters including : / ? # @ & = + $
  • encodeURI: Preserves URL structure characters, only encodes spaces and non-ASCII

RFC 3986: Reserved and Unreserved Characters

RFC 3986 divides URL characters into two categories. Unreserved characters β€” A–Z, a–z, 0–9, hyphen, period, underscore, and tilde β€” may appear in a URL without encoding. Reserved characters β€” such as : / ? # [ ] @ ! $ & ' ( ) * + , ; = β€” have defined structural roles (separating scheme, host, path, query, fragment) and must be percent-encoded if they appear as literal data values rather than delimiters. Any other character, including spaces and non-ASCII Unicode, must also be percent-encoded.

Percent-Encoding Mechanism

Percent-encoding replaces a byte with a percent sign followed by the two-digit uppercase hexadecimal value of that byte. A space (ASCII 32, hex 0x20) becomes %20. A forward slash (ASCII 47, hex 0x2F) becomes %2F. Multi-byte UTF-8 characters encode each byte separately: the euro sign € is UTF-8 bytes E2 82 AC, so it encodes as %E2%82%AC.

Common encodings: space = %20, @ = %40, / = %2F, ? = %3F, = = %3D, & = %26, + = %2B, # = %23

encodeURI vs encodeURIComponent

JavaScript provides two encoding functions with different scopes. encodeURI() encodes a complete URL and deliberately leaves reserved structural characters (: / ? # @ & = + $) unchanged because they are needed to preserve URL structure. Use it when you have a full URL and want to encode only the unsafe characters. encodeURIComponent() encodes everything except unreserved characters, including all the reserved delimiters. Use it for individual query parameter keys and values, where a literal & or = in the data would be misinterpreted as a separator.

The HTML form encoding type application/x-www-form-urlencoded uses a variation where space is encoded as + rather than %20 β€” this is different from RFC 3986 percent-encoding and applies only to form body data, not URLs themselves.

Internationalised Domain Names (IDN)

Domain names are restricted to ASCII characters in the DNS system. Internationalised domain names containing non-ASCII characters (such as Arabic, Chinese, or accented Latin) are converted to an ASCII-compatible encoding called Punycode before DNS lookup. For example, the domain mΓΌnchen.de becomes xn--mnchen-3ya.de in Punycode. Browsers display the Unicode form to users in the address bar while using the Punycode form for actual network requests, balancing readability with protocol compatibility.

Worked Examples

Example 1: Query string with special characters

A search query for C++ & Rust tutorials must travel as a URL parameter. Using encodeURIComponent, each unsafe character is percent-encoded: space becomes %20, + becomes %2B, & becomes %26. The final URL: https://example.com/search?q=C%2B%2B%20%26%20Rust%20tutorials. Without encoding, the & would be misread as a parameter separator, breaking the query entirely.

Example 2: Full URL vs query parameter

Encoding the URL https://example.com/path?q=hello world with encodeURI yields https://example.com/path?q=hello%20world β€” the scheme, host, path, and query delimiters are preserved. Using encodeURIComponent on the same string yields https%3A%2F%2Fexample.com%2Fpath%3Fq%3Dhello%20world, which destroys the URL structure. Pick the right function based on whether you have a full URL or a single value.

Example 3: Unicode character encoding

The Japanese word "東京" (Tokyo) is UTF-8 bytes E6 9D B1 E4 BA AC (six bytes total). Percent-encoded: %E6%9D%B1%E4%BA%AC. In a URL path segment it might appear as https://example.com/city/%E6%9D%B1%E4%BA%AC. Modern browsers display "東京" in the address bar but send the percent-encoded form on the wire.

Example 4: Form submission differs from URL encoding

An HTML form posts name=Alice&age=30. Submitting "Alice Smith" as the name with the default application/x-www-form-urlencoded type produces name=Alice+Smith&age=30 β€” the space becomes "+" rather than "%20". In a URL path, "+" means a literal plus sign. This distinction causes subtle bugs when servers misinterpret form bodies as URL path components.

Common Pitfalls

  • Using encodeURI on query parameter values. It leaves &, =, and ? unencoded β€” which breaks the query string if those characters appear in user data. Always use encodeURIComponent for individual parameter keys and values.
  • Double-encoding. Encoding %20 again yields %2520, which decodes to %20 instead of a space. Track whether a string is already encoded before passing it to another layer.
  • Decoding untrusted input without validation. Attacker-supplied URL parameters can carry null bytes, directory-traversal sequences, or control characters after decoding. Validate the decoded value against your expected format.
  • Confusing "+" with space. In form bodies (application/x-www-form-urlencoded), "+" means space; in URL paths, it is a literal "+". Misreading form data as URL data produces strings with unexpected plus signs.
  • Encoding the whole URL when only one value needs it. Full-URL encoding breaks structural characters. Split the URL, encode each component separately, reassemble.

Frequently Asked Questions

Why are spaces sometimes "%20" and sometimes "+"?

%20 is standard RFC 3986 percent-encoding, used in URL paths and query strings emitted by encodeURIComponent. The + substitution is specific to the application/x-www-form-urlencoded MIME type used by HTML forms. Modern APIs should prefer %20 everywhere β€” the + form is a historical quirk.

Do I need to encode hash fragments?

Yes, if the fragment contains reserved or non-ASCII characters. The fragment (#...) is not sent to the server but is parsed by the browser; unencoded special characters can still confuse client-side routers. Encode fragment values with encodeURIComponent the same way as query values.

What is the maximum URL length?

There is no formal spec limit, but practical limits from browsers, proxies, and servers cluster around 2,000–8,000 characters. For anything approaching that size, switch to POST with the data in the request body. Percent-encoding typically multiplies the size of special-character-heavy data by 2–3Γ—, which pushes against this limit faster than expected.

Can I percent-encode safe characters too?

Yes. Decoders must accept percent-encoded versions of unreserved characters (e.g., %41 = "A"). Some URLs intentionally over-encode to avoid parsing ambiguities. Over-encoding is safe; under-encoding corrupts the URL.

Why do I see "%25" in some URLs?

%25 is the encoded form of "%" itself. It appears whenever the original string contained a literal percent sign, or when a URL has been double-encoded. Decoding %2520 twice yields a literal space; decoding once yields %20 β€” a diagnostic for spotting double-encoding.

Related Calculators

View all Developer Tools β†’

Disclaimer

This calculator is provided for educational and informational purposes only. While we strive for accuracy, users should verify all calculations independently. We are not responsible for any errors, omissions, or damages arising from the use of this calculator.


Also in Technical