DNS, ASCII & “Fancy Characters” Attacks
Why browsers convert names to ASCII, how attackers abuse that, and tiny Deno scripts to experiment — with ASCII diagrams you can read in any terminal.
Sweet summary up front: DNS only speaks plain ASCII. To let people use accents, Chinese, emoji, etc., we convert those names to Punycode (ASCII form starting with xn--
). That lets the old DNS work — but attackers can register look-alike names (homographs) and trick users. Browsers try to help, but a little code and awareness go a long way.
The simple pipeline (ASCII diagram)
Type this into your brain:
You type: fâcebook.com | Browser: normalize & encode (Punycode) | Punycode: xn--fcebook-3ya.com | DNS lookup: ask DNS about xn--fcebook-3ya.com | DNS returns IP | Browser: decides display (Unicode or xn-- form) + checks TLS + Safe Browsing | Connection established (or blocked if malicious)
This ASCII pipeline shows why conversion is necessary: DNS can't handle â
or 你
.
Why ASCII only? Short and sweet
- DNS was designed before Unicode existed — it accepts only ASCII letters (
a–z
), digits (0–9
), and hyphen (-
) in labels. - To support Unicode labels, the internet uses IDNA (Internationalized Domain Names in Applications) which uses Punycode (RFC 3492).
- Punycode maps any Unicode label into an ASCII label beginning with
xn--
so DNS can route it.
What’s a homograph (confusable) attack?
Attackers register domains that look like famous ones:
- Replace Latin
a
with Cyrillicа
(visually identical in many fonts). - Use accented letters:
fâcebook.com
vsfacebook.com
. - Use non-Latin glyphs, emoji, or mixtures of scripts.
Goal: a user glances at the URL, thinks it’s the real site (e.g., facebook.com
), enters credentials, gets phished.
What browsers do to protect you
- Convert the typed domain to Punycode to query DNS.
- If the domain looks suspicious (confusable or mixing scripts), many browsers show the Punycode (
xn--...
) instead of the pretty Unicode to make the trick visible. - Browsers also check Safe Browsing / phishing blacklists and TLS certificates. Those help, but they aren’t perfect defenses against new or unlisted threats.
Deno code example — encode/decode Punycode
Save as punycode_demo.ts
and run:
deno run --allow-net punycode_demo.ts
// punycode_demo.ts import punycode from "https://esm.sh/punycode@2.3.1"; const examples = [ "fâcebook.com", "facebook.com", "mañana.org", "例子.测试", // Chinese example "xn--fcebook-3ya.com" ]; for (const d of examples) { const ascii = punycode.toASCII(d); const unicode = punycode.toUnicode(ascii); console.log("Input: ", d); console.log("Punycode:", ascii); console.log("Unicode: ", unicode); console.log(""); }
What it does:
toASCII
converts Unicode → Punycode (ASCII).toUnicode
converts back.- Try also
deno run punycode_demo.ts
with other domains.
Quick recap
Punycode bridges Unicode-friendly input and the ASCII-only DNS but creates an attack surface for visually confusable domains — learn to convert to Punycode and use simple detectors to spot obvious tricks.
I hope this post was helpful to you.
Leave a reaction if you liked this post!