DNS, ASCII & “Fancy Characters” Attacks

Why browsers convert names to ASCII, how attackers abuse that, and tiny Deno scripts to experiment — with ASCII diagrams you can read in any terminal.

Sweet summary up front: DNS only speaks plain ASCII. To let people use accents, Chinese, emoji, etc., we convert those names to Punycode (ASCII form starting with xn--). That lets the old DNS work — but attackers can register look-alike names (homographs) and trick users. Browsers try to help, but a little code and awareness go a long way.


The simple pipeline (ASCII diagram)

Type this into your brain:

You type:      fâcebook.com
                |
Browser: normalize & encode (Punycode)
                |
             Punycode: xn--fcebook-3ya.com
                |
DNS lookup: ask DNS about xn--fcebook-3ya.com
                |
DNS returns IP
                |
Browser: decides display (Unicode or xn-- form) + checks TLS + Safe Browsing
                |
Connection established (or blocked if malicious)

This ASCII pipeline shows why conversion is necessary: DNS can't handle â or .


Why ASCII only? Short and sweet

  • DNS was designed before Unicode existed — it accepts only ASCII letters (a–z), digits (0–9), and hyphen (-) in labels.
  • To support Unicode labels, the internet uses IDNA (Internationalized Domain Names in Applications) which uses Punycode (RFC 3492).
  • Punycode maps any Unicode label into an ASCII label beginning with xn-- so DNS can route it.

What’s a homograph (confusable) attack?

Attackers register domains that look like famous ones:

  • Replace Latin a with Cyrillic а (visually identical in many fonts).
  • Use accented letters: fâcebook.com vs facebook.com.
  • Use non-Latin glyphs, emoji, or mixtures of scripts.

Goal: a user glances at the URL, thinks it’s the real site (e.g., facebook.com), enters credentials, gets phished.


What browsers do to protect you

  • Convert the typed domain to Punycode to query DNS.
  • If the domain looks suspicious (confusable or mixing scripts), many browsers show the Punycode (xn--...) instead of the pretty Unicode to make the trick visible.
  • Browsers also check Safe Browsing / phishing blacklists and TLS certificates. Those help, but they aren’t perfect defenses against new or unlisted threats.

Deno code example — encode/decode Punycode

Save as punycode_demo.ts and run:

deno run --allow-net punycode_demo.ts
// punycode_demo.ts
import punycode from "https://esm.sh/punycode@2.3.1";

const examples = [
  "fâcebook.com",
  "facebook.com",
  "mañana.org",
  "例子.测试",      // Chinese example
  "xn--fcebook-3ya.com"
];

for (const d of examples) {
  const ascii = punycode.toASCII(d);
  const unicode = punycode.toUnicode(ascii);
  console.log("Input:   ", d);
  console.log("Punycode:", ascii);
  console.log("Unicode: ", unicode);
  console.log("");
}

What it does:

  • toASCII converts Unicode → Punycode (ASCII).
  • toUnicode converts back.
  • Try also deno run punycode_demo.ts with other domains.

Quick recap

Punycode bridges Unicode-friendly input and the ASCII-only DNS but creates an attack surface for visually confusable domains — learn to convert to Punycode and use simple detectors to spot obvious tricks.

I hope this post was helpful to you.

Leave a reaction if you liked this post!