Published: April 26, 2026
Steganography is the practice of hiding information inside something that looks ordinary. Unicode steganography hides executable code or data inside text by using characters that are valid Unicode but render as invisible — zero pixels wide, no glyph, completely undetectable to the human eye.
Unlike traditional steganography (hiding data in images or audio), Unicode steganography embeds payloads directly in source code, configuration files, or text that developers copy and paste every day.
The Unicode standard defines over 149,000 characters across 161 scripts. Among them are dozens of characters specifically designed to be invisible:
These characters exist for legitimate typographic reasons. But attackers exploit the gap between what the Unicode standard allows and what code editors show.
The simplest technique. Each byte of malicious code is converted to binary, then each bit is represented by one of two invisible characters:
U+200B (zero-width space) = 0U+200C (zero-width non-joiner) = 1Eight invisible characters encode one byte. A 500-byte payload requires 4,000 invisible characters — all hidden between visible lines of code.
Example: The letter "A" (0x41 = 01000001) would be encoded as: U+200C U+200B U+200B U+200B U+200B U+200B U+200B U+200C — completely invisible in any editor.
Used by Glassworm. Each ASCII character is mapped to a Unicode Private Use Area codepoint or Variation Selector Supplement character. The mapping is arbitrary but consistent — a small decoder function reverses it at runtime.
This technique is harder to detect because the invisible characters don't follow a simple binary pattern.
A newer technique that uses half-width and full-width Hangul (Korean) character variants. Each ASCII byte is split into bits represented by specific Hangul characters that some systems render as invisible or near-invisible.
Hidden characters alone are inert. The attack requires a decoder — a small piece of visible JavaScript that:
eval() or Function()The decoder is typically 3–5 lines of code, often disguised as a string utility or configuration parser. It is the only visible part of the attack.
| Location | Why It Works |
|---|---|
| npm package source files | Developers rarely read every line of dependencies |
| VS Code extension code | Extensions run with full system access |
| GitHub repository files | Code review UIs hide invisible characters |
| AI-generated code | LLMs may propagate invisible chars from training data |
| Copy-pasted Stack Overflow answers | Browser copy can include hidden characters from page source |
| Configuration files (JSON, YAML) | Parsers may silently accept invisible characters |
Vibe Check scans for all 14 invisible Unicode character ranges used in known steganographic attacks. It detects individual invisible characters (warning) and consecutive sequences of 3+ invisible characters (critical — almost certainly a payload). Everything runs in your browser.
Scan Your Code Now →editor.renderWhitespace: all in VS Code)eval() usage — the decoder must use eval or Function to execute the hidden payload