Invisible Unicode Characters Used in Attacks

Published: April 26, 2026

This is a complete reference of Unicode characters that render as invisible (zero-width or non-printing) and have been observed in code injection attacks, supply chain malware, and steganographic payloads. These are the exact characters that Vibe Check scans for.

Quick Summary

14 character ranges covering hundreds of individual codepoints
12 ranges are high-risk — rarely legitimate in source code
1 range is lower-risk — variation selectors are sometimes used legitimately with emoji
A single invisible character is suspicious. Three or more consecutive invisible characters is almost certainly a steganographic payload.

Complete Reference Table

Codepoint(s)	Name	Count	Risk	Attack Use
`U+200B`	Zero-Width Space	1	High	Binary encoding (bit = 0), string splitting, payload separation
`U+200C`	Zero-Width Non-Joiner	1	High	Binary encoding (bit = 1), text fingerprinting
`U+200D`	Zero-Width Joiner	1	High	Binary encoding, delimiter in multi-byte schemes
`U+200E`	Left-to-Right Mark	1	High	Bidi override attacks, filename spoofing, code flow manipulation
`U+200F`	Right-to-Left Mark	1	High	Bidi override attacks, filename spoofing, making code read backwards
`U+2028`	Line Separator	1	High	JavaScript line terminator injection, breaking string literals
`U+2029`	Paragraph Separator	1	High	JavaScript line terminator injection
`U+202A`–`U+202E`	Bidi Embedding/Override Controls	5	High	Trojan Source attacks: make code appear different than what executes. `U+202E` (RLO) is especially dangerous — reverses text direction
`U+2060`	Word Joiner	1	High	Payload padding, evading word-boundary regex
`U+2061`–`U+2064`	Invisible Math Operators	4	High	Rarely legitimate outside math rendering; used as steganographic bits
`U+FE00`–`U+FE0F`	Variation Selectors	16	Low	Sometimes legitimate (emoji rendering). In source code without emoji, presence is suspicious
`U+FEFF`	Byte Order Mark / Zero-Width No-Break Space	1	High	Legitimate as first character of a file (BOM). Elsewhere: payload marker, string poisoning
`U+E0100`–`U+E01EF`	Variation Selectors Supplement	240	High	Primary encoding range used by Glassworm. 240 characters = enough to map entire alphabet + symbols
`U+E0001`–`U+E007F`	Tag Characters	127	High	Complete invisible ASCII alphabet. Can encode any text payload character-for-character. Originally for language tagging (deprecated)

Total: 401 individual codepoints across 14 ranges. Vibe Check scans for all of them.

How These Characters Are Used in Attacks

Steganographic Payloads (Glassworm/KOI)

Sequences of 3+ consecutive invisible characters almost always indicate an encoded payload. Each invisible character maps to a byte (or bit) of executable code. A decoder function — the only visible part of the attack — reverses the mapping and passes the result to eval().

Trojan Source (Bidi Attacks)

Characters U+202A–U+202E can make code display differently than it executes. A condition like if (isAdmin) can be visually reordered to appear as a different check. This was demonstrated by researchers at Cambridge in 2021 and remains exploitable in most editors.

Text Fingerprinting

Unique patterns of zero-width characters can be embedded in text to trace leaks. Each recipient gets a different pattern. Not malicious per se, but involves invisible manipulation of content.

Prompt Injection

Invisible characters embedded in AI coding assistant configuration files can contain hidden instructions that manipulate the AI's code generation behavior.

How to Detect Them

Vibe Check scans for all 401 invisible codepoints across all 14 ranges. It classifies findings by severity: individual characters get a warning, sequences of 3+ get flagged as critical steganographic payloads. Everything runs client-side in your browser.

Scan Your Code Now →

Detection in Other Tools

VS Code: "editor.renderWhitespace": "all" shows some but not all invisible characters
grep (Linux): grep -P "[\x{200B}-\x{200F}\x{2028}-\x{202E}\x{2060}-\x{2064}\x{FEFF}]" file.js
Hex editor: Any hex editor will show the raw bytes (look for sequences of E2 80 or EF BB BF)
File size: A file with hidden payload will be larger than its visible content warrants