~/devtools / unicode

tool::unicode

Unicode
Converter

Bidirectional conversion between text and \uXXXX unicode escape sequences. Supports Java .properties native2ascii format.

// Unicode Escapes

The \uXXXX format represents a Unicode code point as 4 hexadecimal digits. Java .properties files store non-ASCII characters in this format.

Text

0 chars

Unicode Escapes

0 chars

// about this tool

What is a Unicode Converter?

Unicode is a universal character encoding standard that assigns a unique code point to every character in every language. Each character can be represented as a \uXXXX escape sequence using four hexadecimal digits.

Java's .properties files cannot store non-ASCII characters directly. The native2ascii tool converts characters like Korean or Chinese into \uXXXX escape sequences. This tool replicates that conversion in the browser instantly.

Three modes are supported: 'Encode All' converts every character to \u format, 'Non-ASCII Only' converts characters outside the ASCII range (above 127), and 'Java Properties' mode matches the exact behavior of native2ascii.

Use Cases

▸Convert Korean messages in Java .properties files to native2ascii format
▸Interpret and debug \uXXXX escape sequences in source code
▸Look up Unicode code points of emoji or special characters
▸Prepare or verify i18n resource files for Java applications
▸Understand Unicode escape sequences in regular expressions

FAQ

Q. What is the difference between \u and U+?

U+XXXX is the Unicode standard notation for a code point. \uXXXX is the escape syntax used in programming languages like Java, JavaScript, and C# to represent a Unicode character in source code.

Q. Why can't some emoji be represented with a single \uXXXX?

Most emoji are in the Supplementary Planes (above U+FFFF). In UTF-16, these require a surrogate pair — two \u escapes in the range \uD800–\uDFFF. A single 4-digit \u cannot represent them.

Q. What is the difference between Java Properties mode and Non-ASCII mode?

Non-ASCII mode converts all characters above code point 127. Java Properties mode matches native2ascii exactly, converting characters above Latin-1 range (code point 256 and above) to \uXXXX.

// related tools

Encode and decode Base64 strings. Supports text, URLs, and binary data.

Encode or decode URL strings with encodeURIComponent or encodeURI.

HTML Entity Encoder

Encode HTML special characters into entities or decode them. & → & and more.

Format, validate, and minify JSON. Supports nested structures and diff comparison.