Fact Checked

What is a Unicode Text Editor?

Kurt Inman
Kurt Inman

A Unicode text editor is computer software which can be used to create, edit or view text in a variety of alphabets. It stores information in Unicode, an evolving international standard for representation of human languages. A Unicode text editor is particularly useful with non-Latin alphabets, including those that are read from right to left. Unicode editors are used around the world to create documents, web page content and text for software applications in many languages.

The Unicode standard was first proposed in the late 1980s by the early members of the Unicode Consortium; this non-profit organization coordinates the standard's development worldwide. Early versions of Unicode were designed to accommodate most languages in use at the time. In 1996, its capacity increased to over one million distinct characters, allowing even ancient Egyptian Hieroglyphs to be input and displayed with a Unicode text editor. The Unicode standard specifically defines more than 107,000 characters. Even more complex letters and symbols can be crafted using these pre-defined building blocks.

Woman doing a handstand with a computer
Woman doing a handstand with a computer

Unicode is supported to some extent in most modern web browsers, computer software applications and operating systems. Prior to Unicode, there were many different methods for representing non-Latin alphabets, most of them incompatible with each other. This made it very difficult to enter or display text in several languages simultaneously. A Unicode text editor represents and stores such content in a consistent, well-defined way—the text created can be easily shared with other Unicode-compliant applications and web pages worldwide.

A full-featured Unicode text editor generally allows information to be input from the keyboard in a way that is natural for a particular language. For example, Hebrew, Arabic and other languages which are written right to left can be entered and displayed in that direction with a Unicode editor. Multiple languages can be included in the same document, even if they are written in different directions. Not all characters can be easily entered using a localized keyboard—alternate input methods are usually provided, including choosing from an on-screen list and hard-coding numerically.

A Unicode text editor can import files in a variety of formats, such as Unified Hangul Code or Thai. While loading, any numerically-coded Unicode characters can be automatically converted to actual Unicode symbols. Text files can usually be saved in Unicode or American Standard Code for Information Interchange (ASCII) with non-Latin characters represented numerically. Content can frequently be stored in HyperText Markup Language (HTML) format with Unicode UTF-8 encoding, enabling correct display in modern web browsers.

Unicode text editors often allow different fonts and colors to be selected for individual languages, making it easier to work with a mix of content. "Combining characters," required by some languages to connect individual symbols, can usually be hidden or displayed. While editing, blocks of text can be reordered. They can often be converted from one case to another or from HTML entities to Unicode characters. Many editors also include features which simplify entering and editing Asian languages, converting text between Simplified Chinese and Traditional Chinese or between transliterations and Unicode representations, for example.

Many Unicode text editors are available commercially or through the open source community. Most modern proprietary and open source word processors can also act as Unicode editors. Several web page design tools and email editors do this as well. Unicode text editors are generally available for all of the major operating systems, and several web-based tools exist also.

You might also Like

Discuss this Article

Post your comments
Forgot password?
    • Woman doing a handstand with a computer
      Woman doing a handstand with a computer