We are independent & ad-supported. We may earn a commission for purchases made through our links.
Advertiser Disclosure
Our website is an independent, advertising-supported platform. We provide our content free of charge to our readers, and to keep it that way, we rely on revenue generated through advertisements and affiliate partnerships. This means that when you click on certain links on our site and make a purchase, we may earn a commission. Learn more.
How We Make Money
We sustain our operations through affiliate commissions and advertising. If you click on an affiliate link and make a purchase, we may receive a commission from the merchant at no additional cost to you. We also display advertisements on our website, which help generate revenue to support our work and keep our content free for readers. Our editorial team operates independently of our advertising and affiliate partnerships to ensure that our content remains unbiased and focused on providing you with the best information and recommendations based on thorough research and honest evaluations. To remain transparent, we’ve provided a list of our current affiliate partners here.
Software

Our Promise to you

Founded in 2002, our company has been a trusted resource for readers seeking informative and engaging content. Our dedication to quality remains unwavering—and will never change. We follow a strict editorial policy, ensuring that our content is authored by highly qualified professionals and edited by subject matter experts. This guarantees that everything we publish is objective, accurate, and trustworthy.

Over the years, we've refined our approach to cover a wide range of topics, providing readers with reliable and practical advice to enhance their knowledge and skills. That's why millions of readers turn to us each year. Join us in celebrating the joy of learning, guided by standards you can trust.

What Is Character Encoding?

By Eugene P.
Updated: May 16, 2024
References

Character encoding, in computer programming, is a method or algorithm used to find a usually numerical representation of a character, glyph or symbol. The use of character encoding in computers is necessary because information within computer memory and on computer-readable media is stored as sequences of bits or numbers. This requires the use of encoding to translate non-numerical characters that are used for display or human-readable output into a form that a computer can manipulate. In a more specific application, HyperText Markup Language (HTML) documents that are read by web browsers can define what type of character encoding they are using to let the browser know which specific character set to use when displaying the information in the document. There are several encoding schemes in use, though many of these proprietary and legacy sets are slowly being replaced by the Unicode® encoding standard.

In the early days of computers, when there was limited memory space, the basic characters of the English alphabet — including punctuation and numbers — were stored in 7-bit sequences allowing for 128 different characters. In this original scheme, each 7-bit byte represented one character of the English alphabet, numbered in sequence. This character encoding was efficient and was eventually standardized and used in most of the computers that were produced. Although the encoding system evolved into the Unicode® encoding standard, the concept remained the same. Namely, each single character in a language is directly related to a single number within a large standard character set, and that number is what a computer uses to store, process and index the character.

Other types of character encoding were developed for different reasons. Some that were geared specifically to the English alphabet and intended to be used for text only mapped their characters onto 7-bit sequences and then spread them across 8-bit bytes, or octets. This had the effect of saving 1 bit per octet, effectively using character encoding as a type of compression. Other encoding schemes attempted to provide base information about a character, and then additional characters to represent special accents that could be used when writing in a different language, although these were largely abandoned for the simpler one-to-one encoding methods.

In HTML documents, character encoding is roughly the same as the broader concept, except the encoding being defined encompasses an entire set of characters. This can be important not only for foreign languages, but for documents that use specific symbols for science or mathematics that are not present in all character sets. It also can be useful for using punctuation and other glyphs that might be not present or are mapped differently across encoding schemes. Documents that do not properly define a non-standard character encoding could display incorrectly or be filled with nonsensical characters and placeholders instead of readable information.

EasyTechJunkie is dedicated to providing accurate and trustworthy information. We carefully select reputable sources and employ a rigorous fact-checking process to maintain the highest standards. To learn more about our commitment to accuracy, read our editorial process.
Link to Sources
Discussion Comments
Share
https://www.easytechjunkie.com/what-is-character-encoding.htm
EasyTechJunkie, in your inbox

Our latest articles, guides, and more, delivered daily.

EasyTechJunkie, in your inbox

Our latest articles, guides, and more, delivered daily.