| Country Rank: | 22 |
|---|---|
| World Rank: | 155 |
| Profile Viewed: | 1869 |
| Points: | 3829 |
|
15 Jan
2010
|
Character encoding |
The character encoding is a code that pairs a sequence of characters from a given character set with some thing else to represent it in a form which you can store and transmit it in telecommunication networks, the given characters form before encoded is called character set.
Simple character sets
Conventionally character set and character encoding were considered as the same because the same standerd specify both both of them (ie How characters are encoded into a stream of code.
Modern encoding model
Unicode and its parallel standard and Universal Character set are now the most modern character encoding, they broke away the idea of simle character sets to establish a universal set of characters that can be encoded in a variety of ways, to make this happends we need more terms than the simple character sets.
Terms of modern character encoding
1-Character repertoire :
It is the full set of the abstract characters that a system support it may be closed without addions are allowed without creating a new standered like the ASCII and most of the ISO-8859 serises, or it may be open allowing addions as the case of Unicode.
Any characters in given repertoire reflect the rules about how to divide writing systems into linear information units.
Some alphabets can be broken down into letters, digits, punctuation, and a few special charachters which can all be arranged in simple linear sequences that are displayed in the same order they are read, Even these alphabets have some complicationa that they can be regarded either as part of a single character containing a letter, or as separate characters.
Other writing systems, such as Arabic are represented with more complex character repertoires due to the need to accommodate things like bidirectional text and glyphs that are joined together in different ways for different situations.
2- Coded character set
Which specifies how to represent a repertoire using a number of non nigative integer codes called code points, Acomplete set of charachters and corresponding integers is a coded charcter set .
The coded charcter set may share many repertoire but map them to different codes. In a coded character set, each code point only represents one character.
3- Character encoding form (CEF)
specifies the conversion of a coded character set's integer codes into a set of limited-size integer code values that facilitate storage in a system that represents numbers in binary form using a fixed number of bits , for an example systems that stors numeric information in 16-bit untis will represent integers from 0 to 65,636 in each unit but it could represent more values if it use two units, this is what a CEF accommodates, it defines a way of mapping single code point from a range to a series of one or more code values from a range of.
4- Character encoding scheme (CES)
Spisefies how the fixed size integers codes mapped into octet sequence in order to save it or transmit it via network.