Misplaced Pages

Code: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editContent deleted Content addedVisualWikitext
Revision as of 00:32, 11 December 2022 view sourceAllforrous (talk | contribs)Extended confirmed users27,518 edits new key for Category:Encodings: " " using HotCat← Previous edit Latest revision as of 11:57, 26 December 2024 view source Ultraodan (talk | contribs)Extended confirmed users884 edits Implementing talk page edit requested by Chapel1337 – Fixing tense as per MOS:TENSE 
(9 intermediate revisions by 8 users not shown)
Line 1: Line 1:
{{pp-semi-indef|small=yes}} {{pp-semi-indef|small=yes}}
{{short description|System of rules to convert information into another form or representation}} {{short description|System of rules to convert information into another form or representation}}
{{hatgrp|
{{Other uses}} {{Other uses}}
{{Redirect|Encoding}}
{{Self reference|For usage of the ''code'' tag (''<code>…</code>'') on Misplaced Pages, see ].}}
}}
{{Redirect|Encoding|other uses|Encoding (disambiguation)}}
{{For|], "Code#01 Bad Girl" and "Code#02 Pretty Pretty" redirect here. For the EPs by Ladies' Code|Code 01 Bad Girl|Code 02 Pretty Pretty}} {{technical reasons|prefix=y|Code#|the EPs by Ladies' Code|Code 01 Bad Girl|and|Code 02 Pretty Pretty}}
{{More citations needed|date=March 2010}} {{More citations needed|date=March 2010}}
In ]s and ], '''code''' is a system of rules to convert ]—such as a ], ], sound, image, or ]—into another form, sometimes ] or ], for communication through a ] or storage in a ]. An early example is an invention of ], which enabled a person, through ], to communicate what they thought, saw, heard, or felt to others. But speech limits the range of communication to the distance a voice can carry and limits the audience to those present when the speech is uttered. The invention of ], which converted spoken language into ] ]s, extended the range of communication across space and ]. In ]s and ], '''code''' is a system of rules to convert ]—such as a ], ], sound, image, or ]—into another form, sometimes ] or ], for communication through a ] or storage in a ]. An early example is an invention of ], which enabled a person, through ], to communicate what they thought, saw, heard, or felt to others. But speech limits the range of communication to the distance a voice can carry and limits the audience to those present when the speech is uttered. The invention of ], which converted spoken language into ] ]s, extended the range of communication across space and ].


The process of '''encoding''' converts information from a ] into symbols for communication or storage. '''Decoding''' is the reverse process, converting code symbols back into a form that the recipient understands, such as English or/and Spanish. The process of '''encoding''' converts information from a ] into symbols for communication or storage. '''Decoding''' is the reverse process, converting code symbols back into a form that the recipient understands, such as English or/and Spanish.
Line 20: Line 21:
is a code, whose source alphabet is the set <math>\{a,b,c\}</math> and whose target alphabet is the set <math>\{0,1\}</math>. Using the extension of the code, the encoded string 0011001 can be grouped into codewords as 0 011 0 01, and these in turn can be decoded to the sequence of source symbols ''acab''. is a code, whose source alphabet is the set <math>\{a,b,c\}</math> and whose target alphabet is the set <math>\{0,1\}</math>. Using the extension of the code, the encoded string 0011001 can be grouped into codewords as 0 011 0 01, and these in turn can be decoded to the sequence of source symbols ''acab''.


Using terms from ], the precise mathematical definition of this concept is as follows: let S and T be two finite sets, called the source and target ], respectively. A '''code''' <math>C:\, S \to T^*</math> is a ] mapping each symbol from S to a ] over T. The '''extension''' <math>C'</math> of <math>C</math>, is a ] of <math>S^*</math> into <math>T^*</math>, which naturally maps each sequence of source symbols to a sequence of target symbols. Using terms from ], the precise mathematical definition of this concept is as follows: let S and T be two finite sets, called the source and target ], respectively. A '''code''' <math>C:\, S \to T^*</math> is a ] mapping each symbol from S to a ] over T. The '''extension''' <math>C'</math> of <math>C</math>, is a ] of <math>S^*</math> into <math>T^*</math>, which naturally maps each sequence of source symbols to a sequence of target symbols.


=== Variable-length codes === === Variable-length codes ===
{{main|Variable-length code}} {{main|Variable-length code}}
In this section, we consider codes that encode each source (clear text) character by a ] from some dictionary, and ] of such code words give us an encoded string. Variable-length codes are especially useful when clear text characters have different probabilities; see also ]. In this section, we consider codes that encode each source (clear text) character by a ] from some dictionary, and ] of such code words give us an encoded string. Variable-length codes are especially useful when clear text characters have different probabilities; see also ].


A ''prefix code'' is a code with the "prefix property": there is no valid code word in the system that is a ] (start) of any other valid code word in the set. ] is the most known algorithm for deriving prefix codes. Prefix codes are widely referred to as "Huffman codes" even when the code was not produced by a Huffman algorithm. Other examples of prefix codes are ], the country and publisher parts of ]s, and the Secondary Synchronization Codes used in the ] ] 3G Wireless Standard. A ''prefix code'' is a code with the "prefix property": there is no valid code word in the system that is a ] (start) of any other valid code word in the set. ] is the most known algorithm for deriving prefix codes. Prefix codes are widely referred to as "Huffman codes" even when the code was not produced by a Huffman algorithm. Other examples of prefix codes are ], the country and publisher parts of ]s, and the Secondary Synchronization Codes used in the ] ] 3G Wireless Standard.
Line 33: Line 34:
{{Main|Error detection and correction}} {{Main|Error detection and correction}}
{{See also|Block code}} {{See also|Block code}}
Codes may also be used to represent data in a way more resistant to errors in transmission or storage. This so-called ] works by including carefully crafted redundancy with the stored (or transmitted) data. Examples include ]s, ], ], ], ], ], ], ], ]s, and ]s. Codes may also be used to represent data in a way more resistant to errors in transmission or storage. This so-called ] works by including carefully crafted redundancy with the stored (or transmitted) data. Examples include ]s, ], ], ], ], ], ], ], ]s, and ]s.
Error detecting codes can be optimised to detect ''burst errors'', or ''random errors''. Error detecting codes can be optimised to detect ''burst errors'', or ''random errors''.


Line 41: Line 42:
A cable code replaces words (e.g. ''ship'' or ''invoice'') with shorter words, allowing the same information to be sent with fewer ], more quickly, and less expensively. A cable code replaces words (e.g. ''ship'' or ''invoice'') with shorter words, allowing the same information to be sent with fewer ], more quickly, and less expensively.


Codes can be used for brevity. When ] messages were the state of the art in rapid long-distance communication, elaborate systems of ] that encoded complete phrases into single mouths (commonly five-minute groups) were developed, so that telegraphers became conversant with such "words" as ''BYOXO'' ("Are you trying to weasel out of our deal?"), ''LIOUY'' ("Why do you not answer my question?"), ''BMULD'' ("You're a skunk!"), or ''AYYLU'' ("Not clearly coded, repeat more clearly."). ]s were chosen for various reasons: ], ], etc. Meanings were chosen to fit perceived needs: commercial negotiations, military terms for military codes, diplomatic terms for diplomatic codes, any and all of the preceding for espionage codes. Codebooks and codebook publishers proliferated, including one run as a front for the American ] run by ] between the First and Second World Wars. The purpose of most of these codes was to save on cable costs. The use of data coding for ] predates the computer era; an early example is the telegraph ] where more-frequently used characters have shorter representations. Techniques such as ] are now used by computer-based ]s to compress large data files into a more compact form for storage or transmission. Codes can be used for brevity. When ] messages were the state of the art in rapid long-distance communication, elaborate systems of ] that encoded complete phrases into single mouths (commonly five-minute groups) were developed, so that telegraphers became conversant with such "words" as ''BYOXO'' ("Are you trying to weasel out of our deal?"), ''LIOUY'' ("Why do you not answer my question?"), ''BMULD'' ("You're a skunk!"), or ''AYYLU'' ("Not clearly coded, repeat more clearly."). ]s were chosen for various reasons: ], ], etc. Meanings were chosen to fit perceived needs: commercial negotiations, military terms for military codes, diplomatic terms for diplomatic codes, any and all of the preceding for espionage codes. Codebooks and codebook publishers proliferated, including one run as a front for the American ] run by ] between the First and Second World Wars. The purpose of most of these codes was to save on cable costs. The use of data coding for ] predates the computer era; an early example is the telegraph ] where more-frequently used characters have shorter representations. Techniques such as ] are now used by computer-based ]s to compress large data files into a more compact form for storage or transmission.


=== Character encodings === === Character encodings ===
Line 52: Line 53:


=== Gödel code === === Gödel code ===
In ], a ] was the basis for the proof of ]'s ]. Here, the idea was to map ] to a ] (using a ]). In ], a ] is the basis for the proof of ]'s ]. Here, the idea is to map ] to a ] (using a ]).


=== Other === === Other ===
Line 97: Line 98:
Occasionally, a code word achieves an independent existence (and meaning) while the original equivalent phrase is forgotten or at least no longer has the precise meaning attributed to the code word. For example, '30' was widely used in ] to mean "end of story", and has been used in ] to signify "the end".<ref>Kogan, Hadass {{webarchive|url=https://web.archive.org/web/20101212101705/http://ajr.org/Article.asp?id=4408 |date=2010-12-12 }} American Journalism Review. Retrieved 2012-07-03.</ref> Occasionally, a code word achieves an independent existence (and meaning) while the original equivalent phrase is forgotten or at least no longer has the precise meaning attributed to the code word. For example, '30' was widely used in ] to mean "end of story", and has been used in ] to signify "the end".<ref>Kogan, Hadass {{webarchive|url=https://web.archive.org/web/20101212101705/http://ajr.org/Article.asp?id=4408 |date=2010-12-12 }} American Journalism Review. Retrieved 2012-07-03.</ref>
<ref>{{cite web <ref>{{cite web
|title = WESTERN UNION "92 CODE" & WOOD'S "TELEGRAPHIC NUMERALS" |title = Western Union "92 Code" & Wood's "Telegraphic Numerals"
|publisher = Signal Corps Association |publisher = Signal Corps Association
|year = 1996 |year = 1996
Line 109: Line 110:
== See also == == See also ==
{{Commons category|Codes}} {{Commons category|Codes}}
* ]
* ] * ]
* ] * ]
Line 119: Line 121:
== References == == References ==
{{reflist}} {{reflist}}
* {{cite journal |last1=Chevance |first1=Fabienne |title=Case for the genetic code as a triplet of triplets |journal=Proceedings of the National Academy of Sciences of the United States of America |volume=114 |issue=18 |pages=4745–4750 |pmc=5422812 |year=2017 |pmid=28416671 |doi=10.1073/pnas.1614896114 |doi-access=free }} * {{cite journal |last1=Chevance |first1=Fabienne |title=Case for the genetic code as a triplet of triplets |journal=Proceedings of the National Academy of Sciences of the United States of America |volume=114 |issue=18 |pages=4745–4750 |pmc=5422812 |year=2017 |pmid=28416671 |doi=10.1073/pnas.1614896114 |doi-access=free |bibcode=2017PNAS..114.4745C }}


==Further reading== ==Further reading==

Latest revision as of 11:57, 26 December 2024

System of rules to convert information into another form or representation For other uses, see Code (disambiguation). "Encoding" redirects here. For other uses, see Encoding (disambiguation). For technical reasons, terms beginning with "Code#" redirect here. For the EPs by Ladies' Code, see Code 01 Bad Girl and Code 02 Pretty Pretty.
This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Code" – news · newspapers · books · scholar · JSTOR (March 2010) (Learn how and when to remove this message)

In communications and information processing, code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form, sometimes shortened or secret, for communication through a communication channel or storage in a storage medium. An early example is an invention of language, which enabled a person, through speech, to communicate what they thought, saw, heard, or felt to others. But speech limits the range of communication to the distance a voice can carry and limits the audience to those present when the speech is uttered. The invention of writing, which converted spoken language into visual symbols, extended the range of communication across space and time.

The process of encoding converts information from a source into symbols for communication or storage. Decoding is the reverse process, converting code symbols back into a form that the recipient understands, such as English or/and Spanish.

One reason for coding is to enable communication in places where ordinary plain language, spoken or written, is difficult or impossible. For example, semaphore, where the configuration of flags held by a signaler or the arms of a semaphore tower encodes parts of the message, typically individual letters, and numbers. Another person standing a great distance away can interpret the flags and reproduce the words sent.

Theory

Main article: Coding theory

In information theory and computer science, a code is usually considered as an algorithm that uniquely represents symbols from some source alphabet, by encoded strings, which may be in some other target alphabet. An extension of the code for representing sequences of symbols over the source alphabet is obtained by concatenating the encoded strings.

Before giving a mathematically precise definition, this is a brief example. The mapping

C = { a 0 , b 01 , c 011 } {\displaystyle C=\{\,a\mapsto 0,b\mapsto 01,c\mapsto 011\,\}}

is a code, whose source alphabet is the set { a , b , c } {\displaystyle \{a,b,c\}} and whose target alphabet is the set { 0 , 1 } {\displaystyle \{0,1\}} . Using the extension of the code, the encoded string 0011001 can be grouped into codewords as 0 011 0 01, and these in turn can be decoded to the sequence of source symbols acab.

Using terms from formal language theory, the precise mathematical definition of this concept is as follows: let S and T be two finite sets, called the source and target alphabets, respectively. A code C : S T {\displaystyle C:\,S\to T^{*}} is a total function mapping each symbol from S to a sequence of symbols over T. The extension C {\displaystyle C'} of C {\displaystyle C} , is a homomorphism of S {\displaystyle S^{*}} into T {\displaystyle T^{*}} , which naturally maps each sequence of source symbols to a sequence of target symbols.

Variable-length codes

Main article: Variable-length code

In this section, we consider codes that encode each source (clear text) character by a code word from some dictionary, and concatenation of such code words give us an encoded string. Variable-length codes are especially useful when clear text characters have different probabilities; see also entropy encoding.

A prefix code is a code with the "prefix property": there is no valid code word in the system that is a prefix (start) of any other valid code word in the set. Huffman coding is the most known algorithm for deriving prefix codes. Prefix codes are widely referred to as "Huffman codes" even when the code was not produced by a Huffman algorithm. Other examples of prefix codes are country calling codes, the country and publisher parts of ISBNs, and the Secondary Synchronization Codes used in the UMTS WCDMA 3G Wireless Standard.

Kraft's inequality characterizes the sets of codeword lengths that are possible in a prefix code. Virtually any uniquely decodable one-to-many code, not necessarily a prefix one, must satisfy Kraft's inequality.

Error-correcting codes

Main article: Error detection and correction See also: Block code

Codes may also be used to represent data in a way more resistant to errors in transmission or storage. This so-called error-correcting code works by including carefully crafted redundancy with the stored (or transmitted) data. Examples include Hamming codes, Reed–Solomon, Reed–Muller, Walsh–Hadamard, Bose–Chaudhuri–Hochquenghem, Turbo, Golay, algebraic geometry codes, low-density parity-check codes, and space–time codes. Error detecting codes can be optimised to detect burst errors, or random errors.

Examples

Codes in communication used for brevity

Main article: Brevity code

A cable code replaces words (e.g. ship or invoice) with shorter words, allowing the same information to be sent with fewer characters, more quickly, and less expensively.

Codes can be used for brevity. When telegraph messages were the state of the art in rapid long-distance communication, elaborate systems of commercial codes that encoded complete phrases into single mouths (commonly five-minute groups) were developed, so that telegraphers became conversant with such "words" as BYOXO ("Are you trying to weasel out of our deal?"), LIOUY ("Why do you not answer my question?"), BMULD ("You're a skunk!"), or AYYLU ("Not clearly coded, repeat more clearly."). Code words were chosen for various reasons: length, pronounceability, etc. Meanings were chosen to fit perceived needs: commercial negotiations, military terms for military codes, diplomatic terms for diplomatic codes, any and all of the preceding for espionage codes. Codebooks and codebook publishers proliferated, including one run as a front for the American Black Chamber run by Herbert Yardley between the First and Second World Wars. The purpose of most of these codes was to save on cable costs. The use of data coding for data compression predates the computer era; an early example is the telegraph Morse code where more-frequently used characters have shorter representations. Techniques such as Huffman coding are now used by computer-based algorithms to compress large data files into a more compact form for storage or transmission.

Character encodings

Main article: Character encoding

Character encodings are representations of textual data. A given character encoding may be associated with a specific character set (the collection of characters which it can represent), though some character sets have multiple character encodings and vice versa. Character encodings may be broadly grouped according to the number of bytes required to represent a single character: there are single-byte encodings, multibyte (also called wide) encodings, and variable-width (also called variable-length) encodings. The earliest character encodings were single-byte, the best-known example of which is ASCII. ASCII remains in use today, for example in HTTP headers. However, single-byte encodings cannot model character sets with more than 256 characters. Scripts that require large character sets such as Chinese, Japanese and Korean must be represented with multibyte encodings. Early multibyte encodings were fixed-length, meaning that although each character was represented by more than one byte, all characters used the same number of bytes ("word length"), making them suitable for decoding with a lookup table. The final group, variable-width encodings, is a subset of multibyte encodings. These use more complex encoding and decoding logic to efficiently represent large character sets while keeping the representations of more commonly used characters shorter or maintaining backward compatibility properties. This group includes UTF-8, an encoding of the Unicode character set; UTF-8 is the most common encoding of text media on the Internet.

Genetic code

Main article: Genetic code

Biological organisms contain genetic material that is used to control their function and development. This is DNA, which contains units named genes from which messenger RNA is derived. This in turn produces proteins through a genetic code in which a series of triplets (codons) of four possible nucleotides can be translated into one of twenty possible amino acids. A sequence of codons results in a corresponding sequence of amino acids that form a protein molecule; a type of codon called a stop codon signals the end of the sequence.

Gödel code

In mathematics, a Gödel code is the basis for the proof of Gödel's incompleteness theorem. Here, the idea is to map mathematical notation to a natural number (using a Gödel numbering).

Other

There are codes using colors, like traffic lights, the color code employed to mark the nominal value of the electrical resistors or that of the trashcans devoted to specific types of garbage (paper, glass, organic, etc.).

In marketing, coupon codes can be used for a financial discount or rebate when purchasing a product from a (usual internet) retailer.

In military environments, specific sounds with the cornet are used for different uses: to mark some moments of the day, to command the infantry on the battlefield, etc.

Communication systems for sensory impairments, such as sign language for deaf people and braille for blind people, are based on movement or tactile codes.

Musical scores are the most common way to encode music.

Specific games have their own code systems to record the matches, e.g. chess notation.

Cryptography

In the history of cryptography, codes were once common for ensuring the confidentiality of communications, although ciphers are now used instead.

Secret codes intended to obscure the real messages, ranging from serious (mainly espionage in military, diplomacy, business, etc.) to trivial (romance, games) can be any kind of imaginative encoding: flowers, game cards, clothes, fans, hats, melodies, birds, etc., in which the sole requirement is the pre-agreement on the meaning by both the sender and the receiver.

Other examples

Other examples of encoding include:

Other examples of decoding include:

Codes and acronyms

Acronyms and abbreviations can be considered codes, and in a sense, all languages and writing systems are codes for human thought.

International Air Transport Association airport codes are three-letter codes used to designate airports and used for bag tags. Station codes are similarly used on railways but are usually national, so the same code can be used for different stations if they are in different countries.

Occasionally, a code word achieves an independent existence (and meaning) while the original equivalent phrase is forgotten or at least no longer has the precise meaning attributed to the code word. For example, '30' was widely used in journalism to mean "end of story", and has been used in other contexts to signify "the end".

See also

References

  1. Kogan, Hadass "So Why Not 29" Archived 2010-12-12 at the Wayback Machine American Journalism Review. Retrieved 2012-07-03.
  2. "Western Union "92 Code" & Wood's "Telegraphic Numerals"". Signal Corps Association. 1996. Archived from the original on 2012-05-09. Retrieved 2012-07-03.

Further reading

  • Codes and Abbreviations for the Use of the International Telecommunication Services (2nd ed.). Geneva, Switzerland: International Telecommunication Union. 1963. OCLC 13677884.
Categories: