Revision as of 07:49, 10 August 2007 edit84.58.20.210 (talk) This article is totally revised by the author← Previous edit | Latest revision as of 23:03, 27 February 2024 edit undoCitation bot (talk | contribs)Bots5,415,592 edits Add: date, authors 1-1. | Use this bot. Report bugs. | Suggested by LeapTorchGear | #UCB_webform 196/339 | ||
(51 intermediate revisions by 35 users not shown) | |||
Line 1: | Line 1: | ||
'''SDXF''' ('''Structured Data eXchange Format''') is a ] format defined by RFC 3072.<ref>{{cite web|url=https://www.rfc-editor.org/rfc/rfc3072.html|title=RFC-3072|date=March 2001 |last1=Wildgrube |first1=Max }}</ref> It allows arbitrary structured data of different types to be assembled in one file for ] between arbitrary computers. | |||
{{tone}} | |||
'''SDXF''' stands for "Structured Data eXchange Format", and was published as Internet RFC 3072. | |||
The ability to arbitrarily serialize data into a self-describing format is reminiscent of ], but SDXF is not a text format (as XML) — SDXF is not compatible with text editors. The maximal length of a datum (composite as well as elementary) encoded using SDXF is 16777215 bytes (one less than 16 ]). | |||
It allows arbitrary structured data of different types to be assembled together for exchanging between computers of different architectures. | |||
The ability to arbitrarily structure your data and serialize it into a self-describing format is reminiscent of ], but SDXF is not a text format (as XML)--you cannot manipulate a SDXF structure with a text editor. | |||
==Technical structure format== | ==Technical structure format== | ||
SDXF data can |
SDXF data can express arbitrary levels of structural depth. Data elements are ], meaning that the ] (numeric, character string or structure) are encoded into the data elements. The design of this format is simple and transparent: computer programs access SDXF data with the help of well-defined functions, exempting programmers from learning the precise data layout. | ||
The word "exchange" in the name reflects another kind of transparency: the SDXF functions provide a computer architecture independent conversion of the data. Serializations can be exchanged among computers (via direct network, file transfer or CD) without further measures. The SDXF functions on the receiving side handle architectural adaptation. | |||
] is data with patterns predictable more complex than strings of text.<ref>It may be argued that "structured" is used here in the same sense as in ] — like there are no ]s in a (strictly) structured program, there are no ]s/] in SDXF. This need not be how the name arose, however.</ref> | |||
The word "exchange" in the labelling SDXF reflects another kind of transparency: the SDXF functions care for a computer architecture independent conversion of the data, the composed structure can be exchanged to other computers (via direct network, file transfer or CD) without further measures. The SDXF functions on the receiving side cares for the correct adaptation. | |||
==Example== | |||
What exactly means "structured data"? In fact it is not so easily to find an example for non-structured data: Also a common text is structured in lines, paragraphs, etc. But in this context structures have to be recognized and processed by the related programs. | |||
A commercial example: two companies want to exchange digital invoices. The invoices have the following hierarchical nested structure: | |||
INVOICE | INVOICE | ||
│ | │ | ||
├─ INVOICE_NO | ├─ INVOICE_NO | ||
Line 43: | Line 42: | ||
├─ CONDITIONS | ├─ CONDITIONS | ||
... | ... | ||
Similar and more complex structures has to be covered with SDXF. | |||
How is an SDXF structure build up? | |||
] | |||
The basic construction element is a "Chunk". The whole SDXF structure is a chunk and a chunk can consist out off a set of chunks. | |||
A chunk itself has a very simple construction: It is composed of a header prefix of six bytes, followed by the actual data. The header contains an identification of this chunk as a 2-byte binary number (Chunk_ID), the length and a specification about the type of the following data. It may also contains additional information about compression, encryption and others. | |||
] | |||
The specification about the type of the data informs us if the data consists of text (a string of characters), or if it presents a binary number (can be an integer or a floating point representation) or if the chunk is build up out off a sequence of other chunks (giving a structured chunk). | |||
==Structure== | |||
Especially the availability of structured chunks enables the programmer to pack such hierarchically constructions as the INVOICE above into an SDXF structure in a simple way: | |||
In a first step every named term (INVOICE, INVOICE_NO, DATE, ADDRESS_SENDER, etc.) has to be associated a unique number out off the range 1 to 65535 (2 byte binary integer without sign). | |||
At next a first chunk is to be constructed with the ID INVOICE (that means with the associated numerical chunk_ID) as a structured chunk on level 1. | |||
This INVOICE chunk will be filled step by step with a sequence of other chunks on level 2: INVOICE_NO, DATE, ADDRESS_SENDER, ADDRESS_RECIPIENT, INVOICE_SUM, SINGLE_ITEMS, CONDITIONS. Some of these chunks on level 2 are structured in turn as for the two addresses and SINGLE_ITEMS. For the date one has the possibility to specify it in text format e.g. in the form YYYY-MM-DD (as standardized in ) or alternatively as a structure consisting of the three numerical chunks YEAR, MONTH, DAY. | |||
The basic element is a chunk. An SDXF serialization is itself a chunk. A chunk can consist of a set of smaller chunks. | |||
For a precise description see page 2 of the RFC or alternatively (http://pinpi.com/en/SDXF_2.htm) | |||
Chunks are composed of a header prefix of six bytes, followed by data. The header contains a chunk identifier as a 2-byte binary number (Chunk_ID), the chunk length and type. It may contain additional information about compression, encryption and more. | |||
The chunk type indicates whether the data consists of text (a string of characters), a binary number (integer or floating point) or if the chunk a composite of other chunks. | |||
The concept for SDXF determines also that the programmer has to work on SDXF structures with a well defined set of functions. | |||
Structured chunks enable the programmer to pack hierarchical constructions such as the INVOICE above into an SDXF structure as follow: | |||
Every named term (INVOICE, INVOICE_NO, DATE, ADDRESS_SENDER, etc.) is given a unique number out in the range 1 to 65535 (2 byte unsigned binary integer without sign). The top/outermost chunk is constructed with the ID INVOICE (that means with the associated numerical chunk_ID) as a structured chunk on level 1. This INVOICE chunk is filled with other chunks on level 2 and beyond: INVOICE_NO, DATE, ADDRESS_SENDER, ADDRESS_RECIPIENT, INVOICE_SUM, SINGLE_ITEMS, CONDITIONS. Some level 2 chunks are structured in turn as for the two addresses and SINGLE_ITEMS. | |||
For a precise description see page 2 of the RFC or alternatively here.<ref>{{cite web|url=http://pinpi.com/en/SDXF_2.htm |title=SDXF - 2. Description of the SDXF Format |publisher=Pinpi.com |date= |accessdate=2013-09-10}}</ref> | |||
SDXF allows programmer to work on SDXF structures with a compact function set. | |||
There are only few of them: | There are only few of them: | ||
<blockquote> | |||
{{col-float | |||
|1= | |||
''To read Chunks, following functions has to be used:'' | |||
; init | |||
: To initialize the parameter structure and linking to the existing Chunk. | |||
; enter | |||
: To step into a structured Chunk, the 1st Chunk of this structure is ready to process. | |||
; leave | |||
: To leave the current structure. This structure is already current. | |||
; next | |||
: Goes to next Chunk if exists (otherwise it leaves the current structure). | |||
; extract | |||
: To transfer (and adapt) data from the current Chunk into a program variable. | |||
; select | |||
: To search the next Chunk with a given Chunk ID and make it current. | |||
|2= | |||
''To build Chunks, following functions has to be used:'' | |||
; init | |||
: To initialize the parameter structure and linking to an empty output buffer for to create a new Chunk. | |||
; create | |||
: Create a new Chunk and append it to the current existing structure (if exists). | |||
; append | |||
: Append a complete Chunk to an SDXF-Structure. | |||
; leave | |||
{| border="1" cellpadding="2" | |||
: To leave the current structure. This structure is already current. | |||
|- | |||
}} | |||
| colspan="2" | ''To read Chunks, following functions has to be used:'' | |||
</blockquote> | |||
|- | |||
The following ] creates invoices: | |||
! align="left" | init | |||
| To initialize the parameter structure and linking to the existing Chunk. | |||
|- | |||
! align="left" | enter | |||
| To step into a structured Chunk, the 1st Chunk of this structure is ready to process. | |||
|- | |||
! align="left" | leave | |||
| To leave the current structure. This structure is already current. | |||
|- | |||
! align="left" | next | |||
| Goes to next Chunk if exists (otherwise it leaves the current structure). | |||
|- | |||
! align="left" | extract | |||
| To transfer (and adapt) data from the current Chunk into a program variable. | |||
|- | |||
! align="left" | select | |||
| To search the next Chunk with a given Chunk ID and make it current. | |||
|- | |||
| colspan="2" | ''To build Chunks, following functions has to be used:'' | |||
|- | |||
! align="left" | init | |||
| To initialize the parameter structure and linking to an empty output buffer for to create a new Chunk. | |||
|- | |||
! align="left" | create | |||
| Create a new Chunk and append it to the current existing structure (if exists). | |||
|- | |||
! align="left" | append | |||
| Append a complete Chunk to an SDXF-Structure. | |||
|- | |||
! align="left" | leave | |||
| To leave the current structure. This structure is already current. | |||
|} | |||
<syntaxhighlight lang="c"> | |||
With the invoice example above a creation program may be looks like following code: | |||
init (sdx, buffersize=1000); // initialize the SDXF parameter structure sdx | init (sdx, buffersize=1000); // initialize the SDXF parameter structure sdx | ||
create (sdx, ID=INVOICE, datatype=STRUCTURED); // start of the main structure | create (sdx, ID=INVOICE, datatype=STRUCTURED); // start of the main structure | ||
Line 113: | Line 113: | ||
... | ... | ||
leave; // closing the substructure INVOICE | leave; // closing the substructure INVOICE | ||
</syntaxhighlight><ref>{{cite web|url=http://pinpi.com/en/PIN_63.htm |title=6.3 The Project PRNT: a complete example |publisher=PINPI |date= |accessdate=2013-09-10}}</ref> | |||
Pseudocode to extract the INVOICE structure could look like: | |||
The Syntax of this code is only fictive to simplify matters, for a complete example in the programming language "C" look . | |||
A code to readout the INVOICE structure looks like: | |||
<syntaxhighlight lang="c"> | |||
init (sdx, container=pointer to an SDXF-structure); // initialize the SDXF parameter structure sdx | init (sdx, container=pointer to an SDXF-structure); // initialize the SDXF parameter structure sdx | ||
enter (sdx); // join into the INVOICE structure. | enter (sdx); // join into the INVOICE structure. | ||
// | // | ||
|
while (sdx.rc == SDX_RC_ok) | ||
{ | { | ||
switch (sdx.Chunk_ID) | switch (sdx. Chunk_ID) | ||
{ | { | ||
case INVOICE_NO: | case INVOICE_NO: | ||
Line 143: | Line 143: | ||
} | } | ||
} | } | ||
</syntaxhighlight> | |||
SDXF is not designed for readability or to be modified by text editors. A related editable structure is SDEF - Structured Data Editable Format.<ref>{{cite web|url=http://www.pinpi.com/en/sdef.htm |archive-url=https://web.archive.org/web/20160307070425/http://www.pinpi.com/en/sdef.htm |archive-date=2016-03-07 |title=SDEF Site (from Archive.org)}}</ref> | |||
As the example shows the individual SDXF Chunks will be constructed step by step. At a first glance this appears somehow cumbersome, but: | |||
==See also== | |||
# This reflects exactly programmer's everyday life: The individual elements are distributed in various program variables or data base fields, in case of external data these elements are not all accessible at the same time. | |||
* ] | |||
# With the processing step by step it is warranted that in case of a structure which consists of mixed data type these data will be transformed to a normalized form, so that the resulting structure can transferred from one computer to an other without further adaptation. Even if these computer differs in their architecture. The problematic of different presentation of data is outlined and .<br/>With the use of the SDXF functions the application programmer is completely freed from these problems. | |||
* ] | |||
# Encryption (e.g. AES) and compressing (e.g. ZIP) will be maintained by the SDXF functions too. It is also possible to encrypt or compress only parts of a structure. While extracting Chunks the decompressing and decryption will be automatically done (if the encryption key matches). | |||
* ] | |||
* ] | |||
* ] | |||
* ] | |||
* ] | |||
==References== | |||
SDXF is not designed for readability by humans or that it can be modified by text editors. If someone wish to do so a special format is available which reflects an SDXF structure one to one, named . | |||
<references/> | |||
==See also== | |||
*] | |||
==External links== | ==External links== | ||
* | * | ||
] | ] | ||
] | ] | ||
] | |||
] | ] |
Latest revision as of 23:03, 27 February 2024
SDXF (Structured Data eXchange Format) is a data serialization format defined by RFC 3072. It allows arbitrary structured data of different types to be assembled in one file for exchanging between arbitrary computers.
The ability to arbitrarily serialize data into a self-describing format is reminiscent of XML, but SDXF is not a text format (as XML) — SDXF is not compatible with text editors. The maximal length of a datum (composite as well as elementary) encoded using SDXF is 16777215 bytes (one less than 16 MB).
Technical structure format
SDXF data can express arbitrary levels of structural depth. Data elements are self-documenting, meaning that the metadata (numeric, character string or structure) are encoded into the data elements. The design of this format is simple and transparent: computer programs access SDXF data with the help of well-defined functions, exempting programmers from learning the precise data layout.
The word "exchange" in the name reflects another kind of transparency: the SDXF functions provide a computer architecture independent conversion of the data. Serializations can be exchanged among computers (via direct network, file transfer or CD) without further measures. The SDXF functions on the receiving side handle architectural adaptation.
Structured data is data with patterns predictable more complex than strings of text.
Example
A commercial example: two companies want to exchange digital invoices. The invoices have the following hierarchical nested structure:
INVOICE │ ├─ INVOICE_NO ├─ DATE ├─ ADDRESS_SENDER │ ├─ NAME │ ├─ NAME │ ├─ STREET │ ├─ ZIP │ ├─ CITY │ └─ COUNTRY ├─ ADDRESS_RECIPIENT │ ├─ NAME │ ├─ NAME │ ├─ STREET │ ├─ ZIP │ ├─ CITY │ └─ COUNTRY ├─ INVOICE_SUM ├─ SINGLE_ITEMS │ ├─ SINGLE_ITEM │ │ ├─ QUANTITY │ │ ├─ ITEM_NUMBER │ │ ├─ ITEM_TEXT │ │ ├─ CHARGE │ │ └─ SUM │ └─ ... ├─ CONDITIONS ...
Structure
The basic element is a chunk. An SDXF serialization is itself a chunk. A chunk can consist of a set of smaller chunks. Chunks are composed of a header prefix of six bytes, followed by data. The header contains a chunk identifier as a 2-byte binary number (Chunk_ID), the chunk length and type. It may contain additional information about compression, encryption and more.
The chunk type indicates whether the data consists of text (a string of characters), a binary number (integer or floating point) or if the chunk a composite of other chunks.
Structured chunks enable the programmer to pack hierarchical constructions such as the INVOICE above into an SDXF structure as follow: Every named term (INVOICE, INVOICE_NO, DATE, ADDRESS_SENDER, etc.) is given a unique number out in the range 1 to 65535 (2 byte unsigned binary integer without sign). The top/outermost chunk is constructed with the ID INVOICE (that means with the associated numerical chunk_ID) as a structured chunk on level 1. This INVOICE chunk is filled with other chunks on level 2 and beyond: INVOICE_NO, DATE, ADDRESS_SENDER, ADDRESS_RECIPIENT, INVOICE_SUM, SINGLE_ITEMS, CONDITIONS. Some level 2 chunks are structured in turn as for the two addresses and SINGLE_ITEMS.
For a precise description see page 2 of the RFC or alternatively here.
SDXF allows programmer to work on SDXF structures with a compact function set. There are only few of them:
To read Chunks, following functions has to be used:
- init
- To initialize the parameter structure and linking to the existing Chunk.
- enter
- To step into a structured Chunk, the 1st Chunk of this structure is ready to process.
- leave
- To leave the current structure. This structure is already current.
- next
- Goes to next Chunk if exists (otherwise it leaves the current structure).
- extract
- To transfer (and adapt) data from the current Chunk into a program variable.
To build Chunks, following functions has to be used:
- select
- To search the next Chunk with a given Chunk ID and make it current.
- init
- To initialize the parameter structure and linking to an empty output buffer for to create a new Chunk.
- create
- Create a new Chunk and append it to the current existing structure (if exists).
- append
- Append a complete Chunk to an SDXF-Structure.
- leave
- To leave the current structure. This structure is already current.
The following pseudocode creates invoices:
init (sdx, buffersize=1000); // initialize the SDXF parameter structure sdx create (sdx, ID=INVOICE, datatype=STRUCTURED); // start of the main structure create (sdx, ID=INVOICE_NO, datatype=NUMERIC, value=123456); // create an elementary Chunk create (sdx, ID=DATE, datatype=CHAR, value="2005-06-17"); // once more create (sdx, ID=ADDRESS_SENDER, datatype=STRUCTURED); // Substructure create (sdx, ID=NAME, datatype=CHAR, value="Peter Somebody"); // element. Chunk inside this substructure ... create (sdx, ID= COUNTRY, datatype=CHAR, value="France"); // the last one inside this substructure leave; // closing the substructure ADDRESS_SENDER ... leave; // closing the substructure INVOICE
Pseudocode to extract the INVOICE structure could look like:
init (sdx, container=pointer to an SDXF-structure); // initialize the SDXF parameter structure sdx enter (sdx); // join into the INVOICE structure. // while (sdx.rc == SDX_RC_ok) { switch (sdx. Chunk_ID) { case INVOICE_NO: extract (sdx); invno = sdx.value; // the extract function put integer values into the parameter field 'value' break; // case DATE: extract (sdx); strcpy (invdate, sdx.data); // sdx.data is a pointer to the extracted character string break; // case ADDRESS_SENDER: enter (sdx); // we use 'enter' because ADDRESS is a structured Chunk do while (sdx.rc == SDX_RC_ok) // inner loop ... break; ... } }
SDXF is not designed for readability or to be modified by text editors. A related editable structure is SDEF - Structured Data Editable Format.
See also
- External Data Representation
- Protocol Buffers
- Abstract Syntax Notation One
- Apache Thrift
- Etch (protocol)
- Internet Communications Engine
- Comparison of data serialization formats
References
- Wildgrube, Max (March 2001). "RFC-3072".
- It may be argued that "structured" is used here in the same sense as in structured programming — like there are no gotos in a (strictly) structured program, there are no pointers/references in SDXF. This need not be how the name arose, however.
- "SDXF - 2. Description of the SDXF Format". Pinpi.com. Retrieved 2013-09-10.
- "6.3 The Project PRNT: a complete example". PINPI. Retrieved 2013-09-10.
- "SDEF Site (from Archive.org)". Archived from the original on 2016-03-07.