Misplaced Pages

SDXF: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editNext edit →Content deleted Content addedVisualWikitext
Revision as of 12:34, 19 August 2007 edit84.176.98.132 (talk) Technical structure format← Previous edit Revision as of 01:05, 9 December 2007 edit undoCmdrObot (talk | contribs)339,230 editsm sp: an other→anotherNext edit →
Line 4: Line 4:
It allows arbitrary structured data of different types to be assembled together for exchanging between computers of different architectures. It allows arbitrary structured data of different types to be assembled together for exchanging between computers of different architectures.


The ability to arbitrarily structure your data and serialize it into a self-describing format is reminiscent of ], but SDXF is not a text format (as XML)--you cannot manipulate a SDXF structure with a text editor. The ability to arbitrarily structure your data and serialize it into a self-describing format is reminiscent of ], but SDXF is not a text format (as XML) --you cannot manipulate a SDXF structure with a text editor.


==Technical structure format== ==Technical structure format==
Line 55: Line 55:
In a first step every named term (INVOICE, INVOICE_NO, DATE, ADDRESS_SENDER, etc.) has to be associated a unique number out off the range 1 to 65535 (2 byte binary integer without sign). In a first step every named term (INVOICE, INVOICE_NO, DATE, ADDRESS_SENDER, etc.) has to be associated a unique number out off the range 1 to 65535 (2 byte binary integer without sign).
At next a first chunk is to be constructed with the ID INVOICE (that means with the associated numerical chunk_ID) as a structured chunk on level 1. At next a first chunk is to be constructed with the ID INVOICE (that means with the associated numerical chunk_ID) as a structured chunk on level 1.
This INVOICE chunk will be filled step by step with a sequence of other chunks on level 2: INVOICE_NO, DATE, ADDRESS_SENDER, ADDRESS_RECIPIENT, INVOICE_SUM, SINGLE_ITEMS, CONDITIONS. Some of these chunks on level 2 are structured in turn as for the two addresses and SINGLE_ITEMS. For the date one has the possibility to specify it in text format e.g. in the form YYYY-MM-DD (as standardized in ]) or alternatively as a structure consisting of the three numerical chunks YEAR, MONTH, DAY. This INVOICE chunk will be filled step by step with a sequence of other chunks on level 2: INVOICE_NO, DATE, ADDRESS_SENDER, ADDRESS_RECIPIENT, INVOICE_SUM, SINGLE_ITEMS, CONDITIONS. Some of these chunks on level 2 are structured in turn as for the two addresses and SINGLE_ITEMS. For the date one has the possibility to specify it in text format e.g. in the form YYYY-MM-DD (as standardized in ]) or alternatively as a structure consisting of the three numerical chunks YEAR, MONTH, DAY.


For a precise description see page 2 of the RFC or alternatively (http://pinpi.com/en/SDXF_2.htm) For a precise description see page 2 of the RFC or alternatively (http://pinpi.com/en/SDXF_2.htm)
Line 123: Line 123:
do while (sdx.rc == SDX_RC_ok) do while (sdx.rc == SDX_RC_ok)
{ {
switch (sdx.Chunk_ID) switch (sdx. Chunk_ID)
{ {
case INVOICE_NO: case INVOICE_NO:
Line 147: Line 147:


# This reflects exactly programmer's everyday life: The individual elements are distributed in various program variables or data base fields, in case of external data these elements are not all accessible at the same time. # This reflects exactly programmer's everyday life: The individual elements are distributed in various program variables or data base fields, in case of external data these elements are not all accessible at the same time.
# With the processing step by step it is warranted that in case of a structure which consists of mixed data type these data will be transformed to a normalized form, so that the resulting structure can transferred from one computer to an other without further adaptation. Even if these computer differs in their architecture. The problematic of different presentation of data is outlined and .<br/>With the use of the SDXF functions the application programmer is completely freed from these problems. # With the processing step by step it is warranted that in case of a structure which consists of mixed data type these data will be transformed to a normalized form, so that the resulting structure can transferred from one computer to another without further adaptation. Even if these computer differs in their architecture. The problematic of different presentation of data is outlined and .<br/>With the use of the SDXF functions the application programmer is completely freed from these problems.
# Encryption (e.g. AES) and compressing (e.g. ZIP) will be maintained by the SDXF functions too. It is also possible to encrypt or compress only parts of a structure. While extracting Chunks the decompressing and decryption will be automatically done (if the encryption key matches). # Encryption (e.g. AES) and compressing (e.g. ZIP) will be maintained by the SDXF functions too. It is also possible to encrypt or compress only parts of a structure. While extracting Chunks the decompressing and decryption will be automatically done (if the encryption key matches).



Revision as of 01:05, 9 December 2007

This article's tone or style may not reflect the encyclopedic tone used on Misplaced Pages. See Misplaced Pages's guide to writing better articles for suggestions. (Learn how and when to remove this message)

SDXF stands for "Structured Data eXchange Format", and was published as Internet RFC 3072.

It allows arbitrary structured data of different types to be assembled together for exchanging between computers of different architectures.

The ability to arbitrarily structure your data and serialize it into a self-describing format is reminiscent of XML, but SDXF is not a text format (as XML) --you cannot manipulate a SDXF structure with a text editor.

Technical structure format

SDXF data can be structured in arbitrary level of deepness. The particular data elements are self-described, that means the information about the type of data (numeric, character string or structure) are encoded into the data elements. The design of this format is very simple, but nevertheless transparent: computer programs accesses SDXF data with the help of well defined functions, it is not necessary for the programmer to care about the precise layout of the data.

The word "exchange" in the labelling SDXF reflects another kind of transparency: the SDXF functions care for a computer architecture independent conversion of the data, the composed structure can be exchanged to other computers (via direct network, file transfer or CD) without further measures. The SDXF functions on the receiving side cares for the correct adaptation.

What exactly means "structured data"? In fact it is difficult to find an example for non-structured data: Also a common text is structured in lines, paragraphs, etc. But in this context structures have to be recognized and processed by the related programs. Here an example out off the commercial area: Two companies wants to exchange their issuing of invoices on an electronic base. An invoice is therefore an electronic document with following hierarchical nested structure:

INVOICE
│
├─ INVOICE_NO  
├─ DATE
├─ ADDRESS_SENDER
│    ├─ NAME
│    ├─ NAME
│    ├─ STREET
│    ├─ ZIP
│    ├─ CITY
│    └─ COUNTRY
├─ ADDRESS_RECIPIENT
│    ├─ NAME
│    ├─ NAME
│    ├─ STREET
│    ├─ ZIP
│    ├─ CITY
│    └─ COUNTRY
├─ INVOICE_SUM
├─ SINGLE_ITEMS
│    ├─ SINGLE_ITEM
│    │    ├─ QUANTITY
│    │    ├─ ITEM_NUMBER
│    │    ├─ ITEM_TEXT
│    │    ├─ CHARGE
│    │    └─ SUM
│    └─ ...           
├─ CONDITIONS
...

Similar and more complex structures has to be covered with SDXF. How is an SDXF structure build up?

The basic construction element is a "Chunk". The whole SDXF structure is a chunk and a chunk can consist out off a set of chunks. A chunk itself has a very simple construction: It is composed of a header prefix of six bytes, followed by the actual data. The header contains an identification of this chunk as a 2-byte binary number (Chunk_ID), the length and a specification about the type of the following data. It may also contains additional information about compression, encryption and others.

The specification about the type of the data informs us if the data consists of text (a string of characters), or if it presents a binary number (can be an integer or a floating point representation) or if the chunk is build up out off a sequence of other chunks (giving a structured chunk).

Especially the availability of structured chunks enables the programmer to pack such hierarchically constructions as the INVOICE above into an SDXF structure in a simple way: In a first step every named term (INVOICE, INVOICE_NO, DATE, ADDRESS_SENDER, etc.) has to be associated a unique number out off the range 1 to 65535 (2 byte binary integer without sign). At next a first chunk is to be constructed with the ID INVOICE (that means with the associated numerical chunk_ID) as a structured chunk on level 1. This INVOICE chunk will be filled step by step with a sequence of other chunks on level 2: INVOICE_NO, DATE, ADDRESS_SENDER, ADDRESS_RECIPIENT, INVOICE_SUM, SINGLE_ITEMS, CONDITIONS. Some of these chunks on level 2 are structured in turn as for the two addresses and SINGLE_ITEMS. For the date one has the possibility to specify it in text format e.g. in the form YYYY-MM-DD (as standardized in ISO 8601) or alternatively as a structure consisting of the three numerical chunks YEAR, MONTH, DAY.

For a precise description see page 2 of the RFC or alternatively (http://pinpi.com/en/SDXF_2.htm)

The concept for SDXF determines also that the programmer has to work on SDXF structures with a well defined set of functions. There are only few of them:


To read Chunks, following functions has to be used:
init To initialize the parameter structure and linking to the existing Chunk.
enter To step into a structured Chunk, the 1st Chunk of this structure is ready to process.
leave To leave the current structure. This structure is already current.
next Goes to next Chunk if exists (otherwise it leaves the current structure).
extract To transfer (and adapt) data from the current Chunk into a program variable.
select To search the next Chunk with a given Chunk ID and make it current.
To build Chunks, following functions has to be used:
init To initialize the parameter structure and linking to an empty output buffer for to create a new Chunk.
create Create a new Chunk and append it to the current existing structure (if exists).
append Append a complete Chunk to an SDXF-Structure.
leave To leave the current structure. This structure is already current.

With the invoice example above a creation program may be looks like following code:

 init (sdx, buffersize=1000);   // initialize the SDXF parameter structure sdx
 create (sdx, ID=INVOICE, datatype=STRUCTURED); // start of the main structure
 create (sdx, ID=INVOICE_NO, datatype=NUMERIC, value=123456); // create an elementary Chunk
 create (sdx, ID=DATE, datatype=CHAR, value="2005-06-17"); // once more
 create (sdx, ID=ADDRESS_SENDER, datatype=STRUCTURED); // Substructure
 create (sdx, ID=NAME, datatype=CHAR, value="Peter Somebody"); // element. Chunk inside this substructure
 ...
 create (sdx, ID= COUNTRY, datatype=CHAR, value="France"); // the last one inside this substructure
 leave; // closing the substructure ADDRESS_SENDER
 ...
 leave; // closing the substructure INVOICE

The Syntax of this code is only fictive to simplify matters, for a complete example in the programming language "C" look here.

A code to readout the INVOICE structure looks like:

 init (sdx, container=pointer to an SDXF-structure);   // initialize the SDXF parameter structure sdx
 enter (sdx); // join into the INVOICE structure.
              //
 do while (sdx.rc == SDX_RC_ok)
 {
     switch (sdx. Chunk_ID)
    {
        case INVOICE_NO:
          extract (sdx);    
          invno = sdx.value;  // the extract function put integer values into the parameter field 'value'
          break;
          //
        case DATE:
          extract (sdx);    
          strcpy (invdate, sdx.data); // sdx.data is a pointer to the extracted character string
          break;
          //
        case ADDRESS_SENDER:
          enter (sdx);  // we use 'enter' because ADDRESS is a structured Chunk
          do while (sdx.rc == SDX_RC_ok) // inner loop
           ...
          break;
       ...
    }  
 }

As the example shows the individual SDXF Chunks will be constructed step by step. At a first glance this appears somehow cumbersome, but:

  1. This reflects exactly programmer's everyday life: The individual elements are distributed in various program variables or data base fields, in case of external data these elements are not all accessible at the same time.
  2. With the processing step by step it is warranted that in case of a structure which consists of mixed data type these data will be transformed to a normalized form, so that the resulting structure can transferred from one computer to another without further adaptation. Even if these computer differs in their architecture. The problematic of different presentation of data is outlined here (presentation of characters) and there (byte swapping problem).
    With the use of the SDXF functions the application programmer is completely freed from these problems.
  3. Encryption (e.g. AES) and compressing (e.g. ZIP) will be maintained by the SDXF functions too. It is also possible to encrypt or compress only parts of a structure. While extracting Chunks the decompressing and decryption will be automatically done (if the encryption key matches).

SDXF is not designed for readability by humans or that it can be modified by text editors. If someone wish to do so a special format is available which reflects an SDXF structure one to one, named SDEF.

See also

External links

Categories: