+ All Categories
Home > Documents > GENEDI - Europa

GENEDI - Europa

Date post: 17-Mar-2022
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
90
GENEDI Generic EDI toolbox Version 2.1 User Guide
Transcript

GENEDI Generic EDI toolbox

Version 2.1

User Guide

European Commission – Eurostat User Guide

© European Commission 2005 Page 2

Table of contents 1 Introduction............................................................................................................... 4

1.1 Objectives ........................................................................................................... 4 1.2 Audience ............................................................................................................. 4 1.3 Outline................................................................................................................. 4 1.4 Terminology........................................................................................................ 5

2 Application overview ................................................................................................ 6 2.1 Introduction......................................................................................................... 6

2.1.1 Import Process ............................................................................................ 9 2.1.2 Mapping Process......................................................................................... 9 2.1.3 Archive Process .......................................................................................... 9 2.1.4 Validation Process ...................................................................................... 9 2.1.5 Sending Process ........................................................................................ 10

2.2 File description.................................................................................................. 11 2.3 System requirement .......................................................................................... 12

3 Getting started......................................................................................................... 13 3.1 Running GENEDI............................................................................................. 13 3.2 GENEDI layout................................................................................................. 14

3.2.1 Process Page.............................................................................................. 15 3.2.2 Configure Page.......................................................................................... 17

4 File Management..................................................................................................... 19 5 Action Options......................................................................................................... 22 6 View Options ........................................................................................................... 24

6.1 Session Log....................................................................................................... 24 6.2 Select File.......................................................................................................... 24 6.3 Validation Logs................................................................................................. 24

7 Configuration .......................................................................................................... 27 7.1 Global Configuration ........................................................................................ 29 7.2 Domain Configuration ...................................................................................... 31 7.3 Dataset Configuration ....................................................................................... 32 7.4 Statistical Concept Configuration ..................................................................... 34 7.5 Encryption Settings........................................................................................... 37

8 Updates..................................................................................................................... 39 8.1 Update configuration files................................................................................. 39 8.2 Import................................................................................................................ 39

9 Command Line........................................................................................................ 41 9.1 batch.................................................................................................................. 41 9.2 command <command>...................................................................................... 41 9.3 import ................................................................................................................ 41 9.4 help.................................................................................................................... 42

APPENDIX A: Error and warning description................................................................. 43 APPENDIX B: GESMES format...................................................................................... 45 APPENDIX C: Flat file format......................................................................................... 51

European Commission – Eurostat User Guide

© European Commission 2005 Page 3

APPENDIX D: Quick help to add/customise statistical domains in GENEDI ................ 53 APPENDIX E: Mapping configuration file...................................................................... 56 APPENDIX F : Configuration files .................................................................................. 59 APPENDIX G: Validation file.......................................................................................... 62

File Structure................................................................................................................. 62 Introduction to the generic advanced validation rules .................................................. 66

Generic rule n°1 ........................................................................................................ 66 Generic rule n°2 ........................................................................................................ 67 Generic rule n°3 ........................................................................................................ 68 Generic rule n°4 ........................................................................................................ 68 Generic rule n°5 ........................................................................................................ 69 Generic rule n°6 ........................................................................................................ 70 Generic rule n°7 ........................................................................................................ 71 Generic rule n°8 ........................................................................................................ 71 Generic rule n°9 ........................................................................................................ 71 Generic rule n°10, 10b and 10c................................................................................. 72 Generic rule n°11 ...................................................................................................... 74 Generic rule n°12 ...................................................................................................... 75 Generic rule n°13 ...................................................................................................... 75 Generic rule n°14 ...................................................................................................... 76 Generic rule n°15 ...................................................................................................... 76 Generic Rule°16........................................................................................................ 77 Generic Rule°17........................................................................................................ 78 Generic Rule°18........................................................................................................ 78 Generic Rule°19........................................................................................................ 79 Generic Rule°20........................................................................................................ 79 Generic Rule°21........................................................................................................ 80 Generic rule A........................................................................................................... 80 Generic rule C ........................................................................................................... 81 Generic rule D........................................................................................................... 82

APPENDIX H: Codelists .................................................................................................. 83 General format .............................................................................................................. 83 Code list with hidden codes .......................................................................................... 83 Code list with associated codes or complementary codes ............................................ 83

APPENDIX I: Transcoding tables and Transcoding rules ............................................... 85 APPENDIX J: Hide codes configuration file ................................................................... 88 APPENDIX K: Dataset Naming Convention ................................................................... 88

European Commission – Eurostat User Guide

© European Commission 2005 Page 4

1 Introduction 1.1 Objectives This document aims to supply Competent National Authorities (CNA) with guidance on the use of the Generic EDI Toolbox – GENEDI to automatically reorganise, validate, convert into GESMES/TS format, and send statistical data to Eurostat. CNA users who do not have specific skills in computing should be able to operate GENEDI with the help of this document. GENEDI has been designed to operate on a large variety of systems including among others MS/Windows and Unix. All along this document, the explanations are given for these two types of the operating system. Note that this document doesn’t describe specific operating system features; it only covers functionalities included in GENEDI. 1.2 Audience The guide is addressed to GENEDI users who will be responsible for the intallation and the use of GENEDI to send statistical data to Eurostat. As a rule, the use of the GENEDI does not need any particular skills in computing or IT, apart from familiarity with user applications. 1.3 Outline The document deals with the following subjects: • Application overview: Describes GENEDI functionality and usage.

• Getting started: Provides a quick introduction on using GENEDI commonly used

features.

• File Management: Describes the management of files.

• Process: Describes the “Process” scenario (Pre-process, Validate, Create, Send).

• View: Describes the “Session log”, “Select file” and “View validation log” features.

• Configuration: Describes the configuration process (Global, Domain, Dataset,

Statistical concept configuration).

European Commission – Eurostat User Guide

© European Commission 2005 Page 5

• Domain Management: Describes the management of domains (Import, Export,

Hide/Show).

1.4 Terminology CIRCA Communication & Information Resource Center Administrator CNA Competent National Authority CSV Comma Separated Value EDI Electronic Data Interchange GESMES Generic Statistical MESsage GESMES/TS subset of GESMES for Time Series Data IDA Interchange of Data between Administrations FLR Fixed Length Record MIG Message Implementation Guide NSI National Statistical Institute

European Commission – Eurostat User Guide

© European Commission 2005 Page 6

2 Application overview 2.1 Introduction GENEDI has been designed to support all the tasks related to the transmission of the statistical data to Eurostat, including reorganisation and validation of the input file, conversion into a GESMES formatted file and automatic transmission to Eurostat. GENEDI is in essence a generic toolbox, so it can process any flat input file insofar as an optional mapping module allows users to make their files compliant with any GESMES dataset structures implemented. The mapping module generates a CSV file compliant with the GESMES structure selected by the user among a list of available dataset structures. Additionally, GENEDI provides the option of processing multiple files and the Batch mode. Then, a CSV file compliant with the selected dataset format is submitted to the toolbox. An automatic process (that can be disabled) verifies that the file’s data are compliant with specific validation rules defined in a configuration menu (see the “Configuration” for information) and translate it into a GESMES message. At the end, all GESMES messages are stored into an output folder and can be automatically sent to Eurostat via electronic mail sent to the STADIUM email server. Optionally, the user can send the GESMES messages to Eurostat encrypted. Moreover, GENEDI is able to validate data contained in a GESMES file produced by other applications. Note that in that case GENEDI outputs an “easy GESMES” formatted file (See appendix B for more details). Finally, apart from the Graphical interface, GENEDI provides Command line processing. The user can import a file, process multiple files or run the Batch command in command line.

European Commission – Eurostat User Guide

© European Commission 2005 Page 7

User Input folder

0_PreIntray

1_Intray

2_Validated

3_Gesmes

4_Sent

Export Process

GESMES to CSV

Conversion

Import Process

Mapping Process

Sending Process

CSV to GESMES

Conversion

Validation Process

User Output folder

Archive

European Commission – Eurostat User Guide

© European Commission 2005 Page 9

2.1.1 Import Process This process imports files to be processed by GENEDI, mainly by defining appropriate metadata such as Domain, Periodicity etc (see paragraph 4). Select File -> Import from File Menu. In order to complete the import, you must fill in the Properties Form, which contains information about the file. The name of file is created from the details inserted in the Form, to ensure naming conventions. Alternatively, if there are many files, you can copy the files to the folders. GENEDI will enter the files to the Process Tree, but the Properties must be entered manually. Note: File extensions are either “.csv” , “.ges” or “.flr”. 2.1.2 Mapping Process The goal of this process is to create or modify a mapping (correspondence) to apply to input files whose structure is not compliant with the GESMES key family structure (see the GESMES Message Implementation Guide relevant to your Statistical domain). The mapping process configuration is accessible from the configuration tree. Moreover during the mapping process the input time format can be transformed to be GESMES compliant. 2.1.3 Archive Process This module enables to archive automatically all files, which are being to be processed.

2.1.4 Validation Process This process is devoted to validate files (flat files - CSV, Edifact files - GES), in terms of structure and data. These controls respect specifications defined by Eurostat:

• GESMES Message Implementation Guide (MIG) for each statistical domain

• Structured flat file EDI solution

European Commission – Eurostat User Guide

© European Commission 2005 Page 10

• The easy GESMES EDI solution This module checks only formats, values and code lists membership. It doesn't verify the GESMES syntax. 2.1.5 Sending Process This process enables to compress and send automatically GESMES files (e.g. GES file) to Eurostat.

Note: -To use Statel, you need to install it before. If you haven't Statel and want to use it, contact the Eurostat Help-desk ([email protected]). - The Disk space type of transmission copies the file to a user specified folder, which corresponds to the input folder of the STATEL Robot Service (SRS) for automated transmission.

European Commission – Eurostat User Guide

© European Commission 2005 Page 11

2.2 File description The Generic EDI toolbox deals with 3 flat file formats: Format Meaning Appendix CSV The CSV files are flat text files. This is the

standard data format used by GENEDI B

FLR The Fixed Length Record files are flat text files. Each line contains contiguous fixed length blocks of characters. They are automatically converted into CSV file via the mapping module

B

GESMES GESMES files are EDI type files. They can be used as input in the ‘PreIntray’ directory for validation. They are automatically converted into CSV file before any processing.

A

European Commission – Eurostat User Guide

© European Commission 2005 Page 12

2.3 System requirement GENEDI is written in a Perl language, which can run on most of the known operating systems. The above table gives an abstract list of platforms and processors families, which support the Perl interpreter. Note that this list is not exhaustive, and the last update of this list can be found on the Perl web site (http://www.perl.com). Family OS Processor Unix AIX Aix Unix FREEBSD Free-bsd-i386 Unix LINUX i386-linux Unix HPUX PA-RISC1.1 Unix OSF1 Irix Unix IRIX Alpha_dec_osf Unix SOLARIS Sun4-solaris / i86pc-solaris Unix SUN OS Sun4-sunos DOS MS-DOS / PC-DOS Dos DOS OS/2 Os2 DOS MSWIN32 MSWin32-x86 / MSWin32-alpha /

MSWin32-ppc Mac OS MACOS Mac OS VMS VMS VMS VOS VOS VOS EBCDIC OS/390 OS390 EBCDIC VM/ESA vmesa Acorn RISC OS riscos

Note: Unix users should add the following Perl modules (if not included) in order for Genedi to function properly:

Tk version 804.027 or later Archive::Tar XML::Parse::Expat IO::String Compress::Zlib.

These modules can be found in the following address: http://search.cpan.org. GENEDI was developed and tested with Perl 5.8.4. So it is suggested to have installed in your Unix system Perl 5.8.4 or later.

European Commission – Eurostat User Guide

© European Commission 2005 Page 13

3 Getting started 3.1 Running GENEDI 1. Running GENEDI on Windows. Activate the “Menu” file in the “Start” menu.

2. Running GENEDI on UNIX. Change to the directory you have installed GENEDI and on command line type: >perl genedi.pl

European Commission – Eurostat User Guide

© European Commission 2005 Page 14

3.2 GENEDI layout

When GENEDI is activated, the following window is displayed:

This window allows reaching all the Generic EDI toolbox functionalities. Note: When GENEDI is used for the first time it scans the Tray folders and tries to identify the files present. When input files respect the Dataset Naming convention (according to the Eurostat DSNC document), GENEDI creates automatically an entry in the tree structure that corresponds to the domain and dataset found in the file name. Besides, the user can also manually import a file.

European Commission – Eurostat User Guide

© European Commission 2005 Page 15

3.2.1 Process Page Above you can see the process page. The Process Page has the following menu options: File, Process, View, Help (More details in these options at paragraph 4, 5 and 6). On the left side of the Process Page appears the Process Tree.

The Tree contains the following folders: PreIntray, Intray, Validated, Gesmes, Sent and Archive. Every folder contains the files in each level of process.

PreIntray Contains the files to be pre-processed. The following extensions are recognized: CSV, FLR, GES, and TXT

Intray Contains compliant CSV files. The following extensions are recognized: CSV and TXT

Validated Contains reporting files created during the validation process.

Gesmes Contains GESMES output files.

Sent Contains Sent files.

Archive

European Commission – Eurostat User Guide

© European Commission 2005 Page 16

Contains all the files processed. It also contains the archived and corresponding log files, as well as the old session log (Session log from previous GENEDI session). Each folder, except Archive, contains the files with the following structure: Domain → Dataset → File. This structure is present only when files exist.

Domain / Dataset Clicking on the Domain or the Dataset, their properties are displayed on the right side of the Process Page.

File Clicking on the file the Properties Form is displayed on the right side of the Process Page with the file properties. The user can modify these properties. Note: By modifying one or more file properties, the name of the file will be changed, since the file properties are used for naming the file.

European Commission – Eurostat User Guide

© European Commission 2005 Page 17

3.2.2 Configure Page

The Configure Page has the following menu options: File, View, Domain, Updates, Tools, Help (More details in these options at paragraph 7 and 8). On the left side of the Configure Page appears the Configure Tree. The Configuration Tree contains in a hierarchical view all the domains – datasets – statistical concepts currently available in GENEDI.

European Commission – Eurostat User Guide

© European Commission 2005 Page 18

European Commission – Eurostat User Guide

© European Commission 2005 Page 19

4 File Management

• Import A browser opens so that the user can select file to import files to GENEDI.

European Commission – Eurostat User Guide

© European Commission 2005 Page 20

The user can select “.csv”, “.flr”, “.ges” files. If the file name respects the Dataset Naming Convention provided by Eurostat (See Appendix K) then, GENEDI can read the information to fulfil the file Properties Form below. Note that GENEDI tries first to identify the domain in the file name and propose to display it automatically in the top left list box. Then, GENEDI tries to identify the dataset ID and fulfil automatically the top right list box, and so on. Otherwise, in order to complete the import, the user has to fill in manually the Properties Form, which contains information about the file. The name of file is created from the details inserted in the Form.

European Commission – Eurostat User Guide

© European Commission 2005 Page 21

In the Properties Form, the user can change the options selected when the file was imported. (On saving the name of the file changes.) Form details:

Domain: Statistical domain (name and description). Dataset Id: The dataset id (name and description). Periodicity: Periodicity of the data to be reported. Available options: Annual, Semester, Quarterly, Monthly, Weekly, Daily, Other, Non-periodic.

Country: The country code. Year: Reporting year of data. Period: Precise reporting period of data. Action: Options: New, Replace, Append (Optional)

Note: If several years/periods are covered by a dataset, only the last year/period must be specified. Note: The Action option may not be visible for some domains according to the request of the Eurostat Production units. The value selected is used in the STS segment in the GESMES header (see Appendix B).

• Export The user can save files to another location without deleting them from GENEDI.

• Delete The user can delete the selected file from GENEDI.

• Save The user can save changes to a selected file from the Tree.

• Exit Exits GENEDI.

European Commission – Eurostat User Guide

© European Commission 2005 Page 22

5 Action Options

• Prepare file Converts non-compliant CSV to compliant.

• Validate file Validates the dataset. If the dataset is non-compliant, runs the Pre-process option first.

• Process file Pre-processes, Validates and Creates dataset to GESMES format.

• Send Pre-processes, Validates, Creates and Sends the dataset. According to the channel you've chosen in the configuration tool (Mail, STATEL or Disk space), GENEDI will achieve the sending using the Mail parameters, the STATEL Nick Name (SNN) or the Destination folder respectively.

European Commission – Eurostat User Guide

© European Commission 2005 Page 23

In case you want to use STATEL, the software must be installed and a SNN must be asked to Eurostat.

• Convert GESMES file to CSV Converts GESMES to CSV format.

• Process all files Processes all imported files. Note: The Prepare to Send processes are cumulative and previous steps are performed if needed (e.g. if there is a file in the Intray and the ‘Send’ process is selected then performs validation to send and not pre-processing).

European Commission – Eurostat User Guide

© European Commission 2005 Page 24

6 View Options 6.1 Session Log The session log is displayed on the right side of the Process Window. Session log reports every action taken from the beginning of this GENEDI session. At the end of the session, this log file is sent to archive. 6.2 Select File The selected file’s details are displayed on the right side of the Process Window. 6.3 Validation Logs The following screen appears with the file validated. If there are more than 1 files, select a file to view and press Ok. To exit the Viewer, press Cancel.

The Validation Log Viewer is displayed.

European Commission – Eurostat User Guide

© European Commission 2005 Page 25

The Validation Log Viewer shows the results of the Validation process. The left side of the window contains all the information about the processed file. (File name, Date, Time Begin, Time End, Total Records, Records Rejected, Warnings, Valid Records, “Perfect Records” and “Duplicate Records”). At the bottom of the window are 4 buttons.

Button “Details”

European Commission – Eurostat User Guide

© European Commission 2005 Page 26

On the right side of the window, appears a list of all the records with errors or warnings, the selected record’s entry and the selected record’s details of errors and/or warnings. Select to see all the records, only the records with errors or only the records with warnings. (Appears only if there are Errors and Warnings) For each record, the details are: Error Level, Input Pos., Gesmes Pos, Field Name, Validation Rule and Message.

Button “Summary”

On the right side of the window, appears a summary of errors or warnings by field. For every field, there are shown: the number of problems, the Validation Rule and the Message.

Button “Duplicate Records” On the right side of the window, appear the records with Duplicate keys.

Button “Close” Closes Validation Log Viewer Window. Note: - If the validation process is successful, all the buttons are disabled except Button “Close”. - If there are no duplicate records, the Button “Duplicate Records” is disabled.

European Commission – Eurostat User Guide

© European Commission 2005 Page 27

7 Configuration

• Load Configuration Loads a saved configuration (ini) to the current form.

• Save Saves the current changes.

• Save Configuration as… Saves the current configuration with a different name than the default ini file. Clicking on each tree node (see paragraph 3.2.2), on the right side of the window, the user can configure the respective tree-item-node. There are two configuration types: (a) general and (b) concept. a) The General Configuration regards contact parameters, send parameters etc. and is structured in three levels: (i) global, (ii) domain and (iii) dataset.

European Commission – Eurostat User Guide

© European Commission 2005 Page 28

The global and domain configuration is used only to facilitate the dataset level configuration, while only the latter configuration is taken into account during processing. (see paragraph 7.1, 7.2, 7.3) b) The Concept Configuration, which is the lowest level of configuration, the concept parameters can be configured. (see paragraph 7.4)

European Commission – Eurostat User Guide

© European Commission 2005 Page 29

7.1 Global Configuration Global configuration is used to facilitate configuration. Once it is filled in and saved, it can be used for the configuration of domains if the domain configuration is empty. The global configuration is saved in the ‘global.ini’ file (see Appendix for configuration files). Select GENEDI from the Configuration Tree. On the right side of the window appears the following form:

The information that can be provided from the form is explained below.

Decimal Separator to be taken into account - Decimal separator: Indicate if you use "." or "," as decimal separator for numeric fields. Be careful not to select comma is your CSV files have comma as field separators!

Contact parameters (optional) - Contact function: The possible values are: None CC: responsible person for information production CP: responsible person for computer data processing CF: Head of unit for information production

European Commission – Eurostat User Guide

© European Commission 2005 Page 30

CE: Head of Unit for computer data processing - Contact name: the name of the contact person - Contact identity: (optional) gives the identity of the contact (dept. id) as known in sender's side (e.g. BoP, M&B, EDP,…) - Com. channel: The possible values are: EM: email TE: Telephone FX: fax XF: X.400 - Com. number: Telephone or fax number or e-mail address etc…

Transmission parameters - Auto compression: Check the button, if you need to compress the created GESMES files (using gzip) - Type of channel: Indicates the way data are transmitted to Eurostat. The possible values are:

Nothing (blank) Mail Statel Disk space

- Recipient identifier: Should be let at 4D0 code for EUROSTAT - Sender email: Your full email address or the email from the responsible of the data transmission. - Recipient email: Email address where you want to send your GESMES file. It can be an email at EUROSTAT like [email protected], or an internal email address. - SMTP server: The IP address of your SMPT server. It can be an alias like "smtp.mydomain" or a number like for example 105.212.256.005. - Disk space Directory: Contains the path where the file will be copied for Disk space type of channel.

European Commission – Eurostat User Guide

© European Commission 2005 Page 31

7.2 Domain Configuration Select the Domain from the Configuration Tree. On the right side of the window appears a form, which is the same with the Global Configuration form, except that it would contain the Global Defaults if all the field have been let blank. The domain configuration files are located in ‘codelists\DOMAIN\genedi_DOMAIN.ini’ (see appendix for configuration files).

European Commission – Eurostat User Guide

© European Commission 2005 Page 32

7.3 Dataset Configuration Select the Dataset from the Configuration Tree. The configuration of each dataset is located in the ‘codelists\DOMAIN\genedi_DATASET.ini’ file (see appendix for configuration files). This configuration information is actually used when processing a file that refers to a specific dataset.

On the right side of the window appears a form, which is the same with the Domain form, and additionally contains the following:

Other parameters - Compact GESMES: If the button is checked then the time range technique is used to write several observations per data segment. Consequently the GESMES output file is really downsized. Else the GESMES output file will contain one observation per line. Note that this option can be applied only in GESMES/TS datasets. - Check Duplicate Record: Check the button, if you need to check for duplicate records during validation. - Validate: Check the button, if you need to go through the validation process. - Code for optional values:

European Commission – Eurostat User Guide

© European Commission 2005 Page 33

Specify the code used for optional values to be handled as missing on purpose in the

001, but you can enter a free text up to 14 characters to identify

(ges2csv): e has a full GESMES format due to an input date that

utton, if you want to send the file encrypted to Eurostat. The Encryption

for

t button is selected the Public Key should be filled with the Production unit

Button “Save”

ave the parameters shown in the current form

dataset during validation. - Message Reference: Is by default MREF000uniquely the message. - GESMES date formatCheck the button, when the input filis GESMES formatted. Not selecting the button should be used only in specific cases when the input file has not a full GESMES format due to an input date that is not GESMES formatted. - Encrypt: Check the boption requires the GnuPG application. The management of the keys is configured outside Genedi. Genedi can only set the folder where GnuPG is located (see par. 7.5more details). - Public key: When Encrypresponsible for the domain (recipient production unit).

S

European Commission – Eurostat User Guide

© European Commission 2005 Page 34

7.4 Statistical Concept Configuration Concept form is used for configuring a statistical concept. This configuration includes mapping, transcoding and validation rules. In order to open the Concept Form, select a Statistical concept from the Configuration Tree.

Mapping Mapping enables the user to create a correspondence between the fields of CSV or FLR input files and the fields of the GESMES key family structure (see appendix for mapping configuration file). - Position: If selected, change or enter the position (if the input file is CSV) or the range length (if FLR) of input fields. Field that doesn’t correspond to a GESMES concept could be let blank. - Default value: You can give a default value instead. If you use FLR input file format, you should just fill the list with a length format as “x-y” instead of a position number. - Hide Code:

European Commission – Eurostat User Guide

© European Commission 2005 Page 35

If you want to hide the codes for the selected statistical concept, select the button. For specifying the code to be displayed for each hidden code see the Appendices Codelists and Hide code configuration file. If no replaced code is specified GENEDI puts ‘8’s equals to the length of the hidden code.

Transcoding Transcoding is used when the input file for the specified concept, uses a different code from the GESMES structure code lists. - Transcode: If the concept needs transcoding, check the button and enter a name for the transcoding rule. The name of the rule corresponds to a file named “rule_name.trc” where the actual transcoding is defined for the selected concept. - Edit Rule: Press the button, to create or edit the rule. The “rule_name.trc” will open in the notepad or vi depending on the OS respectively. The user should enter one line for every different code. The structure of this line is: GESMES code;Comment on GESMES code;User code;Comment of user code When finished, the user has to save the file. (See Appendix Transcoding tables and rules for more information) - Time Transcoding: This option is enabled only when the concept corresponds to the time period. If you use non standard date reference, you have to tick off the "time transcoding" option. Then you have to select the Input Time Format and optionally the User Codes (see below). See appendix for Mapping configuration files for more details. - Input Time Format: A list box gives a list of input time format in your dataset. Select the correct format in the list. If a standard GESMES format is selected then it is not saved in the configuration (the input date does not need any change). - User code for: If the Input Time Format contains different period codes instead of the default ones, you must also define the list of your own date reference using "," as separator. For example, type A,B,C,D or 03,06,09,12 in the text entry labeled "user code for" if you want GENEDI to use this reference for Quarter rather than the default list for Quarter (01,02,03,04). Exception is the time formats for year that do not have period code. The default (standard) period codes are proposed in the drop down list.

Validation - Validation Type:

European Commission – Eurostat User Guide

© European Commission 2005 Page 36

Helps you define the validation rules for a given statistical concept in a given dataset (see appendix for validation files). The available rules are:

• Numeric • Alphabetic • Alphanumeric • Codelist • Defined Value

For each check, the error level can be specified. The Error Level has two values: warning and reject. When a check fails:

If its level was specified as “Reject”, it means that the dataset has major errors and it cannot be converted to GESMES.

On the other hand, if its level is “Warning”, it prompts the message to the validation log files but the dataset can be converted to GESMES because the error is not major.

- Mandatory: If the check button is selected, the statistical concept value must be present in the input file data. - Numeric: the field is numeric, you can verify that each value is inferior to a Max Value or superior to a Min Value. The Max and Min Value can be entered by the user. You can also verify the number of characters of each value. Additionally, you can verify that the field value is greater, lower or equal to a linear expression of another field value with “Comparison” check.

- Alphabetic: you can verify if a field is alphabetic and check the number of digits of its value. You can verify that each value is inferior to a Max Value or superior to a Min Value. The Max and Min Value can be entered by the user.

European Commission – Eurostat User Guide

© European Commission 2005 Page 37

- Alphanumeric: you can verify if a field is alphanumeric and check the number of characters of its value. You can verify that each value is inferior to a Max Value or superior to a Min Value. The Max and Min Value can be entered by the user.

- Codelist: you can verify if the field belongs to a code list. This rule is available only if a default code list is assigned to this field.

- Defined Value: you can verify that the field is equal to a target value.

Button “Save” updates the configuration files according to your settings. Button “Remove rules” allows you to remove the current rules for the concept from the configuration files. Important note: You cannot select several validation rules from the graphical interface, if you need that option, please contact the edamis support at [email protected]. 7.5 Encryption Settings To set the Encryption Settings, select from the menu Tools -> Encryption Settings.

European Commission – Eurostat User Guide

© European Commission 2005 Page 38

Where the following form is shown:

Select the location of the GnuPG executable (e.g. gpg.exe for Windows). Gnupg is used during encryption, if required.

European Commission – Eurostat User Guide

© European Commission 2005 Page 39

8 Updates 8.1 Update configuration files A file chooser is displayed to specify the file and the type of file to be imported to GENEDI. If the type of file is a zip archive, GENEDI prompts to the user a pop up for confirmation of the update of each domain in the archive. In case the file selected is a codelist (.txt), a ini file, or a configuration file (.cod), GENEDI prompts the user to select the domain(s) in which the file has to be updated (next figure). 8.2 Import A file chooser is displayed to specify the package that contains the domains to be imported to GENEDI. When parsing the package, GENEDI prompts to the user a pop up for confirmation of each domain in the package. The new domains imported are set as “Shown” and are displayed in the Configuration Tree.

European Commission – Eurostat User Guide

© European Commission 2005 Page 40

Note: the zip archive should respect the following folder structure: codelist └ [Domain short name] └ files └ … validation └ [Domain short name] └ files └ … Example: codelist └ AIR └ aircraft.txt validation └AIR └ AIR_A1_Q_validation.txt

Pop up window for selecting the domain in which files have to be updated

European Commission – Eurostat User Guide

© European Commission 2005 Page 41

9 Command Line GENEDI 2.1 can also be run from command line. The usage is: genedi.pl [-options] files The options include: batch, command, import and help. 9.1 batch Runs GENEDI in batch mode. All files that respect the file name convention will be automatically processed. The other ones will have to be identified manually either using the GUI or the import command described in paragraph 9.3 below. The command is given for example: perl genedi.pl –batch Note: In case the path to the Perl executable is not set in the default system paths, the command may be: On Windows platforms, when located in the genedi root folder: /perl/bin/perl genedi.pl –batch On UNIX platforms, when the path to perl.exe is “mypath”: mypath/perl genedi.pl -batch 9.2 command <command> Executes a specific command. The commands available are:

preproc: preprocesses a file validation: validates a file create: creates a GESMES file send: sends a GESMES file ges2csv: converts a gesmes file back to CSV

An example of the command is: perl gendi.pl –command validation 1_Intray\LCI_Q_A1_2005_0001_V0002.csv 9.3 import Imports a file. The command must be given with the following: -domain <domain name> -ds_id <dataset name> [-periodicity <periodicity>]

(A: Annual, S: Semester, Q: Quarterly, M: Monthly, W: Weekly, D: Daily, O: Other, N: Non Periodic, 2 to 9 for every 2 to 9 years)

-country <country> -year <year> -period <period>

European Commission – Eurostat User Guide

© European Commission 2005 Page 42

(0000 for annual, 0001 to 9999 for others) [-action <action>] (N: New, R: Replace, A: Append, P: Partial, O: Other) -version <version> -compliant (0 1) An example of the command is: perl genedi.pl -import -domain LCI -ds_id LCI_Q -country A1 -year 2005 -period 0001 -version 0001 -compliant 1 "C:\lci-q-compact.csv" Note: The full path of the file to be imported is needed. For windows it is important to enclose this path in brackets. 9.4 help Displays the command line options.

European Commission – Eurostat User Guide

© European Commission 2005 Page 43

APPENDIX A: Error and warning description There are several predefined validation rules that you can apply on each field of your data file. Numeric rules: You can verify that a field is numeric. You can verify the number of digit. You can verify that a field is inferior or equal to a value. You can verify that a field is inferior or equal to a value. Alphabetic rules: You can verify that a field is alphabetic. You can verify the number of letters. Alphanumeric rules: You can verify that a field is alphanumeric. You can verify the number of characters. Comparison rules: You can verify that f1 (>,<,=) A * f2 + B, where f1 and f2 are two field values, A and B, two parameters you can define. Defined value rules: You can verify that the field is equal to a particular value. Code list rules: You can verify that the field’s value belong to a code list. You can decide to verify one among several types of rules. For more information and to set those rules, see the Appendix for setting validation rules. Example of errors: REJECTED 5 1 ECON_IND belongTo IE not found in code list ECONOMIC_INDICATOR' Means that the value of the field 5 in the input structure (field 1 in the GESMES structure, named ECON_IND) “IE”, is not part of the code list ECONOMIC_INDICATOR. REJECTED 3 1 SEATS minValue 4 is lower than 5 Means that the value of the field 3 in the input structure (field 1 in the GESMES structure, names SEATS) “4” is too small and should be superior or equal to 5.

European Commission – Eurostat User Guide

© European Commission 2005 Page 44

REJECTED 2 1 SEATS maxValue 4 is greater than 3 Means that, the value of the field 2 in the input structure (field 1 in the GESMES structure, names SEATS) “4” is too big and should be inferior or equal to 3. REJECTED 1 1 TYPE isAlphanumeric A2C has wrong number of characters, it should be between 0 and 2 characters. Means that on line 1, the value of the field 1 in the input structure (field 1 in the GESMES structure, named TYPE) “A2C” has the wrong number of characters that should be between 0 and 2. REJECTED 2 4 NAME isAlphabetic ABCD has wrong number of characters, it should be less than 3 Means that, the value of the field 2 in the input structure (field 4 in the GESMES structure, named NAME) “ABCD” has the wrong number of characters that should be inferior to 3. REJECTED 1 1 NUM isNumeric 123 has wrong number of characters, it should be between 2 and 5 characters Means that the value of the field 1 in the input structure (field 1 in the GESMES structure, named NUM) “123” has the wrong number of characters that should between 2 and 5 characters. REJECTED 1 8 TYPE isAlphanumeric A2# is not a alphanumeric value Means that the value of the field 1 in the input structure (field 8 in the GESMES structure, named TYPE) “A2#” is not a Alphanumeric value. REJECTED 2 3 NAME isAlphabetic AB1 is not a isAlphabetic value Means that the value of the field 2 in the input structure (field 3 in the GESMES structure, named NAME) “AB1” is not a alphabetic value. REJECTED 1 1 YEAR isNumeric 12A is not a numeric value Means that the value of the field 1 in the input structure (field 1 in the GESMES structure YEAR) “12A” is not a numeric value REJECTED 1 5 NAME targetValue TOTO is not equal to TATA Means that the value of the field 1 in the input structure (field 5 in the GESMES structure, named NAME) “TOTO”, is not a right value and should be “TATA”. REJECTED 1 5 NUM Comparison 400 is not lower than 5 * 100 Means that the value of the field 1 in the input structure (field 5 in the GESMES structure, named NUM) should be lower than 5 * 100.

European Commission – Eurostat User Guide

© European Commission 2005 Page 45

APPENDIX B: GESMES format

Principles and Transmission means Some very simple principles shall be applied when generating and then transmitting a GESMES message to Eurostat: • one data set = one GESMES message = one file = one email or one STADIUM

consignment • All GESMES messages shall be sent to Eurostat via STADIUM (client software that

can be provided by Eurostat), or using the following structured e-mail solution:

Message to: Internet: [email protected] (or X.400:C=INT;A=RTT;O=CEC;P=EU;S=ESTAT;G=STADIUM) Subject field-naming convention: “eDAMIS #[input file name].ges” Body: "GESMES File (compressed) automatically sent by GENEDI version 2.1"

European Commission – Eurostat User Guide

© European Commission 2005 Page 46

Overall structure of the easy GESMES The easy GESMES message could change from a statistical domain to another. Its global format is shown below. The variable part of the message is shown in bold and Italics and is detailed in the next pages. Each line starts with an identifier (called a segment), and ends with a quote ‘. It is requested, for readability purposes, to start a new line for each segment; this will also avoid possible problems of transmission of the message if it is too large. A data set “DATASET” for country “CC”, Year “YY” and period “PP” (identified as file “[input file name].ges”) can have the following structure: The header, whose model can be found in the “gesmes_xxx.txt” files where “xxx” designates the short name of the statistical domain (e.g. BOP,MRTM,FA,FT,IS,AV) UNA:+.? ' Or UNA:+decimal_separator? ' UNB+UNOC:3+Sender identifier+EUROSTAT+current date:time+Interchange reference number++GESMES/TS' UNH+Message reference number+GESMES:2:1:E6' BGM+74' NAD+Z02+ECB' NAD+MR+EUROSTAT' NAD+MS+Sender identifier' CTA+contact_function+contact_id:contact_name' COM+com_number:com_channel' IDE+10+Message identity' DSI+data set identifier' STS+3+7' DTM+242:date time:203' DTM+Z02+period of reference:date time format' IDE+5+dataset structure' GIS+AR3' GIS+1:::-' The data section, a complete description is provided the next section ARR++ dataset' The Trailer section UNT+Number of segmentd+Message reference' UNZ+1+Interchance reference number'

European Commission – Eurostat User Guide

© European Commission 2005 Page 47

Important remark: In the GESMES header, if the UNA segment contains a variable decimal_separator then it is assumed that the GESMES output will not GESMES\TS but another subset of GESMES (e.g. GESMES\DSIS, GESMES\PRODCOM,...). If the UNA segment is the first one (e.g. UNA:+.? ') then GENEDI will use the GESMES\TS subset. Consequently, in the first case the input decimal separator will be used in the GESMES output. In the second case, the GESMES\TS case, the decimal separator must be a dot, so the conversion of the input decimal separator to a dot will be automatic.

List of segments Names Description Example of usage UNA:+ Service String Advice UNA:+.? ’ UNB+ Interchange Header UNB+UNOC:3+IT3+EUROSTAT+010630:

1234+IREF00001++GESMES/TS’ UNH+ Message Header UNH+MREF00001+GESMES:2:1:E6’ BGM+ Beginning of Message BGM+74’ NAD+Z02+ Name and Address (organisation

maintaining the code lists) NAD+Z02+ECB’

NAD+MR+ Name and Address (recipient) NAD+MR+EUROSTAT’ NAD+MS+ Name and Address (sender) NAD+MS+IT3’ CTA+ Name of the function of the

person (on the sending site) whose name follows

CTA+CC+IS/BoP:Mr John'

COM+ Telephone or fax number or e-mail address etc...

COM+004969:TE'

IDE+10+ Message identity IDE+10+my text DSI+ Dataset identifier DSI+BOP_Q’ STS+ Status Report STS+3+7’ DTM+242: Date/Time/Period (preparation

time) DTM+242:20002241345:203’

DTM+Z02: Date/Time/Period (reporting period)

DTM+Z02:19982000:702’

IDE+ Dataset structure Identity IDE+5+EUROSTAT_BOP_01’ GIS+ General Indicator GIS+AR3’ GIS+1 General indicator (character

used for the missing value in ARR segment)

GIS+1:::-‘

ARR++ Array ARR++1997:3:ESHUV:2:ESBIO:0111:1::::::5023’

UNT+ Message Trailer UNT+59+MREF000001’ UNZ+ Interchange Trailer UNZ+1+IREF000001

European Commission – Eurostat User Guide

© European Commission 2005 Page 48

Variable parts Most of the variable parts are linked to the configuration and set-up parameters, and to a field contained in the array segments (ARR+). This is the purpose of the “Links” column. Variable part name and list of segments where it is present

Possible value Links

Sender identifier (set up, in UNB+ and NAD+MS+)

List of sender identifiers contained in the code lists cl_organisation_lit.cod

Given during the installation

Current date (in UNB+) Format YYMMDD (e.g. 010630 for 30 June 2001)

System date when the message is built

Time of preparation (in UNB+)

Format HHMM (e.g. 1415 for 2:15 pm)

System time when the message is built

Interchange reference number (in UNB+ and UNZ+)

A unique reference that identifies the interchange Example: an incremental number

Always IREF000001

Message reference (in UNH+ and UNT+)

Reference that must be unique within the interchange. Free text up to 14 characters.

By default MREF000001. It can be changed in the main menu window

contact_function (in COM+)

CC: responsible person for information production CP: responsible person for computer data processing CF: head of unit for information production CE: head of unit for computer data processing

contact_id (in COM+) Free text up to 17 characters, eg BoP, M&B, EDP, ISCD, etc.

contact_name (in COM+) Free text up to 35 characters, eg. John Smith

com_number (in CTA+) Free text up to 512 characters, eg 0049 69 1344 0

com_channel (in CTA+) EM: e-mail TE: telephone FX: fax XF: X.400

Message identity (in IDE+10+)

Free text (up to 25 characters) to give message identity

Dataset identifier (in DSI+) In order to distinguish messages containing data sets structured according to the same key family, a different identifier has to be given for each of them. The lists

European Commission – Eurostat User Guide

© European Commission 2005 Page 49

of data set identifiers differ according the statistical domain. It could be found in file XXX_dataset_id.cod, where XXX is the short name of the statistical domain.

Date time (in DTM+242) Date and local time of the extraction of the dataset

System date and time when the message is built

Period of reference - second term Date time format - first term (in DTM+Z02)

For specific periods 602 - for CCYY 608 - for CCYYQ 610 - for CCYYMM ranges: 702 - for CCYY-CCYY 708 - for CCYYQ-CCYYQ 710 - for CCYYMM-CCYYMM where: CC - century YY - year Q - Quarter (1,2,3,4) MM - month (01,02…12)

In the ARR++ segments, linked to the “time period” or the (“reference year” and “reference quarter”)

Data set structure (in IDE+) It can be deduced from the dataset identifier in the code lists XXX_dataset_id.cod

Linked to the dataset identifier

Number of segment (in UNT+)

Contains a count of the number of segments in the message, between UNH and UNT ( but without UNA, UNB, and UNZ)

Equal to (Number of ARR++ segments) + (number of Header segments) -2

Remark: if updates (correction, append, delete…) must be achieved on a data set already transmitted to Eurostat, then the whole data set with the updates must be retransmitted to Eurostat.

European Commission – Eurostat User Guide

© European Commission 2005 Page 50

Structure of the data set segment (ARR++) The content of the data set will be included in the ARR segment. The structure required for the transmission from the Competent National Administrations to Eurostat depends first on the statistical domain, and then on the dataset identifier. It is not possible to give a complete description of each dataset structure used in the framework of GENEDI. However, to illustrate the purpose, the following table shows the EUROSTAT_POPSTA_RD key family structure:

KEY FAMILY MNEMONIC: ESTAT_POPSTA_RD

Organization and statistical concepts involved

Position In key:

Attachment level and Usage status

Concept Mnemonic Concept Name Format Code list Mnemonic Code List Name

DIMENSIONS 1 FREQ Frequency AN1 CL_FREQ Frequency code list (BIS,ECB) 2 REF_AREA Reporting Country AN2 CL_TERTORY Area/Reference Region code list 3 DEMOIND Demographic Indicator AN..14 CL_DEMOIND Demographic Indicator code list 4 REF_REGION Reference Region AN..16 CL_ TERTORY Area/Reference Region code list 5 GENDER GENDER AN..5 CL_GENDER GENDER code list 6 AGE Age AN..10 CL_AGE Age code list

In ARR: Time reference and observation 7 ΤΙΜΕ_PERIOD Time period or range AN..35 - 8 TIME_FORMAT Time format code AN3 - 9 OBS_VALUE Observation value N..15 - Assig

nment level

C, M

OBSERVATION ATTACHED ATTRIBUTES (in the main ARR segment)

10 Observation

C OBS_STATUS Observation status AN1 CL_OBS_STATUS Observation status code list (BIS,ECB, Eurostat-BoP)

11 Observation

C OBS_CONF Observation confidentiality AN1 CL_OBS_CONF Observation confidentiality code list (Eurostat-BoP, ECB)

OTHER ATTRIBUTES Obse

rvation

C OBS_COM Observation comment AN..350

Message Administration

ORGANISATION Organisation AN3 CL_ORGANISATION

European Commission – Eurostat User Guide

© European Commission 2005 Page 51

APPENDIX C: Flat file format

Purpose Eurostat supports 2 EDI-compatible formats: • A GESMES standard message compliant with the UN/EDIFACT standard • A structured flat file format ( as a temporary/intermediate solution) The purpose of this section is to detail the “structured flat file format” complete EDI solution for those of Competent National Authorities who have chosen this option to transmit statistics to Eurostat.

Why use a structured flat file EDI solution This structured flat file solution that is an alternative to the UN/EDIFACT GESMES/TS solution offers the following main advantages: • The cost of this EDI solution is close to zero, as the transmission tool (e-mail or

STADIUM) is available and the format can be generated automatically (without any need for additional software) from most of the existing database (e.g. export format “csv” of Oracle, “save as “csv” in Excel…),

• It is compatible with full EDI solutions based on servers (with the possibility of having a full automatic process to extract data from a central database and to transmit it to Eurostat),

• But it is also compatible with a PC user that extracts manually data to transmit it to Eurostat,

• The “structured flat file + structured e-mail” EDI solution is flexible enough to permit the transmission of explanatory notes with the data, also the compression, and if necessary encryption of the data,

• It can be understood easily just by reading the GESMES Message Implementation Guides (in which the structure and codes are detailed)

• It can be a good standard input format for the development of an EDI toolbox.

CSV format Principles: • One dataset = one structured flat file • One record in the dataset = one line in the associated structured file • The choice is let between having “;” (semicolon) or “,” (comma) or tabulation as field

separator.

FLR format Principles: • One dataset = one structured flat file • One record in the dataset = one line in the associated structured file • Each line is composed of a succession of data without separator. To recover each field

values, one uses a table that provides the length (number of digits) of each field.

European Commission – Eurostat User Guide

© European Commission 2005 Page 52

Example: This is a FLR line: 200007 IE 10101110000IE US Year: 4 digits -> 2000 Month: 2 digits -> 07 Country code: 3 digits -> IE …

European Commission – Eurostat User Guide

© European Commission 2005 Page 53

APPENDIX D: Quick help to add/customise statistical domains in GENEDI

Documents required: • GESMES MIG or the output structure • Descriptive of the Input structure

Procedure to add a new statistical domain called XXX 1. Create a sub folder called XXX in folder GENEDI\Codelists\ 2. Copy the codelists in sub folder XXX 3. Create a text file called "XXX_dataset_structure.cod" and write the output statistical

structure following strictly the model: NAME_OF_STRUCTURE_1 @duplicate_keys = list of positions @time_period = position of the year Concept1;Definition;Codelist_name (optional) Concept2; Definition;Codelist_name (optional) ... ; NAME_OF_STRUCTURE_2 Concept1; Definition;Codelist_name (optional) Concept2; Definition;Codelist_name (optional) ... ; Example: MY_TEST_STRUCTURE Value;; Date;; Code;;my_test_codelist ; Remark:

1. The Code list name should be present only for field whose values belong to a list of codes.

2. The concept definition is mandatory 3. The parameters, words starting with @, correspond to the following information:

• @duplicate_keys = list of field GENEDI has to take into account when checking for duplicate records • @time_period = position of the field that contains the year or,

@time_period = position of the field that contains the year, position of the field that contains the months, the quarters, or the semesters

European Commission – Eurostat User Guide

© European Commission 2005 Page 54

• @period_format, This parameter is necessary only if you want to use your own reference for the date (different from the GESMES references) for the months, the quarters, or the semesters. In that case, enter the list of references using the following model: my first reference, corresponding GESMES reference, my second reference, corresponding GESMES reference,... Example: Q1,1,Q2,2,Q3,3,Q4,4, means that I use Q1 for my first quarter, and that this reference corresponds to value 1 in GESMES, etc...

• @file_name, this parameter is necessary only if you use several family of reference in the same dataset. For example, if you are using Q1 or 2, Q2 or 22 etc... In that case, enter the full name (with the path) with quotes of the file containing the correspondences between the different family of references. Example of file content: Q1; 21; 1st Quarter 21; 21; 1st Quarter (second choice) Q2; 22; 2nd Quarter 22; 22; 2nd Quarter (second choice)

The table below summarises the different combination of parameters according to the date format in the input dataset:

Case

Example of date format for the 1st quarter 2001

number of positions for @time_period

Presence of @period_format

Presence of @file_name

1 20011 1 No Not used 2 2001;1 2 No Not used 3 2001;Q1 2 Yes No 4 2001Q1 1 Yes No 5 2001;21 2 Yes Yes 6 200121 1 Yes Yes

Save the file in sub folder XXX.

4. Create a text file called "XXX_dataset_id.cod" and write the following line: NAME_OF_DATASET_A; NAME_OF_STRUCTURE_1 NAME_OF_DATASET_B; NAME_OF_STRUCTURE_2 ... Remark: one structure name can be use for several dataset Example: My_test_dataset;MY_TEST_STRUCTURE

5. Open file "list_of_units.cod" located in folder "GENEDI\codelists", and add the a new line as follows: XXX;long name of the new domain

European Commission – Eurostat User Guide

© European Commission 2005 Page 55

Example: TEST;New domain for test

6. Create the files for GEMSES header and trailer. Generally, these files are similar for every domain in GENEDI. However, be careful since some parameters may change. You can find these files in folder GENEDI\Codelists\STS for instance. Their names are gesmesSTS.txt and gesmesTrailerSTS.txt. Copy these files in your sub folder Codelists\XXX and rename them gesmesXXX.txt and gesmesTrailerXXX.txt.

7. If you need to validate data, create validation rules.

Create a sub folder called XXX in folder GENEDI\Validation Launch GENEDI, Choose the domain you have created in the list of statistical domains, Select the dataset in the right frame", Click on "configure" button, Go in the menu bar and click on "File", Choose in the list "Set Validation rules", Follow the procedure described in the "GENEDI configuration for validation rules"

8. If you need to transcode data: Create a sub folder called XXX in folder "GENEDI\tools\transcoding" Create the transcoding tables, that consists in a text file with four columns as follows: OUTPUT CODE;COMMENT;INPUT CODE;COMMENT Remark: 1) Comments are optional BUT NOT the semi colons. If you omit the comments then the line should be: OUTPUT CODE;;INPUT CODE; 2) Save the file with an explicit name (free), but with extension ".trc"

Repeat step 4 for each concept requiring a transcoding. 9. Launch the transcoding tool, select the new created domain, and the data set id in the

right frame. Then, click on the "configure transcoding" button, select each concept in the right frame whose values must be transcoded, and enter the name of the corresponding table created in step 4 (without the file extension) Example: if the file containing the table is called "ACTIVITY_TABLE.trc", and the table is for the concept "Activity", then enter the name ACTIVITY_TABLE

10. Click on button "save". 11. See the Appendix for Transcoding to use the transcoding tool.

European Commission – Eurostat User Guide

© European Commission 2005 Page 56

APPENDIX E: Mapping configuration file

Mapping concept In order to make the mapping operation as easy as possible, the full name of the GESMES statistical concepts should appear in this list. So, the mapping tool could look like a window with two columns, one that contains the position or the length of input fields, an other that shows the name of the statistical concept. Example: Positions Default

values Gesmes concept definition

1;5;6 5 2 3

I2 F

Reference period Threshold indicator Productivity Partner country Flag Other partner country

Or Range of length

Default values

Gesmes concept definition

1-5;16-21 6-6;22-22 7-9 10-15

I2 F

Reference period Threshold indicator Productivity Partner country Flag Other partner country

Field that doesn’t correspond to a GESMES concept could be let blank. But, you can give a default value instead. If you use FLR input file format, you should just fill the list with a length format as “x-y” instead of a position number. When the configuration is realised, you have to save the parameters; the mapping is saved into a “X.map” file, where X designates the dataset identifier name (e.g. EXTRA-M.map). Note: it is possible to map several input fields to one GESMES concept. This situation occurs when the same variable is used several times in the one record. In that case, you can indicate either the positions or the ranges of length several times separated by a semi-colon (see tables above). The next figure shows an example where this functionality is useful.

European Commission – Eu

© European Commission 2005 Page 57

rostat User Guide

Block 1 Block 2 Block 3 Block 4Block 1

Block 1

Block 1

Block 2

Block 3

Block 4

Block 4 Block 5

Different values for the same variable

GESMES structure

Block 5Block 1 FLR file structure The mapping configuration can be performed in the User Interface in a Statistical Concept Form (see User Guide par. 7.4). Mapping file format When saving a mapping configuration the “.map” file is saved. The following example of shows a .map file for FLR input file. The input time format is provided in file codelist\[Stat domain]\[Dataset id].map on the first line. Quarterly:01,02,03,04 QQYY;14 :9:608 :11:A :12:F :13: :6:1 1-2:2: 4-4:4: 5-8:5: 9-9:3: 10-10:1: 11-14:7: 17-20;33-36;49-52;65-68:8: 21-32;37-48;53-64;69-80:10: The first value provides GENEDI the user time nomenclature for the input time. In the example above, values 01,02,03 and 04 are used to indicate each quarter. But if the if your input time uses different values to reference each quarter you can change them by:

Cheching the “time code transcoding” button Entering the list of values for each quarter separated by commas

The second value indicates GENEDI the input date format, so that it could convert it correctly to the GESMES time period format.

European Commission – Eurostat User Guide

© European Commission 2005 Page 58

The possible formats are: MMYY, YYMM, MMCCYY, CCYYMM, SCCYY, CCYYS, SSYY, YYSS, SYY, YYS, QQYY, YYQQ, QYY, YYQ, CCYYQ, QCCYY, WWYY, YYWW, WWCCYY, CCYYWW, CCYYMMDD, DDMMCCYY, YYMMDD, DDMMYY, xxYY. With the notation: CC for century, YY for year, MM for month, WW for week number, DD for day number, S for semester, Q for quarter and xx for anything which must be omitted. Note: If the input time format selected is not a standard GESMES output format, then it is saved in the first line of the map file, thus no time transcoding is needed. The number following the input time format is an index used by the graphic interface. Many-to-one mapping GENEDI is also able to concatenate input field values and to map it to a GESMES concept. For example, your dataset structure uses two fields for the date and the periodicity (e.g. ...;2002;3... in a CSV file). As in GESMES/CB the date and periodicity are coded in only one field like 20023, you need to concatenate your input fields. To use this functionality in the mapping module, just put X.Y in the position column that corresponds to the GESMES concept selected, where X and Y are the positions of your fields in your dataset structure. The example above shows an extract of a mapping file using the concatenation option: :12:F :13: :6:1 1:2: 2:4: 3.4:5: that means that input fields at position 3 and 4 will be concatenated and put at the 5th position in the GESMES dataset structure

European Commission – Eurostat User Guide

© European Commission 2005 Page 59

APPENDIX F : Configuration files Dataset configuration file These files are located in Codelist/Domain/genedi_Dataset.ini. The structure of this file is for example: # # updated by confManager.pm # date : 12/01/2005 # # # csv parameters # field_separator = ";" decimal_separator = "." # # contact parameters # contact_function = "" contact_name = "" contact_id = "" com_channel = "" com_number = "" # # transmission parameters # compress_mail = "yes" #compress automatically output files type_of_channel = "Statel" #use of Mail, Statel or Disk space to send data recipient_identifier = "4D0" #code that identifies the recipient (usually code for

#Eurostat) recipient_email_address = "[email protected]" #email of the recipient sender_email_address = "" #email of the sender smtp_server = "" #IP address of the sender disk_space = "" #Destination folder for Disk space channel # # other parameters #

European Commission – Eurostat User Guide

© European Commission 2005 Page 60

gesmes_factorization = "" #control the GESMES factorisation option checkDuplicateRecord = "yes" #use to check duplicated record validation_boolean = "yes" #control the validation option optional_values = "" #definition of values in the record that will be

#handled as optional. More than one values #separated by comma.

Message_ref_num = "MREF000001" ges_X_boolean = "no" #indicates if the date has GESMES format for the

#GEMSES to CSV process encrypt_boolean = "no" #indicates if the file to be sent will be encrypted public_key = "" #the public key of the recipient for the encryption The # symbol means that the line is a comment. The four first lines give the time of creation of the file. The other lines are long the format parameter = “value”. Each parameter value is updated by the graphic interface. It is possible to change manually any of these parameter values but no checking will be done on these changes, and could lead to error production.

European Commission – Eurostat User Guide

© European Commission 2005 Page 61

Genedi setup ini file This file “genedisetup.ini” is created during the installation process and contains parameter values relevant to: The used-defined input and output folders (used only for 1.6.* and earlier versions, otherwise they should have their default values in the installation directory). The network paths (only used for a network installation) This is an example of genedisetup.ini: # # user-defined paths # Import_PreIntray = "0_PreIntray" # path where GENEDI must import non-

compliant input files Import_Intray = "1_Intray" # path where GENEDI must import

compliant input files Export_Gesmes = "3_Gesmes" # path where GENEDI must copy output

files # # paths for network use # inputfilepath = "" # Only used for Network installation absolutepath = "" # Only used for Network installation Setup ini file The “setup.ini” is created during the installation process and contains parameter values relevant to: The version The STATEL Nick Name The country and the Institution ver = "2.1" allow_sm = "yes" # Boolean to authorise the sending module snn = "eurostat-1" # Statel default address # # country parameters # sender_identifier = "FR1" # code that identifies the sender country_code = "FR" # country code country_code2 = "INSEE" # name of the institution country_name = "France" # full country name gnupg_location=”” #folder containing the gpg executable

European Commission – Eurostat User Guide

© European Commission 2005 Page 62

APPENDIX G: Validation file When validation rules are set or modified through the concept configuration screen, a file called [DATASET_ID]_validation.txt is created in folder codelists/[STAT_DOMAIN]. The aim of this section is to describe the structure of such a file. In some specific cases (use of generic advanced validation rules), it will be necessary to modify the content of these files with a text editor. File Structure The Validation rules of dataset EXTRA-M of Foreign trades are shown. When you have set the validation rules using the graphic interface, the following file is created: ##value_conf belongTo,28,R,O,CL_CONFIDENTIALITY ##END ##container belongTo,11,R,M,CL_CONTAINER ##END ##delivery belongTo,15,R,O,CL_DELIVERY ##END ##oth_part_conf belongTo,20,R,O,CL_PART_CONF ##END ##confidentiality belongTo,16,R,M,CL_CONFIDENTIALITY ##END ##transaction belongTo,14,R,O,CL_TRANSACTION ##END ##part_country belongTo,6,W,M,CL_AREA_GEO ##END ##net_conf belongTo,30,R,O,CL_CONFIDENTIALITY ##END

European Commission – Eurostat User Guide

© European Commission 2005 Page 63

##pref belongTo,9,R,M,CL_PREFERENCE ##END ##invoic_conf belongTo,29,R,O,CL_CONFIDENTIALITY ##END ##freq belongTo,1,R,M,CL_FREQ ##END ##stat_proc belongTo,8,R,M,CL_STAT_PROC ##END ##ref_country belongTo,2,R,M,CL_AREA_GEO ##END ##front_trans belongTo,10,R,M,CL_TRANSPORT ##END ##trans_nation belongTo,12,W,M,CL_AREA_GEO ##END ##flow_code belongTo,4,R,M,CL_FLOW ##END ##int_trans belongTo,13,R,O,CL_TRANSPORT ##END ##oth_part_country belongTo,6,W,M,CL_AREA_GEO ##END ##suppl_conf belongTo,31,R,O,CL_CONFIDENTIALITY ##END

European Commission – Eurostat User Guide

© European Commission 2005 Page 64

##part_conf belongTo,19,R,O,CL_PART_CONF ##END Each rules is written between ##statistical concept name rules #END with the following convention:

statistical concept name, corresponds to the name found in FT_dataset_structure.cod rules, corresponds to the name of the rule (listed below)

You have to strictly respect the grammar of the rule, which is described below for each possible rules: belongTo, position of the field in the record, error level (R or W), value presence(Optional or Mandatory), Code list name. This rule checks if the value at the given position is in the given code list. targetValue, position of the statistical in the record, error level, value presence, target value This rule checks if the value at the given position is equal to the given target value. isAlphabetic, position of the statistical in the record, error level, value presence, minimum number of characters, max minimum number of characters This rule checks if the value at the given position is an alphabetic value and optionally checks if the number of characters is in the given range. isAlphanumeric, position of the statistical in the record, error level, value presence, minimum number of characters, max minimum number of characters This rule checks if the value at the given position is an alphabetic or a numeric value and optionally checks if the number of characters is in the given range. isNumeric, position of the statistical in the record, error level, value presence, minimum number of characters, max minimum number of characters This rule checks if the value at the given position is an numeric value, optionally checks if the number of characters is in the given range and if the value is in a given range. maxValue, position of the statistical in the record, error level, value presence, maximum value This rule checks if the numeric value at the given position is lower than the given value.

European Commission – Eurostat User Guide

© European Commission 2005 Page 65

minValue, position of the statistical in the record, error level, value presence, minimum value This rule checks if the numeric value at the given position is greater than the given value. Compare, position of the statistical in the record, error level, value presence, comparison operator, coefficient, second field number, offset, Boolean to generate an error or not if the second field is empty This rule checks if the numeric value at the first given position is lower, greater or equal to the second given value times a given coefficient plus a scalar.

European Commission – Eurostat User Guide

© European Commission 2005 Page 66

Introduction to the generic advanced validation rules In order to give more flexibility to the validation process, a new feature of GENEDI allows to parameterise generic advanced validation rules. This paragraph aims to describe the procedure to include these rules in the current validation files. Note that the generic advanced validation rules should be inserted AFTER the creation of the validation files through the graphic interface. Regarding the handling of empty fields, it is the same in all generic rules. When a field is mandatory (Status = ‘M’) an error is always raised if it has an empty value. If the field is optional (Status = ‘O’), the rule is not applied to the current record, except in the special case of target value usage for this field, where the rule is executed for verification. When a field is compared directly for equality or inequality with a target value then the empty value for this field is allowed. This is because the rules’ target values can take the value “” or “empty”. These checks are made for all the fields that are involved in a generic rule. GENEDI offers three generic advanced validation functions that perform three types of controls: Generic rule n°1 Syntax: generic_rule_1,P1,Op1,Tg1,P2,Op2,Tg2,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure Op1 = comparison operator (==,<,>,>=,<=,!=) Tg1 = target value to compare with the value of P11

P2 = position of the second field in the GESMES dataset structure Op2 = comparison operator (==,<,>,>=,<=,!=) Tg2 = target value to compare with the value of P2 MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning Example: generic_rule_1,9,==,2,14,<=,0,The tonnage of freight and mail should be greater than zero for type of service 'freight and mail',M,W test performed: IF (value of field n°9)==2 THEN IF (value of field n°14)<=0 THEN

Display the Warning message: 1 To test if an empty value, let Tg1 or Tg2 empty with comma following one another

European Commission – Eurostat User Guide

© European Commission 2005 Page 67

The tonnage of freight and mail should be greater than zero for type of service 'freight and mail'

ENDIF ENDIF Generic rule n°2 Syntax: generic_rule_2,P1,Op,P2,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure Op = comparison operator (==,<,>,>=,<=,!=) P2 = position of the second field in the GESMES dataset structure MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning Example: generic_rule_2,17,<,12,The technical capacity of the aircraft should be higher or equal to the number of passengers,O,W test performed: IF (value of field n°17) < (value of field n°12 THEN

Display the Warning message: The technical capacity of the aircraft should be higher or equal to the number of passengers

ENDIF

European Commission – Eurostat User Guide

© European Commission 2005 Page 68

Generic rule n°3 Syntax: generic_rule_3, P1,Op1,Tg1,P2,Op2,Tg2,Op3,Tg3,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure Op1 = comparison operator (==,<,>,>=,<=,!=) Tg1 = target value to compare with the value of P1 P2 = position of the second field in the GESMES dataset structure Op2 = comparison operator (==,<,>,>=,<=,!=) Tg2 = target value to compare with the value of P2 Op3 = comparison operator (==,<,>,>=,<=,!=) Tg3 = second target value to compare with the value of P2 MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning Example: generic_rule_3,9,==,2,17,!=,,>,0,The technical capacity of the aircraft should be equal to zero or empty in case of type of service freight and mail,O,W test performed: IF (value of field n°9)==2 THEN IF ((value of field n°14)!="" OR (value of field n°14)>0) THEN

Display the Warning message: The technical capacity of the aircraft should be equal to zero or empty in case of type of service freight and mail

ENDIF ENDIF Generic rule n°4 Syntax: generic_rule_4, P1,Op1,Tg1,P2,Tg2,P3,Op3,Tg3,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure Op1 = comparison operator (==,<,>,>=,<=,!=) Tg1 = target value to compare with the value of P1 P2 = position of a second field in the GESMES dataset structure Tg2 = target value to compare with the value of P2 P3 = position of a third field in the GESMES dataset structure Op3 = comparison operator (==,<,>,>=,<=,!=) Tg3 = second target value to compare with the value of P3 MG = text describing the error message

European Commission – Eurostat User Guide

© European Commission 2005 Page 69

Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning Example: generic_rule_4,15,==,empty,2,A8,11,==,1,TEUITU shall be present in table A8,M,R test performed: IF [(value of field n°15) is empty] AND (value of field n°2)==A8] AND [(value of field n°11)==1] THEN

Display the Error message: TEUITU shall be present in table A8

ENDIF Generic rule n°5 Syntax: generic_rule_5, P1, P2, P3,CODELIST,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure P2 = position of a second field in the GESMES dataset structure P3 = position of a third field in the GESMES dataset structure CODELIST = name of the code list containing complementary codes for field P3 MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning Example: generic_rule_5,17,15,11,aircraft,The number of passenger seats available divided by the number of flights should be lower or equal to the maximum aircraft configuration OR higher or equal to the minimum aircraft configuration,O,W test performed: IF [(value of field n°17) / [(value of field n°15) > VALUE_1(n°11)] OR [(value of field n°17) / [(value of field n°15) < VALUE_2(n°11)] THEN

Display the Warning message: The number of passenger seats available divided by the number of flights should be lower or equal to the maximum aircraft configuration OR higher or equal to the minimum aircraft configuration

ENDIF VALUE_1(n°i) and VALUE_2(n°i) are taken respectively in the second and the third column of the code list whose name is the parameter CODELIST. For example, a line of such a code list could look like:

European Commission – Eurostat User Guide

© European Commission 2005 Page 70

B747;100;12 So in that case VALUE_1 is 100 and VALUE_2 is 12. The position of the third field is used by GENEDI to get the correct code in CODELIST. To use extra values in Code list like in that example, you have to create a file called [STAT_DOMAIN]_assoc_codelist.cod in folder Codelists\[STAT_DOMAIN], and write the following line: CODELIST_NAME; complementary_2 Where CODELIST_NAME is the name of the code list as for the current example: aircraft;complementary_2 See Appendix Codelists for a complete description of the different code list formats. Generic rule n°6 Syntax: generic_rule_6, P1, P2,CODELIST_1, CODELIST_2,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure P2 = position of a second field in the GESMES dataset structure CODELIST_1 = name of the code list containing associated codes for field P2 CODELIST_2 = name of the code list containing the codes for field P1 MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning test performed: IF (value of field n°P1) is NOT in code list CODELIST_1 THEN Returns with out message. The rule cannot be applied. ELSE

Extract the two first characters of (value of field n°P1) And compare with the associated value linked to the (value of field n°P2) in code list CODELIST_2 If not equal display error message MG with level Level

ENDIF See paragraph Error! Reference source not found. for a complete description of the different code list formats. Example: generic_rule_6,5,2,airport,reporting_country_and_prefix_reporting_airports,The reporting airport code is not consistent with the reporting country code,O,W

European Commission – Eurostat User Guide

© European Commission 2005 Page 71

Generic rule n°7 Syntax: generic_rule_7, P1, Op1,Tg1,P2,Tg2,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure Op1 = comparison operator (==,<,>,>=,<=,!=) Tg1 = target value to compare with the value of P1 P2 = position of a second field in the GESMES dataset structure Tg2 = target value to compare with the value of P2 MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning Example: generic_rule_7,13,!=,empty,2,A7|A8|A9,TKM shall be empty in all tables but A7.A8.A9,M,R test performed: IF [(value of field n°13) is not empty ] AND [(value of field n°2) is in (A7,A8,A9)] THEN

Display Error Message: TKM shall be empty in all tables but A7.A8.A9

ENDIF Generic rule n°8 Syntax: generic_rule_8, P1, CODELIST,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure CODELIST = name of the code list containing codes for field P1 MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning Example: generic_rule_8,14,CL_COUNTRY,J0334: Country of unloading is not recognized,M,R test performed: IF [Two first characters of (value of field n°P1) is NOT in CODELIST] THEN

Display Error Message MG with level Level ENDIF Generic rule n°9 Syntax:

European Commission – Eurostat User Guide

© European Commission 2005 Page 72

generic_rule_9, P1, P2,P3,CODELIST_1,CODELIST_2,MG_1,MG_2,Status,Level_1, Level _2 P1 = position of the reference field in the GESMES dataset structure P2 = position of a second field in the GESMES dataset structure P3 = position of a third field in the GESMES dataset structure CODELIST_1 = name of the code list containing codes for field P1 CODELIST_2 = name of the code list containing codes for field P2 MG_1 = text describing the error message for field P1 MG_2 = text describing the error message for field P2 Status = M or O, for (M)andatory or (O)ptional reference field Level_1 = R or W, Error Level for (R)ejection or (W)arning for field P1 Level_2 = R or W, Error Level for (R)ejection or (W)arning for field P2 test performed: IF [Two first characters of (value of field n°P1) == Two first characters of (value of field n°P1) == (value of field n°P3)] THEN IF [(value of field n°P1) is NOT in CODELIST_1] THEN

Display Error Message MG_1 with level Level_1 ENDIF ELSE IF [Two first characters of (value of field n°P1) is NOT in CODELIST_2] THEN

Display Error Message MG_2 with level Level_2 ELSE OK IF [(value of field n°P1) is NOT in CODELIST_1] THEN

Display Error Message MG_2 with level Level_2 ELSE OK ENDIF ENDIF Example: generic_rule_9,14,13,1,CL_NUTS3,CL_COUNTRY,J0337: Region of Unloading must be present for National Transport,J0338: Region of Unloading is not recognized,M,R,W Generic rule n°10, 10b and 10c Syntax: generic_rule_10, P1,Op1,Op2,P2,P3,COEF,P4,Op4,Tg4,MG,Status,Level generic_rule_10b, P1,Op1,P2,P3,COEF,P4,Op4,Tg4,MG,Status,Level generic_rule_10c, P1,Op1,P3,COEF,P4,Op4,Tg4,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure P2 = position of a second field in the GESMES dataset structure P3 = position of a third field in the GESMES dataset structure P4 = position of a fourth field in the GESMES dataset structure

European Commission – Eurostat User Guide

© European Commission 2005 Page 73

COEF = numeric value Op(i) = comparison operator (==,<,>,>=,<=,!=) Tg4 = target value to compare with the value of P4 MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning Example for n°10: generic_rule_10,16,<,>,12,15,0.1,11,==,1,J0345: Incorrect Ton Kilometers for single transport operations,M,W test performed for n°10 IF [(value of field n°P4) Op4(==,<,>,>=,<=,!=) Tg4] AND

{[(value of field n°P1) Op1(==,<,>,>=,<=,!=) (value of field n°P2-0.5) x (value of field n°P3)*COEF] OR [(value of field n°P1) Op1(==,<,>,>=,<=,!=) (value of field n°P2+0.5) x (value of field n°P3)*COEF] }

THEN Display error Message MG with level Level

ENDIF Example: IF [(value of field n°11) ==1]

AND {[(value of field n°16) < (value of field n°12 -0.5) x (value of field n°15)*0.1] OR [(value of field n°16) > (value of field n°12 +0.5) x (value of field n°15)*0.1]}

THEN Display Warning Message: J0345: Incorrect Ton Kilometers for single transport operations

ENDIF generic_rule_10b,16,<=,12,15,0.1,11,==,2,J0346: Incorrect Ton Kilometers for several transport operations,M,W test performed for n°10b IF [(value of field n°P4) Op4(==,<,>,>=,<=,!=) Tg4] AND

[(value of field n°P1) Op1(==,<,>,>=,<=,!=) (value of field n°P2) x (value of field n°P3)*COEF]

THEN Display error Message MG with level Level

ENDIF

European Commission – Eurostat User Guide

© European Commission 2005 Page 74

Example: IF [(value of field n°11) ==2] AND

[(value of field n°16) <= (value of field n°12) x (value of field n°15)*0.1] THEN

Display Warning Message: J0346: Incorrect Ton Kilometers for several transport operations

ENDIF Example for n°10c: generic_rule_10c,12,>,9,1.3,11,!=,,J0328: Weight exceeds Journey Load Capacity,M,W test performed for n°10c IF [(value of field n°P1) Op4(==,<,>,>=,<=,!=) Tg4] AND

[(value of field n°P1) Op1(==,<,>,>=,<=,!=) (value of field n°P2) x (value of field n°P3)*COEF]

THEN Display error Message MG with level Level

ENDIF Example: IF [(value of field n°11) !=""] AND

[(value of field n°12) > (value of field n°9) x 1.3] THEN

Display Warning Message: J0328: Weight exceeds Journey Load Capacity

ENDIF Generic rule n°11 Syntax: generic_rule_11, P1,CODELIST,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure CODELIST = name of the list of codes with associated codes MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning Example: generic_rule_11,4,country_codes_and_UNlocodes_prefix,the first two positions of the reporting port are not compatible with the reporting country,M,R test performed: IF [(the two first characters of value of field n°P1) are not in the list of codes

associated to the COUNTRY ISO Code] THEN

European Commission – Eurostat User Guide

© European Commission 2005 Page 75

Display error Message MG with level Level ENDIF Generic rule n°12 Syntax: generic_rule_12, P1,Start,End,Op,Tg,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure Start = starting position in the string (starting at 0), integer value, can be a negative value if starting from the end of the string End = ending position in the string (starting at 0), integer value Op = comparison operator (==,<,>,>=,<=,!=) Tg = target value to compare with the sub string of P1 between position Start and End MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning Example: generic_rule_12,8,-1,1,==,0,The last position of the type of cargo should be different from zero,M,W test performed: IF [(sub string of value of field n°P1 between position Start and End)

Op(==,<,>,>=,<=,!=) Tg] THEN Display error Message MG with level Level

ENDIF Example:

IF [(the last character of value of field n°8) ==0] THEN Display Warning Message:

The last position of the type of cargo should be different from zero ENDIF Generic rule n°13 Syntax: generic_rule_13,P1,P2,CODELIST_1,CODELIST_2,Op,Tg,MG1,MG2,Status,Level P1 = position of the reference field in the GESMES dataset structure P2 = position of a second field in the GESMES dataset structure CODELIST_1 = name of the first code list CODELIST_2 = name of the second code list Op = comparison operator (==,<,>,>=,<=,!=) Tg = target value to compare with the value of P4 MG1 = text describing the first error message

European Commission – Eurostat User Guide

© European Commission 2005 Page 76

MG2 = text describing the second error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning Example: generic_rule_13,7,6,mca_eurostat_partners,extended_unlocode,==,ZZ888,The port of loading/unloading doesn't exist in the UNlocode list,W,R test performed: IF [(value of field n°P2) is empty] AND [(value of field n°P1) is in CODELIST_1] THEN

Display error Message MG1 with level Level ELSE IF [(value of field n°P2) is not equal to the code associated to (value of field n°P1) in CODELIST_2] THEN IF [(value of field n°P1) is in CODELIST_1] OR

[(value of field n°P2) Op(==,<,>,>=,<=,!=) Tg] THEN Display error Message MG2 with level Level ENDIF

ENDIF Generic rule n°14 Syntax: generic_rule_14,P1,CODELIST_1,CODELIST_2,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure CODELIST_1 = name of the first code list CODELIST_2 = name of the second code list MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning Example: generic_rule_14,18,CL_COUNTRY,CL_NUTS3,J0349-J0352: Country or Region of loading is not recognized,O,W test performed: IF [(value of field n°P1) is NOT in CODELIST_1] OR

[(value of field n°P1) is NOT in CODELIST_2] THEN Display error Message MG with level Level

ENDIF Generic rule n°15 Syntax:

European Commission – Eurostat User Guide

© European Commission 2005 Page 77

generic_rule_15,P1,CODELIST,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure CODELIST = name of the first code list MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning Example: generic_rule_15,17,CL_COUNTRY, Transit Country not recognized,O,W test performed: FOR EACH two digit codes in the (value of field n°P1)

IF it is NOT in CODELIST THEN Display error Message MG with level Level

ENDIF END FOR EACH Example: If P1 is "BENOLU", the list of two digit codes is "BE","NO","LU", and each of these codes should be in the CL_COUNTRY codelist. Generic Rule°16 Syntax: generic_rule_16, P1, P2,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure P2 = position of the second field in the GESMES dataset structure MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning test performed: IF NOT ( first character of (value of field n°P1) in (R,D,N) AND (“19”/”20”) 2nd 3rd characters of (value of field n°P1) <=

(value of field n°P2) AND length of (value of field n°P1) == 3 ) Display error message MG with level Level ENDIF Example: generic_rule_16,6,2,Wrong code for Population or inconsistent Year,M,R Note:

European Commission – Eurostat User Guide

© European Commission 2005 Page 78

The comparison refers to years. The first field has only the last two digit of the year (century information is missing). So before the comparison the digits “19” or “20” are added. If the value is greater or equal of 70 then the “19” is used otherwise the “20”. Generic Rule°17 Syntax: generic_rule_17, P1,Tg1,P2,P3,Tg2,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure Tg1 = target value for P1 P2 = position of the second field in the GESMES dataset structure P3 = position of the third field in the GESMES dataset structure Tg2 = target value regarding year difference of P3,P2 MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning test performed: IF ( (value of field n°P1) == Tg1 ) IF NOT ( (value of field n°P3) – (“19”/”20”) 2nd 3rd characters

of (value of field n°P2) == Tg2 AND length of (value of field n°P1) == 3)

Display error message MG with level Level ENDIF ENDIF Example: generic_rule_17,7,16197,6,2,1,Reference year does not conform with Population year,M,R Note: The comparison refers to years. The second field has only the last two digit of the year (century information is missing). So before the comparison the digits “19” or “20” are added. If the value is greater or equal of 70 then the “19” is used otherwise the “20”. Generic Rule°18 Syntax: generic_rule_18, P1,Start,Length,Inc,List,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure Start = the index of the string that the substring will start (negative values from count from the end) Length = the length of characters to keep (if omitted then all the string until the end is taken)

European Commission – Eurostat User Guide

© European Commission 2005 Page 79

Inc = takes values 0,1. If 1 prompt message if included list. the opposite with 0 List = list of values separated by '|' MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning test performed: IF ( substr((value of field n°P1),Start,Length) included in List) Display error message MG with level Level ENDIF In case that Inc is 1 otherwise the error will be prompted if it as not included. Example: generic_rule_18,8,0,1,0,1|2|9,The type of cargo should be codes starting only with 1 2 9,M,R Generic Rule°19 Syntax: generic_rule_19, P1,Op,Tg,P2,Inc,List,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure Op = comparison operator (==,<,>,>=,<=,!=) Tg = Target value P2 = position of the reference field in the GESMES dataset structure Inc = takes values 0,1. If 1 prompt message if included list. the opposite with 0 List = list of values separated by '|' accept * wildcard, also accepts exceptions at the end with ‘|!’ seperator example: a|b|c*|d\*|sp*li*|!c1|c2 equalts to a list with: a,b, all string stating with c, d*, sp(anything)li(anything), except c1,c2 MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning test performed: IF ( ((value of field n°P1) Op Tg) AND (value of field n°P2) included in List) Display error message MG with level Level ENDIF In case that Inc is 1 otherwise the error will be prompted if it as not included. Example: generic_rule_19,14,==,,8,1,3*|5*|6*|!52|53,It is mandatory for all types of unit cargo (3_ to 6_) except 52 and 53,O,R Generic Rule°20 Syntax:

European Commission – Eurostat User Guide

© European Commission 2005 Page 80

generic_rule_20, P1, P2,Codelist,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure P2 = position of the reference field in the GESMES dataset structure Codelist = Codelist name MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning test performed: IF (value of field n°P1) is not included to the list of codes associated to (value of field n°P2) in Codelist THEN

Display error message MG with level Level ENDIF Example: generic_rule_20,7,6,CL_EXTENDED_UNLOCODE,The Maritime Coastal Area is not consistent with the port of loading/unloading,M,R Generic Rule°21 Syntax: generic_rule_21, P1, Codelist1,Codelist2,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure Codelist1 = The first codelist name Codelist2 = The second codelist name MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning test performed: IF (value of field n°P1) does not belong to Codelist1 or Codelist2 THEN

Display error message MG with level Level ENDIF Example: generic_rule_21,2,CL_EU_COUNTRIES,CL_ACC_EU_COUNTRIES,The country should belong to EU members or EU acceding countries,M,R Generic rule A Syntax: generic_rule_A, P1, Math1,Math2, MG,Status,Level

European Commission – Eurostat User Guide

© European Commission 2005 Page 81

P1 = position of the reference field in the GESMES dataset structure Math1 = first arithmetic expression (e.g. #1 <= (#2 *0.2) / #4) Math2 = second arithmetic expression MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning Example: generic_rule_A,2,#2 <= (#4 + #5) *0.6, #1 == #2, Checking not OK,O,W test performed: IF expression Math1 is true THEN IF expression Math2 is wrong THEN

Display error Message MG with level Level ENDIF

ENDIF Remark: The arithmetic expressions can be any sequence of operators like (<,>,=,!,+,/,*,-,(,)) and field positions. The field numbers are represented by #n where 'n' is the position number. Also perl functions and expressions can be used. Generic rule C Syntax: generic_rule_C, P1, Math1, MG,Status,Level P1 = position of the reference field in the GESMES dataset structure Math = arithmetic expression (e.g. #1 <= (#2 *0.2) / #4) MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning Example: generic_rule_C,#2 <= (#4 * #6), Checking not OK,O,W test performed: IF expression Math is wrong THEN

Display error Message MG with level Level ENDIF Remark:

European Commission – Eurostat User Guide

© European Commission 2005 Page 82

The arithmetic expressions can be any sequence of operators like (<,>,=,!,+,/,*,-,(,)) and field positions. The field numbers are represented by #n where 'n' is the position number. Also perl functions and expressions can be used. Generic rule D Syntax: generic_rule_D, P1, Math1,Codelist,MG,Status,Level P1 = position of the reference field in the GESMES dataset structure Math = arithmetic expression (e.g. #1 <= (#2 *0.2) / #4) MG = text describing the error message Status = M or O, for (M)andatory or (O)ptional reference field Level = R or W, Error Level for (R)ejection or (W)arning Example: generic_rule_D,5,#2<2001,CL_SBS_ACTIVITY-ANNEX2-2B1,Wrong activity code,M,R test performed: IF expression Math is true AND (value of field n°P1) not in Codelist THEN

Display error Message MG with level Level ENDIF Remark: The arithmetic expressions can be any sequence of operators like (<,>,=,!,+,/,*,-,(,)) and field positions. The field numbers are represented by #n where 'n' is the position number. Also perl functions and expressions can be used.

European Commission – Eurostat User Guide

© European Commission 2005 Page 83

APPENDIX H: Codelists Code lists are provided with the GENEDI package for domains, which have been validated by Eurostat. This paragraph aims to give information about the different types of code lists and how to change a code list format. General format Generally, the code lists in GENEDI are text files that contain one code per line and optional general comments preceded by the symbol "#". Optionally, comments can be added after each code, separated by a semicolon. Example: # Codes for Aircraft in Aviation Domain

General comments

B738; Boeing 738

Separator

Comment associated to the code

Code

Code list with hidden codes Some domains need to hide part or the totality of the codes they send to Eurostat. GENEDI is able to hide some codes just before creating the GESMES file with the provision that the code list that contains these codes has a specific format as follows: # General comments Code;optional comment;associated hidden code Example 1: this is a code list line with hidden code, but without comments: B738;;8888 Note that separators must always be present even if no comments are given. Example 2: this is a code list line with hidden code, with comments: B738;Boeing 738;8888 Code list with associated codes or complementary codes In this case, the code lists provides two codes, the reference one and the associated/complementary one. The second code is called "associated" or "associated" because it is used by GENEDI to achieve advanced validation rules. The following example illustrates the use of associated codes in the case of Aviation Domain. Example 3: the code list for reporting country codes uses associated codes since some countries have airport in overseas territories. EB;EB; Belgium EK;EK; Denmark

European Commission – Eurostat User Guide

© European Commission 2005 Page 84

EF;EF; Finland LF;LF; France Metropole LF;SO; France Guiana LF;TF; France Antilles Only associated codes for reporting country are used in the data set, so when GENEDI checks the reporting country code, for instance "SO", it uses the code list above to get the main country code (LF). Then this value is used for advanced controls. The last example shows also the format of code lists with associated codes: # General comments Code;associated code;optional comment Complementary codes are used to give additional information to a code. For instance, the code list "aircraft" contains the reference code (e.g. B738) and two complementary codes (e.g. 100 and 350) which are in that case the minimum and the maximum capacity of the aircraft. The format is as follows: # General comment Code;complementary code 1;complementary code 2;optional comment Important: To indicate GENEDI that a code list contains associated or complementary codes, GENEDI needs the presence of a configuration file called: [STAT_DOMAIN]_assoc.cod where [STAT_DOMAIN] is the short name of the statistical domain (e.g. AV for Aviation). This text file must be located in sub folder GENEDI\codelists\[STAT_DOMAIN], and should contain the list of code lists using associated or complementary codes. The following example gives the format to use: Code list name;associated Or Code list name;complementary_1 Or Code list name;complementary_2 Word in italic font are reserved word. Complemantary_1 means that GENEDI can take one other value than the code, and Complemantary_2 means that GENEDI can take two other values than the code.

European Commission – Eurostat User Guide

© European Commission 2005 Page 85

APPENDIX I: Transcoding tables and Transcoding rules

Introduction All the configuration files relevant to the TRANSCODE add-in are placed into folder “tools\transcoding” and “tools\transcoding\[short name of the statistical domain]”. In the latter one, contains the transcoding tables and transcoding rules. Moreover the transcoding process creates a log file. You can also control what happened by analysing the [Dataset ID].log file (located in the ‘1_Intray\archive’ folder). This file is only created if errors occurred during the transcoding.

Transcoding tables A transcoding table is a structured flat file whose name is free and whose extension must be “.trc” (transcode). A transcoding table must be created for each different field that. It is recommended to call your table the same name as the field to process. Each line of this file has the following structure: GESMES codes ; comment on GESMES codes ; User code ; comment on user code If one consider one line of the transcoding table, these examples are equivalent: A ; annual data ; AN ; annual data (with all comments) A ; annual data ; AN (with 1 comment) A ; ; AN ; annual data (with 1 comment) A;;AN (without comment) Several user codes can be match to one GESMES code, for example when two user codes are fused into one GESMES code. In that case just separate each user code by a “,” in the transcoding line like in this example: A ; comment ; AN1,AN2,AN3 ; comment This means that when AN1 or AN2 or AN3 values is met, replace it by A. Note: The transcoding table must be created by hand, so be careful to use “;” as a separator and be careful not to forget any separators (exactly two “;” between the two family of codes). Comments can contain any characters but the “;” and the “,”. Once the table is created save it into a file with extension “.trc” in folder “tools\transcoding\[statistical domain]”. Special functionality: An advanced option in the transcoding table is the possibility to use a Perl function to create a more complex correspondence between user codes and GESMES codes. This option is destined to person who has Perl Knowledge, as the function must be coded in Perl.

European Commission – Eurostat User Guide

© European Commission 2005 Page 86

The syntax is the following one: My_function($line) ; comment ; list of user codes separated by “,” ; comment Where My_function is the name of your function and $line shouldn’t be modified. Then, you have to define your function in the Perl package “\tools\transcoding\fonction.pm” Here is an example of file fonction.pm: #!/usr/bin/perl package tools::transcoding::fonctions; require Exporter; use strict; use tools::portability; use tools::conf; use vars qw(@EXPORT @ISA); @EXPORT = qw(exempleDeFonction); @ISA = qw(Exporter); # --------------------------------------------------------------------------------------------------------- # You shouldn’t modify anything before this comment. # You can create any function that gets a CSV line as an entry and that returns a code. # If this function is used in <data set id>_rules.txt then it will replace the current code # by the on returned following your program. # ---------------------------------------------------------------------------------------------------------sub My_function { # get the CSV line, this line shouldn’t be modified my $line = shift; # store each field’s value in array @temp my @temp = split(/;/,$line); # for example, add string “NS” to field number 5 (array @temp start at index 0!) return( "NS".$temp[4]); }

European Commission – Eurostat User Guide

© European Commission 2005 Page 87

Transcoding rules The transcoding rules are text files that contain the list of transcoding table to use for a given dataset, followed by the field number in the dataset structure. These files are named “[dataset identifier]_rules.txt”, using the name of the dataset. Each line of these files should respect the following format:

File name ; field number The left string is the name of the file containing the transcoding table, the right number is the position of the field in the GESMES Key family structure (see GESMES MIG). The following example illustrates the purpose.

ile STS_IND_ORD_Q_rules.txt contains the list of transcoding tables for dataset

_CONF_RULE;11

so file OBS_CONF_RULE.trc contains the transcoding table to apply to field number 11

he second line means that value F means “public value” and must replace together

FSTS_IND_ORD_Q. The first line is : OBS

(here the concept observation confidentiality in the GESMES key family structure). The table for this concept looks like:

Tvalues 0 or 3 or 5.

European Commission – Eurostat User Guide

© European Commission 2005 Page 88

APPENDIX J: Hide codes configuration file The concepts to be hidden in the GESMES file are saved in files of the form ‘codelists\DOMAIN\DATASET.hid’. The order in the GESMES structure of each concept to be hidden is stored in this file in a separate line. Thus if you don’t want to use graphic interface, then you have to create the configuration file by hands according to the following steps:

Create a text file called DATASET_ID.hid in which you write the list of the field numbers, whose codes must be hidden, using one number per line. For example, to hide the codes of fields' position 2,3 and 8 in the GESMES structure, you should have:

2 3 8

Save this file into subfolder \codelist\STAT_DOMAIN\ in the GENEDI folder Optionally modify the code lists corresponding to the fields selected contain at least

two columns with the hidden codes. APPENDIX K: Dataset Naming Convention In order to harmonize the way the data file are named within the data producers, Eurostat created a set of rules for naming the file to process. This paragraph describes these rules but the user is invited to consult the original document on CIRCA for the last version. The naming conventions described below will be used in all the Data Transmission Tools and Services. 1. DATASET NAMING CONVENTION: Definition “Dataset”: concept of one or several statistical tables (having one single data structure), with a specific periodicity and to which is usually associated a maximum delay for transmission. Field of application: this name will mainly be used internally by the eDAMIS kernel and also by GESMES. Identification rule: DATASET ID = (DOMAIN ID) + “_” + (DATASET STRUCTURE ID) + “_” + (PERIODICITY or PERIODICITIES) Field: Length: Description/Remark: DOMAIN ID: 1..8 Identifies the statistical domain (group of datasets closely linked

together) DATASET STRUCTURE ID:

1..7 Identifies the Dataset Structure (associated to one or several statistical tables). • If no Dataset Structure ID is defined, the ‘periodicity id for data’ will be used as default value. • If 2 positions are used for the ‘Periodicity or periodicities’ field, then the Dataset structure ID should be on 6 positions

European Commission – Eurostat User Guide

© European Commission 2005 Page 89

maximum. PERIODICITY or PERIODICITIES:

1..2 The first position, identifies the periodicity of the data to be reported: “A” for Annual “2” to “9” for every 2 to 9 years “S” for Semester “Q” for Quarterly “M” for Monthly “W” for Weekly “D” for Daily “O” for Other (periodicity) “N” for Non periodic (e.g. Sequential)

The second position (optional) identifies the periodicity of transmission (only visible if different from the periodicity of the data). It uses the same convention as the first position.

Composition constraints and limitations for fields:

• Only the uppercase characters from A to Z and the figures from 0 to 9 are allowed • For “DOMAIN ID”, the first character must be alphabetic • The “_” (underscore) character is only used as a field separator.

Based on the above the theoretical maximum length for a dataset ID is 8+1+7+1+1 or 8+1+6+1+2= 18 (in compliance with the maximum length defined in GESMES). EXAMPLE1:

• Statistical domain: RAIL TRANSPORT, • Dataset structure: Annex “E” (includes tables E1 and E2) • Periodicity for data and transmission: Quarterly (Q)

If “RAIL” is chosen as identification for the statistical domain, then the “dataset id” to be used is:

RAIL_E_Q. EXAMPLE2:

• Statistical domain: European System of Accounts, • Dataset structure: 0110 • Periodicity for data: Annual (A) • Periodicity for transmission: Monthly (M)

If “ESA” is chosen as identification for the statistical domain, then the “dataset id” to be used is:

ESA_0110_AM. 2. DATASET OCCURRENCE NAMING CONVENTION: Definition “Dataset occurrence”: An occurrence of a dataset for 1 country for 1 period (or time series or sequence). Field of application: This name will mainly be used by Eurostat data providers (end users or applications) to send data in “full EDI mode” to Eurostat (the file name will be sufficient for eDAMIS tools to be able to identify the data to be transmitted without asking the user to build manually an envelope). Identification rule: DATASET OCCURRENCE ID = (DATASET ID) + “_” + (COUNTRY CODE) + “_” + (YEAR) + “_” + (PERIOD)

European Commission – Eurostat User Guide

© European Commission 2005 Page 90

Field: Length: Description/Remark: DATASET ID: See above See above COUNTRY CODE:

2 ISO country code should be used except for: • GB replaced by UK, • GR replaced by EL • For international organisations (see Annex-1).

YEAR: 4 Format: YYYY, identifies the reporting year of the data If the dataset is non periodic then “0000” or reporting year

PERIOD: 4 Identifies the precise reporting period of the data • “0000” for Annual or for every 2 to 9 years, • “0001” to “9999” for others.

Composition constraints and limitations for fields: If several years/periods are covered by a dataset occurrence (case of the transmission of time series), then specify only the last year/period. EXAMPLE1: For Dataset “RAIL_E_Q” (see above)

• Country: United Kingdom (UK) • Year: 2003 • Quarter: 2.

The “Dataset occurrence id” to be used is: RAIL_E_Q_UK_2003_0002 EXAMPLE2: For dataset “STSCONS_EARN_M”

• Country: Greece (EL) • Year: 2002 • Month: 12.

The “Dataset occurrence id” to be used is: STSCONS_EARN_M_EL_2002_0012 EXAMPLE3: For Dataset “ESA_0109_A”

• Country: France (FR) • Year: 2001.

The “Dataset occurrence id” to be used is: ESA_0109_A_FR_2001_0000 EXAMPLE4: Name with transmission frequency indication For Dataset “ESA_0110_AM”

• Country: European Central Bank (4F) • Year: 2001 • Month: 1.

The “Dataset occurrence id” to be used is: ESA_0110_AM_4F_2001_0001

END OF DOCUMENT


Recommended