Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Include Page
_This_Is_a_Draft
_This_Is_a_Draft

Table of Contents

What are data sets?

  • Data sets contain data to be used in an ETLUnit test.  They are like data files, in that they contain data, but there are some differences.
    • A data file and a data set each live in a text file in the ETLUnit project.
    • A data file has a file extension of ".delimited" or ".fixed."
    • A data set has a file extension of ".dataset."
    • A data file resides in folder "data/," a sibling of the ETLUnit test class.
    • A data set resides in folder "dataset/," also a sibling of the ETLUnit test class.
    • One data set file (one "data set") can contain multiple data set members.
    • One data file stands alone.  It cannot have multiple sets of data.
  • Each data set member is the equivalent to (and could be replaced by, in the ETLUnit test) a data file.
  • Properties that are normally written in a test method operation may instead be written as one of the properties of a data set member. 
    • ETLUnit will use a data set's properties when a test is run that uses the data set member, unless explicitly told not to by the use of the ignore-data-set-properties property.
  • TODO: Ressurrect the links below (next two points) when they become available.
  • SEE stage() for Data Sets, on the stage page, for syntax.
  • SEE assert() for Data Sets, on the assert page, for syntax.

Location of data set files

  • Relative to the ETLUnit test class, data sets are placed in a sibling folder called "dataset/."
  • Contrast this location with that for data files.  Data files are placed in a sibling folder called "data/."

Dataset Folder Location Sample

In the sample below, the folder "dataset" is a sibling of the ETLUnit tests.

Section
Column
width25px

 

Column

Image Added

Format of data set files

  • Data set files may contain one data set member, or several.
  • Data set members are delimited within the text file by certain characters.
    • A data set member starts with a properties section, delimited by an open and close curly brace.
      • Properties may be written between these curly braces.
      • If multiple properties are present, they must be separated by commas.
      • Multiple properties may exist on the same line or on separate lines.
      • If no properties are present, empty curly braces are acceptable ("{}").
      • A data set member is not legal if it lacks a properties section.
      • The data set's id property is the only way to assign an ID to a data set member.  An ETLUnit operation will refer to the id with property data-set-id.
      • All the other properties in a data set's property section should be legal properties of the assert or the stage operation.
      • It is illegal to have other properties that are not canonized.
    • The data section of a data set member follows the properties section.
    • A data set member ends with a data section, delimited by this series of angle brackets:  "<><><><><><>".
      • The data itself is placed between two lines that each have a series of six pairs of complementary angle brackets.
  • There is no special delimiter, like a comma, between data set members.
  • Line endings should be LF only (x'000A').
  • The last line in the file should be a data section delimiter, terminated by a LF line ending character.
  • TODO: Restore the link for sample code on the next line.
  • See the sample code below.

Use of data sets

  • Data sets may be used in the stage operation or the assert operation.
Section
Column
width75px

 

Column
width650px
Note

Data sets do not work in the assert operation prior to ETLUnit version 3.9.6.

Column

 

Usage Example
Anchor
Usage Example
Usage Example

The data sets