Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Usage Example
Anchor
Usage Example
Usage Example

The data sets

Section
  • The first data set, extract_data.dataset, is to be used in the stage operation. 
  • The data set member's properties section includes a property, 'target-table.' 
  • When the ETLUnit test runs, it will use this property.
  • The data will be placed in a table named PLANT_BUCKET_COMP.
  • This data is tab delimited.
Code Block
languagetext
titleextract_data.dataset
{
    id: 'PLANT_BUCKET_COMP_EDW_TBL',
    target-table: 'PLANT_BUCKET_COMP'
}
<><><><><><>
/*-- PLANT_ID    SALES_DATE    PLANT_SK    THIS_YEAR_COMP    THIS_YEAR_PRIOR_YEAR_COMP    THIS_YEAR_WEEKLY_COMP    THIS_YEAR_PRIOR_WEEKLY_COMP    THIS_YEAR_COMP_WITH_ESTIMATES    THIS_YEAR_PRIOR_YEAR_COMP_WITH_ESTIMATES    THIS_YEAR_WEEKLY_COMP_WITH_ESTIMATES    THIS_YEAR_PRIOR_YEAR_WEEKLY_COMP_WITH_ESTIMATES    POPULATION_DATE  --*/
/*-- INTEGER    TIMESTAMP    BIGINT    CHAR    CHAR    CHAR    CHAR    CHAR    CHAR    CHAR    CHAR    TIMESTAMP  --*/
1003    20080202    557701000    Y    N    Y    N    Y    N    Y    N    20111122
1003    20090831    557701000    Y    N    Y    N    Y    N    Y    N    20111122
1003    20090901    557701000    Y    N    Y    N    Y    N    Y    N    20111122
1003    20110501    557701000    Y    N    Y    N    Y    N    Y    N    20111122
1003    99990909    557701000    Y    N    Y    N    Y    N    Y    N    20111122
<><><><><><>
  • The second data set, assert.dataset, will be used in the assert operation. 
  • Note that the table headings are commented out, and are optional. 
  • Their presence or absence does not affect the test at runtime.
  • This data is comma delimited.
Code Block
languagetext
titleassert.dataset
{
    id: 'PLANT_BUCKET_COMP_EDW'
}
<><><><><><>
/*-- PLANT_ID,SALES_DATE,PLANT_SK,THIS_YEAR_COMP,THIS_YEAR_PRIOR_YEAR_COMP,THIS_YEAR_WEEKLY_COMP,THIS_YEAR_PRIOR_WEEKLY_COMP,THIS_YEAR_COMP_WITH_ESTIMATES,THIS_YEAR_PRIOR_YEAR_COMP_WITH_ESTIMATES,THIS_YEAR_WEEKLY_COMP_WITH_ESTIMATES,THIS_YEAR_PRIOR_YEAR_WEEKLY_COMP_WITH_ESTIMATES,POPULATION_DATE  --*/
/*-- INTEGER,TIMESTAMP,BIGINT,CHAR,CHAR,CHAR,CHAR,CHAR,CHAR,CHAR,CHAR,TIMESTAMP  --*/
1003,2008-02-02 00:00:00,557701000,Y,N,Y,N,Y,N,Y,N,2011-11-22 00:00:00
1003,2009-08-31 00:00:00,557701000,Y,N,Y,N,Y,N,Y,N,2011-11-22 00:00:00
1003,2009-09-01 00:00:00,557701000,Y,N,Y,N,Y,N,Y,N,2011-11-22 00:00:00
1003,2011-05-01 00:00:00,557701000,Y,N,Y,N,Y,N,Y,N,2011-11-22 00:00:00
1003,9999-09-09 00:00:00,557701000,Y,N,Y,N,Y,N,Y,N,2011-11-22 00:00:00
<><><><><><>

Both of the above data set members could reside in a single data set file.  In this example, they are each in their own file.

The ETLUnit test

Note that the stage operation below has no target table.  The target table was specified in the first data set, above.

Section
Column
width25px

 

Column
width700px
Panel
bgColor#E0E6F8
titleColor#0B0B61
titleBGColor#CED8F6
titleplant_bucket_comp.etlunit

class plant_bucket_comp {

@Database(id: 'edw')

@Test

extractFromEdw(){

// ================================================================================================

// Stage data from a data set. 

// To what table I am staging?  I will have to look at the data set's properties to find out. 

// I will find a target-table property there.

// Alternatively, I could put the target-table property here, in this stage operation.

// If a target-table property exists in both places, the one in the data set member will be used.

// ===============================================================================================

stage(

data-set-name: 'extract_data',

data-set-id: 'PLANT_BUCKET_COMP_EDW_TBL'

);

set(variable: 'population_date', value: '20090901');    

execute(

workflow: 'wkf_EXTRACT_PLANT_BUCKET_COMP',

folder: 'EDW_EXTRACTS',

context:

{

edw-connection-src: 'informatica.connections.edw.default'

}

);

// ============================================================================================

// My "expected" data may be found in the file called "assert.dataset" in the "dataset" folder.

// It will reside in the data set member "PLANT_BUCKET_COMP_EDW."

// ============================================================================================

assert(

data-set-name: 'assert',

data-set-id: 'PLANT_BUCKET_COMP_EDW',

source-file: 'PLANT_BUCKET_COMP.txt',

reference-file-type: 'FF_PLANT_BUCKET_COMP'

);

}

}

Column

 

Rules, Behaviors, Caveats

Data Sets and the assert Operation

  • Given only a data set name without an ID, the assert is over each member in the set.
    • If a source-table property is assigned within the assert operation, the assert operation will compare each member of the data set with the contents of the same source-table.
    • If the source-table property is instead assigned in the properties section of each data set member, the operation will compare each member of the data set with the contents of the source-table specified there.
  • Given only a data-set name without an ID, if the data set file is not empty and no source or source-table is provided, ERR_RECURSIVE_ASSERT occurs.
  • When asserting over a data set without using an ID, the data set file need have no ID's on any of its members.
  • In order to assert a source-file against a target data set, you must first use the register operation for the source-file.  Available only in ETLUnit-3.9.6 and later.

  • Specifying property ignore-data-set-properties is useful if you want to use a data set member, without its hard-coded properties.

    • ignore-data-set-properties does NOT ignore the id.

    • If the data set has only one member:
      • The data set member does not have to have an id for ignore-data-set-properties to work.
      • The assert operation does not have to specify a data-set-id.
    • If the data set has multiple members:
      • The assert operation must specify a data-set-id.
      • The data set member whose properties are being ignored must have an id matching the data-set-id specified in the assert operation.
  • When specifying a source-table value in the assert operation different from the source-table value in the data set, the value in the data set will prevail.

    • This applies to any valid property.

Data Sets and the stage Operation

  • ignore-data-set-properties requires that data-set-id is present or that data set has only one member.

Practices

  • You may stage to multiple tables in one stage operation.
    • Include a multi-member data set by data-set-name, but without specifying any data-set-id.
    • Every member in the data set will invoke a separate stage operation.
    • Each member of the data set should set all the properties for its stage operation.
  • You may run multiple asserts in one assert operation.
    • Include a multi-member data set by data-set-name, but without specifying any data-set-id.
    • Every member in the data set will invoke a separate assert operation.
    • Each member of the data set should set all the properties for its assert operation.
  • You should not write a meaningless property in the properties section of the data set.
    • Select from valid properties:
      • 'id'
      • Properties valid for the stage operation
      • Properties valid for the assert operation

Caveats

  • @OperationDefault annotation does not currently work with data set properties in the stage operation.