stage

ETLUnit 3.9.6

 

Draft in Progress

This document is a draft and is under development.

 

Description

Set up test environment by putting data into tables or files required by the ETL test.

Attributes

List of stage() Attributes

stage() for Database

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

- >> REF:

 Database Feature Module - Configuration

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

default-implementation-id:  - >> TYPE:  string

implementation-defaults- >> TYPE:  object (key-value pairs, comma-separated)

{

<implementation-default-name>:  - >> TYPE:  string

{

server-name- >> TYPE:  string

server-port- >> TYPE:  integer

implementation-properties:  - >> TYPE:  object (key-value pairs, comma-separated)

user-name- >> TYPE:  string

password- >> TYPE:  string or null

}, ... multiple implementation defaults possible

}

default-connection-id- >> TYPE:  string

data-file-null-delimiter- >> TYPE:  string

data-file-column-delimiter- >> TYPE:  string

data-file-row-delimiter- >> TYPE:  string

database-definitions:   - >> TYPE:  object (key-value pairs, comma-separated)

{

<database-name>  - >> TYPE:  string

{

implementation-id:  - >> TYPE:  string

server-name- >> TYPE:  string

server-port- >> TYPE:  integer

implementation-properties- >> TYPE:  object (key-value pairs, comma-separated)

user-name- >> TYPE:  string

password- >> TYPE:  string or null

schema-scripts:    - >> TYPE:  list of strings

}, ...  multiple database definitions possible

}


 

source:  - >> TYPE:  string

source-file:  - >> TYPE:  string

target-table:  - >> TYPE:  string  - - REQUIRED

connection-id:  - >> TYPE:  string

target-schema:  - >> TYPE:  string

reference-file-type:  - >> REF:

 File Feature Module - Reference File Type

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Either:

- >> TYPE:  string

Or:

- >> TYPE:  object (key-value pairs)

{

id- >> TYPE:  string

classifier- >> TYPE:  string

version- >> TYPE:  string

column-list- >> TYPE:  list of unique strings

column-list-mode:  - >>

POSSIBLE VALUES:

include

exclude

}

 



column-list:  - >> TYPE:  list of unique strings

column-list-mode- >>

POSSIBLE VALUES: 

include

exclude

mode:  - >> TYPE:  string

modes:  - >> TYPE:  list of strings 

connections:  - >> TYPE:  object (key-value pairs, comma-separated)

{

<connection-name>: - >> TYPE:  string

{

mode:  - >> TYPE:  string

modes:  - >> TYPE:  list of strings

}, ... multiple connections possible

}

Go to top

stage() for File

 

 

 

 

 

 

- >> REF:

 File Feature Module - Configuration

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

reference-file-types:  - >> TYPE:  object (key-value pairs)

{

<reference-file-type-name>:  - >> TYPE:  string

{

default-version:  - >> TYPE:  string

default-remote-version:  - >> TYPE:  string

default-classifier:  - >> TYPE:  string

default-remote-classifier:  - >> TYPE:  string

classifiers:  - >> TYPE:  object (key-value pairs)

{

<classifier-name>:  - >> TYPE:  string

{

default-version:  - >> TYPE:  string

default-remote-version:  - >> TYPE:  string

}, ... multiple classifiers possible

}

}, ... multiple reference file types possible

} 


 

file:  - >> TYPE:  string  - - REQUIRED

variable-name:  - >> TYPE:  string

classifier:  - >> TYPE:  string

destination-name:  - >> TYPE:  string

context-name:  - >> TYPE:  string

Go to top

stage() for List-File

 

 

 

 

 

 

- >> REF:

 File Feature Module - Configuration

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

reference-file-types:  - >> TYPE:  object (key-value pairs)

{

<reference-file-type-name>:  - >> TYPE:  string

{

default-version:  - >> TYPE:  string

default-remote-version:  - >> TYPE:  string

default-classifier:  - >> TYPE:  string

default-remote-classifier:  - >> TYPE:  string

classifiers:  - >> TYPE:  object (key-value pairs)

{

<classifier-name>:  - >> TYPE:  string

{

default-version:  - >> TYPE:  string

default-remote-version:  - >> TYPE:  string

}, ... multiple classifiers possible

}

}, ... multiple reference file types possible

} 


 

list-file:  - >> TYPE:  string  - - REQUIRED

variable-name:  - >> TYPE:  string

classifier:  - >> TYPE:  string

destination-name:  - >> TYPE:  string

context-name:  - >> TYPE:  string

Go to top

stage() for Data Sets

 

 

data-set-name:  - >> TYPE:  string REQUIRED

data-set-id:  - >> TYPE:  string

ignore-data-set-properties:  - >> TYPE:  boolean

Go to top

Individual stage() Attributes

stage() for Database Context

[Database Feature Module - Configuration properties not listed here.]

source:

  • A source data file in the standard location.
  • This file will be found in folder data/, a sibling of the ETLUnit test class file.
  • Typically these files on the file system should be named with .fixed or .delimited file extension.
  • The source property value, however, should not include the file extension (.delimited or .fixed).
  • Typically used with target-table.
  • e.g.:
    • source: DATA_FILE,
    • The file named by source would be placed in folder data/, a sibling to the ETLUnit test class file.
    • The file would there be named DATA_FILE.delimited or DATA_FILE.fixed.

 

 

Example of source property in database context

stage(

source: 'DATA_FILE',

target-table: 'TEST_TABLE'

);         

 

 

source-file:

  • A source data file in a custom location.
  • Include the path with the file name.  The path will start at the project root. 
  • Include the file name extension.  It may be anything.
  • If you do not provide a reference-file-type for the source-file, ETLUnit will apply the FML it finds for the target table.

 

 

Example of source-file property in database context

stage(

source-file: 'src/test/etlunit/data2/BUCKET_GAP_FILLER_FF_COLM_ORD_src_file.abc',

connection-id:'pos_ods',

target-table: 'TICKET'

);         

 

 

target-table:

  • Database table name.

connection-id:

  • Corresponds to id property in the @Database annotation.
  • Any value other than one that corresponds to the id property in the @Database annotation is not allowed.

target-schema:

  • Name of target table's schema, if different from default.

reference-file-type:

  • Applied to source, not target
  • If you want to supply FML for the target, just name the FML so that it will be found by ETLUnit when it is looking for FML for the target.
    • TODO: Fix link when available: SEE FML File Naming Options.

column-list:

  • List of source columns to include or exclude when staging data.
  • Only the column list that results from the combined filters of column-list and column-list-mode will be chosen from the source file and staged to the target table.
  • Q: What if FML is provided to give the source data file columns with different names than those in the target table?
  • A: FAIL!  The purpose of FML is to match files.  Without matching field names, ETLUnit cannot match the file layouts.
  • Q: What if the count of the net number of columns in the column list (after the include or exclude filter is applied) does not match the number of columns in the source data file?
  • A: 
    • If column-list is found inside reference-file-type structure, this will fail.
    • If column-list is found outside reference-file-type structure, and it has fewer columns than the source data file, only the enumerated columns will be applied.

column-list-mode:

  • Valid values:
    • include
      • Include only the columns in the column-list.
    • exclude  - - DEFAULT, if column-list-mode is omitted.
      • Exclude the columns in the column-list.

mode:

  • Represents a database connection.
  • Corresponds to a mode declared in the @Database annotation.
  • You cannot use a mode that has not been declared in the @Database annotation.

modes:

  • A list of modes.

connections:

<connection-name>:

mode:

modes:

  • Use the connections structure if the database connection you need is out of scope (overridden by method-level @Database annotation).
  • Example:

 

Example of connections property in stage operation

@Database(id: 'qpt-bmw-mart', modes: ['src', 'tgt'])

@Database(id: 'edw', modes: ['src'])

@Database(id: 'qpt_pet', modes: ['src'])

class load_pfix_agg_v2

{

@BeforeTest

loadReferenceData()

{

stage(

source: 'PRODUCT_ALL_GM',

target-table: 'PRODUCT',

connections:

{

qpt_pet: {mode: 'src'},

qpt-bmw-mart: {modes: ['src', 'tgt']}

}

);

stage(

source: 'REPORTING_ACTIVE_MODIFIER_GM',

target-table: 'REPORTING_ACTIVE_MODIFIER',

target-schema: 'CBE',

connections:

{

qpt_pet: {mode: 'src'},

qpt-bmw-mart: {mode: 'src'}

}

);

stage(

source: 'PLANT_SMALL_GM',

target-table: 'PLANT',

connections:

{

qpt_pet: {mode: 'src'},

qpt-bmw-mart: {modes: ['src', 'tgt']}

}

);

stage(

source: 'VW_QPT_SYSTEM_EFFECTIVE_DATE_20150509',

target-table: 'VW_QPT_SYSTEM_EFFECTIVE_DATE',

// When you only need to specify one connection...

connection-id: 'edw'

);

set(variable: 'population_date', value: '20150302');

}

...

}

 

Go to top

stage() for File Context

[File Feature Module - Configuration properties not listed here.]

file:  

  • Required
  • Name of the file being staged.
  • Include file name extension, if any.
  • File will be locating in folder files/, a sibling to the ETLUnit test class.

variable-name:  

  • The variable into which is placed the value of the file property (above).
  • The value provided to this property will accomplish two things: 
    • At runtime, a variable named by this variable-name property will be populated with the name of the file specified in the file property of this stage operation.
    • This variable will be made available to the execute() operation.
  • In the case of testing with Informatica workflows, look for the variable in the Informatica parameter file used by the workflow being tested.  The parameter file may be found in src/main/informatica/<InformaticaWorkflowFolderName>.  The parm file name should be identical to the workflow name, with an ".xml" file name extension.
  • In the example below:
    1. A variable "inFile" is populated with the value "bool_conversion_src.txt,"  because that is the value of the file property of the same stage operation.
    2. Because of the configuration of the parm file "wkf_BOOL_CONVERSION.TEST.xml,"  the $InputFile1" parameter used by the workflow wkf_BOOL_CONVERSION_TEST will resolve to "bool_conversion_src.txt," the value implicitly placed in variable "inFile."  In the parm file this variable is represented as "$inFile."
    3. The Informatica workflow will pull its data at the beginning of its run from file "bool_conversion_src.txt."  You can see confirmation of this in the excerpt from the workflow log.

 

bool_conversion_src.txt

O|Y|Y|Y|Y|Y

O|N|N|N|N|N

O|||||

O|x|x|x|x|x

... [ more data ]

wkf_BOOL_CONVERSION_TEST.PRM

[Global]

 

$BadFileName1=nogood

$InputFile1=${inFile}

... [ more parameters ]

Example of variable-name property in file context

class bool_conversion_test {

@Test

testBounds() {

stage(file: 'bool_conversion_src.txt', variable-name: 'inFile');

 

//Run workflow

execute(workflow: 'wkf_BOOL_CONVERSION_TEST', folder: 'SHARED_EDW');

 

//asssertions

assert(

target: 'BOOL_CONVERSION_TGT',

reference-file-type: 'bool',

source-file: 'output'

);

}

}

 

 

classifier: 

  • When the execute operation that follows this stage operation involves Informatica workflow:
    • Valid values:
      • source   - - relates internally to variable "sourceFiles"
      • target  - - relates internally to variable "targetFiles"
      • lookup  - - relates internally to variable "lookupFiles"
    • In the Informatica PowerCenter Workflow Manager, the Edit Tasks window, the Mapping tab, the "Files, Directories and Commands" panel, the attribute Value column, you may see file directory variables like $PMSourceFileDir, $PMTargetFileDir or $PMLookupFileDir.  When you follow where the file is being staged to, and what the file name is, you should be able to see what the classifier should be by looking on this panel.
    • File copy destination:
      • source = "SourceFiles" - - If global variable $PMSourceFileDir is then set to ${sourceFiles}, the staged file will end up in subfolder SourceFiles/.
      • target = "TargetFiles" - - If global variable $PMTargetFileDir is then set to ${targetFiles}, the staged file will end up in subfolder TargetFiles/.
      • lookup = "LookupFiles" - - If global variable $PMLookupFileDir is then set to ${lookupFiles}, the staged file will end up in subfolder LookupFiles/.
    • When only file and classifier are specified:
      • File is copied to a location used by Informatica workflow, using the name found in the file property.
    • When file, classifier and variable-name are specified:
      • File is copied to a location used by Informatica workflow, using the name found in the file property, as processed through the variable-name property.
    • When file, classifier and destination-name are specified:
      • File is copied to a location used by Informatica workflow, using the name found in the destination-name property.

destination-name: 

  • The destination name becomes available as a file to, for example, Informatica workflows in the execute operation.
  • The generated file may be found locally in a location like this:
    • target/generated-sources/log/<testClassName>/<testMethodName>/<operationOrdinal>_execute
  • The generated file may be found in the Informatica test environment in the location configured through the workflow's parameter files.

context-name:


Go to top

stage() for List-File Context

[File Feature Module - Configuration properties not listed here.]

list-file:   

  • Required
  •  Name of the list file being staged. 
  • Include file name extension, if any.
  • File will be locating in folder files/, a sibling to the ETLUnit test class.

variable-name:

classifier:  

  • When the execute operation that follows this stage operation involves Informatica workflow:
    • Valid values:
      • source   - - relates internally to variable "sourceFiles"
      • target  - - relates internally to variable "targetFiles"
      • lookup  - - relates internally to variable "lookupFiles"
  • Similar behavior as that for file context classifier property.

destination-name:  

context-name:  

Go to top

stage() for Data Sets Context

For example of the stage operation using a data set, see the Data Sets page.

data-set-name:

  • Required
  • Matches the name of a text file in folder 'dataset,' a sibling of the ETLUnit test class.
  • The data set text file has file extension '.dataset.'

data-set-id:

  • The data-set-id's value matches the value of the id property in the data set member's properties section.

failure-id:

  • The failure to be generated by ETLUnit if the operation using the data set fails.
  • A failure elsewhere in the test method will not invoke this failure ID.

ignore-data-set-properties:

  • Causes the ETLUnit test to ignore properties included in the data set member's properties section.
  • This does not include the id property in the data set member's properties section.

Go to top

Practices

stage() for File Context

Stage a flat file without a target

  • When working with Informatica tests, for instance, you may want to stage a flat file without specifying a target.  The flat file itself is the target.
  • In this case, you may want to specify a classifier, so Informatica will find the file in the right context (source, target, lookup).

 

gap_ticket.etlunit

@Database(id: 'edw', modes: ['src'])

class gap_bucket

{

    @Test

    columnsloadinorder_bucket()

    {

        stage(

            file: 'FF_BUCKET_V2.txt',

            classifier: 'source'

        );

        

        stage(

            source: 'BUCKET_V1_GAP',

            connection-id: 'edw',

            target-table: 'BUCKET'

            );        

            

        execute(

            workflow: 'wkf_GAP_FILLER_CREATE_V1_BUCKET_DELTA',

            folder: 'EDW_QPT_TICKET',

            context:

            {

                edw-connection-src: 'informatica.connections.edw.src',

                edw-connection-tgt: 'informatica.connections.edw.tgt'

            }

        );

        

        assert(

            source-file: 'FF_BUCKET_V1_DELTA_20150512',

            reference-file-type: 'gap_bucket'

        );

    }

 }

In Informatica, the workflow's source file name is based on a variable.

 

That variable is set in the ETLUnit project's parameter file for the workflow being tested.

wkf_GAP_FILLER_CREATE_V1_TICKET_DELTA.PRM

...

$InputFile_FF_BUCKET_V2=FF_BUCKET_V2.txt

...

And we know from the parameter file above that our staged file, FF_BUCKET_V2.txt is expected by the workflow.  This stage operation instance makes that flat file available to the workflow.

 

Go to top