A typical scenario for an Enterprise is to be able to generate data files for consumption on a continuous basis. An example here would be generating a file of all the customers, as and when the customer gets created. This scenario is similar to an Enterprise message bus like JBoss, where incremental changes are communicated across systems. But in an EIM scenario, this often involves loops in the data flow which continue to check if changes have occurred and in most cases, which involves repeated lookup of tables, which turns out to be expensive.
To achieve this, we can either use while loop or a continuous workflow introduced in SAP BODS 4.1.
The only difference between the above two is that the continuous workflow runs all data flows in a loop but keeps them in the memory for the next iteration. This will remove the repetition of some of the common steps of execution (such as connecting to the repository, parsing/optimizing/compiling ATL, opening database connections). This feature comes handy especially if you are using the flow to generate flat files etc. which require these steps.
You can define a continuous workflow, by changing the execution type to Continuous. By default, the execution type would be set to regular.
Creating a Continuous Workflow
Before creating a continuous workflow, let’s look at the limitations it has:
- A Continuous workflow can neither be nested inside another workflow object nor can it contain sub data flows.
- A Continuous workflow can be used only in a batch job.
So, to leverage the continuous workflow feature, let’s create a workflow and set the execution type to continuous, as shown above.
In this example, we are simulating a situation where files have to be generated for all departments and the number of departments can change intermittently. This requires that the workflow runs through all the departments, and stops after all the department files have been generated. Since the number of departments is variable, we have chosen to use a continuous workflow for the same.
Inside the workflow, place a data flow which will extract data from a table, and dynamically generate flat files. It will then place a script object before and after to generate filename, and increment the counter respectively. The script may also be used to append the timestamp of the file.
Create a custom function like the one shown below, which will check the condition and loop to generate the target files dynamically.
Configuring the Continuous Workflow Properties
The release resource option controls how often the resources used by the underlying objects are released and re-initialized. You can control this by following three options:
- To release resources after a few runs, select Number of runs and enter the number of runs. The default is 100.
- To release resources after a few hours, select the After checkbox, select Number of hours, and enter the number of hours.
- To release resources after a few days, select the After checkbox, select Number of days, and enter the number of days.
- To release resources when the result of a function is not equal to zero, select the After checkbox, select Result of the function is not equal to zero, and enter the function you want to use.
You cannot directly specify the number of cycles the workflow needs to run, and post which it comes to a stop. Instead, you can specify a custom function, whose return values will determine the further execution of the workflow.
To leverage this, edit the workflow properties as shown below. Here, our continuous workflow stops based on the return value of our custom function.
Extracting data from the source table and generate flat files dynamically using Continuous Workflow
When the job is executed, it initially sets the value of the counter variable to 1 and determines the number of departments and assigns it to the count variable.
We will then increment the counter in the end script. Then job runs continuously until all the department files have been generated like below,
The continuous workflow can be seen as a performance-optimized version of the while loop and help accelerate your data flows.
Typical candidates will be jobs that have while loops, and have recurrent data initiation like initializing datastores etc. and can be reconfigured to be run as continuous workflows.
As inferred from the above, Continuous workflow allows you to run a workflow object indefinite number of times in a loop. But the main purpose of the continuous workflow is not to substitute the while loop, as all the underlying structures such as datastores, data flows are initialized and optimized the first time the continuous workflow is executed and does not have to be done again until job completion.
Read more blogs from Data Science Category here.