In most ETL tools, we are limited by what the IDE offers by way of functions and transforms available. As a workaround for more complicated logic, Data Services allows you to define your own transforms by way of Python code which will then be executed by BODS. This gives the flexibility to achieve the transformation of data at any complex level and create new records, datasets out of it based on our requirements.  Here, we will discuss one such use case of user defined transform.

We need to retrieve the employee name from employee information table via API based on the employee id in the order table which reside in our on-premise database.

 

Employee Table (this table will be accessed via API):

Employee IDEmployee NameCountryMail id
100JohnUSJohn@xxx.com
101JackUKJack@xxx.com
102SunilIndiaSunil@xxx.com
103SarojIndiaSaroj@xxx.com

 

Order Table (source table we have in DS):

Order IDEmployee IDProduct IDPayment Method
10001100201Paypal
10002101202Visa Card
10003102201Paypal
10004103203Cash

 

Target table required:

Order IDEmployee IDEmployee NameProduct IDPayment Method
10001100John201Paypal
10002101Jack202Visa Card
10003102Sunil201Paypal
10004103Saroj203Cash

 

Here, it might look like a simple case of two tables and looking them up. However, the requirement is that the Employee table contains sensitive data and cannot be exposed directly to any tool, instead controlled APIs are available which control the data that can be accessed.

User defined transform generally works based on per record or per collection.

Per record – It will take each record from source and retrieve the output.

Per Collection – Use the Group Forming operations (Break Group and Candidate Selection) to group records for Python processing.

 

In our scenario, we need to retrieve the employee name that belongs to employee id based on each record. Hence, we are going with per record option. Below given block diagram will explain the transformation of data with user defined transform.

 

Sample Code Snippet

user-defined-transform-data-services

 

High Level Block diagram to illustrate the user defined transform

user-defined-transform-data-services

 

There are many more use cases and in some cases the user defined transform can be used to replace existing logic and help streamline your data flows and in other cases, establish a single version of the transform by reusing the transform in multiple transformations.

Subscribe to our Newsletter

5920 Windhaven Pkwy
Plano TX 75093.

+1 888-227-2794

+1 972-232-2233

+1 888-227-7192

solutions@visualbi.com