Azure Data lake Analytics uses U-SQL language to process your data at any scale. U-SQL as such is a combination of declarative SQL with Imperative C# expression language. Through distributed compute capability of U-SQL, you can analyse massive amounts of data across hybrid sources like Data Lake, Blob, Azure Database etc.
In this blog we will see how to use a custom User Defined Function in USQL. Because of U-SQL’s relevance to C# expression and dot Net Framework, it is very easy to leverage C# functionalities in U-SQL.
You can try the following steps in Visual Studio with Azure Data lake tools installed. In this example we create a simple UDF to return a random number and use it in U-SQL code.
Creating a U-SQL project
Create a new U-SQL project for creating a class library as highlighted in the snapshot. You can give a suitable Name for this library.
Creating C# function
In the .cs file, you can write the code to define the function to be used in U-SQL. You will have to define the return type for the function. In our example we create randNumber() function returning Int values. You can compile and test the code in this window.
Registering the assembly to Azure Data Lake Analytics
After you compile the .cs file, right click on the project and click Register Assembly.
On the window, key in the details of your Azure Data Lake Analytics account and database details.
The path would be the address of the .cs file. Give a suitable name for the assembly. Once you press submit, the assembly would get registered to your Azure Data lake Analytics.
You can find the assembly in the data explorer pane in your database in Azure Data Lake Analytics Catalog.
Calling the function in U-SQL
Once the Assembly is registered to your ADLA database, we can use the function in any U-SQL code.
You need to reference the assembly as highlighted in the snapshot.
REFERENCE ASSEMBLY {Assembly Name}
After that we can call the function by mentioning the {Assembly Name}.{Class name}.{Function Name}(Parameters)
Output of the above code as below.
With this we are through with the steps to create in UDF in Azure Data Lake Analytics. There is another way (Code Behind) where you will have to copy the C# code along with every U-SQL script. Personally I will feel registering the assembly as is lets you share your U-SQL scripts with others. I will try to explain the steps to do Code behind method in my future blog.