Data Hub is a strong data management & orchestration tool for data integration, data processing, and data governance. Data orchestration is composed of reusable pipelines, configurable operations to process data pulled from a variety of sources, including CSV files, XML, web services APIs, hybrid cloud services, and SAP data stores like HANA, BW ABAP Data flow, etc. This blog gives a detailed introduction to SAP Data Hub and Installation of Dev Edition.
Advanced operations can be achieved using analytics or machine learning libraries such as TensorFlow, or custom-coded tasks in datahub.
SAP Data Hub is available in two different editions:
- SAP Data Hub – Developer Edition
- SAP Data Hub – Trial edition
SAP Data Hub, Developer Edition
SAP Data Hub developer edition was first delivered at the end of 2017. Its latest version 2.4 is now available for download.
SAP Data Hub can be installed on any platform that supports Kubernetes.
This includes managed cloud services:
- AWS (EKS), GCP (GKE), Azure (AKS)
- Private cloud
- On-premise installations like SUSE CaaS Platform
SAP Data Hub developer edition can be installed on your local computer with the help of Docker container. SAP Data Hub packaged them together with Hadoop Distributed File System (HDFS), Spark and Livy into a single Docker container image. This container image can be used to initiate options such as SAP Vora Database, SAP Vora Tools, SAP Data Hub Modeler or HDFS, Spark.
Limitations of installing SAP Datahub Developer edition on your local computer are:
- Data governance and workflow features not being available
- Currently, we are facing an issue with using operators in SAP Data Hub developer edition
- Operators related to machine learning like TensorFlow and image processing operators OpenCV currently cannot be used in SAP Data Hub developer edition
Pre-requisites and hardware requirements
Before getting started with SAP Data Hub Developer Edition installation, please ensure, that the following prerequisites and hardware requirements are met in your local computer.
- 64-Bit Processor with Intel/AMD instruction set “X86_64”
- At least 2 CPU Cores (better: 4 Cores) for the purpose of the Developer Edition.
- At least 8 GB of RAM for the purpose of the Developer Edition
- At least 10 GB disk space for running docker image
- Internet Connectivity (temporary)
- The operating system must support the installation of the Docker (https://www.docker.com)
- Docker is available for Windows, MacOS, and Linux
Docker is a computer program that performs operating-system-level virtualization. Docker is used to run software packages called containers. Docker provides seamless integration with the Windows operating system.
Please download the docker for Windows with the below link:
Docker Desktop for Windows is a Docker designed to run both Windows and Linux Docker containers. However, Datahub Developer edition requires the Docker to be switched to “Linux Containers” mode.
Test whether the docker is working properly by running the below command in Linux:
Docker run Test
Obtaining SAP Data Hub Developer Edition
Download the Developer Edition with below link and unpack the archive into your local disk:
Building Container Image
Steps to build a container image:
- Open a terminal window and switch to the directory where you have unpacked the Developer Edition
- Issue the command for creation of the base image docker build –tag sapdatahub/dev-edition-base:15.0-01 -f dev-edition-base.Dockerfile
- Issue the command for creation of the final image docker builds –tag sapdatahub/dev-edition:2.3
Running SAP Data hub Developer Edition
Run the below command to get more information about the usage of the developer edition:
docker run -ti sapdatahub/dev-edition:2.3
|run||start the SAP Datahub processes in the container|
|run-hdfs||starts processes related to HDFS and Spark/Livy in container|
|prompt||starts into the bash shell and start further processes manually|
|network||performs a network check for accessing public internet sites|
- The minimal set of parameters to spin up the Developer Edition as container is: docker run sapdatahub/dev-edition:2.3 run –agree-to-sap-license
- Run a Docker container
docker network create dev-net
- Followed by (for Linux, Mac)
or for Windows
Launch SAP Data Hub modeler by running this URL: https://localhost:8090
Start and Stop Data Hub
To start and stop the data hub, use the below commands:
Hostname “Devedition” is not a mandatory name, you can give name based on your needs,
Docker start Devedition
Docker Stop Devedition
User can launch HDFS by running the below commands for accessing the Apache Hadoop user interface as shown in the below image:
or for windows
Launch HDFS by running the below URL:
Quick cockpit view of SAP Data Hub Developer Edition
Data Hub consists of below user interface for navigating, creating pipeline and workflow,
Installation of SAP Data Hub Dev Edition:
Limitation of SAP Data Hub Dev Edition:
Read more blogs related to SAP here.