Sap Datasphere Data Flow Series – Operators (Joins...

Kunal_Mohanty · ‎11-17-2023

Hey Champs,

Hope you enjoyed the last blog on Sap Datasphere Data Flow Series – Introduction and sample example, well its just the trailer. Today lets jump into different types of operator in Datasphere in details.

Operators in SAP Data Sphere are used to manipulate and transform data. They can be used to perform a wide variety of tasks, such as filtering, sorting, aggregating, and joining data. Operators can be used in both data flows and data jobs.

Here are some of the most common operators in SAP DataSphere:

Projection Operator

Join Operator

Union Operator

Aggregation Operator

Script Operator (We will cover in a separate blog)

Join Operator & Projection Operator in Data Flow:

In SAP Data Sphere, the Join Operator is used to combine data from two or more sources based on a common column between them

There are several types of joins available in SAP Data Sphere, each with its own purpose:

Inner Join: This is the most common type of join, and it combines data from two sources where the values in a related column match. Only rows with matching values are included in the result dataset.

Left Join: This join includes all rows from the left source, regardless of whether there is a matching row in the right source. If there is no matching row, the values for the right source's columns are displayed as null.

Right Join: This join is similar to the left join, but it includes all rows from the right source instead of the left source. If there is no matching row, the values for the left source's columns are displayed as null.

To use the Join Operator in SAP Data Sphere, you can drag and drop it onto the data flow canvas, connect it to the desired data sources, and configure the join type and join condition. The join condition specifies the related column between the sources and the values that must match for rows to be included in the result dataset.

Projection Operator:

It allows us select the required fields into the output. For example we have 100 columns in a table and we want to see only 50 columns so in that case we can use a projection operator and select our required fields.

Lets see one example where we are join two dataset named Order Table and Order Item table.
We can combine Order table and Order Item table using Order id as join condition between two dataset.

Thoda Gyan:
The size limit for files being processed by the join operator is 10 GB.

Steps :

From the left side repository panel drag the Order header table and Order Item table to the play area as source.

Drag and drop the join operator from operator panel to the play area.

Now drag the projection operator in front of the Order header table, now connect the Order header table and projection.

Now connect a line from projection and Order item table to the join operator.

Click on projection and change the Name to Customer Name. Now click on the join operator and select inner join as shown below.

Adding a projection

Now click on add target table, Add in the right side details pane click on “Create and deploy target table” . And from the general tab select the mode as append and save and activate. Now in the top left corner click run as show in the below image.

Running the Data Flow

.Aggregation Operator:

Aggregation refers to the function whereby key figure values on detail level are automatically summed up at runtime and shown or planned on aggregated level.

SUM: Calculates the total sum of values for a particular column.

AVG: Calculates the average value for a particular column.

MIN: Determines the minimum value for a particular column.

MAX: Determines the maximum value for a particular column.

COUNT: Counts the total number of non-null values for a particular column.

These aggregate functions can be applied in various scenarios within Data Sphere, including:

Analyzing sales data: By aggregating sales figures across different time periods, product categories, or customer segments, you can identify trends, patterns, and outliers in sales performance.

Calculating performance metrics: Aggregated functions can be used to assess employee performance, track website traffic, or evaluate the impact of marketing campaigns.

Identifying trends in sensor data: By aggregating sensor readings over time, you can detect patterns and anomalies in environmental conditions, equipment performance, or user behavior.

Generating summary reports for companies: Aggregated data can be used to create informative reports that highlight key findings and trends.

Steps:

Using the previous flow after Order item table, drag a projection to the play area and remove the item column from the projection. Keep those that you want to aggregate and those that you want to group the aggregations by.

Click the Aggregation tool, drag it onto the diagram canvas, and release it where you want to create the aggregation.

Click on aggregation and click the column named Item and then select the aggregation type as Sum.

Join the flow between projection, aggregation and join.

Adding the Aggregation Node

5.Now save and deploy and run the dataflow. Then click on the target table and do data preview.

Data Preview of Aggregation Node

Union in Data Flow:

Union in Sap DataSphere Data Flow is used to combine two datasets. Lets take one example to combine two dataset. In my example I am taking Order Header Table and Sales Order Table. Just to show the union functionality. I will only take order number from both the dataset.

Steps:

Drag and drop Order Table Header into the play area as source and add a projection and connect the source table and projection. Now enable the Order Id column.

Drag and drop Sales Order Table into the play area as source and add a projection and connect the source table and projection. Now enable the Order Id column.

Drag and drop the union node into the play area.

Now connect the flow of the projection to the union node

Click on the union node and map the order id of both the table. And go to the unmapped section and delete the columns which are not mapped.

Now save and deploy it and click on run to see the data preview..

Before lets look into the ingredients for the dish :

Order Header Table

Order Item Table

Now we can data preview and see the complete data.

Adding the Union Operator

Conclusion:

This blog introduced the different types of operator available in datasphere data model that we’ll use along the entire blog series, I have explained each and every topic in detail. We can do a lot more using all these operators. I have skipped the script Operator and will cover it soon with another blog.

Thanks for reading! I hope you find this post helpful. For any questions or feedback just leave a comment below this post. Feel free to also check out the other blog posts in the series and follow me to learn as well as master sap analytics. Let me know if you find something can be improved or added.

Best wishes,

Kunal Mohanty

Sap Datasphere Data Flow Series – Operators (Joins, Projection, Aggregation, Union)

SAP PI for Beginners

ABAP 7.40 Quick Reference

Fiori: technical installation and configuration of one app from A - Z