Solved: k-means multiple categorical columns for similarit...

former_member186543 · ‎12-21-2016

Hi ,

I am trying to implement a machine learning requirement, where we want to find out similar incidents / support tickets from our database on the basis of attributes like product category, priority, impact, code group and other categorical as well as numerical attributes.

As per SAP HANA PAL and my ML knowledge I believe we can try using the K-means clustering algorithm however we have many categorical columns. SAP documentation says that weights can be assigned to category column but I want to understand if I want to assign different weights to different categorical columns to K-means input, will it be possible in HANA PAL ?

In case it's not possible, which other clustering algorithm from PAL can suffice the requirement.

Thanks,

Hasan

Former Member · ‎01-23-2017

Hi Hasan, currently we cannot set different weights to different categorical columns. For the second question you raised in your reply, PAL has a function called cluster assignment which will label the new data according to a cluster results from previously-run cluster functions.

Best regards,

Xingtian

k-means multiple categorical columns for similarity analysis

Re: Unable to write select query in BW transformat...

Re: Write Widget names to a text box

I am unable to create instance for Data Attribute ...

Re: SAP Analytics Cloud for planning - Inverse For...

How to Integrate Sybase Data into SAP Datasphere?