Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
DKK
Product and Topic Expert
Product and Topic Expert

Introduction

As Generative AI continues to evolve, the complexities surrounding the management of Personally Identifiable Information (PII) also increase.AI systems can sometimes deduce PII from seemingly harmless data or unintentionally reveal PII in their results. Therefore, it's essential for AI developers and users to understand these potential risks and take proactive measures to address them.

Personally identifiable information (PII) refers to any information that can be used to identify an individual. This might include, but is not limited to, name, social security number, date and place of birth, mother’s maiden name, or biometric records. In the digital age, it could also include digital identity data such as IP addresses or mobile device IDs.

Handling PII responsibly is critical for many reasons:

  • Respecting Privacy
  • Ensuring Security
  • Building Trust

On the regulatory front, there are laws including the European Union's General Data Protection Regulation (GDPR) and the U.S.'s California Consumer Privacy Act (CCPA) that offer a framework for PII management. These laws empower individuals with rights over their personal data. However, it's worth noting that these regulations are relatively recent and still evolving. There's also ongoing discourse about the optimal way
to regulate AI systems to balance the need for privacy protection and technological innovation. SAP HANA Data Anonymization provides methods in order to overcome all the above potential risks. The methods supported by SAP HANA are :

  • k-anonymity
  • l-diversity
  • Differential privacy

More information on the SAP HANA Documentation :
https://help.sap.com/docs/SAP_HANA_PLATFORM/f88e51df089949b2af06ac891c77abf8/ee693d6584d243e1a0daf7c...

Architecture

On this blog we will explore the various solutions for tackling PII-related challenges through the SAP Business Technology Platform. We'll be focusing on SAP Datasphere's capabilities, which not only enable us to consolidate all necessary data sources, SAP or non-SAP, but also leverage SAP HANA's powerful data anonymization methods. This combination ensures your data's security while maximizing its potential.

Considering the unique capabilities and skills of the personas interacting with SAP Datasphere, we have come to the following options.

We really hope that the first option , direct usage of Data Anonymization annotations from SAP Datasphere interface to be on the roadmap really soon.

anon1.png

 

A generic architecture of a Generative AI project, where SAP Datasphere takes center stage, can be seen below : 

anon2.png

More information regarding the CAP LLM Plugin , please check this link

https://community.sap.com/t5/technology-blogs-by-sap/elevating-data-privacy-in-sap-cap-based-genai-a...

Hybrid Approach, focus mainly for Business personas

The starting point for business users will be a local table accessible through the SAP DSP interface (INPUT).
We'll elaborate more on this in the upcoming images. The entire process will be seamlessly managed behind the scenes, offering a user-friendly experience (PROCESS). It will be set up just once on the underlying SAP HANA of SAP DSP. A procedure will generate an Anonymization view, which will be displayed on the SAP Datasphere interface (OUTPUT). This procedure is programmed to run every minute. Only when the user activates the "enable" flag during the initial step will this procedure execute. Otherwise, it simply remains inactive. The final output will be an SAP Datasphere view, tailor-made for business users based on their specific requirements. The anonymized view produced can be distributed by the Data Controller to the necessary DSP spaces, substituting the original non-anonymized dataset. This straightforward step allows us to exert additional control over the access permissions for the Data Consumer.

Hybrid approach

anon3.png

Details regarding the Configuration (SAP Datasphere Local Table)

anon4.png

Information regarding SAP HANA Procedure and Scheduler

anon5.png

Conclusion

In conclusion, the use of SAP Datasphere in a generative AI scenario presents an array of advantages. It significantly enhances data handling and security, ensuring the protection of personally identifiable information through SAP HANA's anonymization methods. The versatility of SAP Datasphere allows it to cater to both IT and Business personas, offering a range of implementation methods including direct, SQL, Python or via a CAP Application. Being a one-stop solution for all generative AI needs, its hybrid approach is highly flexible, capable of extending and adapting to any specific requirement.

This makes SAP Datasphere a highly efficient and user-friendly tool for handling generative AI scenarios.

1 Comment