SAP Cloud Integration: Understanding the XML Encry...

CarlosRoggan · ‎03-07-2024

SAP Cloud Integration doesn’t offer an encryptor step for encrypting XML content according to the "XML Encryption" standard. That standard provides some benefits and flexibility specifically for xml content.
This article is intended to introduce into the "XML Encryption" standard, as preparation for future hands-on.
I'm trying to explain everything simple, with my simple understanding and my simple words - this is not a professional article.
In this blog post, I will try to answer many questions and show examples.
The next blog post shows how we can encrypt / decrypt XML payloads, according to the XML-Enc spec, manually in a Groovy script.

Overview

Historical Intro
Theoretical Intro
XML Sample Intro
Optional Outro

History

How I imagine that it started:
Timmy from Texas wanted to share some secret info with his friend Taku in Tokyo.
So he encrypted a message and sent it to Taku.
Taku was unable to decrypt and read the message.
So Timmy travelled to Tokyo to enjoy some food and to explain the way how he encrypts and packages his messages.
Afterwards, Taku in Tokyo was able to decrypt and read all messages (even before breakfast).
Some time later, same situation happened with his friend Toto in Togo.
Although the food is said to be great, Timmy decided not to travel, but to invite his friends for a conference at home.
They had international food, late-night discussions and at the end, they agreed on a common way of sending secure messages.
As a consequence, everybody in the world can send secure messages and the recipients can understand the message, as long as they follow that agreement.

Does that make sense?
Really makes sense, especially the section about the international food (which didn’t make it into the specification).

What do we learn from this story?
People communicating with each other need to agree on some basic principles:
- how encryption is done, which steps in which order
- what exactly is encrypted
- which algorithms are used
- certificate information
- where is that information stored

This intro was copied from my cms-post.

Introduction

We’re talking about sending data from somewhere to anywhere over the internet.
Instead of writing a letter, we use XML to structure the data which we send.
As we know that the internet is dangerous, we want to encrypt the data.

There are blogs out there?
Sure, we already have so fantastic blog posts like this one together with the intro blog.
It explains how to use the CMS standard for encrypting a message.

So why do we need this blog?
Actually, the CMS standard is not specific to any kind of payload, so it could be used for XML as well, why not.
But...
But we need this blog post because it is specific to XML payload.
As the message is written in XML, we can take advantage of the fact that the content is structured already.
This is a benefit.
So we have an extra standard.

OK. What is the benefit?
As we’re dealing with xml, which is a structured content, we have the advantage of choosing which content or part of content we want to encrypt.

Cool. Which content can we choose?
There are 3 possibilities:

Encrypt the whole document, i.e. the whole file or the entire message
Encrypt part of the document: choose one node of the XML document.
In this case, the node itself is not encrypted, but only the content below the node.
Means, the text content of the node is sensitive, but the node name is left as plain text
The content can be a subtree of child nodes as well.
Encrypt part of the document: again, choose one node of the XML document.
But in this case, the node itself is encrypted as well, along with all of it content.

Variant 1…?
Ummmmm - yes, it is similar as CMS....
AHA
Ehm, yes, here the benefit is less obvious, but nevertheless, the result is an XML with a specific structure, which can be understood by XML-Enc-aware tools.

Don’t understand.
Remember the funny history story?
At the end, a standard is an “agreement” between sender and receiver.
If they both adhere to the agreement, they can send and receive, encrypt and decrypt without trouble.
So even in case of variant 1, the receiver can find the info about how to decrypt, by reading XML.

What is the XML-agreement?
Basically, in case of "XML Encryption" agreement, the receiver knows where to find the information that he needs for decrypting:

The incoming XML contains a node <EncryptedData> which contains everything: the encrypted content and metadata.
There’s the info about which variant (see above) was used
The subtree of this node contains info about the algorithm used to encrypt the content
The subtree contains info about the key that was used to encrypt the content
The encrypted key itself
The encrypted content itself
. . .

Note that the standard is flexible and there are multiple ways to apply it.
In this blog post we’re sticking to one variant which is common and safe and makes sense.

How is encryption done?
During encryption, the sensitive content is replaced by an <EncryptedData> node.
The subtree of <EncryptedData> contains the sensitive content that has to be secured, in non-understandable way, i.e. encrypted.
After encryption, the result is encoded with Base 64, (this is common practice when sending data over the internet).

How is it encrypted?
We have to understand the 2 basic ways of encrypting:
Symmetric and asymmetric encryption

What is symmetric encryption?
Sounds normal: some content is encrypted with a key.
For decryption, the SAME key is used.
Means, the key must be handed over to the recipient in a safe way.
This is a disadvantage.
The advantage: fast and can handle big-sized content.

And asymmetric?
To avoid the problem of having to transmit the secret key:
Here we have 2 keys, which belong together: private and public keys.
This is called a key pair.
The public key is not secret, it can be sent to the encryptor.
The content is encrypted with the public key.
ONLY the private key can then decrypt the content.
Advantage: more secure.
Disadvantage: not applicable to big payloads and slow.

So both are unusable?
There's a solution: use both in a hybrid mode.
Use symmetric key to encrypt the (big) content.
Use asymmetric key to encrypt the (small) symmetric key.
That’s it.
The symmetric key can be safely sent together with the encrypted content.
Because the symmetric key is securely encrypted.
The receiver can decrypt the symmetric key, (because he has the private asymmetric key).
Then use the symmetric key to decrypt the content.

Confusing...
Let’s repeat:
We want to encrypt sensitive content
-> we use a “Content Encryption Key” == CEK
-> also called “Data Encryption KEY” == DEK
This key has to be encrypted with another key.
-> We use a “Key Encryption Key” == KEK

Why can't we just use the KEK to encrypt the message?
As mentioned, because KEK is asymmetric and thus not suitable for big content.

Ah, already forgot
No prob.

What is a key?
What we want to achieve is to hide secret content from someone but reveal it to us.
We want to make it look random, but be able to revert.
Thus we need to use a key, so we are able to revert.
Note:
A key can be just a sequence of bits, but longer key length ( key size) is more safe.

What is a DEK or CEK?
Data Encryption Key or Content Encryption Key.
This is a symmetric key for encrypting the payload content.

What is a KEK?
Key Encryption Key, this is usually an asymmetric key.
Also referred to as “Key Transport”.

How is encryption done?
Think about a rule, e.g. replace every ‘a’ with a ‘b’
Such rule is called “algorithm” or “cipher”.
To make the process reversible, a key is applied.
This makes it reversible only for the key owner.

Examples for symmetric algorithms?
AES, DES (not safe!), TDES (== Triple DES == 3DES == DESede), RC4 (etc, not safe)

Examples for asymmetric algorithms?
RSA, DSA, ECC

What is AES?
It stands for Advanced Encryption Standard.
It is a symmetric-key algorithm.
It works on blocks with size 128 bits.
It supports keys with sizes 128, 192 and 256 bits.

What is a Block Cipher?
In symmetric cryptography, 2 ways are used: block and stream ciphers.
In case of stream, the input is encrypted byte by byte.
In case of block, the content is cut into blocks, which are then encrypted.

What is block size?
The size of such blocks.
AES always operates on blocks of 128 bits.

What is padding?
Assume we have some content which has to be encrypted with AES.
Obviously, it is larger than 128 bits, or a multiple.
Which is the size of a block.
After cutting the content into blocks of 128 bits, there will be a remaining rest.
The rest has to be filled up until 128 is reached.
That’s what we call padding.

What is operation mode?
Assuming again, the content which has to be encrypted is larger than 128 bits.
So it is cut into multiple blocks.
Encryption will be applied to many blocks individually.
The way how this is done, will help to make the encryption more safe.
At the end we want a result that looks completely crazy (= random bytes).
Therefore, we can choose an encryption mode (= operation mode).
Examples:
ECB, Electronic Code Block, unsafe.
Note that ECB is often used as default, if no operation mode is specified.
So the recommendation is to always specify a secure operation mode.
CBC, Cipher Block Chaining, not recommended.
CTR, Counter
GCM, Galois Counter Mode, recommended.

Can we find an end?
We’ve talked about the XML structure and the encryption process.
Now we’ve found the end:
->here

Can we look at an example?
The next chapter is full of xml.

Sample XML

Let’s view a simplified example.
We have a Sales Service that sends info about an order:

Order number
Product Identifier
Customer info
Payment: credit card number
. . .

The service sends the payload in XML format.
XML is tedious to read, so trying to simplify:

We can quickly identify a security risk:
Sending credit card number via the internet is not acceptable.

So we could encrypt the number and send the XML as below:

However, it is better to stick to the XML Encryption standard:

The next screenshot below shows that the content of a node has been replaced with the <EncryptedData> subtree (simplified).
Remember the 3 variants above? So this is the second:
only the content is encrypted, not the whole element + content.
With other words: the credit card number is unreadable, but the <CreditCard> node is still readable.

Next screenshot shows the final result XML structure:

The last screenshot shows the final result:

What we can see:

The top level <EncryptedData> node has 3 children
- EncryptedData
--- EncryptionMethod
--- KeyInfo
--- CipherData

Explanation

🔸EncryptionMethod
This is the information about how the content was encrypted.
In our example, the symmetric cipher AES was used with a key size of 256 bits and operation mode GCM.

🔸CipherData
The result of encrypting plain text is called “ciphertext” and it is stored below this node.
Note that the cipher text is base64-encoded.

🔸KeyInfo
In our example, we chose to encrypt the symmetric key.
The <KeyInfo> node carries the information about this symmetric key
(Remember, this is the key that was used to encrypt the content).
The <KeyInfo> has the following children:
- KeyInfo
---- EncryptedKey
------- EncryptionMethod
------- CipherData

In our case, it contains the encrypted key itself and the method that was used for encryption.
Example: We use an RSA public key for encrypting the DEK, so the <EncryptionMethod> node will contain something with “…rsa…”

Note:
The algorithms are specified via URI, e.g.

<xenc:EncryptionMethod Algorithm=http://www.w3.org/2001/04/xmlenc#rsa-oaep-mgf1p />

We can see the nice little namespace xenc
I like this one 😁
It is specified at top level node:

<xenc:EncryptedData xmlns:xenc=http://www.w3.org/2001/04/xmlenc#

OK.
Let’s add one more last screenshot, where we can compare the XML payload before and after encryption:

Note:
The receiver has to know which variant was used:
If the only the content was encrypted, or the whole element.
This is specified in the “Type” attribute of the top-level element:

<xenc:EncryptedData Type=http://www.w3.org/2001/04/xmlenc#Content
or
<xenc:EncryptedData Type=http://www.w3.org/2001/04/xmlenc#Element

And here comes one last (really last) screenshot, showing the result of encrypting with the variant 3, which is of Type ...xmlenc#Element:

In above screenshot we can see that the <CreditCard> node has disappeared.
The node itself has been replaced with the <EncryptedData> node.
In the groovy script below, we’ll see the flag that decides upon the type.

Optional Info

The “XML Encryption” is also called “XML-Enc”.
It is a standard that is specified as a W3C Recommendation.
It is owned by the World Wide Web Consortium aka W3C.
The W3C owns most standards related to the World Wide Web.
The current version 1.1 of the specification for XML Encryption Syntax and Processing is from 2013.
It can be found here: https://www.w3.org/TR/xmlenc-core1/

Implementations of the standard are available for C, C++ and Java.
The Java implementation is used in our next blog post.

Summary

The XML Enc specification describes how to flexibly encrypt parts of an XML document.
(Or the whole).
The sensitive xml-section is replaced by a new <EncryptedData> section.
This xml-tree contains the encrypted content and metadata (method, key, etc)
The spec is flexible and open, but the common process of encryption would be:
▶️ Generate a symmetric key on the fly.
▶️ Encrypt the content with it.
▶️ Encrypt the symmetric key with an asymmetric key.

Next Steps

Go through the tutorial in the next blog post to gain hands-on experience.

Links

W3C recommendation XML Encryption Syntax and Processing V 1.1
Apache Santuario
Understanding CMS (PKCS 7) standard.
Security Glossary Blog

🌵