Skip to Content
Former Member
Sep 09, 2015 at 06:24 AM

Lack of records integrity in Activity Data Collector data


Dear all,

we would like to do some blurring within the portal and use the ADC mechanism of EP (Creating and Editing Collector Files - Portal - SAP Library). While developing file system to database collector job like suggested by SAP (fs parsing is of course a way to slow to create user statistics in real time), Ive realized you need a custom table to store the data, which contains all fields delivered by the ADC service (plus some custom header fields, since ADC user agent field doesnt contains any information on operation system etc). So I created a such table containing a combined primary key with the fileds: time of request (rfo.t), hashed header of request (rfo.hrh), hashed iview pcd url (rfo.hrh), logged user (rfo.un) and server node id (rfo.nid). This all in the naive assumption there cant be two identical requests with same key, or to say it in more naturla language: one specific user cant cause more then one EP request in exact same point in time, with identical request header, to same PCD target and using the same server node.

To test my collector job, which simple reads and clusters data from the portalActivityTraces folder and puts them in the mentioned DB table, I have activated the ADC and tryed to insert a bunch of ADC records collected on a production system with desired field configuration. As I found out while doing this, my assumption regarding the uniqueness of collected data was sadly wrong, some of collected records have identical primary key fields, which cause while inserting au naturel.

At the end it can have two reasons afaik: either ADC writes simply wrong (double?) request data, or I do not understand how it works and my primary key is not choosen properly. Does anybody have an idea how this is supposed to work?

Attendant document from appropriate SAP note (attached) is not a real help, its incomplete on formal field description e.g. There are no statements regarding possible field size, uniqueness of values, or mandatory resp optional presence of those. This makes such a development job a kind of try-and-error task, maybe there is somewhere a more reliable and formal description of collected data? I wasnt able to find anything else on this topic and could be glad about a hint. Thanks,



1.jpg (128.9 kB)