Hi community,
at the moment I'm developing some stuff with the SAP Leonardo Machine Learning Foundation. It works quite well when I use my retrained model for image classification! That's great! The actual problem I'm facing at the moment is the similarity scoring.
Situation:
I have a database on SAP side which stores about 1045 different kind of data records in form of pictures. For each picture record I have feature vectors (image feature extraction). I have an application which allows me to make pictures. What I want to achieve right now is that I want similar pictures provided from the database! This is in my opinion the use case which should be achieved with the service for similiarity scoring but I'm wondering if the service implementation is right. Let's say I would have only five records and my picture which was made with my mobile phone... The API says that I have to pack all feature vectors as separate files into a zip which will be sent to the service later on. If I do so the similarity is calculated for each file with every other file in the zip. Is that correct? I would expect that the similarity should only be calculated between my newly made picture and the existing pictures in the database. But ... If I just pack the feature vector of my newly file and of always only one other picture in the zip the request just takes to long because I would then have about 1045 requests to the service...
Does anybody of you has a solution for this problem? I'm really looking forward to hearing from you!
Greetings
Stefan