cancel
Showing results for 
Search instead for 
Did you mean: 

Determine Availability of a RAS Service

Former Member
0 Kudos

Good Morning Alll,

We use an asp.net crystal viewer to open reports, and run in a multi-ras configuration.

We know that a RAS Service will go into a "hung" state if it doesn't have the contiguous memory it requires and that the RAS services should be recycled every so often to prevent it.

Because of this and because RAS works in a purely round-robin manner, we need to determine the status of a RAS Service prior to sending it a request otherwise, eventually all requests will get sent to the bad service and no users will be able to open any reports.

Do you have any suggestions on how to determine if a RAS Service is hung? For example, perhaps send the RAS Service a simple request and attempt to catch a particular error.

Any assistance you can offer will be greatly appreciated.

Thanks,

Mike

Accepted Solutions (1)

Accepted Solutions (1)

atul_chowdhury2
Active Participant
0 Kudos

You'll probably want to implement a watch dog thread (a service from a separate process might actually be better) that force kills the caller of your RAS API program for whatever behaviours you need it to observe.

This answer on SO might put you on the right track:

http://stackoverflow.com/questions/7046574/watch-dog-for-blocking-function-call

Hope this helps

Former Member
0 Kudos

Atul,

Thanks for the suggestion...

It's a good thought you had but we need a solution that will run in-line with any crystal request. We hope to check each RAS Service just before we give it a request and act accordingly.

thanks,

Mike

Answers (8)

Answers (8)

Former Member
0 Kudos

Don,

Thank you for the attention you have been paying to this situation. It's very much

appreciated.

Knowing now that .NET does have the memory issue did you investigate any further why

if one RAS Service might be in a hung state or close to it, would it affect the other

RAS services?

"... It's Windows OS thing and nothing CR can do about it..."

I'm a little concerned about this statement. Business Objects offers the Embedded option to

their clients boasting the Round Robin capability yet if we send a crystal request to a "hung"

service then it just disappears with no real response at all.

This is an issue that is greatly affecting our customer base, and based on your forum it's

also affecting many others.

Is there any thought about possibly getting in between the crystal viewer's request and

the RAS Service so that you can check to see if the service is "alive and well" prior

to sending the request? If you find a service to be "hung" then you could forward it on to the

next service that is "alive and well"

I understand the Business Objects Enterprise has this ability, and thats great for your

customers who have BOE. But what about the clientele who has embedded?

Why is a sanity check included in one round-robin process and not the other?

"...not using Sessions your test app is very slow..."

Several weeks ago we made this change and it has greatly improved the performance of

Crystal operations(moving from one page to another).

This solution was a great help.

We're aware of the ProcessAffinity flag and are looking into them more lately.

"...CPU loading you can get a keycode that has a flag set to throttle each server..."

What did you mean by "throttle each server"?

Thanks,

Mike

0 Kudos

Hi Mike,

As I mentioned, Adam and R&D are putting together a Kbase article to explain how it works and what if anything CR can do about it. I don't personally think this is memory fragmentation but that's just me but I'm not in that e-mail thread so I don't have all the info. Just ignore what I said, Adam is on top of it... Mine comments are not the official word, just how I understood it to work...

For throttling there is an option in the RAS Keycode to set the amount of CPU the RAS Server can access. It can be set to 50%. It won't help in this case though and I don't recall now why I made that suggestion.

As for testing I simply add a try/catch for each report job, if it fails with an exception it should move on. Without your reports and app I can't do much for testing if a try/catch would help at all, too many variables. What you could do is if an exception is thrown then using Windows API's to kill a that RAS Server and restart it may help. Sorry no sample code on how to do that in Windows API's. There is an option for the Service itself off the Recovery tab. Possibly setting those may help also.

Anyways, Adam has the case and working with the Product Group and Developers so he can continue working with you for a solution.

Thanks again

Don

Former Member
0 Kudos

666739

Former Member
0 Kudos

We did open a case several weeks ago.

SAP Support said the contiguous memory issue doesn't exist with JAVA only with .NET...

Can you please check with SAP Support to see which claim is accurate?

Former Member
0 Kudos

Nikhil\Don,

We are actually using Crystal 2008 and Crystal Embedded 2008 not XI...

"Load Report Failure..."

We are not recieving this error...What happens with us is we send a crystal request to a "hung" service and the crystal viewer(in our .NET app) recieves no response.

I cannot speak to JAVA but the contiguos memory is a problem in .NET.

Don, did you mean to say the problem is in .NET and not JAVA?

As for our solution, we have several ideas that we are creating test solutions for. Some include automated recycling of services others test to determine whether a RAS Service might be "hung" or not and then we would act on that, programmatically.

All are being tested but to early to tell.

We'll certainly update you in the future.

As far as this contiguous memory issue being documented...I found out from an SAP support rep that "contiguous memory" is an issue. I havent seen it documented anywhere.

--Mike

0 Kudos

Hi Guys,

I've never heard of Contiguous memory problems in .NET. Typically using GC.Collect will "clean up MS's Memory manager. In Java it is a problem because Java doesn't handle fragmented memory at all.

Windows apps can run into the 1.5 Gig memory limit, actually 2 gig but that's about as far a Windows app's can get due to various environment requirements for memory space also.

I still highly recommend you both log cases and we can work directly on the issues.

Just make sure you are closing and Disposing of your report objects and using GC.Collect helps also.

Don

0 Kudos

Very interesting, What is the case number? As I said I've never heard of this problem in .NET, only in Java...

Edited by: Don Williams on Sep 27, 2011 7:24 AM

0 Kudos

Hi Mike,

I've done some testing and talked to Adam about this and sorry to say I was under the impression we don't have memory fragmentation in a .NET app. After discussing with Adam and our Platform Developer who is where Adam is getting his info from they have convinced me we do have issues but it's just how Windows works as Adam indicated in the case notes.

Adam is going to put a Kbase together once all the info is collected and R&D OK's it to describe how MS handles Memory and why Cycling applications is both required and expected. It's Windows OS thing and nothing CR can do about it.

And to verify, not using Sessions your test app is very slow and I see a new RAS Server does handle each page request because of not using sessions thus the app is loading the report each time it pages.

Putting the report in session then the Ras Server which opened the report does the page handling.

For the CPU loading you can get a keycode that has a flag set to throttle each server, talk to your account manager for this option if you want to go that route.

Also, if you have not see it ask them for our search in SMP for this doc:

Crystal Reports Server Embedded 2008

Sizing and Configuration Guide for OEM Partners

Also, scalability may be a problem also having 4 RAS Servers running on one PC. If they are seeing problems with report requests I recommend you get another PC to push the Servers onto and dedicate it for just RAS. Check out the Process Affinity Flags also to allocate one CPU to one RAS Server

Thanks again and sorry for the mis-info...

Don

Former Member
0 Kudos

Ok, we'll consider the ticket...Thank You..

Ok, one last question and I think I know the answer but here goes:

When RAS sends a crystal request on to the SQL Server database it creates a SPID with a program name of Report Application Server. Is there anyway that I can retrieve that RAS SPID from the crystal viewer?

thanks,

Mike

0 Kudos

Hi Mike,

I checked with the SDK PM and he says that CRSE Servers in .NET do NOT talk to each other so a case will have to be created so we can get more info and possibly a sample app to duplicate the issue.

As for the SPID, CR is not assigning the ID. It's the Client engine doing this. Possibly MS SQL Server has an API to get the ID from the client...

Post to MS Forums, possibly someone there can tell you how to...

Don

Former Member
0 Kudos

Don,

It's been awhile since we spoke and there has been alot of work going on here to deal with our RAS Hang\Crash issues.

Additionally, I thank you for all the assistance you have provided.

I have, yet another question for you...

We have noticed that in a multi-RAS coniguration(4 RAS Services for example) when the default RAS Service gets into a state where it's memory is high then one or more of the 3 additional RAS Services can potentially land into a crises situation. In other words, one of the other RAS Services "hang".

Do you know if there is some connection between the DEFAULT RAS Service(created by the Crystal Embedded Install) and the additional RAS Services that would explain this?

In the past, we have been told that the RAS Services act independently of one another(in fact they don't even know about the others) but it just seems to be too much of a coincidence to over look.

Any assisstance you can offer will be greatly appreciated.

thank you,

Mike

0 Kudos

Hi Mike,

I suggest you create a case and work with one of us to debug this. Need more info on the configuration and what is happening in your main app as far as data goes and why the first RAS service is taking up so much memory....

There is no communication between CRSE Services that I know of. In BOE then yes because the CMS manages the jobs but in CRSE then now. But I've never asked so I can't say for sure. I don't believe so, the config file is simply to tell your APP which server is available and where if the previous one is busy.

Thanks again

Don

Former Member
0 Kudos

Hello Michael,

We are facing the same issue of RAS haning and our applications not being able to run CR reports. they get a "Load Report Failure..." error.

We are trying to implement a automated RAS restart to get around this. This reduces the chances of the error, but will not fix the issue. Also, are are facing this in R2. We are in the process of upgrading to R3. Does this issues exist in R3.1 as well?

Is there any documentation that explains this issue where RAS needs contiguous memory?

What plan are you following to fix this issue? Any help is greatly appreciated.

Thanks,

Nikhil

0 Kudos

Hi Nikhil,

Same for you, since you are using CRSE please log a case, assuming you have extended support for XI R2?

Contiguous Memory is only an issue in Java, not in .NET.

Don

Former Member
0 Kudos

Don,

Thank you for your quick response.

You use the term "down" when you say you're testing RAS. When you say "down", do you mean stopped or "hung". I define "hung" to be not functioning but having a status of "Started".

If a RAS Service is "hung" and we attempt to programmatically stop it, then it takes a considerable amount of time for it to stop.

The other option is to "kill" the RAS service via .net. Have you killed a RAS Service programmatically via .NET?

Thanks,

Mike

0 Kudos

Hi Mike,

Ah, right different scenario.... A stopped RAS Server will generate the exception where as a hung service won't respond at all...

I've asked for this set of API's for CRSE for a few years now and nothing new has been added. We don't have anyway of detecting if a RAS service has hung. Possibly in the Services itself, there are 3 options for each service but I'm not sure if Windows can detect if a service has stopped responding either though.

I've never tried to kill a Service in code but I'm sure it's possible, try googling it...

I found this one:

http://www.dotnetspark.com/kb/1051-kill-windows-service-programmatically.aspx

Problem is is how to detect if it's stopped responding...

What you may want to do is to capture why the Server goes into a Not Responding State???? Typically it can be due to really long SQL queries and the session times out. Or it may be that a something was missed prompt popped up but no window to set focus too...

In a BOE System the CMC service is capable of handling hung services but not sure what happens in CRSE without a CMC Service running.... I don't believe it has anything available. The CMS will ping the Service and the default timeout is 10 minutes, if it doesn't get a response it restarts the service.

Thanks again

Don

Former Member
0 Kudos

Thanks for the link on how to kill the PID. The next challenge is to match up the PID to kill with a specific RAS Service.

I was able to write code to find all services with an image name of crystalras and their corresponding PIDs;however, the next hurdle was to identify a specific RAS Service in a multi-RAS environment.

The best we could do, at this point, was to output the result of NETSTAT to a file. We now have the ability to link a PID to a specific RAS Service. This can be done because the NETSTAT output has the RAS Service port number and it's corresponding PID.

Luckily, I know which portnumber I'm working with at the moment I'm trying to link a PID to a RAS Service.

Too bad that the .NET CRSE version of RAS isn't as up to speed as the other technologies. Oh well, such are the joys of app development.

As for catching the cause of the RAS hangs, we have been attempting to do this. The problem is, we never know when the RAS service will get pushed to the brink. Our customers can run some pretty intense and long running SQL which can really put a heavy burden on RAS...Sometimes, we can track the exact code that caused the problems(a RAS SPID with high cpu, disk io and spawned threads) other times the best we can do is narrow it down to a handful of RAS SPIDs and hope to find some inefficient code which can be corrected. This is what I refer to as a "shotgun blast" way of finding the troubling SQL as sometimes the bad SQL might run a couple of hrs before the RAS Service hangs and it will be some lessor SQL which pushes the service to the point of failure.

Once again, thanks for all your assistance...

0 Kudos

Hi Mike,

I've used this in testing to see if a RAS service is down:


int i=0;
ReportClientDocument rcd;
//= new ReportClientDocumentClass();
//string reportPath = "rassdk://c:\\reports\\TestReport.rpt";
//string reportPath = "ras://c:\\reports\\TestReporta" + i.ToString() + ".rpt";
string reportPath = "ras://c:\\reports\\World Sales Report.rpt";

object reportPathAsObject = (object)reportPath;
for (i = 0; i < 5; i++)
{
   rcd = new ReportClientDocumentClass();
   try
   {
       rcd.Open(ref reportPathAsObject, 1);
       if (rcd != null)
       {
           System.Console.WriteLine("The report application used is:" + rcd.ReportAppServer + " - JobID: " + i);
           rcd.Close();
       }
   }
   catch
   {
       // if the RAS server is down catch exception
       System.Console.WriteLine("The RAS Server:" + rcd.ReportAppServer + " is down. - JobID: " + i);
       rcd.Close();
       // reset the report counter back one to handle the failed report Server job
       i = i-1;
   }
}

If the Job server isn't taking requests the open should throw an exception...

Thanks

Don