Solved: Web Repository Manager and robots.txt

Former Member · ‎11-15-2006

Hello,

I would like to search an intranet site and therefore set up a crawler according to the guide "How to set up a Web Repository and Crawl It for Indexing".

Everything works fine.

Now this web site uses a robots.txt as follows:

<i>User-agent: googlebot

Disallow: /folder_a/folder_b/

User-agent: *

Disallow: /</i>

So obviously, only google is allowed to crawl (parts of) that web site.

My question: If I'd like to add the TRex crawler to the robots.txt what's the name of the "User-agent" I have to specify here?

Maybe the name I defined in the SystemConfiguration > ... > Global Services > Crawler Parameters > Index Management Crawler?

Thanks in advance,

Stefan

Former Member · ‎11-15-2006

Hi Stefan,

I'm sorry but this is hard coded. I found it in the class : com.sapportals.wcm.repository.manager.web.cache.WebCache

private HttpRequest createRequest(IResourceContext context, IUriReference ref)

{

HttpRequest request = new HttpRequest(ref);

String userAgent = "SAP-KM/WebRepository 1.2";

if(sessionWatcher != null)

{

String ua = sessionWatcher.getUserAgent();

if(ua != null)

userAgent = ua;

}

request.setHeader("User-Agent", userAgent);

Locale locale = context.getLocale();

if(locale != null)

request.setHeader("Accept-Language", locale.getLanguage());

return request;

}

So recompile the component or changing the filter... I would prefer to change the roberts.txt

hope this helps,

Axel

Former Member · ‎11-15-2006

Hello Stefan,

I do not directly know whats the name but you can easyly find it out. Go to the logfiles of the web server of your third party site and check the entrys. If its not logged you should switch shortly the logging to default (this works for Apache e.G.).

Hope this helps,

Axel

-

Please consider rewarding points to helpful answers on SDN

Web Repository Manager and robots.txt

Accepted Solutions (1)

Accepted Solutions (1)

Answers (1)

Answers (1)

Re: Failed to commit objects to server : Undefined...

Re: Can we use Large Object (LOB) in Abstract enti...

Re: Data upload to custom table using excel in RAP

Re: Excel File upload with SAP RAP approach using ...

Re: Excel File upload with SAP RAP approach using ...