MORE INFORMATION
After SharePoint Portal Server crawls a content source, all documents may not be indexed because of errors or rule exclusions. You can use the Gthrlog.vbs utility to extract more information about these documents from the gather logs. You may also want to view successful index entries, which are not shown in the Web page viewer by default.
Gatherer log files are created during every index update, and can be found in the following folder, which is typically located in ProgramFiles\SharePointPortalServer on the drive that you installed the SharePoint Portal Server program files to:
Data\FTData\SharePointPortalServer\GatherLogs\workspace_name
All files that have the .gthr extension are log files. You can view these files to identify the documents that were successfully indexed. To find the log that corresponds to the most recent index update, locate the .gthr file with the most recent timestamp. The .gthr files are difficult to read by using a text editor, such as Microsoft Notepad. To parse the logs into a readable format, run the Gthrlog.vbs utility against the log file.
Gthrlog.vbs is located in the following folder:
Program Files\Common Files\Microsoft Shared\MSSearch\Bin
To run the utility, type the following commands from a command prompt, where, "SPS" is the workspace name, the Data folder in the Program Files folder is on drive D, and the log file is named "sps.25.gthr":
cscript gthrlog.vbs "d:\program files\sharepoint portal server\data\ftdata\sharepointportalserver\gatherlogs\sps\sps.25.gthr"
When you run this command, the log file contents are displayed in the command window. If you want to save this information in a text file, you can run the utility with a redirector to a file, for example:
cscript gthrlog.vbs "d:\program files\sharepoint portal server\data\ftdata\sharepointportalserver\gatherlogs\sps\sps.25.gthr" > c:\viewlog.txt
After you run the Gthrlog.vbs utility against a .gthr file, you receive output that is similar to the following text:
Microsoft (R) Windows Script Host Version 5.1 for Windows<BR/>
Copyright (C) Microsoft Corporation 1996-1999. All rights reserved.
<BR/>
2/12/2001 10:26:06 AM Add The gatherer has started
2/12/2001 10:26:20 AM Add The initialization has completed<BR/>
2/12/2001 10:26:22 AM Add Started Full crawl
2/12/2001 10:27:48 AM Add Completed Full crawl
2/12/2001 10:27:48 AM Add Started Full crawl
2/12/2001 10:27:54 AM Add Completed Full crawl
2/12/2001 10:28:48 AM file://./backofficestorage/localhost/SharePoint
Portal Server/workspaces/class/SPS Help.htm/Add/URL is excluded by the
server (robots.txt, no-index attribute on the URL, encrypted file, or a
search folder), or redirected to an excluded URL
When you read this output, you can determine the files that the SPS Help.htm file excluded and the possible causes for the exclusion.