With due respect to original blog post here which is in German. This is a rewrite into English. Germans please let me know if I have missed something.
We often need to configure Office Server Search while working on MOSS 2007 infrastructure projects. Microsoft Filter Pack is efficient as long as it only comes to indexing the metadata documents or metadata and content of file types.
As a large volume of documentation often exist in PDF format and there is always a requirement to index those .pdf files. As you know for every other file type, a so called iFilter indexing is needed. In case of .pdf files, you need to aquire third party iFilters.
The most common iFilters available for .pdf files are:
Here, most common are Adobe PDF iFilter and Foxit PDF iFilter. While Adobe iFilter is available free of charge, for the other two you will need to pay license fees.
With respect to the performance, these two IFilters differ. Microsoft bloggers have already done various testings on Adobe PDF iFilter and Foxit PDF iFilter. As a result of various testings done by Microsoft bloggers, Foxit iFilter (32 bit) is around 4 times as fast as the Adobe Filter. And Foxit iFilter (64 bit) is about 5 times, as fast as the Adobe Filter.
In addition we found PDFlib TET PDF IFilter, for which there were no performance records found.
This was a enough reason to test all these three iFilters for their performance.
The Test Setup:
All three iFilters were tested on same hardware environment.
We used a virtual Hyper-V with two virtual processors and 4 GB RAM. OS used was Windows Server 2008 (Standard Edition) with Service Pack 1.
MOSS 2007 SP 1 (Cumulative update Feb 2009)
Here we indexed 1022 documents (both German and English) of volume of 1.8GB.
The default settings of the current IFilter were used. The duration of indexing was collected in two runs and considered the best result.
The Test Result:
License costs compared
Status: 11.05.2009, all data without guarantee.
Alrough the Adobe iFilter comes free, its indexing performance has lot of scope for improvement. While the performance of iFilter Foxit and PDFlib are at par. Initial review cannot comment on difference in the quality of indexing. You can therefore reduce the decision criteria for an iFilter on other factors like volume, cost and personal preferences.
If a large volume of PDF documents are to be indexed the one of the paid options are recommended.
If the index server is not a dedicated server, then this task is preferably preformed by a Web Frontend Server. Because the indexing requires much CPU time, it should be exploited as efficiently as possible.
In case of multi-core the PDFlib IFilter is more cost effective than other peers.