Developer forum

Forum » CMS - Standard features » RE: Search in files - what does it search?

RE: Search in files - what does it search?

Hans Ravnsfjall
Reply

Hi

What does the classic search module search, when searching in files? Does it search content of eg. PDF, WORD, EXCEL etc. or is it only file metadata?

Do the files have to be in specific folders?

Is the new search module/functionality better for searching files?


Replies

 
Nicolai Høeg Pedersen
Reply

Hi Hans

It searches the content - it is based on the windows search service and searches the file types it supports - including PDFs which requires an iFilter for PDF. It searches an index setup for the solution - see installation guide. Then on the search module you can select which folder to search in. See manual.

Weighted search is currently the best option for searching content of files.

BR Nicolai

 
Hans Ravnsfjall
Reply

Thank You Nicolai

does it also search MetaData for files?

 
Nicolai Høeg Pedersen
Reply

Hi Hans

It does not search Dynamicweb meta data if that is what you mean?

BR Nicolai

 
Hans Ravnsfjall
Reply

Ok

 

is there anyway to extend the module to add a OCR (optical caracter recognition), so the search will also find results from PDF´s that are images and not text?

 

Or does anybody have any experiense with this, and maybe a tip about an alternative solution?

 
Nicolai Høeg Pedersen
Reply

Hi Hans

It is not easy - but possible.

Windows search service uses iFilter to index the various document types, and one specifically for PDFs. You can find a iFilter for PDF that supports OCR to have it index PDFs with text embedded as images. An example: http://www.abbyyeu.com/rs/en:ifilter

If you get that up and running, you do not need any changed in DW search module.

That requires that filter to be installed on the relevant server - that can be an issue in a shared environment - but talk to the hosting provider on that.

BR Nicolai

 

You must be logged in to post in the forum