Developer forum

Forum » Development » Problem with indexing

Problem with indexing

Anders Ebdrup
Reply

Hi,

 

I have repeatedly experienced problems with the indexing process - the index consists of 1.3 + million entries.

The problem that occurs is:
Error message: Can not overwrite: E:\iis\ xx.xxxxxxxx.xx\Files\System\_search\Results\index\ _2y.cfs
Call Stack: to Lucene.Net.Store.FSDirectory.InitOutput (String name)
to Lucene.Net.Store.SimpleFSDirectory.CreateOutput (String name)
to Lucene.Net.Index.CompoundFileWriter.Close ()
to Lucene.Net.Index.DocumentsWriter.CreateCompoundFile (String segment)
to Lucene.Net.Index.IndexWriter.DoFlushInternal (Boolean flushDocStores, Boolean flushDeletes)
to Lucene.Net.Index.IndexWriter.DoFlush (Boolean flushDocStores, Boolean flushDeletes)
to Lucene.Net.Index.IndexWriter.PrepareCommit (IDictionary `2 commitUserData, Boolean internal_Renamed)
to Lucene.Net.Index.IndexWriter.Commit (IDictionary `2 commitUserData)
to Dynamicweb.Searching.Management.IndexUpdater.Commit ()
to Dynamicweb.Searching.Indexer.PerformIndexModifications (Index Update Parameters parameter, IEnumerable `1 data IndexModificationResult addToResult)
to Dynamicweb.Searching.Indexer.PerformIndexModifications (IEnumerable `1 fromEntries, Index Update Parameters parameters)
to Dynamicweb.Searching.Indexer.PerformUpdate (IEnumerable `1 data IndexerQueueContentType expectedContentType, Index Update Parameters parameters)
to Dynamicweb.Searching.IndexManager.UpdateIndexInternal (Indexer in Boolean isFullUpdate, Boolean updateSpell, IDictionary `2 flags)


Does any one know how to avoid that?


Replies

 
Mikkel Høst
Reply

I'm not sure, but from my experience working with lucene you need to make sure that the w3p3 process has access to the index folder, so make sure that is setup in IIS. 

Another problem i have encountered, if you update the index and the code is doing a index->optimize() - all these cfs (compound) files be merged into the main index. So if you are working on the index, doing alot of updates, rebuilding and so on you may be trying to overwrite a file that is delete form a previous action.

Open you Resource monitor, look at the Disk activity after doing something with the index and make sure that the index files are not being written to or read from. Even if your operations are "done" the index will still be doing stuff in the "background" and this can be quit a challenge when you are developing, because this is done by the system process so if you kill your w3p3 the work will still continue..

Just my 5 cents from working with lucene.. It's really awesome but can be quit a hard if you have large indexes under development.

 

 
Anders Ebdrup
Reply

Hi Mikkel,

 

Thanks for the reply. The update is done at 3.00 am by "scheduled tasks", and the lock only occur once in a while - so I have no intentions to check the resource monitor while it is running :-) Besides the indexing process takes approximately 50 minutes.

 

The task is started with these parameters: "Admin/Public/IndexUpdateTask.ashx?Path=Results&FullUpdate=False"

 

Best regards, Anders

 
Mikkel Høst
Reply

okay. In that case try a FullUpdate and see if the result is the same.. 

 
Anders Ebdrup
Reply

But if I update the index with a "FullUpdate" will the index then be available and the users have access to it??

 
Mikkel Høst
Reply

Depends on how DW implemented the functionality. Maybe you can reflect the DLL an see. AFAIK it would require that they "rebuild" the index at another location and then swap them around when done...

 
Nicolai Høeg Pedersen
Reply

DW will index to a new temp index, and when done, replace the old one.

 

Nicolai

 
Anders Ebdrup
Reply

Strange what is then locking the temporary index... If you are using a temporary index why does this then corrupt the primary index??

 
Anders Ebdrup
Reply

Hi Nicolai,

 

Can you please confirm if this is true, as it seems strange that it then corrupts the index?

DW will index to a new temp index, and when done, replace the old one.

Best regards, Anders

 
Pavel Volgarev
Reply

Hi Anders,

 

The backup update is optional. These are some settings that can help you make the indexing (and searching) process more robust:

 

  • /Globalsettings/System/Searching/IndexingBackupUpdate
    Set this flag to "True" in order to make Dynamicweb update the index outside of the current folder (it will use the "backup" folder instead) and replace the "live" version in case of success.
     
  • /Globalsettings/System/Searching/IndexingRetryCount
    The maximum number of attempts to perform in case of an error during indexing.
     
  • /Globalsettings/System/Searching/IndexingDelay
    The delay (in milliseconds) between attempts (see the previous setting).
     
  • /Globalsettings/System/Searching/TryRecoverCorruptedIndex
    Set this flag to "True" and Dynamicweb will monitor the state of the index during each query operation. If the index is likely to be corrupted, the full update will trigger automatically.

 

-- Pavel

 
Anders Ebdrup
Reply

Hi Pavel,

 

Thanks for your reply. Does these settings only apply to the product index or also to the custom indexes?

 

Best regards, Anders

 
Pavel Volgarev
Reply

Hi Anders,

 

These settings are global to all indexes you have.

 

-- Pavel

 
Anders Ebdrup
Reply

Hi Pavel,

 

I have tried this setting:

 

/Globalsettings/System/Searching/IndexingBackupUpdate

 

But there seems to be a problem when not making a full update as only the updated entries will be in the index after a partial update.

 

 

In the meantime I have experienced more problems with the index:

 

[507] An error occured while updating the spell corrections source.

 

Index: Results

Date: 4/28/2013 3:00:01 AM

Solution:

Error message: no segments* file found in Lucene.Net.Store.SimpleFSDirectory@E:\iis\xx.xxxxxxxxx.xx\Files\System\_search\Results\index: files: _28n.cfs _33.cfs _3d.cfs _3o.cfs _3z.cfs _4a.cfs _4l.cfs _4m.cfs _4n.cfs _4o.cfs _4p.cfs _4q.cfs _4r.cfs _uy.cfs Call stack: at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run(IndexCommit commit) at Dynamicweb.Searching.IndexManager.UpdateSpellDictionary(IndexInfo info, LogHandler log)

 

[501] An error occured while updating an index.

 

Index: Results

Date: 4/28/2013 3:00:02 AM

Solution:

Error message: Object reference not set to an instance of an object.

Call stack: at Dynamicweb.Searching.Indexer.PerformIndexModifications(IndexUpdateParameters parameters, IEnumerable`1 data, IndexModificationResult addToResult) at Dynamicweb.Searching.Indexer.PerformIndexModifications(IEnumerable`1 fromEntries, IndexUpdateParameters parameters) at Dynamicweb.Searching.Indexer.PerformUpdate(IEnumerable`1 data, IndexerQueueContentType expectedContentType, IndexUpdateParameters parameters) at Dynamicweb.Searching.IndexManager.UpdateIndexInternal(Indexer i, Boolean isFullUpdate, Boolean updateSpell, IDictionary`2 flags)

 

[501] An error occured while updating an index.

 

Index: Results

Date: 4/29/2013 3:00:03 AM

Solution:

Error message: Lock obtain timed out: NativeFSLock@E:\iis\xx.xxxxxxxxx.dk\Files\System\_search\Results\index\write.lock

Call stack: at Lucene.Net.Store.Lock.Obtain(Int64 lockWaitTimeout) at Lucene.Net.Index.IndexWriter.Init(Directory d, Analyzer a, Boolean create, Boolean closeDir, IndexDeletionPolicy deletionPolicy, Boolean autoCommit, Int32 maxFieldLength, IndexingChain indexingChain, IndexCommit commit) at Lucene.Net.Index.IndexWriter..ctor(Directory d, Analyzer a, Boolean create, MaxFieldLength mfl) at Dynamicweb.Searching.Management.IndexUpdater.GetIndexWriter(String physicalPath, String path, Boolean clearOnError) at Dynamicweb.Searching.Management.IndexUpdater.Update(IndexUpdateParameters parameters) at Dynamicweb.Searching.Indexer.PerformIndexModifications(IndexUpdateParameters parameters, IEnumerable`1 data, IndexModificationResult addToResult) at Dynamicweb.Searching.Indexer.PerformIndexModifications(IEnumerable`1 fromEntries, IndexUpdateParameters parameters) at Dynamicweb.Searching.Indexer.PerformUpdate(IEnumerable`1 data, IndexerQueueContentType expectedContentType, IndexUpdateParameters parameters) at Dynamicweb.Searching.IndexManager.UpdateIndexInternal(Indexer i, Boolean isFullUpdate, Boolean updateSpell, IDictionary`2 flags)

 

Best regards, Anders

 

 
Anders Ebdrup
Reply

Hi Pavel,

 

Any news regarding this subject?

 

Best regards, Anders

 

You must be logged in to post in the forum