Developer forum

Forum » PIM » Best approach for updating the index for catalogues with lots of products

Best approach for updating the index for catalogues with lots of products

Adrian Ursu Dynamicweb Employee
Adrian Ursu
Reply

Hi guys,

I have a solution that is supposed to list a lot of products. So far we have 200k SKUs and I expect to have another 100k by the end of the year.
The challenge here is updating the index once something happens on any of the workflows.
We have set up PIM workflows on the staging solution (where the catalog is smaller) and we have had no issues making changes and moving over products from one workflow state to the other and from one query to the other based on the Shared queries.

On the Production site, once we make any change to any number of products, the index rebuilds entirely. Which, as you can imagine, would take about 15-20 minutes. And since we have multiple users making changes at the same time, the index keeps restarting.

I am pretty sure the way I have set up the index is not the most efficient way.

I have a PIM index with 2 builds:

  • Full (HoursToUpdate = 24,BulkSize=500)
  • Partial (HoursToUpdate=0.1,BulkSize=500)

I have tried adding a new Build - UpdateWithIds hoping that this would improve the update. But I still don't get the desired result.

Assuming what I have done is wrong, what would be the best setup of the Index to allow a fast update of the index once something is updated on the SharedQueries/Workflows?

Thank you,
Adrian

 


Replies

 
Søren Jensen Dynamicweb Employee
Søren Jensen
Reply

Hi Adrian,

Can you please send me URL to both solutions, you are running on (Customer & Staging)

Please also inform me about the Server setup behind (CPU kernel, Memory, SQL/IIS on seperat or on same server... )

Then I will investigate and come back to you

/Søren

 
Nuno Aguiar Dynamicweb Employee
Nuno Aguiar
Reply

Hi Adrian,

 

I am sure Soren can help you, but in any case we've also experienced a few things with long indexes rebuilding and we've had sucess by doing a few things:

  • Not using the SchemaExtender
    Instead focus on the fields you want
  • Adjusting the build settings
    • Skip Grouping
    • Skip Related Products
    • Skip Prices
    • ...

 

We've been playing around with some of the settings with relative success for large catalogs. We also revisit IndexBuilderExtenders and make sure we cache requests and such.

 

Hope this helps too.

 

Best Regards,

Nuno Aguiar

 
Adrian Ursu Dynamicweb Employee
Adrian Ursu
Reply

Hi Soren,

Yes.

Staging version: http://altexpim.staging.dynamicweb-cms.com/

Production version: http://altexpim.cloud.dynamicweb-cms.com/ 

Thank you,

Adrian

 
Adrian Ursu Dynamicweb Employee
Adrian Ursu
Reply

Hi Nuno,

In my case, the index has to be used for PIM queries. If the PIM queries are not limited to Product indexes, your suggestion may work.

I will investigate this path.

I have already removed grouping, prices and related products but I have a lot of fields that I need.

Thank you,
Adrian

 
Adrian Ursu Dynamicweb Employee
Adrian Ursu
Reply

Hi Soren,

Any news?

Thank you,
Adrian

 
Steffen Kruse Hansen Dynamicweb Employee
Steffen Kruse Hansen
Reply

Hi Adrian,

I have just look at bit at the solution, and it seems like the functionality with updating the index works correctly. I can at least trigger only one product being updated if I make the change directly on the ProductEditPage.

If you still experience the problem, then maybe you can try to enable Auditing on the Live solution, because that way we can track exactly what kind of change triggers the rebuild of the index, because it could theoretically just be a column in the database where the value is null but should be an empty string (this is just a guess). But I think maybe auditing would be able to help us debug this.

Best regards,

Steffen
 

 
Adrian Ursu Dynamicweb Employee
Adrian Ursu
Reply

Hi Steffen,

Thank you for the feedback.

The update on product save works pretty good.

We have a few other scenarios where we update the products and that;s usually a bulk process.

One process is an import using DataIntegration. At the end of the process, I have set to run a partial index. Sometimes though, I get an error like this: 
Building Index 'PIM.index' failed. Exception: An active task with the given name ('PIM#PIM.index#A') already exists. Please provide a different name or wait until the existing task completes.

I have defined a separate partial update job that I have set to be run on update and apparently the problem is gone. The major issue though is when making bulk changes in a Query.

We have a few bulk activities (custom) that are supposed to change properties on the products. And at the end of the process, use UpdateProductIndexes from ProductService trying to start a partial update. For this purpose, we have also set an Auto-Build build using the "Update with IDs option" but even for small updates (1 product) the changes are not visible right away and I have noticed the Full build starting (which takes about 20 minutes).

I am wondering if this is the right approach for what we are trying to do or if I am doing something wring as I have never used the "Update with productIds" before.

Thank you,
Adrian

 

You must be logged in to post in the forum