Developer forum

Forum » CMS - Standard features » FileIndexBuilder taking a long time

FileIndexBuilder taking a long time

Nuno Aguiar Dynamicweb Employee
Nuno Aguiar
Reply

Hi,

 

We have a customer with almost 200.000 files that we had to index. Initially I was using the schema extender and this was taking 14min to 20min and every instance would take around 3GB.

 

So then I removed the schema extender and created just a handful of fields (the ones I absolutely needed) and my instance size on disk dropped to 60MB. I was very pleased with this, but the build still takes the same amount of time, which makes me wonder if there's not something in the builder that assumes the schema extender, impacting performance and not really using it in the end.

 

Can anybody confirm this?

 

Best Regards,

Nuno Aguiar


Replies

 
Nicolai Pedersen
Reply

Hi Nuno.

Iterating all those files will take time - probably some of the work is still done for getting data for fields you do not need. Feel free to look into the code your self and improve as needed.

Thanks, Nicolai

 
Nuno Aguiar Dynamicweb Employee
Nuno Aguiar
Reply

Hi Nicolai,

 

Well I still don't have direct access to the source code, so it's always a hassle (I'l reach out to someone at the Summit about it).

 

I noticed that I actually needed to set SkipMetaData to True to get that performance increase. Not adding the schema extender was the wrong approach.

 

Thank you,

Nuno Aguiar

 
Nuno Aguiar Dynamicweb Employee
Nuno Aguiar
Reply

Hi Nicolai,

 

There must be something else at play that I don't know how to debug. Using 9.4.18 just by trying to get into the Index (with this almost 200.000 files) takes a long time, throttles up my CPU and takes a lot of memory.

 

It looks like as if as soon as I get into the index to rebuild my instance, DW is doing a lot of thing under the cubberts than it "needs". Could you ask QA to investigate it? I see this in http://qa-abbycadabby.dw-demo.com (as I assume you need a proper scenario)

 

Best Regards,

Nuno Aguiar

 
Nicolai Pedersen
Reply

Hi Nuno

Unfortunately we cannot do that. You made it, you find the bug and fix it or report it :-).

 

Capture.PNG
 
Nuno Aguiar Dynamicweb Employee
Nuno Aguiar
Reply

Hi Nicolai,

 

That comes from the Schema Extender. And we have no customizations on it. I am happy to track it down further but it seems to be out of my control for now :(

 

I did notice that I am not getting that in our dev environment with a lot less files. And that's comming directly from some jpgs the customer is placing. So from my point of view:

  • I configured a file schema extender
  • I set SkipMetaData to false in the Build settings
  • I am getting this error because of some jpgs files the customer is uploading

 

I tried to remove the schema extender and create the fields that I needed manually (as a workaround) and now I am getting acceptable performance in the backend.

 

This tells me there's something to look at, since the usage of the schema extender is causing issues and we don't control the files the customers put in there.

 

Hope that makes sense to you. And I would guess this is a bug report :P

 

Best Regards,

Nuno Aguiar

 
Nicolai Pedersen
Reply
This post has been marked as an answer

Hi Nuno

Sorry - I misunderstood (read and concluded to fast).

In /System/Repositories there is a metadata file which seems to contain data that is wrong. We will check out why. You can delete it and build again. I think it might grow for each build causing the slow performance. Just delete it.

BR Nicolai

Votes for this answer: 1
 
Nuno Aguiar Dynamicweb Employee
Nuno Aguiar
Reply

Hi Nicolai,

 

Thank you for the answer. Currently I applied the fields manually and that worked too, but it's important to know that workaround too.

 

Best Regards,

Nuno Aguiar

 
Nicolai Pedersen
Reply

Hi Nuno

You can skip meta data or not - that is currently the only workaround. Problem is they have one million or so different meta data tags on images. I've added a feature request so it is possible to skip exif and xmp metatags explicitly.

We have investigated the issue - and the file is not growing - it is simply because you have so many meta tags, so deleting it will make no difference.

We are still investigating if there is a format issue.

BR Nicolai

 
Nuno Aguiar Dynamicweb Employee
Nuno Aguiar
Reply

Hi Nicolai,

 

Great, those feature requests are reasonable. Thanks for looking into this.

 

Best Regards,

Nuno

 

You must be logged in to post in the forum