Developer forum

Forum » Development » changing Lucene.Net.Search.SimpleFacetedSearch.MAX_FACETS has no effect ?

changing Lucene.Net.Search.SimpleFacetedSearch.MAX_FACETS has no effect ?

Martin Jensen
Reply

Hi

I would like to configure a facet for an indexed field with more than 2048 distinct values. I tried overriding the SimpleFacetedSearch.MAX_FACETS value but i still cant create the facet without getting the errormessage: "Facets cannot have more than 2048 unique terms in a field". How can i override this value ? Im aware that this might hurt performance and im ok with that


Replies

 
Nicolai Pedersen
Reply

Hi Martin

I think that is a UI limitation. So you can define it directly in the XML file.

And it will not "might hurt performance" - it will guaranteed hurt performance and break your solution under load. You might want to look into another way of solving this - it sounds like you are using facets for something that is not a facet. You should consider another approach and maybe have a look at term facets.

BR Nicolai

 
Anders Ebdrup
Anders Ebdrup
Reply

Hi Nicolai,

 

We will have handle approx. 2 million Equestrian results where you can filter on riders and horses, which gives the many outcomes.
Will there be any possibility that we can overrule this setting? The solution is in a DW 8.

 

As I remember we have another solution which will break due to this line in the LuceneIndexProvider:

                    //SimpleFacetedSearch.MAX_FACETS = int.MaxValue; Uh, dangerous!

(Jeppe made this change for us: SimpleFacetedSearch.MAX_FACETS = int.MaxValue; )

 

Best regards, Anders

 
Nicolai Pedersen
Reply

So you want a disaster option? No one can handle that option and I'll have to debug 100s of slow solutions...

I am not a fan of designing for disaster on purpose. Facets are not meant for +2048 outcomes (2048 is already 1900 too many :-)) - how would you even show that many options in a UI?

I would love to challenge you on this to find a better implementation. Term facets are created for exactly this...

Sorry for being an asshole :-)

 
Anders Ebdrup
Anders Ebdrup
Reply

I am always ready for a battle! :-) When can I try to reach you?

 
Martin Jensen
Reply

Hi Nicolai

We are never going to render the facets before prefiltering for a specific rider or horse. So if we prefilter (using a searchfield) for rider id 777 and he has 50 results with 3 different horses we need to show a list of the 50 results and show a horse facet with the 3 horse options. If we prefiltered for a rider we will only render a horse facet.. Likewise if we prefilter a horse ID then we only show the rider facet. (both options will also render other facets but these are not a problem)

This way we never have to render more than maybe 5-6 options for the facets that are a problem.

How can we solve this ?

 
Nicolai Pedersen
Reply

You can solve that by using term facets.

The problem is - that even though your search result is only 50, but you try to calculate a facet on a field, it will be calculated on every unique field term on the entire index. So if you have, i.e. 5000 unique terms in one field and 5000 unique terms in another field and you create facets for both, you would get 25000000 calculations - now add 2 other facets with 20 unique terms and you get 10000000000 calculations. See the problem?

And this is the nature of facets in the facets engine. That is how it works and we cannot change that.

But then you have term facets - they will return a unique set of terms for a given field given the search result - and limit the list and will not make any calculations. The only thing you cannot do, is to get how many of the 50 in the result you have for each option...

BR Nicolai

 
Martin Jensen
Reply

Ok so ive tried using term facets. I ran into some problems though..

I create a simple query where string param "rId" should "Equal" the string field in my index called RiderId

I then create a term facet for this field in the xml to get around the "more than 2048 terms" error

I use the facet in a querypublisher template that i called qptemp

I send params to the template like this: /qptemp?rId=1234

i get this error:

System.NullReferenceException: Object reference not set to an instance of an object.
   at Lucene.Net.Search.BooleanQuery.Rewrite(IndexReader reader) in d:\Lucene.Net\FullRepo\trunk\src\core\Search\BooleanQuery.cs:line 482
   at Lucene.Net.Search.IndexSearcher.Rewrite(Query original) in d:\Lucene.Net\FullRepo\trunk\src\core\Search\IndexSearcher.cs:line 302
   at Lucene.Net.Search.Query.Weight(Searcher searcher) in d:\Lucene.Net\FullRepo\trunk\src\core\Search\Query.cs:line 96
   at Lucene.Net.Search.Searcher.Search(Query query, Filter filter, Int32 n, Sort sort) in d:\Lucene.Net\FullRepo\trunk\src\core\Search\Searcher.cs:line 57
   at Dynamicweb.Indexing.Lucene.LuceneIndexProvider.FillFacetFieldTerms(Facet facet, FieldDefinitionBase field, IndexReader reader, Query searchQuery, QuerySettings settings, FacetGroupResult facetsResult, IList`1 exceptions)
   at Dynamicweb.Indexing.Lucene.LuceneIndexProvider.DoFacetSearch(IFacetGroup facets, FieldDefinitionBase[] fields, IndexReader reader, IQuery query, QuerySettings settings, Query originalQuery, IList`1 exceptions)
   at Dynamicweb.Indexing.Lucene.LuceneIndexProvider.SearchInternal(IQuery query, QuerySettings settings)
   at Dynamicweb.Modules.QueryPublisher.Frontend.GetContent()


if i instead use another facet created for horses then it works when i query for a rider (it shows the horses related to the rider).. if i query for a horse then it breaks again.. so it seems i cant query for a field that has a term facet on the page ??

---------------------------------------------
in other words.. this works:

        /qptemp?rId=1234
        
        &

        <?xml version="1.0" encoding="utf-8"?>
        <Query>
          <Settings />
          <Source Repository="MainRepo" Item="Results.index" Type="Dynamicweb.Indexing.Queries.IndexQueryProvider, Dynamicweb.Indexing" />
          <Parameters>
            <Parameter Name="rId" Type="System.String" DefaultValue="" />
            <Parameter Name="hId" Type="System.String" DefaultValue="" />
          </Parameters>
          <Expressions>
            <BinaryExpression Operator="Equal">
              <Left>
                <FieldExpression Field="RiderId" />
              </Left>
              <Right>
                <ParameterExpression Name="rId" />
              </Right>
            </BinaryExpression>
            <BinaryExpression Operator="Equal">
              <Left>
                <FieldExpression Field="HorseId" />
              </Left>
              <Right>
                <ParameterExpression Name="hId" />
              </Right>
            </BinaryExpression>
          </Expressions>
        </Query>

        &

        <?xml version="1.0" encoding="utf-8"?>
        <Facets>
          <Settings />
          <Source Repository="MainRepo" Item="resq.query" />
            <Facet Name="Horse" Type="Term" Field="HorseId" QueryParameter="hId" />
        </Facets>

        
-------------------------------------------------------------------------------------------------
and this does NOT:
        
        /qptemp?rId=1234
        
        & 
        <same query xml>
        &
        
        <?xml version="1.0" encoding="utf-8"?>
        <Facets>
          <Settings />
          <Source Repository="MainRepo" Item="resq.query" />
            <Facet Name="Rider" Type="Term" Field="RiderId" QueryParameter="rId" />
        </Facets>
        
-------------------------------------------------------------------------------------------------
also this works but is kinda useless for our usecase:

        /qptemp?rId=1234&hId=5555   <-- OBS!
        
        & 
        <same query xml>
        &

        <?xml version="1.0" encoding="utf-8"?>
        <Facets>
          <Settings />
          <Source Repository="MainRepo" Item="resq.query" />
            <Facet Name="Horse" Type="Term" Field="HorseId" QueryParameter="hId" />
            <Facet Name="Rider" Type="Term" Field="RiderId" QueryParameter="rId" />
        </Facets>
        
-------------------------------------------------------------------------------------------------

so how can we have 2 term facets on same page and only query 1 of them ? (the one that gets queried would only have a single value in it.. -> aka hidden)

//Martin

 
Martin Jensen
Reply

We already have a solution for these requirements running at: http://go.rideforbund.dk/resultater/resultat-hest/rytter.aspx?rId=57855

This is implemented using the old index and some custom modules. We are re-writing this to use a lucene repository with a querypublisher instead and im not sure term facets are the solution.

As you can see in the link above we have searched for a rider and now see the results that are related to that specific rider, in the facets we see "horse" options since this rider has results with 3 different horses. If u click on a horse the list will update and show results from that specific horse.. Our problem is if we use a term facet for the horse option then: 1. we dont have counts on the horses and more importantly 2. If we click on a horse then when the facets gets updated we only see the horse option that we just clicked and not the other 2 horse options (due to the way term facets work). 

How could we solve this in a performant manner ? both rider and horse have ALOT more than 2048 distinct values.. Our goal is to try and do this without too much custom implementation (if possible ?)

thanks

/Martin

 
Nicolai Pedersen
Reply

Hi Martin

I don't know. The best I can do, is to give you the SimpleFacetedSearch.MAX_FACETS = int.MaxValue; setting, but with no guarentees for performance.

With the Lucene and faceted search implementation we have available, I cannot see a solution that will perform well in this case. It is just not the right tool in this case. Like hammering in nails with a screwdriver. It is not because I do not want you to have this feature, I just cannot make it descently with what we have available.

It might be better to do this with SQL, I cannot tell - that will be your choice.

BR Nicolai

 

 

You must be logged in to post in the forum