Developer forum

Forum » Templates »  "Did you mean" spell checker for free text searches?

"Did you mean" spell checker for free text searches?

Anders Ebdrup
Anders Ebdrup
Reply

Dear Dynamicweb,

 

I'm excited to see that there is a note in the 9.8 release about "Did you mean?"! How can we use that in the platform? And how does is calculate the result?

 

Best regards, Anders


Replies

 
Nicolai Pedersen
Reply
This post has been marked as an answer

Hi Anders

It uses the Lucene SpellChecker against the index document field, and based on the available terms in that field for all documents, finds the one closest using a number of distance filters. See the code below.

This is an example of how to configure - this uses a summary. Could be any text based field of course.

  1. Setup a free text field of type summary in your index (Dump#1)
  2. Setup a parameter and a query expression to search the summary (Dump#2)
  3. Setup a paragraph to use the query (Dump#3)
  4. Setup the app to do spell checking on the summary field using the input from the query variable (Dump#4)
  5. Update your product list template and your no result template (mine is the same template) to show the suggestions. (Dump#5)
  6. Search with a spell checking error (Dump#6)
  7. Click the suggestion or one of the alternatives (Dump#7)

In your product list template, add the information about the spell checking result something like this - you might not want the alternative list, but to illustrate.

<div class="row">
    <div class="col-md-12">
        @{
        string pageid = GetString("Ecom:ProductList:Page.ID");
        string firstSuggestion = GetString("QueryResult.SpellCheck");
        }

        @if (!string.IsNullOrWhiteSpace(firstSuggestion))
            {
                <strong>Did you mean: </strong>  <a href="Default.aspx?ID=@pageid&q=@firstSuggestion"><i>@firstSuggestion</i></a><br />
                foreach (var suggestion in GetLoop("SpellCheckerSuggestions"))
                {
                    var suggestionTerm = suggestion.GetString("Suggestion");
                    <a href="Default.aspx?ID=@pageid&q=@firstSuggestion" style="margin-right:5px;color: #AAAAAA !important; font-size: 12px !important">@suggestionTerm</a>
                }
            }
        </div>
</div>

This is how it technically is implemented:

public class LuceneSpellChecker
    {
        private readonly SpellChecker.Net.Search.Spell.SpellChecker checker;
        private readonly IndexReader indexReader;
        private readonly string indexField;
        private readonly int numberOfSuggestions;
        private bool isIndexed;

        /// <summary>
        /// Constructs new spell checker instance
        /// </summary>
        /// <param name="reader"></param>
        /// <param name="field"></param>
        public LuceneSpellChecker(IndexReader reader, string field)
        {
            indexReader = reader;
            indexField = field;
            checker = new SpellChecker.Net.Search.Spell.SpellChecker(new RAMDirectory(), new JaroWinklerDistance());
            numberOfSuggestions = Configuration.SystemConfiguration.Instance.GetInt32("/GlobalSettings/System/Repository/LuceneSpellChecker/NumberOfSuggestions");
            if (numberOfSuggestions <= 0)
                numberOfSuggestions = 10;
        }

        private void EnsureIndexed()
        {
            if (!isIndexed)
            {
                checker.IndexDictionary(new LuceneDictionary(indexReader, indexField));
                isIndexed = true;
            }
        }

        /// <summary>
        /// Suggest similar words.
        /// </summary>
        /// <param name="word"></param>
        /// <returns></returns>
        public IEnumerable<string> SuggestSimilar(string word)
        {
            EnsureIndexed();

            var existing = indexReader.DocFreq(new Term(indexField, word));
            if (existing > 0)
                return Enumerable.Empty<string>();

            var suggestions = checker.SuggestSimilar(word, numberOfSuggestions, null, indexField, true);
            var jaro = new JaroWinklerDistance();
            var leven = new LevenshteinDistance();
            var ngram = new NGramDistance();

            var metrics = suggestions.Select(s => new
            {
                word = s,
                freq = indexReader.DocFreq(new Term(indexField, s)),
                jaro = jaro.GetDistance(word, s),
                leven = leven.GetDistance(word, s),
                ngram = ngram.GetDistance(word, s)
            })
            .OrderByDescending(metric =>
                (
                    (metric.freq / 100f) +
                    metric.jaro +
                    metric.leven +
                    metric.ngram
                )
                / 4f
            )
             .ToList();

            return metrics.Select(m => m.word);
        }
    }
01DidYouMeanIndexConfig.JPG 02DidYouMeanQueryConfig.JPG 03DidYouMeanCatalogConfig1.JPG 04DidYouMeanCatalogConfig2.JPG 05DidYouMeanTemplateImplementation.JPG 06DidYouMeanFrontend.JPG 07DidYouMeanFrontendResult.JPG
Votes for this answer: 1
 
Anders Ebdrup
Anders Ebdrup
Reply

SWEET!laughyes

 

You must be logged in to post in the forum