Tutorial 5: Extending Indexing
The Dynamicweb indexing engine – sometimes referred to as New Indexing – is a powerful and fast generalized search framework based on Lucene 3.0.3.
Broadly speaking, New Indexing consists of the following elements:
- Indexes – which are data structures optimized for data retrieval
- Queries – which are requests for data limited by criteria you define
- Facets – which are used to create filters in frontend by passing parameters to a query
- Tasks – which are used to rebuild indexes at an interval
All of these elements can be heavily configured to suit your particular scenario – and all exist within a so-called repository, which is simply a folder in the file archive containing configuration files.
New Indexing may be extended in a number of ways:
- Indexes may be extended by extending an existing IndexBuilder, creating a custom IndexBuilder, or creating a custom Schema Extender
- Queries may be extended by creating custom macros, code providers, and value mappers
Extending Indexes
An index is a data structure optimized for data retrieval operations – which means that querying it is much faster than querying a database directly. It consists of the following elements:
- An instance is a physical data structure which can be queried for data
- A build configuration is a set of instructions to an IndexBuilder for retrieving data from a data source and building an instance.
- Field mappings are a set of instructions for which data from the source to should be added to which fields in the instance, and how the data should be analyzed and stored. A schema extender is a predefined set of field mappings for e.g. a product index or a content index.
Dynamicweb ships with four standard builders; a ProductIndexBuilder, a ContentIndexBuilder, an SQLIndexBuilder, and a UserIndexBuilder.
Likewise, Dynamicweb ships with three schema extenders; a ContentIndexSchemaExtender, a ProductIndexSchemaExtender, and a UserIndexSchemaExtender.
Both the default builders and the schema extenders are described in the Indexing & Search documentation article.
Of course, you can also extend the default functionality by either:
- Extending an existing IndexBuilder
- Creating a custom IndexBuilder
- Creating a custom SchemaExtender
Extending an existing IndexBuilder
Most of the default IndexBuilders – the Product-, Content- and UserIndexBuilders – contains support for using IndexBuilderExtenders to extend the build process with data from e.g. a remote source.
The process is:
- Make sure the SkipExtenders setting on the IndexBuilder is set to False
- Manually add a field with a custom source to the index field mappings
- Write some awesome code which will populate the custom field with data
Activate IndexBuilderExtenders
Before proceeding, you must make sure that the IndexBuilder setting SkipExtenders is set to False.
- Go to Settings > Repositories > Your Index
- Open the build definition
- Verify that the SkipExtenders is False (Figure 4.1)
It should be set to False by default – but it never hurts to check. Next you must create a place in the index to store the remote data.
Creating a data destination
In order to use data from a remote source, you must have a place to put it – an index field with a custom source.
To add an index field with a custom source to the index:
- Click Add field
- Select a Field or Grouping type field and enter a name and a system name
- Click the green plus icon and enter a custom source name, then select the custom source using the dropdown
- Select a data type matching the source data, then check the stored and indexed checkboxes (Figure 5.1)
- Click OK
- Save
This leaves only one thing – adding data to the field.
Coding the IndexBuilderExtender
The final step is to write the code which will populate your custom field with values during the indexing process.
You must write the class that will implement the interface of the index builder you want to extend. In the case of the ProductIndexBuilder that will be IIndexBuilderExtender<ProductIndexBuilder>, for the other IndexBuilders it will be IIndexBuilderExtender<[YourIndexBuilder]>.
Here is a sample code that will populate the custom field with some string value:
To compile the code you will need to include a NuGet reference to Dynamicweb.Indexing and Dynamicweb.Ecommerce.
Once compiled and uploaded to the bin folder, you can build your index and verify that the new field has been assigned the value from the IndexBuilderExtender.
Creating a custom IndexBuilder
If the default IndexBuilders supplied by us are inadequate for your project, you can of course create a custom IndexBuilder from scratch.
To do so you must implement the IndexBuilderBase class from the Dynamicweb.Indexing package.
In the code sample below a custom FileIndexer is build. It will extract the content of PDF-files and make the textual content searchable.
Notice that open-source iTextSharper has been used for parsing/reading the PDF-documents. The most easy way to add the assembly, is to add from NuGet. Just search itextsharp from nuget.org. At the time of writing this, the version is 5.5.13.
Recap: the example above creates a custom FileIndexBuilder that extracts the text content of PDF files and builds an index with the content, making PDF content searchable.
For adding references to your VS project, these are the ones used for this task:
And a brief rundown:
- In the SupportedActions property you can define the builder actions the builder can handle, e.g. Full or Update.
- In the DefaultSettings property you can define builder settings with default values that your builder supports – the user will be able to change them in GUI.
- In the GetFields() method you can define the list of fields that you want to be saved in the index, usually it contains an instance of the “Schema extender” class that returns list of the fields.
- In the Build() method you need to handle the actions and build your index data. In this example based on the start folder you can process the files from this folder and save them to the index.
Following the example, you can write you own index builder and index any other data you need. Once your custom IndexBuilder has been built and uploaded to the bin folder for the solution, it will be available alongside all the standard IndexBuilders when creating a new build, with the actions and settings you created.
Please note that a builder retrieves data from a source and also handles the build process – but depends on field mappings to know where in the index the data should be placed. This can be done by manually adding fields to the index definition – or through a schema extender, which is a predefined set of field mappings and storage instructions tailored to a particular IndexBuilder.
If you want to know how index settings are stored, you can go to the /Files/System/Repositories/[Your repository name] folder and look into *.index xml files with settings.
Custom Schema Extenders
You may want to define your own schema extender to allow you to specify a default list of field mappings for a particular IndexBuilder.
The field object contains the following basic properties:
- Name – field name that will be shown in UI
- System Name and Source – name that will be stored in the index configuration/settings and in the index column name
- TypeName – the .NET type name, for example: “System.String”, “System.Int32”, “System.Int64” or “System.String[]” if you want to store array of values in one field.
The following storage instruction can also be enabled:
- Analyzed – the field is run through an analyzer, and tokens emitted are indexed. This only makes sense as long as the field is also indexed.
- Indexed – the field is made searchable, and stored as a single value. This is appropriate for keyword or single-word fields, and for multi-word fields you want to retrieve and display as single values (e.g. for facets).
- Stored - field has its value stored as-is in the index. This does not affect indexing or searching – it simply controls whether you want the index to act as data store for value. Since most of your data will be in the Dynamicweb database, you usually don’t need to store a field in the index.
To create your own Schema extender you need to implement the IIndexSchemaExtender interface – this example returns two fields, matching the example of the custom IndexBuilder from the previous section:
After compiling this code together with custom index builder class, you will be able to select the schema extender in the default manner (when adding field mappings to an index definition). Save the index to see the list of schema extender fields in the Schema Extender Fields section.
Please note that you need to (re)build the index before you will be able to query the index fields.
For debugging purposes you may find the Query publisher app useful, or any other external tool that can open your index files, e.g. the Luke All tool (provided that you are using the default LuceneIndexProvider). The index files are usually located in the /Files/System/Indexes/[Your repository name]/[Your index name]/[Your index instance name] folder.
Extending Queries
A query is simply a request for data submitted to an index – e.g. ‘Return all active products with a price below 100’.
Queries are created by stringing together a number of expressions (Figure 9.1) which limit the data you receive from a query (an empty query returns all data in the index).
An expression is not particularly complicated – it consists of a field in the index, an operator, and a test value.
Test values can be either constants, parameter values, macros, term values, or dynamic values from a code provider.
You can extend the default functionality by:
- Creating custom macros
- Creating custom code providers
- Creating custom value mappers
Macros
A macros is a dynamic value retrieved from the context, e.g. a PageID, WebsiteID, UserID, CartID, etc.
To create a custom macro, you need to inherit from the base class Dynamicweb.Extensibility.Macros.Macro in the Dynamicweb.Extensibility NuGet package.
After that you will need to implement the following abstract methods:
- Name – this is you macro name that will appear in the macro list when selecting a test value
- SupportedActions – the list of actions your macro supports
- object Evaluate(string action) – method to handle actions and output the results of them.
See this sample custom Favorites Macro:
This macro will return the user favorite products ids.
If you compile the sample code and add the library to the bin folder you will be able to select the custom macro as a test value in an expression (Figure 10.2).
Code providers
Like macros, a code provider allows you to dynamically construct a test value for an expression. However, a code provider allows you to define extra parameters and use them to calculate the test value whenever a query is being executed.
To create a custom code provider you need to create a class inherited from the Dynamicweb.Extensibility.CodeProviderBase Class. Here is a sample code from the already implemented “DateTime” code provider:
This code provider evaluates the expression based on the selected parameters Interval and Number (Figure 11.2).
In the BuildCodeString method, the expression evaluation is carried out. In this case, it returns a DateTime.Now.AddHours(2) DateTime object.
In the BuildDisplayValue method, the display value for the UI is calculated – in this case, the display value is Today + 2 hours (Figure 11.3).
Value mapper
A value mapper allows you to map a list of terms from your already built index to the appropriate object.
For example, the DateTimeValueMapper converts the index terms to a date time object, the GroupIDsValueMapper gets the available Ecommerce product groups by terms values and returns the list of group id - group name pairs, and it’s the same for the LanguageIDValueMapper, the ManufacturerIDValueMapper, and the VariantGroupValueMapper.
Value mapper must implement the abstract ValueMapperBase class – here is an example of a custom ProductType value mapper:
This mapper converts the term values to the “Dynamicweb.Ecommerce.Products.ProductType” enum.
In the AddInGroup attribute you need to define the type name of the index builder which the mapper will be used for.
In the AddInName attribute you must specify the column names for which this mapper will be applied.
After building this mapper and uploading the library to the bin folder, you will be able to view the mapper results in the UI when using the term selector on an expression querying the product Type column (Figure 12.2).
What you’ve learned – and what’s next
In this tutorial, you’ve learned how to:
- Extend an existing IndexBuilder with custom data using an IndexBuilderExtender
- Create an IndexBuilder from scratch
- Create a custom SchemaExtender for automatically mapping incoming data to fields in the index
In the next tutorial you will learn about extending the integration area by creating integration providers, using table script, and creating a custom scheduled task add-in.