Data Integration Module
The Data Integration module is a tool for importing and exporting data to and from a Dynamicweb solution.
Using the module you create integration activities which accomplish a particular task, e.g.:
- Exporting users to XML
- Importing user from CRM
- Exporting orders to CSV
An integration activity consists of:
- A source provider matching the data source - either the Dynamicweb solution (export) or a file (import)
- A destination provider matching the data destination - either the Dynamicweb solution (import) or a file (export)
Integration activities are created and managed from the Data Integration node (Figure 2.1) – which can be accessed by navigating to Settings > Integration > Data Integration.
From the Data Integration area you can:
- Create new activities
- Create new activities from a template
- Open the URL builder
- Delete old logs
- View and edit existing activities – and run them!
The Data Integration module requires an underlying database running on Microsoft SQL Server 2008 or later.
Additionally, your webserver must be able to resolve its own name from DNS; after calling jobrunner.aspx a new thread is created to another .aspx, where the job is actually executed. So if you have http://website.com, the IIS-server should be able to resolve website.com, as it will be calling ‘http://website.com/JobRunnerBackground.aspx?’.
Creating an integration activity
To create a new integration activity, simply click New activity in the toolbar (Figure 3.1).
This will open a wizard which will take you through all the steps involved:
- Selecting the source provider
- Selecting the destination provider
- Selecting tables to import or export data from
- Naming the activity
- Checking the column mappings
- Saving and running the activity
Read more about providers, tables and column mappings below.
“New activity from template”
If you use the Integration Framework, some of the common import/export jobs have already been defined for you in (customizable) templates. To learn more, get acquainted with the Integration Framework.
When you create an integration activity, you need to define what you want to accomplish - and then select the integration providers best suited to that task.
Here's an overview of the built-in providers supplied by Dynamicweb:
Can access the full Dynamicweb database. High performance, but complex. Supports SQL Views.
Import and export data directly to and from a Microsoft SQL server database. Supports SQL Views.
Import and export user and user group data in a more straightforward manner than the Dynamicweb provider.
Import and export data to and from Microsoft Excel.
Import and export data from and to CSV files.
Import and export data from and to XML files.
Import and export ‘incomplete’ data to and from the Dynamicweb database. Able to e.g. auto-assign IDs to imported products.
Import and export items (content).
Import and export order and orderline data.
If you're in doubt about which provider is most suitable, here are a couple tips and tricks to help you get started:
- 9 out of 10 times your import data source will be XML, but occasionally end-users prefer Excel or CSV. XML is also a very popular destination format.
- When importing data to Dynamicweb, it usually makes sense to use the Dynamicweb Provider as the destination provider – but in some cases the specialized providers may be useful:
- The Ecom Provider allows you to import incomplete product data and have the provider auto-assign the missing values. Comes with significant memory overhead.
- The User Provider contains options for e.g. removing inactive users, generating passwords for new users, and notifying new users by email after import.
- The XML Provider lets you use XSL transformation files to modify either your source or destination XML – this can be very useful. The downside is that you have to work with XSLT.
Click the links in the table to read in-depth information about each provider – or read about creating custom providers.
The entire import/export tool is built around the concept of mapping tables to tables, and between those tables, mapping columns to columns. This approach is modeled closely on the approach taken by the Microsoft SQL Server Import and Export Data Tool.
After selecting a source and destination provider you will be asked to select source tables (Figure 5.1) and map them to destination tables.
In this case, I export data to an XML file, and so I cannot select a target table (as the destination file does not exist yet). However, if your destination is a database you will be able to select a target table here. If your destination table has the same name as the source table, it will be automatically selected in the dropdown.
Once you’re happy with your table selection and mapping click finish – you will be prompted to name the activity (Figure 5.2). Do so.
Automatic mapping on run will be explained in the next section.
It is VERY IMPORTANT that a table definition is only defined ONCE per activity. So if your source is a folder with multiple XML files for importing Ecommerce data, the table EcomProducts must only be referenced once. If you need to import data to the same table destination more than once, you should split your import into multiple activities.
After naming the activity, you can create or review the column mappings for each table in your import/export (Figure 6.1).
As with source and destination tables, if the column names in the destination table matches those in the source table the mappings will be created automatically.
From the column mapping overview, you have access to a number of advanced options for handling and modifying your import/export activity.
For instance, you can:
- Remove columns (e.g. AccessUserPassword and other sensitive data) which should not be imported or exported
- Add a conditional filter to a table mapping to e.g. export only products with a stock value below 10
- Customize key settings
- Use scripting to prepend or append something to column data, or replace the source data with a constant
Read more about scripting and conditionals below.
Create mappings at runtime/Perform mapping automatically on run
During the creation of a new activity you have the option (during the naming-step) to check a perform mapping automatically on run checkbox, and from the ribbon bar of the activity details view you have access to a create mappings at runtime button (Figure 6.3).
If either of these are active, Dynamicweb will automatically generate mappings between source and destination columns with the same name every time an activity is run.
This makes it less cumbersome to add custom fields and import data to them – as long as the names match, it will be done automatically.
Running an activity
Activities may be run by clicking the icon in the run-column of the data integration overview (Figure 7.1).
You can also run an activity by clicking the Save & Run button from activity details view (Figure 7.2), either during creation or when viewing or editing the activity details.
Scheduling an activity
In some cases, it makes sense to schedule an activity to be run at a predefined interval. This is done using the standard Scheduled task functionality in Dynamicweb.
To set up a schedule for an activity, right click the activity in the Data Integration area view and click Schedule task (Figure 8.1).
Create a scheduled task by selecting a start date and interval in the standard manner.
Scheduled integration activities are run by accessing an auto-generated activation URL – you can access the URL when editing a scheduled task for an integration activity.
The same type of URL can be generated manually and used to run an activity remotely – read more below.
On scheduled tasks
The Scheduled Task function in Dynamicweb relies on Windows scheduled task, which should be configured on the server by your hosting partner. If you are hosting the solution, make sure you have Windows scheduled task configured. Use Curl.exe to trigger http://yoursite.com/Admin/Public/TaskExecute.aspx every 5 minutes.
Once an activity has been run you will be presented with a log of how everything went (Figure 9.1).
You can access both the latest log and historical logs from the activity details view by using the log button to bring up the log overview (Figure 9.2).
Physically, logs can be found in the /System/Log/DataIntegration folder.
Often, it may be useful to email key people the result of an activity after it’s been run – especially if it fails!
To set up email notifications right-click on the activity in the Data Integration area overview and click Notification setting (Figure 10.1).
This opens the notification settings for the activity (Figure 10.2).
To set up email notifications for the activity:
- Specify a sender name and sender email
- Add the emails of the intended recipients – separate emails by semicolon if more than one
- Select (or edit/create) a notification email template
If you want to limit email notification to only failed activities, check the only on failure checkbox.
Running an activity remotely (URL Builder)
In some cases, it may be necessary to run an import/export job remotely. This happens by calling a job runner via an URL.
The job runner takes two parameters:
- Tasks to run
The Task parameter is a comma-separated list of jobs and the token is a unique checksum to prevent unwanted execution.
To generate URLs for remotely running an activity, use the URL Builder tool which is accessible from the Data Integration area view (Figure 11.1).
Using the URL Builder tool you can:
- Select one or more tasks (activities) to run – move them to the right column
- Click generate URL to generate an URL for running the currently selected activities. The installation checksum (see System Information) will be added to the token parameter.
- Click Create as a scheduled task to do so – this lets you create a single scheduled task which runs more than one activity at the same time
Each individual activity is run sequentially, as a single transaction.
If the ‘Stop on failed jobs’ checkbox has been checked, the activity will stop if an activity is not completed correctly. Already completed activities will not be reversed – i.e. the data from activity 1 is still imported/exported, even if job 2 fails.
You may have noticed the checkbox for enabling synchronous execution in the screenshot above.
Normally, the URL Builder generates an “ordinary” execution based on an asynchronous call that passes the activity to a new thread and proceeds in the background.
However, this means that you will not know the actual success/failure of the activity (without using email notifications or consulting the log).
For automated activities, this is not always good enough – and so you can call the alternate JobRunnerSynchronized.aspx instead, which awaits the response of the actual activity execution.
The up-side is that you will know the real success/failure status of the executed job straight away. The down-side is that you should be very aware of the amount of time you have to wait for the response, in order to avoid a potential HTTP Request timeout.
If executed successfully in browser you'll get a content of "See log file: yourjob[date]-[timestamp].log.".
If job succeeds you get a HTTP 200 in return. If job fails a HTTP 500 is returned.
Editing & modifying an activity
By clicking an existing activity in the Data Integration area, you can open (and edit) the activity details and settings.
For that purpose, you have access to a ribbon bar (1) and table/column mappings area (2) (Figure 12.1)
Using the ribbon bar you can:
- Save and run and activity
- Add table and column mappings, remove inactive mappings – or activate the create mappings at runtime option
- Add a new destination table and add columns to a destination table
- Edit the source provider and destination provider settings
- Edit or configure email notification settings
- Access the most recent log file – or review old log files
From the table & column mappings view, you can:
- Use conditionals to filter rows during import/export
- Use scripting to append or prepend something to an input value, or to replace input with a constant
- Customize key values
Read more below.
You can use values from the context when using conditionals and scripting on an integration activity.
To do so, use these context objects:
@Request() – get value from System.Web.HttpContext.Current.Request
@Session() – get value from System.Web.HttpContext.Current.Session
@Page() – get value from current Page object, if any. (Current page exists if a valid PageID is in the query string).
@User() – get value from the current User object.
@Code() – evaluate C# code
@Paragraph() – get value from current paragraph object, if any. (Current paragraph exists if a valid ParagraphID is in the query string).
@ProductID() – get value from current product object, if any. (Current product exists if a valid ProductID is in the query string).
@Page(MenuText) – Get title of page
@User(Name) – get name of user
@User(Email) – get email of user
@Code(System.DateTime.Now.AddDays(1)) – get tomorrow’s date
@Code(1+2) – get 3 (possible, but stupid, granted)
When mapping tables, you can use conditional to filter which rows should be affected during import/export – e.g. if you want to export all products with a stock value lower than 10.
To set conditionals, click the conditional icon next to a table mapping in your activity overview (Figure 13.1).
Please note that conditionals only work as filters when the column is part of the import.
In addition to comparing simple values, you use @NULL to match on true DbNull (isNull = “true” in the XML) and @NULLOREMPTY to match null and empty strings (e.g. “” or <! [CDATA]> in the XML).
Applying conditions dynamically
In some cases it may be useful to run an activity with dynamic parameters. For instance, you may want to retrieve orders within a particular date interval, a particular order state, or with a specific order total.
To do so add a conditional using the following syntax:
Where ParameterName is the name you want to give the parameter which is passed to URL and “value” is the default value you want to pass on (Figure 13.3).
During column mapping, you can use scripting to append, prepend or set constants when the activity is run.
Click the scripting button to the right of a column mapping and define your script (Figure 14.1).
Simply select append, prepend or set constant as appropriate – and enter the value you want to append/prepend/create.
Set Constant is mainly used if you want to replace all source values with the constant, or (more likely) to create a destination column with no source column available, e.g. if your ERP system has no language context column and you want to import these products into a LANG1 or LANG2 context in Dynamicweb Ecommerce. Use the Apply on create only to only apply the constant to newly created rows.
Keys in data integration are similar to Primary Keys in databases. These are used to determine when a new row should be added, and when an existing row should be updated.
Clicking the "Set keys" icon (Figure 15.1) will pop up a window where you can choose the key settings that should be considered upon import
Default will be the actual key(s) of the column. For instance, the default key settings for table EcomProducts is the composite key ProductID + ProductLanguageID + ProductVariantID. However, in some cases you may want to set the unique identifier to ProductNumber.
Moving activities to other solutions
Data Integration activities are store as XML files on a solution, and can be accessed through the file manager.
To move an activity to a different solution, simply move the activity from /Integration/Jobs on one solution to /Integration/Jobs on the target solution (Figure 16.1).
Please note that all paths to source files (CSV or XML) are static in the activity file and must be manually adjusted if they differ on the destination solution.
Similarly any custom fields / tables mapped in the job file will need to be recreated. However as the database schema is validated on job load you will be informed of any potential errors.
Data Integration notifications contains only one JobFinished notification which is raised after the Data Integration job execution is finished.
The JobFinished notification allows access to information about the job execution, log file and the tables present in the job, and is located in the Dynamicweb.DataIntegration.Integration.Notifications namespace.
The following example shows how to create a JobFinished notification subscriber: