Developer forum

Forum » Integration » Forever running TaskExecute runner

Forever running TaskExecute runner

Scott Forsyth Dynamicweb Employee
Scott Forsyth
Reply

Hello,

We're starting to get a lot of sites running into some issues with the scheduled task runner. We're up to 5 sites now impacted by this, which is becoming critical.

The thing that is happening is that the Windows scheduled task, which runs every 5 minutes, is calling TaskExecute.aspx, but that page remains running forever, which prevents the next 5 minute run from running.

This happens if one of the scheduled task jobs takes more than 5 minutes. We've been able to narrow it down to 5 minutes being the magic time. We have a mocking framework with an exactly 5 minute job and the issue occurs, if we set the job to anything less than 5 minutes, everything works correctly and the TaskExecute.aspx page finishes running and life continues happily.

We believe that this has started coming with 9.9.x, although we're not certain.

While troubleshooting today, we ran a debug build with a December 2019 version of TaskExecute.aspx.vb but the 9.12 debug version of the platform and the issue still occurs. So, it appears that the issue isn't with TaskExecute.aspx.vb (although at first I was thinking that it would be the cause).

Does anyone on the platform team know of anything at all that may have changed in the last year that introduced a 5 minute timeout somewhere. It would be preventing the aspx page from ending when the work is done if the page runs for longer than 5 minutes.

Thanks!

Scott


Replies

 
Dmitriy Benyuk Dynamicweb Employee
Dmitriy Benyuk
Reply

Hi Scott,

If you want the page to be executed with some definite timeout you can set this value in the web.config file:
httpRuntime executionTimeout="600" 
The default is 600 seconds which is 10 min, so you can set it less than 5min, then it should be 300 or less.
Then once your tasks that are running more than 5 min in total during that TaskExecute page load from the Windows scheduler task will be aborted at your timeout,
but you would get an error at the last or ar one of the executed task as the last during timeout expired and the error will say:
System.Threading.ThreadAbortException: Thread was being aborted
But that will allow your tasks to run every 5 min as scheduled in the Windows Scheduler.

Maybe you can setup your long running tasks that can take more than 5 min of execution time to be executed with the invervals grater than just 5 min, fx 10 min or 1 hr instead?
Then after the long running task finishes the Windows Scheduler will continue its work and will be executed in 5 min inverval.

Kind regards, Dmitrij


 

 
Scott Forsyth Dynamicweb Employee
Scott Forsyth
Reply

Hi Dmitrij,

That's very helpful. You're right that the executionTimeout is exactly 5 minutes, so I experimented with that. I thought it would be it.

However, I found that the issue for this situation is something else, and it's surprising. It turns out that Internet Explorer is the cause. The 5 minute issue occurs just in Internet Explorer, and it doesn't occur in other browsers. The Windows task scheduler runner uses PowerShell's Invoke-WebRequest which happens to use the same IE engine for the execution. 

I haven't tracked down which setting in IE it is or why it recently changed, but I wanted to update you on what we've found so far. For a solution, we'll either need to determine which setting changed, or we'll use another method of calling the site rather than Invoke-WebRequest.

Scott

 

 
Dmitriy Benyuk Dynamicweb Employee
Dmitriy Benyuk
Reply

Hi Scott,
I was using a bit old approach since I have a local solution so I've manually created a task with Curl.exe as it is described in this post:
https://doc.dynamicweb.com/forum/development/development/scheduled-tasks-on-custom-solution
So maybe this is a difference: now by default the windows task is created with powershell.exe that runs Invoke-WebRequest  command and before it was running the Curl.exe utility that was visiting the TaskExecute.aspx page.
So maybe you can investigate new and old approach and fine some differences.
Kind regards, Dmitrij
 

 
Scott Forsyth Dynamicweb Employee
Scott Forsyth
Reply

Hi Dmitriy,

Good point, the curl.exe method doesn't have that issue. I've looked further into it today. I found that the underlying ReadWriteTimeout property has a 300,000 (5 minute) timeout so it seems like it may be the necessary setting. However, Invoke-WebRequest doesn't let you change that. You can change TimeoutSec but that's a hard timeout and doesn't help with keeping the request alive until the page completes.

As a side, using -UseBasicParsing is supposed to have it not use Internet Explorer (PowerShell 6.0 and greater makes that the default). However, you already set that and it doesn't help with this situation. The 5 minute issue still occurs. It also occurs on PowerShell 5.x and 7.x. 

Here's a bit more info: https://stackoverflow.com/questions/7250983/httpwebrequests-timeout-and-readwritetimeout-what-do-these-mean-for-the-unde

So, for solutions, what do you think about setting TimeoutSec to 290 seconds (or 250 seconds as a more round number)? That's just short of 5 minutes so it will timeout before the 5 minute time. Here's what I'm seeing from my testing:

  • When having a job that runs for 6 minutes, and -TimeoutSec set to 10 seconds (for testing), the calling script will end at 10 seconds (as expected)
  • During the remaining nearly 6 minutes, if we hit the TaskExecute.aspx page, it correctly determines that there is a job running and it will say that there are 0 jobs to run.
  • Even after 5 minutes, but before 6 minutes, it will correctly say that 0 jobs need to run.
  • After 6 minutes, the logs on the website show that the job completed successfully.
  • Running the job after 10 minutes (the next 5 minute interval) will correctly run the job again.
  • This means that:
    • The caller (PowerShell or a browser) doesn't need to run the whole time to keep things alive.
    • It's ok to have the Windows scheduled task re-run every 5 minutes, even if there are jobs still running. It seems to handle it gracefully.
    • The Windows scheduled task doesn't do anything with the response anyway, so ending early with a timeout doesn't seem to have any negative impact.

What do you think?

Scott

 
Dmitriy Benyuk Dynamicweb Employee
Dmitriy Benyuk
Reply

Hi Scott,
I still can not get it reproduced. Here is my flow:
I've created a task with the following parameters:
Trigger - every 5 min
Actions: Start program: powershell.exe with parameters: Invoke-WebRequest -UseBasicParsing -Uri http://dw9.local.dynamicweb.dk/Admin/Public/TaskExecute.aspx -Method GET

And have a long running Export data add-in batch task that runs 5 min 10 sec
So when the scheduler is triggered (fx. at 10: 00: 00) it waits 5 min 10 sec interval during which the task is finished (10:05:10), so the next windows task
is then scheduled to be run at 10:10:00 so it skips the running of the task at 10:05:00 since the previous task didn't complete at that time.
So it looks like it works in the same way as it was when running it using the Curl.exe.
Is this flow not the same on your server? If not what is the difference? Do you have an error message from somewhere: Data integration logs,
scheduled task logs, windows event viewer logs at the time when it fails?
I need some more information since I can not catch where the problem is.
Maybe it is some specific server setting problem.

Kind regards, Dmitrij



 

 
Scott Forsyth Dynamicweb Employee
Scott Forsyth
Reply

Hi Dmitrij,

That's helpful to know that it may be possible to get it working with Invoke-WebRequest, if we can find out the setting differences between you and me.

Here's the endpoint that we're testing. You can use it. It's set to a 5 minute delay, so it should be the same as yours. But maybe ours doesn't give any HTTP response until the end of the time and yours' somehow keeps it alive, or something like that. 

This is a mocking test endpoint so I can share the info here:

http://mock.mydwsite1.com:8104/DynamicwebService

Secret: RunUntilComplete

Request:

<GetEcomData Qty="10" ><tables><Products type="all" /></tables></GetEcomData>

That will take exactly 5 minutes to reply, which reproduces in our environment. 

The other settings that we have are:

  • The scheduled task settings are exactly the same as yours (created from the website)
  • Windows Server 2016, although I can reproduce by running the Windows 10 machine
  • If you test with Internet Explorer it should have the same issue
  • The issue is that the request never completes, so it doesn't try again after the 5 / 10 minutes
  • You can test it here too if you would like: http://qa2-drteeth.mydwsite1.com/admin. The scheduled task is: Import Products Mocking Test
  • It is our custom scheduled task that we've been doing most of our testing on, but we tried with a generic scheduled task and that did the same.
  • I did leave web.config back at the default timeout, although that didn't seem to make a difference <httpRuntime executionTimeout="300" />

Hope that helps. Thanks for looking!

Scott

 
Dmitriy Benyuk Dynamicweb Employee
Dmitriy Benyuk
Reply

Hi Scott,
I've got it reproduced using your service, but it looks like the request to it never ends (not finishes at 5 min), I've tried to run the query from the Live integration Test connection page, where I've used the endpoint you provided and the request run more than 5 minutes and then failed due to http web request timeout exception,
that is because of site web.config timeout is 10 min.

Then I've run the task from the windows scheduler and it took exactly 30min timeout. Here is a log file:
2021-04-19 22:46:32.916: Starting scheduled task.
2021-04-19 22:46:33.075: Request: <GetEcomData Qty="10" ><tables><Products type="all" /></tables></GetEcomData>.
2021-04-19 23:16:33.119: Error occured when sending request to remote system: Thread was being aborted.
2021-04-19 23:16:33.119: File: 'C:\Dynamicweb\Dynamicweb9\Files\System\Log\DataIntegration\0 266 imp prods20210419-2246329263768.log' doesn't exists
2021-04-19 23:16:33.119: Thread was being aborted.

Why it took 30 min? Probably you are using the NAV connector and its code is setting 30 min timeout, see Navconnector.cs file:
            navWSBinding.SendTimeout = TimeSpan.FromMinutes(30);
            navWSBinding.OpenTimeout = TimeSpan.FromMinutes(30);
            navWSBinding.CloseTimeout = TimeSpan.FromMinutes(30);
            navWSBinding.ReceiveTimeout = TimeSpan.FromMinutes(30);
So maybe you can change this code to your own timeout?
What is the reason to have an endpoint that has no response?

Kind regards, Dmitrij

 
Scott Forsyth Dynamicweb Employee
Scott Forsyth
Reply

Hi Dmitrij,

Interesting. That sounds like some progress on tracking it down. I didn't fully understand a couple of your tests. Let me explain what I see on my end, and what I understand from your testing, and you can let me know what I'm missing.

First, the various timeouts that I'm aware of:

  • Scheduled task - the one that we're using is set to timeout at 6 minutes. If I reduce the timeout on the scheduled task so that it's the first to timeout, the error message is: "Log: Timeout error communicating with server. System.TimeoutException: …The request channel timed out while waiting for a reply after 00:00:05.9979925. "
  • ScriptTimeout setting. By default, the scheduled task runner itself has a 30 minute timeout. I have a customized build that lets me extend it, and it's set to 60 minutes right now. Here's the branch which I'm waiting for approval and I'll create a pull request for it. Here's the link to it. If this timeout is reached first, the error message is: "Unknown error communicating with server. System.Threading.ThreadAbortException: Thread was being aborted."
  • The Mocking endpoint. It's currently set with a timeout of 5 minutes (300000 milliseconds).
  • Web.config's executionTimeout. Currently it's set to 5 minutes (300 seconds), but I haven't found that this impacts my testing.
  • I'm not using NAV, but I'm using a custom connector that we call MockingConnector. The endpoint is set to 5 minutes, and I'm not aware of a timeout setting on the connector itself. You mentioned changing the timeout in the NAV settings, but I don't believe that it would come into play since it's our custom mocking connector with a 5 minute timeout. 

When testing, here's what I find:

  • When visiting https://qa2-drteeth.mydwsite1.com/Admin/Public/TaskExecute.aspx from Chrome, Firefox, or Edge, this will complete in 5 minutes, which is the time it takes to run the long running task. It will show a message that it has completed, and how long it took.
  • If you visit the same link in Internet Explorer, it doesn't time out at all. It seems to run forever.
  • If you visit the same link from PowerShell, it also doesn't complete in the 5 or 6 minutes. I believe that it will wait for the 3 days in the scheduled task's settings.
  • In the case of Internet Explorer or PowerShell, the log for the scheduled task shows that it completed at 5 minutes, even though IE and PowerShell appear to never end. 

If the Mocking endpoint is set to 4.5 minutes, then, even in Internet Explorer or the PowerShell Invoke-WebRequest, it will complete at 4.5 minutes. So, my conclusion is that both Internet Explorer and Invoke-WebRequest have a 5 minute timeout somewhere such that if the 5 minutes are reached, it essentially hangs, and never completes.

I just created a new mocking endpoint for you to set for 4.5 minutes: <GetEcomData Qty="10" ><tables><Products45 type="all" /></tables></GetEcomData>. You can use that to see how it does end at 4.5 minutes correctly when using that endpoint.

What you mentioned above is very interesting. You mentioned the 30 minute timeout. That sounds like the 30 minutes within the scheduled task runner. I've extended mine to 60 minutes, but if you are testing on your own site, then it would be 30 minutes.

So, my question is what you did to achieve the 30 minute timeout. Is that on your own site with your own scheduled task? I would have expected that to complete at the 5 minute mark, unless that scheduld task has a similar result as Internet Explorer or PowerShell. How can I reproduce the 30 minute timeout?

> You said: I've got it reproduced using your service, but it looks like the request to it never ends (not finishes at 5 min)

That's my exact problem. I have an endpoint that will time out at 5 minutes, but using IE or PowerShell, it doesn't end at 5 minutes. What I would like to achieve is for IE or PowerShell to end correctly at the 5 minute timeout of the Mocking endpoint.

> You said: What is the reason to have an endpoint that has no response?

Good question. The whole reason for this exercise is that we're seeing sites with longer running pages having issues with the scheduled task runner. It never ends, so it never runs the regular 5 minute run, causing all of the scheduled tasks to appear as if they are stuck. So, with the mocking endpoint, I'm able to simulate the problem situation.

I realize this is a lot of info, but hopefully it gives some clues.

Thanks again for looking into this!

Scott

 
Dmitriy Benyuk Dynamicweb Employee
Dmitriy Benyuk
Reply

Hi Scott,
I have get it reproduced using https://qa2-drteeth.mydwsite1.com/Admin/Public/TaskExecute.aspx in IE.
But I can not reproduce it locally since I do not have your cusom add-in that uses your Mock endpoint.
Could you send me its code on mine email?
I was trying to set the endpoint using those settings but it fails to get any data:

It looks like the problem should be somewhere in the endpoint management/endpoint execution.
I've checked mine long running Export data add-in batch task that runs 5 min 10 sec and it was executed just fine in IE,
it doesn't use any endpoint execution, just hardcoded Sleep interval inside Run() method.

Regards, Dmitrij

 
Scott Forsyth Dynamicweb Employee
Scott Forsyth
Reply

Hi Dmitrij,

I emailed you the mocking connector service with configuration, and also the RunUntilComplete scheduled task that should make it easier to set up a repro that is the same as ours.

Thanks again!

Scott

 
Dan Kristensen Hørlyck
Dan Kristensen Hørlyck
Reply

Hi Scott,

For future reference, this is tracked as #1826.

We will reproduce and implement a fix as needed in the next sprint scheduled to start May 5th.

Sincerely,
Dan

 

 

 
Scott Forsyth Dynamicweb Employee
Scott Forsyth
Reply

Thanks Dan (and Dmitrij),

That sounds good. Thanks!

Scott

 
Søren Jakobsen
Reply

Any news regarding this issue?

Thanks

Søren

 

You must be logged in to post in the forum