Unable to Halt a database connector that hangs in 'Polling' state. #3304

rbeckman-nextgen · 2020-05-11T20:38:00Z

During network and/or firewall problems something went wrong with a database connection of our test-environment, and a mirth-channel (database reader, with the keep-connection-option open set to true) got stuck in 'polling' state (the reason of the db-query hanging isnt a/the mirth-problem).

But the Mirth-problem was: Stopping the channel didnt work (this i expected), and Halting the channel didnt do anything too (it stayed in halted state, probably not able to terminate the db-connection, and never stopped.).

Also due to this problem, stopping/restarting mirth-connect wasnt possible anymore.

This problem maybe related to: Mirth-3366

Imported Issue. Original Details:
Jira Issue Key: MIRTH-3403
Reporter: amc_cru
Created: 2014-08-13T02:13:50.000-0700

rbeckman-nextgen · 2020-05-11T20:38:01Z

Driver used by the database-reader channel: jdbc:jtds:sqlserver:https://.....

Imported Comment. Original Details:
Author: amc_cru
Created: 2014-08-13T02:30:13.000-0700

rbeckman-nextgen · 2020-05-11T20:38:02Z

Others are running into this as well, also using jTDS. This is due to a couple of things. First, the database reader doesn't support halting at all right now. The delegate interface doesn't even have a halt method. Second, both the reader and writer query delegates just close the Connection object. However, that could block, depending on the driver. To do a proper halt, we should be calling the abort method, and possibly also the setNetworkTimeout method when the connection first gets created.

However, jTDS is dumb and doesn't support either of those methods. In the JtdsConnection class it literally just throws an AbstractMethodError, nothing else. So as far as I can tell, it's impossible to force-halt a hanging jTDS connection, unless we override the class and implement those methods ourselves. PostgreSQL's driver does implement them, so it should work fine there. Not sure about other drivers. Maybe Microsoft's JDBC driver does.

This is very easy to reproduce. I just use a VM with SQL Server on it, and a Database Writer channel that invokes WAITFOR. While a message is processing I suspend the VM. After that, the channel cannot be stopped or halted, and requires the entire server to be restarted.

Imported Comment. Original Details:
Author: narupley
Created: 2015-04-30T12:07:41.000-0700

rbeckman-nextgen · 2020-05-11T20:38:03Z

Doesn't look like Microsoft's JDBC driver supports those methods either. So as far as SQL Server goes there's nothing that can be done, unless as I said we alter jTDS to support it.

It sucks that in cases like this the channel basically can't be used at all, but the only alternative is to spawn the possibly-forever-blocking operations in a separate thread, and when something like this happens we just try our best and then forget about the thread. Then the channel will be able to stop, and can be used, redeployed, etc. It's just that in the JVM you'll still have lingering threads that could stick around forever. What's worse? A possible thread leak, or forcing the user to restart the entire server? In the thread leak case we would obviously send some error to the server log letting the user know that it's happening, and that they should restart the server when it's convenient. I think that's better than channel being in a perpetually unusable state, wherein the user is forced to either restart the server immediately, or abandon/clone the channel.

Imported Comment. Original Details:
Author: narupley
Created: 2015-04-30T12:29:09.000-0700

rbeckman-nextgen · 2020-05-11T20:38:04Z

[http:https://www.mirthcorp.com/community/forums/showthread.php?t=14253]

Imported Comment. Original Details:
Author: narupley
Created: 2015-05-14T07:47:43.000-0700

rbeckman-nextgen · 2020-05-11T20:38:05Z

Is there a workaround for this? We are encountering this in our production (supported) environment.

Imported Comment. Original Details:
Author: justinsk
Created: 2019-07-25T17:26:34.000-0700

rbeckman-nextgen · 2020-05-11T20:38:06Z

We are currently having a similar problem with Connector Type of File Server and Method of FTP. Just last week we had the same problem occur with a Connector Type of File Server and Method of FTP and Method of File and a Directory pointing to a networked server. When the network server has problems, Mirth Connect didn't handle this well and started using a so much CPU that other channels could get their work done, and we had to reboot the server. We are using Mirth Version 3.5.2. Justin Kaltenbach, what version are you using? We, too, are using a supported version.

Imported Comment. Original Details:
Author: mulleg
Created: 2019-08-02T11:55:14.000-0700

rbeckman-nextgen · 2020-05-11T20:38:06Z

Would like to know workaround as well. Locking up production workflow about once every 2 weeks. Occurring with the following connecting to SQL Azure using Mirth Connect Server 3.5.2, Java version: 1.8.0_121:

Database Reader

Use Javascript: No
Keep Connection Open: No
Aggregate Results: No
Cache Results: Yes
Retries on Error: 3
Retry Interval: 10000

sqljdbc42.jar

Imported Comment. Original Details:
Author: sesq
Created: 2019-08-28T12:32:07.000-0700

brendanhwell · 2021-03-05T22:05:30Z

Running into the same issue? Any workaround? This happens weekly for us....requires a complete kill of MC, channel just sits there querying indefinitely, and requires a mirth connect restart. I was going to try Microsoft's driver but it sounds like you tried that @rbeckman-nextgen ?

aemerytruven · 2021-03-05T22:31:39Z

This happens to us so often I have a Windows Scheduled task to kill mirth and start it again a few times a day. I have not found another work around. We house our Mirth instances in the same datacenter as our MS SQL Servers, so they are direct server to server connections, no cloud.

RaulDeLaMantua · 2021-03-11T15:48:55Z

This happens to us all the time.

michaelmarcuccio · 2021-06-25T05:28:13Z

Can this be reopened? If I am not mistaken this was never addressed and has been an issue for years with no workaround.
Was this resolved by newer Mirth versions or the new integrated SQL driver they use?

rivforthesesh · 2021-08-04T10:23:03Z

Please can this be reopened? I've been running into this issue a lot with one particular database reader (even though the query I've used runs just fine in SSMS), and having to force kill the service every time this happens is really slowing down work.

michaelmarcuccio · 2021-08-04T13:44:46Z

@rivforthesesh what Mirth version are you using and what sqljdbc.jar driver version are you using? Also are you using timeout parameters in your connection string? I think the solution might be something around using the new timeout params, but you have to use an up to date version of the driver(6.2). I have not tested this and probably don't have time to at the moment. e.g. https://github.com/Microsoft/mssql-jdbc/wiki/QueryTimeout

rivforthesesh · 2021-08-04T14:08:29Z

@michaelmarcuccio currently using Mirth 3.11.0, and I believe the driver is mssql-jdbc-8.4.1.jre8.jar - that's my best guess based on the config files we have. I think I've managed a workaround now but thank you for the query timeout suggestion, I'll make a note of it for later!

In case this helps anyone else already stuck in the loop: stopping the channel and then killing the process in SSMS (sp_who2 to find the SPID, kill to kill it) will end the polling loop, but that requires having the right permissions in SSMS to kill processes

(For actually fixing it, changing the source query to add the option maxdop=2 got me past the polling stage and onto reading - I was digging through sys tables in SSMS to try to find anything odd about the query and it had a dop of 8 which, while it should've worked on this server, was way higher than any other running process. That's more likely to be a quirk of the server I'm working on than something with Mirth, but I'll drop it here anyway)

michaelmarcuccio · 2021-08-04T14:44:41Z

@rivforthesesh I am using SQL Azure with sqljdbc42.jar, just adjusted from maxdop=0 to maxdop=1 as our instance has less than one vCore. Thanks for the info here. If the issue still occurs I will probably try upgrading drivers and setting timeouts in connection string.

Edit: Issue still occurs.

twest-mirthconnect · 2022-11-09T21:31:04Z

@rbeckman-nextgen - Will you please let me know if this is still a problem in your production? Let me know what version you are using when you respond. :) Thanks.

michaelmarcuccio · 2022-12-01T03:22:36Z

Using v3.5.2, the issue hasn't reoccurred in a few months. Maybe the Azure SQL DB side got more friendly to prevent this scenario, or maybe I just have been getting lucky. It is really painful when the issue does occur as it causes like a 30 minute downtime to restart everything.

twest-mirthconnect · 2023-02-24T22:38:43Z

For those on this string, if you can or want to develop a PR on your end, we will pull it into our sprint schedule and complete a code review and quality check. Once we complete our validation, the commit will be available for use ahead of the release.

We hope this new process will improve both the response to problems engineers are facing in the field and a stronger, more connected community.

rbeckman-nextgen added channel connector labels May 11, 2020

rbeckman-nextgen added this to the Future Planned milestone May 11, 2020

cturczynskyj added the closed-due-to-inactivity This is being closed due to age. If this is still an issue, comment and we can reopen. label Mar 1, 2021

pladesma closed this as completed Mar 1, 2021

jonbartels mentioned this issue Aug 5, 2021

Re-open Unable to Halt a database connector that hangs in 'Polling' state. #3304... please #4666

Closed

pladesma reopened this Sep 3, 2021

pladesma added bug Something isn't working Internal-Issue-Created An issue has been created in NextGen's internal issue tracker triaged RS-876 and removed closed-due-to-inactivity This is being closed due to age. If this is still an issue, comment and we can reopen. labels Sep 3, 2021

tonygermano mentioned this issue Mar 24, 2024

Database Writer gets stuck in 'Reading' state causing all new messages to queue indefinitely #3360

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to Halt a database connector that hangs in 'Polling' state. #3304

Unable to Halt a database connector that hangs in 'Polling' state. #3304

rbeckman-nextgen commented May 11, 2020

rbeckman-nextgen commented May 11, 2020

rbeckman-nextgen commented May 11, 2020

rbeckman-nextgen commented May 11, 2020

rbeckman-nextgen commented May 11, 2020

rbeckman-nextgen commented May 11, 2020

rbeckman-nextgen commented May 11, 2020

rbeckman-nextgen commented May 11, 2020

brendanhwell commented Mar 5, 2021

aemerytruven commented Mar 5, 2021

RaulDeLaMantua commented Mar 11, 2021

michaelmarcuccio commented Jun 25, 2021 •

edited

Loading

rivforthesesh commented Aug 4, 2021

michaelmarcuccio commented Aug 4, 2021

rivforthesesh commented Aug 4, 2021

michaelmarcuccio commented Aug 4, 2021 •

edited

Loading

twest-mirthconnect commented Nov 9, 2022

michaelmarcuccio commented Dec 1, 2022

twest-mirthconnect commented Feb 24, 2023

Unable to Halt a database connector that hangs in 'Polling' state. #3304

Unable to Halt a database connector that hangs in 'Polling' state. #3304

Comments

rbeckman-nextgen commented May 11, 2020

rbeckman-nextgen commented May 11, 2020

rbeckman-nextgen commented May 11, 2020

rbeckman-nextgen commented May 11, 2020

rbeckman-nextgen commented May 11, 2020

rbeckman-nextgen commented May 11, 2020

rbeckman-nextgen commented May 11, 2020

rbeckman-nextgen commented May 11, 2020

brendanhwell commented Mar 5, 2021

aemerytruven commented Mar 5, 2021

RaulDeLaMantua commented Mar 11, 2021

michaelmarcuccio commented Jun 25, 2021 • edited Loading

rivforthesesh commented Aug 4, 2021

michaelmarcuccio commented Aug 4, 2021

rivforthesesh commented Aug 4, 2021

michaelmarcuccio commented Aug 4, 2021 • edited Loading

twest-mirthconnect commented Nov 9, 2022

michaelmarcuccio commented Dec 1, 2022

twest-mirthconnect commented Feb 24, 2023

michaelmarcuccio commented Jun 25, 2021 •

edited

Loading

michaelmarcuccio commented Aug 4, 2021 •

edited

Loading