代码之家  ›  专栏  ›  技术社区  ›  Giulio Pulina

ActiveMQ 5.9-由于未释放锁,代理上的线程被阻止

  •  2
  • Giulio Pulina  · 技术社区  · 8 年前

    我正在将ActiveMQ 5.9与Camel 2.10.3配合使用,在负载下(在性能测试期间),我遇到了一些问题,代理似乎无法尝试关闭连接,我无法理解原因。

    JMS系统的配置如下:有两个代理(在故障切换模式下配置)和许多客户端节点,它们既是消费者又是某些队列的生产者(让我们举一个例子:“customer\u update queue”。

    我使用的是PooledConnectionFactory,默认配置为“CACHE\u CONSUMER”缓存级别,每个客户端节点最多有10个并发CONSUMER。

    代理配置如下:tcp://0.0.0.0:61616?maximumConnections=1000&wireFormat。maxFrameSize=104857600

    下面是一个线程,它在代理上持有锁,但从不释放它:

    "ActiveMQ Transport: tcp:///10.128.43.206:38694@61616" #5774 daemon prio=5 os_prio=0 tid=0x00007f2a4424e800 nid=0
    xaba4 waiting on condition [0x00007f29fe397000]
    java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000004fd008fb0> (a java.util.concurrent.CountDownLatch$Sync)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.
    java:1037)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
    at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
    at org.apache.activemq.broker.TransportConnection.stop(TransportConnection.java:983)
    at org.apache.activemq.broker.TransportConnection.processAddConnection(TransportConnection.java:699)
            - locked <0x000000050401eed0> (a java.lang.Object)
    at org.apache.activemq.broker.jmx.ManagedTransportConnection.processAddConnection(ManagedTransportConnection.java:79)
    at org.apache.activemq.command.ConnectionInfo.visit(ConnectionInfo.java:139)
    at org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:292)
    at org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:149)
    at org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport.java:50)
    at org.apache.activemq.transport.WireFormatNegotiator.onCommand(WireFormatNegotiator.java:113)
    at org.apache.activemq.transport.AbstractInactivityMonitor.onCommand(AbstractInactivityMonitor.java:270)
    at org.apache.activemq.transport.TransportSupport.doConsume(TransportSupport.java:83)
    at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:214)
    at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:196)
    at java.lang.Thread.run(Thread.java:748)
    

    我在代理上还有500多个其他线程,如下所示:

    "ActiveMQ Transport: tcp:///10.128.43.206:52074@61616" #2962 daemon prio=5 os_prio=0 tid=0x00007f2a440c9000 nid=0xa01f waiting for monitor entry [0x00007f29fc768000]
       java.lang.Thread.State: BLOCKED (on object monitor)
            at org.apache.activemq.broker.TransportConnection.processAddConnection(TransportConnection.java:696)
            - waiting to lock <0x000000050401eed0> (a java.lang.Object)
            at org.apache.activemq.broker.jmx.ManagedTransportConnection.processAddConnection(ManagedTransportConnection.java:79)
            at org.apache.activemq.command.ConnectionInfo.visit(ConnectionInfo.java:139)
            at org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:292)
            at org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:149)
            at org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport.java:50)
            at org.apache.activemq.transport.WireFormatNegotiator.onCommand(WireFormatNegotiator.java:113)
            at org.apache.activemq.transport.AbstractInactivityMonitor.onCommand(AbstractInactivityMonitor.java:270)
            at org.apache.activemq.transport.TransportSupport.doConsume(TransportSupport.java:83)
            at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:214)
            at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:196)
            at java.lang.Thread.run(Thread.java:748)
    

    我在broker上看到的第一个错误是:

    2018-05-16 16:36:59,336 [org.apache.activemq.broker.TransportConnection.Transport:856] 
    WARN  - Transport Connection to: tcp://10.128.43.206:48747 failed: java.io.EOFException
    

    在代理(10.128.43.206)中引用的客户机节点上,我看到了这些日志,似乎节点正在尝试重新连接,但就在它再次断开连接之后,这种情况一再发生。

    2018-05-16 16:36:59,322 [org.apache.activemq.transport.failover.FailoverTransport:856] WARN  - Transport (tcp://10.128.43.169:61616) failed, reason:  java.io.IOException, attempting to automatically reconnect
    2018-05-16 16:36:59,322 [org.apache.activemq.transport.failover.FailoverTransport:856] WARN  - Transport (tcp://10.128.43.169:61616) failed, reason:  java.io.IOException, attempting to automatically reconnect
    2018-05-16 16:36:59,375 [org.apache.activemq.transport.failover.FailoverTransport:856] INFO  - Successfully reconnected to tcp://10.128.43.169:61616
    2018-05-16 16:36:59,375 [org.apache.activemq.transport.failover.FailoverTransport:856] INFO  - Successfully reconnected to tcp://10.128.43.169:61616
    2018-05-16 16:36:59,375 [org.apache.activemq.TransactionContext:856] INFO  - commit failed for transaction TX:ID:52374-1526300283331-1:1:898
    javax.jms.TransactionRolledBackException: Transaction completion in doubt due to failover. Forcing rollback of TX:ID:52374-1526300283331-1:1:898
            at org.apache.activemq.state.ConnectionStateTracker.restoreTransactions(ConnectionStateTracker.java:231)
            at org.apache.activemq.state.ConnectionStateTracker.restore(ConnectionStateTracker.java:169)
            at org.apache.activemq.transport.failover.FailoverTransport.restoreTransport(FailoverTransport.java:827)
            at org.apache.activemq.transport.failover.FailoverTransport.doReconnect(FailoverTransport.java:1005)
            at org.apache.activemq.transport.failover.FailoverTransport$2.iterate(FailoverTransport.java:136)
            at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129)
            at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            at java.lang.Thread.run(Thread.java:748)
    2018-05-16 16:37:00,091 [org.apache.activemq.transport.failover.FailoverTransport:856] INFO  - Successfully reconnected to tcp://10.128.43.169:61616
    2018-05-16 16:37:00,091 [org.apache.activemq.transport.failover.FailoverTransport:856] INFO  - Successfully reconnected to tcp://10.128.43.169:61616
    2018-05-16 16:37:00,112 [org.apache.activemq.transport.failover.FailoverTransport:856] WARN  - Transport (tcp://10.128.43.169:61616) failed, reason:  java.io.IOException, attempting to automatically reconnect
    

    最后,代理到达maxConnections可用(1000),需要重新启动。

    这可能是因为一个客户端节点同时充当消费者和生产者,使用相同的连接池,从而产生某种死锁吗?

    你有什么建议吗?

    谢谢

    朱利奥

    1 回复  |  直到 8 年前
        1
  •  -1
  •   Giulio Pulina    8 年前

    我很可能受到了这个问题的影响:

    https://issues.apache.org/jira/browse/AMQ-5090

    更新到ActiveMQ 5.10.0和Camel 2.13.1解决了这个问题(即使在性能测试期间,系统也更加稳定)。

    谢谢 朱利奥