celery .delay挂起(最近,不是auth问题)
|
我正在使用RabbitMQ 2.1.1作为后端运行Celery 2.2.4 / djCelery 2.2.4。我最近将两台新的celery服务器上线了-我已经在两台机器上运行了2个工人,总共有约18个线程,在我的新包装盒(36g RAM +双超线程四核)上,我正在运行10个每个都有8个线程的工人,总共180个线程-我的任务都非常小,所以应该没问题。
节点在最近几天一直运行良好,但是今天我注意到ѭ0正在挂起。当我中断它时,我看到一个回溯指向这里:
File \"/home/django/deployed/releases/20110608183345/virtual-env/lib/python2.5/site-packages/celery/task/base.py\", line 324, in delay
return self.apply_async(args, kwargs)
File \"/home/django/deployed/releases/20110608183345/virtual-env/lib/python2.5/site-packages/celery/task/base.py\", line 449, in apply_async
publish.close()
File \"/home/django/deployed/virtual-env/lib/python2.5/site-packages/kombu/compat.py\", line 108, in close
self.backend.close()
File \"/home/django/deployed/virtual-env/lib/python2.5/site-packages/amqplib/client_0_8/channel.py\", line 194, in close
(20, 41), # Channel.close_ok
File \"/home/django/deployed/virtual-env/lib/python2.5/site-packages/amqplib/client_0_8/abstract_channel.py\", line 89, in wait
self.channel_id, allowed_methods)
File \"/home/django/deployed/virtual-env/lib/python2.5/site-packages/amqplib/client_0_8/connection.py\", line 198, in _wait_method
self.method_reader.read_method()
File \"/home/django/deployed/virtual-env/lib/python2.5/site-packages/amqplib/client_0_8/method_framing.py\", line 212, in read_method
self._next_method()
File \"/home/django/deployed/virtual-env/lib/python2.5/site-packages/amqplib/client_0_8/method_framing.py\", line 127, in _next_method
frame_type, channel, payload = self.source.read_frame()
File \"/home/django/deployed/virtual-env/lib/python2.5/site-packages/amqplib/client_0_8/transport.py\", line 109, in read_frame
frame_type, channel, size = unpack(\'>BHI\', self._read(7))
File \"/home/django/deployed/virtual-env/lib/python2.5/site-packages/amqplib/client_0_8/transport.py\", line 200, in _read
s = self.sock.recv(65536)
我已经检查了Rabbit日志,并看到尝试连接的过程为:
=INFO REPORT==== 12-Jun-2011::22:58:12 ===
accepted TCP connection on 0.0.0.0:5672 from x.x.x.x:48569
我已将Celery日志级别设置为INFO
,但是在Celery日志中看不到任何特别有趣的东西,除了其中两个工人无法连接到代理:
[2011-06-12 22:41:08,033: ERROR/MainProcess] Consumer: Connection to broker lost. Trying to re-establish connection...
所有其他节点都可以正常连接。
我知道去年有类似性质的帖子(RabbitMQ /芹菜与Django挂在delay / ready / etc上-没有有用的日志信息),但我可以肯定这是不同的。可能是纯粹的工人数量在amqplib
中创建了某种竞争条件-我发现此线程似乎表明amqplib
不是线程安全的,不确定这对Celery是否重要。
编辑:我已经在两个节点上尝试了celeryctl purge
-在一个节点上它成功了,但是在另一个节点上,它失败了,并出现以下AMQP错误:
AMQPConnectionException(reply_code, reply_text, (class_id, method_id))
amqplib.client_0_8.exceptions.AMQPConnectionException:
(530, u\"NOT_ALLOWED - cannot redeclare exchange \'XXXXX\' in vhost \'XXXXX\'
with different type, durable or autodelete value\", (40, 10), \'Channel.exchange_declare\')
在两个节点上,ѭ9都挂起,上面的“无法关闭连接”回溯。我在这里不知所措。
EDIT2:我能够从camqadm
中使用exchange.delete
删除有问题的交换,现在第二个节点也挂起了:(。
EDIT3:最近也发生了更改的一件事是,我向我的登台节点连接到的Rabbitmq添加了一个额外的虚拟主机。
没有找到相关结果
已邀请:
1 个回复
蹄渭信妥扳