Work Queues (工作/任务队列)

(using php-amqplib)


In the first tutorial we wrote programs to send and receive messages from a named queue. In this one we'll create a Work Queue that will be used to distribute time-consuming tasks among multiple workers.


The main idea behind Work Queues (aka: Task Queues) is to avoid doing a resource-intensive task immediately and having to wait for it to complete. Instead we schedule the task to be done later. We encapsulate a task as a message and send it to a queue. A worker process running in the background will pop the tasks and eventually execute the job. When you run many workers the tasks will be shared between them.


This concept is especially useful in web applications where it's impossible to handle a complex task during a short HTTP request window.



In the previous part of this tutorial we sent a message containing "Hello World!". Now we'll be sending strings that stand for complex tasks. We don't have a real-world task, like images to be resized or pdf files to be rendered, so let's fake it by just pretending we're busy - by using the sleep() function. We'll take the number of dots in the string as its complexity; every dot will account for one second of "work". For example, a fake task described by Hello... will take three seconds.

上回说到,咱们发了个“Hello World"的消息。 这回发送一些字符串来代表复杂的任务。我们没有像调整图片大小或者pdf渲染等现实生活中的任务,那么我们用sleep()函数来伪装任务很耗时,进而模拟现实中的场景

We will slightly modify the send.php code from our previous example, to allow arbitrary messages to be sent from the command line. This program will schedule tasks to our work queue, so let's name it new_task.php:

记得"send.php"吧,我们稍微修改下下,以便于它可以通过命令发送任意消息。 这个程序会向我们的工作队列中安排任务,


$data = implode(' ', array_slice($argv, 1));
if(empty($data)) $data = "Hello World!";
$msg = new AMQPMessage($data,
                        array('delivery_mode' => 2) # make message persistent

$channel->basic_publish($msg, '', 'task_queue');

echo " [x] Sent ", $data, "\n";

Our old receive.php script also requires some changes: it needs to fake a second of work for every dot in the message body. It will pop messages from the queue and perform the task, so let's call it worker.php:


$callback = function($msg){
  echo " [x] Received ", $msg->body, "\n";
  sleep(substr_count($msg->body, '.'));
  echo " [x] Done", "\n";

$channel->basic_qos(null, 1, null);
$channel->basic_consume('task_queue', '', false, false, false, false, $callback);

Note that our fake task simulates execution time.


Run them as in tutorial one: 执行方式跟上回一样(如下)

shell1$ php new_task.php "A very hard task which takes two seconds.."
shell2$ php worker.php

Round-robin dispatching(循环调度/分发)

One of the advantages of using a Task Queue is the ability to easily parallelise work. If we are building up a backlog of work, we can just add more workers and that way, scale easily.


First, let's try to run two worker.php scripts at the same time. They will both get messages from the queue, but how exactly? Let's see.

首先,我们同事运行俩worker.php。它们都会从队列中获得消息,可究竟为喵呢? 接着看。

You need three consoles open. Two will run the worker.php script. These consoles will be our two consumers - C1 and C2.


shell1$ php worker.php
 [*] Waiting for messages. To exit press CTRL+C
shell2$ php worker.php
 [*] Waiting for messages. To exit press CTRL+C

In the third one we'll publish new tasks. Once you've started the consumers you can publish a few messages:


shell3$ php new_task.php First message.
shell3$ php new_task.php Second message..
shell3$ php new_task.php Third message...
shell3$ php new_task.php Fourth message....
shell3$ php new_task.php Fifth message.....

Let's see what is delivered to our workers:


shell1$ php worker.php
 [*] Waiting for messages. To exit press CTRL+C
 [x] Received 'First message.'
 [x] Received 'Third message...'
 [x] Received 'Fifth message.....'
shell2$ php worker.php
 [*] Waiting for messages. To exit press CTRL+C
 [x] Received 'Second message..'
 [x] Received 'Fourth message....'

By default, RabbitMQ will send each message to the next consumer, in sequence. On average every consumer will get the same number of messages. This way of distributing messages is called round-robin. Try this out with three or more workers.

默认嘞,RabbitMQ 会按顺序发送每一条消息到下一个消费者,通常每个消费者会收到相同数量的消息。这种分发消息的方式就叫“round-robin”。试试三个或更多的woker.

Message acknowledgment(消息确认)

Doing a task can take a few seconds. You may wonder what happens if one of the consumers starts a long task and dies with it only partly done. With our current code, once RabbitMQ delivers a message to the customer it immediately removes it from memory. In this case, if you kill a worker we will lose the message it was just processing. We'll also lose all the messages that were dispatched to this particular worker but were not yet handled.




But we don't want to lose any tasks. If a worker dies, we'd like the task to be delivered to another worker.


In order to make sure a message is never lost, RabbitMQ supports message acknowledgments. An ack(nowledgement) is sent back from the consumer to tell RabbitMQ that a particular message has been received, processed and that RabbitMQ is free to delete it.



If a consumer dies without sending an ack, RabbitMQ will understand that a message wasn't processed fully and will redeliver it to another consumer. That way you can be sure that no message is lost, even if the workers occasionally die.



There aren't any message timeouts; RabbitMQ will redeliver the message only when the worker connection dies. It's fine even if processing a message takes a very, very long time.



Message acknowledgments are turned off by default. It's time to turn them on by setting the fourth parameter to basic_consume to false (true means no ack) and send a proper acknowledgment from the worker, once we're done with a task.

默认情况下,消息确认是关闭的。是时候开启他们了,一旦一项任务完成,通过设置basic_consume 第四个参数设置为false(true意思是关闭消息确认)来发送适当的确认信息。

$callback = function($msg){
  echo " [x] Received ", $msg->body, "\n";
  sleep(substr_count($msg->body, '.'));
  echo " [x] Done", "\n";

$channel->basic_consume('task_queue', '', false, false, false, false, $callback);

Using this code we can be sure that even if you kill a worker using CTRL+C while it was processing a message, nothing will be lost. Soon after the worker dies all unacknowledged messages will be redelivered.



Forgotten acknowledgment(老年失忆)

It's a common mistake to miss the basic_ack. It's an easy error, but the consequences are serious. Messages will be redelivered when your client quits (which may look like random redelivery), but RabbitMQ will eat more and more memory as it won't be able to release any unacked messages.




In order to debug this kind of mistake you can use rabbitmqctl to print the messages_unacknowledged field:

要调试这种错误,你可以用rabbitmqctl来打印messages_unacknowledged 字段。

$ sudo rabbitmqctl list_queues name messages_ready messages_unacknowledged
Listing queues ...
hello    0       0

Message durability(消息持久性)

We have learned how to make sure that even if the consumer dies, the task isn't lost. But our tasks will still be lost if RabbitMQ server stops.


When RabbitMQ quits or crashes it will forget the queues and messages unless you tell it not to. Two things are required to make sure that messages aren't lost: we need to mark both the queue and messages as durable.



First, we need to make sure that RabbitMQ will never lose our queue. In order to do so, we need to declare it as durable. To do so we pass the third parameter to queue_declare as true:


$channel->queue_declare('hello', false, true, false, false);

Although this command is correct by itself, it won't work in our present setup. That's because we've already defined a queue called hello which is not durable. RabbitMQ doesn't allow you to redefine an existing queue with different parameters and will return an error to any program that tries to do that. But there is a quick workaround - let's declare a queue with different name, for example task_queue:


$channel->queue_declare('task_queue', false, true, false, false);

This flag set to true needs to be applied to both the producer and consumer code.


At this point we're sure that the task_queue queue won't be lost even if RabbitMQ restarts. Now we need to mark our messages as persistent - by setting the delivery_mode = 2 message property which AMQPMessage takes as part of the property array.

到这里,我们就能保证及时RabbitMQ重启task_queue这个队列也不会丢失。现在呢,我们也得把消息标记为持久的,设置一个数组的属性delivery_mode=2,作为AMQPMessage 的参数(第二个)就可以啦。

$msg = new AMQPMessage($data,
       array('delivery_mode' => 2) # make message persistent

Note on message persistence(需要注意滴)

Marking messages as persistent doesn't fully guarantee that a message won't be lost. Although it tells RabbitMQ to save the message to disk, there is still a short time window when RabbitMQ has accepted a message and hasn't saved it yet. Also, RabbitMQ doesn't do fsync(2) for every message -- it may be just saved to cache and not really written to the disk. The persistence guarantees aren't strong, but it's more than enough for our simple task queue. If you need a stronger guarantee you can wrap the publishing code in a transaction.




Fair dispatch(公平分配)

You might have noticed that the dispatching still doesn't work exactly as we want. For example in a situation with two workers, when all odd messages are heavy and even messages are light, one worker will be constantly busy and the other one will do hardly any work. Well, RabbitMQ doesn't know anything about that and will still dispatch messages evenly.


This happens because RabbitMQ just dispatches a message when the message enters the queue. It doesn't look at the number of unacknowledged messages for a consumer. It just blindly dispatches every n-th message to the n-th consumer.



In order to defeat that we can use the basic_qos method with the prefetch_count = 1 setting. This tells RabbitMQ not to give more than one message to a worker at a time. Or, in other words, don't dispatch a new message to a worker until it has processed and acknowledged the previous one. Instead, it will dispatch it to the next worker that is not still busy.

为了解决这种情况,我们可以用basic_qus方法,设置prefetch_count=1. 这会告诉RabbitMQ一次只给一个worker一个消息。



$channel->basic_qos(null, 1, null);

Note about queue size(注意啦!!队列大小)

If all the workers are busy, your queue can fill up. You will want to keep an eye on that, and maybe add more workers, or have some other strategy.


Putting it all together(合体!!!again 哈哈)

Final code of our new_task.php file:(new_task.php的终极神码)


$channel->queue_declare('task_queue', false, true, false, false);

$data = implode(' ', array_slice($argv, 1));
if(empty($data)) $data = "Hello World!";
$msg = new AMQPMessage($data,
                        array('delivery_mode' => 2) # make message persistent

$channel->basic_publish($msg, '', 'task_queue');

echo " [x] Sent ", $data, "\n";



(new_task.php source)

And our worker.php: (worker哦)


$channel->queue_declare('task_queue', false, true, false, false);

echo ' [*] Waiting for messages. To exit press CTRL+C', "\n";

$callback = function($msg){
  echo " [x] Received ", $msg->body, "\n";
  sleep(substr_count($msg->body, '.'));
  echo " [x] Done", "\n";

$channel->basic_qos(null, 1, null);
$channel->basic_consume('task_queue', '', false, false, false, false, $callback);

while(count($channel->callbacks)) {



(worker.php source)

Using message acknowledgments and prefetch you can set up a work queue. The durability options let the tasks survive even if RabbitMQ is restarted.


Now we can move on to tutorial 3 and learn how to deliver the same message to many consumers.


