Today I want to talk about daemonizing PHP scripts on Linux servers -- creating scripts that are meant to run in the background for long periods of time. After we've covered that, I'll explain how I like to use supervisord to manage these long-running PHP processes.

Creating long-running scripts

A daemon is nothing but a script that goes on forever. When you get right down to it, a daemon is simply an infinite loop.

  1. class MyDaemon
  2. {
  3.     public function run()
  4.     {
  5.         while (1) {
  6.             // Like the energizer bunny
  7.         }
  8.     }
  9. }
  10.  
  11. $daemon = new MyDaemon();
  12. $daemon->run();

The work you do within that infinite loop is the whole purpose of your daemon. Maybe it's listening for a network connection, or maybe it's waiting for an item to enter a queue. For example, how about a hypothetical website that needs to process new users using a database queue, you might end up with something like this:

(Note: using a database for a queue isn't very efficient, hence the use of sleep() here to prevent hundreds of count queries per second being run in the loop. In practice you'd want to use a real queue, like beanstalkd.)

  1. class MyDaemon
  2. {
  3.     public function run()
  4.     {
  5.         while (1) {
  6.             if ($this->isNewUsers()) {
  7.                 foreach ($this->getNewUsers() as $user) {
  8.                     $this->setupNewUser($user);
  9.                 }
  10.             } else {
  11.                 // If there's no work to do, let's just
  12.                 // wait 10 seconds before hitting the DB to
  13.                 // count again
  14.                 sleep(10);
  15.             }
  16.         }
  17.     }
  18.    
  19.     // ...
  20.     // Imagine we defined isNewUsers, getNewUsers, setupNewUser
  21. }

Note: In this post I'm creating simple daemons that generally perform one task at a time. "True" daemons, like a web server for example, fork off children so multiple tasks can be run in parallel.

Shutting down safely

So we have a long-running script now. But you might have noticed there's no way to actually stop it. The only way we could ever stop this script from running is to issue a kill command to kill the process. But this is dangerous because what if the script was in the middle of processing something? Using our example above, what if we were only half-finished processing a user?

You can sometimes mitigate problems like this. For example, if you use database transactions to ensure an all-or-nothing commit. But what if you need to interact with systems that don't offer such transactional safety?

So the only way a PHP daemon is really worth it is if we can shut it down safely. That means we need some way to tell the script it's time to go; a way to tell it to break the infinite loop. Something like this:

  1. class MyDaemon
  2. {
  3.     public function run()
  4.     {
  5.         while (1) {
  6.             // Check to see if we should break out
  7.             if ($this->shouldStop()) break;
  8.  
  9.             if ($this->isNewUsers()) {
  10.                 foreach ($this->getNewUsers() as $user) {
  11.                    
  12.                     // Check inside the loop too, to make sure we
  13.                     // dont continue processing a big batch of users
  14.                     if ($this->shouldStop()) break;
  15.                    
  16.                     $this->setupNewUser($user);
  17.                 }
  18.             } else {
  19.                 sleep(10);
  20.             }
  21.         }
  22.     }
  23.    
  24.     /**
  25.      * Returns true when the main loop should finally stop
  26.      */
  27.     public function shouldStop()
  28.     {
  29.         return false;
  30.     }
  31. }

But what can we put inside that shouldStop() method?

Checking an external value

Perhaps the simplest thing to do is to check to see if there is some external value. Maybe a database record that says "should_stop=1" or the presence of a file called "should_stop.txt".

  1. /**
  2.      * Returns true when the main loop should finally stop
  3.      */
  4.     public function shouldStop()
  5.     {
  6.         clearstatcache();
  7.         if (file_exists('should_stop.txt')) {
  8.             unlink('should_stop.txt'); // delete the file for next time
  9.             return true;
  10.         }
  11.        
  12.         return false;
  13.     }

But these methods aren't very efficient, and they build up a dependance on some other system.

Handling POSIX signals

The best way for us to handle clean shutdowns is to handle standard POSIX signals like TERM and INT. When you're using a shell and press CTRL+C, Linux sends the application an INT signal. When you the kill command on a process, it will send the TERM signal by default. So intercepting these two signals is perfect for us. If we can intercept them, then we can make sure we shut down gracefully.

Enabling Process Control in PHP

PHP has process control features (PCNTL) that we need to interpret POSIX signals. But they are not enabled by default. You must compile PHP with the --enable-pcntl configuration option.

Listening for signals

There are two steps we need to do:

  1. Declare ticks. Ticks are a little known feature of PHP that allows you to run a callback function every X statements. This can be useful for memory profiling or debugging, but it's not very useful otherwise. But we aren't actually going to use ticks ourselves -- the PHP PCNTL features just uses it internally.
  2. Assign callbacks to listen for signals using pcntl_signal().

Here's what we end up with:

  1. declare(ticks = 1);
  2.  
  3. function sig_handler($signo)
  4. {
  5.     switch ($signo) {
  6.         case SIGTERM:
  7.         case SIGINT:
  8.             global $_MYDAEMON_SHOULD_STOP;
  9.             $_MYDAEMON_SHOULD_STOP = true;
  10.             break;
  11.     }
  12. }
  13. pcntl_signal(SIGTERM, 'sig_handler');
  14. pcntl_signal(SIGINT, 'sig_handler');
  15.  
  16. class MyDaemon
  17. {
  18.     // ...
  19.  
  20.     public function shouldStop()
  21.     {
  22.         global $_MYDAEMON_SHOULD_STOP;
  23.         if (isset($_MYDAEMON_SHOULD_STOP) AND $_MYDAEMON_SHOULD_STOP) {
  24.             return true;
  25.         }
  26.        
  27.         return false;
  28.     }
  29. }

So in this example we've created a callback function that listens for TERM and INT signals, and assigns a value of 'true' to a variable. Then within the actual daemon class code, we check this variable each time to determine when it's time to call it quits.

Note that it's important to use a basic function callback with pcntl_signal and not an object method. There is some weirdness when using using $this in callbacks. That's why in the example above I've opted to use a simple function and a global variable.

Problems with long-running PHP scripts

The biggest problem with long-running PHP scripts is memory usage. PHP isn't exactly designed for long-running programs. There are memory leaks that don't usually matter in websites because requests start and finish so quickly, but matter a lot in long-running processes. These leaks are usually to do with circular object references and PHP's garbage collection. But with recent versions of PHP it's not so much of a problem anymore.

Even so, you have to be careful about memory usage in your actual scripts as well. Not many libraries are designed for long-running processes. So maybe your emailing library saves some information in an array after you send each email. In a web script, it doesn't matter -- the array is lost after the request finishes. But now you might have a process that runs for hours on end. This little array of log messages can become huge!

It's problems like that that you must be weary of. You need to consciously think about things like that while designing your PHP daemons. What I would recommend is to plan for your PHP daemons, if possible, to shut down and restart periodically. I'll explain the best way to do this in the next section.

Handling processes (startup, monitoring, detaching etc)

So we've got our long-running PHP scripts now, but we still have a few more things to tackle. First is how do we detach a script so it runs in the background? That is, if you run one of these scripts from your shell for example, you'll just get an empty screen that does nothing until you quit the process. You want to run the script so it detaches and runs in the background -- aka, you get your command prompt back. You can do this in straight PHP -- it involves forking so the parent process can die, but the children live on.

But there are other things to consider. How do we handle output? How do we handle fatal errors and exceptions (like a database disconnection) -- how do we get the process to restart? How do we make sure the correct user owns the process?

The easiest way to do all of this is to use a process manager like supervisord. This software runs on your server and is responsible for starting your long-running PHP scripts in the background (so you don't have to worry about detaching yourself), and also sports features like monitoring so you can automatically restart processes that've quit unexpectedly or when they are using too much memory.

With supervisord, you add some configuration for each script you want to run. Here's an example that runs a script, logs output to a file, and will TERM the script when it uses over 50MB of memory (where it'll be restarted again).

  1. [program:mydaemon]
  2. command=/usr/bin/php /home/john/mydaemon.php
  3. directory=/home/john
  4. startsecs=5
  5. exitcodes=2
  6. user=john
  7. stdout_logfile=/home/john/logs/mydaemon.out
  8. stderr_logfile=/home/john/logs/mydaemon.err
  9.  
  10. [eventlistener:memmon]
  11. command=memmon -p mydaemon=50MB
  12. events=TICK_60

There are other good features of supervisord and other configuration options -- be sure to check it out, it's insanely useful.

That's all, folks

So that's how I write long-running PHP scripts. Don't fuss around with minutely cron jobs to run PHP scripts anymore. Do it properly. Create a daemon.

7 Responses to “Creating daemons in PHP”

  1. droope Says:

    Cool! thanks

  2. Kevin van Zonneveld Says:

    Nice, I've written something similar and submitted it to PEAR a while ago. Have a look:
    http://kevin.vanzonneveld.net/techblog/article/create_daemons_in_php/

  3. Roman Says:

    Thanks for supervisord. That was exactly what I was looking for.

  4. Shane Harter Says:

    I recently open-sourced a Daemon platform I wrote that we've used for a long time in Production.

    https://github.com/shaneharter/PHP-Daemon

    There are many similar libraries out there, no doubt. But a feature this has that readers here might find useful is a built-in timer. Suppose you wanted to run some code every second. Or maybe 5 times a second.

    Also, I built a pretty nifty auto-restart feature to side-step the (very correct) memory issues you wrote about.

    Shane

  5. Mike Munger Says:

    great tutorial, thanks!

    A few notes for anyone like me in CentOS 5.5 x64 (with root access)

    I use the webtatic repo for the latest PHP release (as of now it has 5.3.6) http://www.webtatic.com/packages/php53/

    The repo has php-cli available which includes php-pcntl so you can get that all through yum without having to compile this into PHP yourself. You may also need php-mbstring for Pheanstalk installation.

    beanstalk you can install through yum from the epel repo.

    Pheanstalk is a good PHP beanstalk lib https://github.com/pda/pheanstalk

    python (2.4.3) and supervisor you can install with yum also (either the base or epel repo, don't remember)

    and don't forget to make sure these services start when the machine is rebooted (if you need to) by adding to chkconfig:
    sudo chkconfig --level 345 beanstalkd on
    sudo chkconfig --level 345 supervisord on

    I was able to get it all working with beanstalk as the queue and supervisor as the process manager.

    Garbage collection definitely an issue for me as we run a custom MVC framework, and my daemon uses some of the framework, which does not explicitly clean everything up as it was assumed to be a web-only framework, so I am working on getting the daemon to restart itself every 5 large tasks, or when the queue is reaches empty.

    Good luck!

  6. Aaron Says:

    Thanks for this! I actually found a lot of other uses for supervisord than just managing our php workers; including running httpd itself. I had this goal of having a single amazon machine image for all parts of our app; web server/rest service, media processors, etc. I created a script that starts supervisord with a config file named after the amazon ec2 instance's "Type" tag value that we set. Very painless way to get one amazon machine image running in multiple modes using something like the Strategy Pattern. To launch a new media processor, we just launch an amazon ec2 instance and tag it with Type=MediaProcessor. Supervisord handles the rest.

  7. Brandon Says:

    I am new to deamon-izing my php code. If i understand the artical correctly, do you still need to compile PHP with pcntl if you use supervisor?

    Any chance you could publish a zip with the files used for this demo?

Leave a Reply