ChatGPT解决这个技术问题 Extra ChatGPT

Run PHP Task Asynchronously

I work on a somewhat large web application, and the backend is mostly in PHP. There are several places in the code where I need to complete some task, but I don't want to make the user wait for the result. For example, when creating a new account, I need to send them a welcome email. But when they hit the 'Finish Registration' button, I don't want to make them wait until the email is actually sent, I just want to start the process, and return a message to the user right away.

Up until now, in some places I've been using what feels like a hack with exec(). Basically doing things like:

exec("doTask.php $arg1 $arg2 $arg3 >/dev/null 2>&1 &");

Which appears to work, but I'm wondering if there's a better way. I'm considering writing a system which queues up tasks in a MySQL table, and a separate long-running PHP script that queries that table once a second, and executes any new tasks it finds. This would also have the advantage of letting me split the tasks among several worker machines in the future if I needed to.

Am I re-inventing the wheel? Is there a better solution than the exec() hack or the MySQL queue?


P
Paul Dixon

I've used the queuing approach, and it works well as you can defer that processing until your server load is idle, letting you manage your load quite effectively if you can partition off "tasks which aren't urgent" easily.

Rolling your own isn't too tricky, here's a few other options to check out:

GearMan - this answer was written in 2009, and since then GearMan looks a popular option, see comments below.

ActiveMQ if you want a full blown open source message queue.

ZeroMQ - this is a pretty cool socket library which makes it easy to write distributed code without having to worry too much about the socket programming itself. You could use it for message queuing on a single host - you would simply have your webapp push something to a queue that a continuously running console app would consume at the next suitable opportunity

beanstalkd - only found this one while writing this answer, but looks interesting

dropr is a PHP based message queue project, but hasn't been actively maintained since Sep 2010

php-enqueue is a recently (2017) maintained wrapper around a variety of queue systems

Finally, a blog post about using memcached for message queuing

Another, perhaps simpler, approach is to use ignore_user_abort - once you've sent the page to the user, you can do your final processing without fear of premature termination, though this does have the effect of appearing to prolong the page load from the user perspective.


Thanks for all the tips. The specific one about ignore_user_abort doesn't really help in my case, my whole goal is to avoid unnecessary delays for the user.
If you set the Content-Length HTTP header in your "Thank You For Registering" response, then the browser should close the connection after the specified number of bytes are received. This leaves the server side process running (assuming that ignore_user_abort is set) without making the end user wait. Of course in you will need to calculate the size of your response content before rendering the headers, but that's pretty easy for short responses.
Gearman (gearman.org) is a great open source message queue that is cross platform. You can write workers in C, PHP, Perl or just about any other language. There are Gearman UDF plugins for MySQL and you can also use Net_Gearman from PHP or the gearman pear client.
Gearman would be what I would recommend today (in 2015) over any custom work queueing system.
Another option is to set up a node js server to handle a request and return a fast response with a task in between. Many things inside a node js script are executed asynchronously such as a http request.
T
Timm

When you just want to execute one or several HTTP requests without having to wait for the response, there is a simple PHP solution, as well.

In the calling script:

$socketcon = fsockopen($host, 80, $errno, $errstr, 10);
if($socketcon) {   
   $socketdata = "GET $remote_house/script.php?parameters=... HTTP 1.1\r\nHost: $host\r\nConnection: Close\r\n\r\n";      
   fwrite($socketcon, $socketdata); 
   fclose($socketcon);
}
// repeat this with different parameters as often as you like

On the called script.php, you can invoke these PHP functions in the first lines:

ignore_user_abort(true);
set_time_limit(0);

This causes the script to continue running without time limit when the HTTP connection is closed.


set_time_limit has no effect if php run in safe mode
r
rojoca

Another way to fork processes is via curl. You can set up your internal tasks as a webservice. For example:

http://domain/tasks/t1

http://domain/tasks/t2

Then in your user accessed scripts make calls to the service:

$service->addTask('t1', $data); // post data to URL via curl

Your service can keep track of the queue of tasks with mysql or whatever you like the point is: it's all wrapped up within the service and your script is just consuming URLs. This frees you up to move the service to another machine/server if necessary (ie easily scalable).

Adding http authorization or a custom authorization scheme (like Amazon's web services) lets you open up your tasks to be consumed by other people/services (if you want) and you could take it further and add a monitoring service on top to keep track of queue and task status.

http://domain/queue?task=t1

http://domain/queue?task=t2

http://domain/queue/t1/100931

It does take a bit of set-up work but there are a lot of benefits.


I do not like this approach because it overloads the web server
I dont see how you get around that if you use one server. And how would you get around that if you had more than one? So really, this answer is the only way to not load this work on the webserver.
N
Nisse Engström

If it just a question of providing expensive tasks, in case of php-fpm is supported, why not to use fastcgi_finish_request() function?

This function flushes all response data to the client and finishes the request. This allows for time consuming tasks to be performed without leaving the connection to the client open.

You don't really use asynchronicity in this way:

Make all your main code first. Execute fastcgi_finish_request(). Make all heavy stuff.

Once again php-fpm is needed.


A
Alister Bulman

I've used Beanstalkd for one project, and planned to again. I've found it to be an excellent way to run asynchronous processes.

A couple of things I've done with it are:

Image resizing - and with a lightly loaded queue passing off to a CLI-based PHP script, resizing large (2mb+) images worked just fine, but trying to resize the same images within a mod_php instance was regularly running into memory-space issues (I limited the PHP process to 32MB, and the resizing took more than that)

near-future checks - beanstalkd has delays available to it (make this job available to run only after X seconds) - so I can fire off 5 or 10 checks for an event, a little later in time

I wrote a Zend-Framework based system to decode a 'nice' url, so for example, to resize an image it would call QueueTask('/image/resize/filename/example.jpg'). The URL was first decoded to an array(module,controller,action,parameters), and then converted to JSON for injection to the queue itself.

A long running cli script then picked up the job from the queue, ran it (via Zend_Router_Simple), and if required, put information into memcached for the website PHP to pick up as required when it was done.

One wrinkle I did also put in was that the cli-script only ran for 50 loops before restarting, but if it did want to restart as planned, it would do so immediately (being run via a bash-script). If there was a problem and I did exit(0) (the default value for exit; or die();) it would first pause for a couple of seconds.


I like the look of beanstalkd, once they add persistence I think it will be perfect.
Thats already in the codebase and being stabilised. I'm also looking forward to 'named jobs', so I can throw things in there, but know it won't be added if there's already one there. Good for regular events.
@AlisterBulman could you give more information or examples for "A long running cli script then picked up the job from the queue". I m trying to build such a cli script for my application.
A
Andrew Moore

Here is a simple class I coded for my web application. It allows for forking PHP scripts and other scripts. Works on UNIX and Windows.

class BackgroundProcess {
    static function open($exec, $cwd = null) {
        if (!is_string($cwd)) {
            $cwd = @getcwd();
        }

        @chdir($cwd);

        if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') {
            $WshShell = new COM("WScript.Shell");
            $WshShell->CurrentDirectory = str_replace('/', '\\', $cwd);
            $WshShell->Run($exec, 0, false);
        } else {
            exec($exec . " > /dev/null 2>&1 &");
        }
    }

    static function fork($phpScript, $phpExec = null) {
        $cwd = dirname($phpScript);

        @putenv("PHP_FORCECLI=true");

        if (!is_string($phpExec) || !file_exists($phpExec)) {
            if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') {
                $phpExec = str_replace('/', '\\', dirname(ini_get('extension_dir'))) . '\php.exe';

                if (@file_exists($phpExec)) {
                    BackgroundProcess::open(escapeshellarg($phpExec) . " " . escapeshellarg($phpScript), $cwd);
                }
            } else {
                $phpExec = exec("which php-cli");

                if ($phpExec[0] != '/') {
                    $phpExec = exec("which php");
                }

                if ($phpExec[0] == '/') {
                    BackgroundProcess::open(escapeshellarg($phpExec) . " " . escapeshellarg($phpScript), $cwd);
                }
            }
        } else {
            if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') {
                $phpExec = str_replace('/', '\\', $phpExec);
            }

            BackgroundProcess::open(escapeshellarg($phpExec) . " " . escapeshellarg($phpScript), $cwd);
        }
    }
}

O
Omar Aziz

PHP HAS multithreading, its just not enabled by default, there is an extension called pthreads which does exactly that. You'll need php compiled with ZTS though. (Thread Safe) Links:

Examples

Another tutorial

pthreads PECL Extension

UPDATE: since PHP 7.2 parallel extension comes into play

Tutorial/Example

reference manual


obsolete now, replaced by parallel.
@T.Todua, thank you. Updated the answer to stay relevant!
D
Darryl Hein

This is the same method I have been using for a couple of years now and I haven't seen or found anything better. As people have said, PHP is single threaded, so there isn't much else you can do.

I have actually added one extra level to this and that's getting and storing the process id. This allows me to redirect to another page and have the user sit on that page, using AJAX to check if the process is complete (process id no longer exists). This is useful for cases where the length of the script would cause the browser to timeout, but the user needs to wait for that script to complete before the next step. (In my case it was processing large ZIP files with CSV like files that add up to 30 000 records to the database after which the user needs to confirm some information.)

I have also used a similar process for report generation. I'm not sure I'd use "background processing" for something such as an email, unless there is a real problem with a slow SMTP. Instead I might use a table as a queue and then have a process that runs every minute to send the emails within the queue. You would need to be warry of sending emails twice or other similar problems. I would consider a similar queueing process for other tasks as well.


Which method are you referring to in your first sentence?
K
Kjeld

It's a great idea to use cURL as suggested by rojoca.

Here is an example. You can monitor text.txt while the script is running in background:

<?php

function doCurl($begin)
{
    echo "Do curl<br />\n";
    $url = 'http://'.$_SERVER['SERVER_NAME'].$_SERVER['REQUEST_URI'];
    $url = preg_replace('/\?.*/', '', $url);
    $url .= '?begin='.$begin;
    echo 'URL: '.$url.'<br>';
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    $result = curl_exec($ch);
    echo 'Result: '.$result.'<br>';
    curl_close($ch);
}


if (empty($_GET['begin'])) {
    doCurl(1);
}
else {
    while (ob_get_level())
        ob_end_clean();
    header('Connection: close');
    ignore_user_abort();
    ob_start();
    echo 'Connection Closed';
    $size = ob_get_length();
    header("Content-Length: $size");
    ob_end_flush();
    flush();

    $begin = $_GET['begin'];
    $fp = fopen("text.txt", "w");
    fprintf($fp, "begin: %d\n", $begin);
    for ($i = 0; $i < 15; $i++) {
        sleep(1);
        fprintf($fp, "i: %d\n", $i);
    }
    fclose($fp);
    if ($begin < 10)
        doCurl($begin + 1);
}

?>

It would really help if the source code would be commented. I have no idea what's going on in there and which parts are example and which parts are re-usable for my own purpose.
y
yogibear

There is a PHP extension, called Swoole.

Although it might not be enabled, it is available on my hosting for being enabled at click of a button.

Worth checking it out. I haven't had time to use it yet, as I was searching here for info, when I stumbled across it and thought it worth sharing.


P
Peter D

Unfortunately PHP does not have any kind of native threading capabilities. So I think in this case you have no choice but to use some kind of custom code to do what you want to do.

If you search around the net for PHP threading stuff, some people have come up with ways to simulate threads on PHP.


P
Peter

If you set the Content-Length HTTP header in your "Thank You For Registering" response, then the browser should close the connection after the specified number of bytes are received. This leaves the server side process running (assuming that ignore_user_abort is set) so it can finish working without making the end user wait.

Of course you will need to calculate the size of your response content before rendering the headers, but that's pretty easy for short responses (write output to a string, call strlen(), call header(), render string).

This approach has the advantage of not forcing you to manage a "front end" queue, and although you may need to do some work on the back end to prevent racing HTTP child processes from stepping on each other, that's something you needed to do already, anyway.


This doesn't seem to work. When I use header('Content-Length: 3'); echo '1234'; sleep(5); then even though the browser takes only 3 chars, it still waits for 5 seconds before showing the response. What am I missing?
@ThomasTempelmann - You probably need to call flush() to force the output to actually be rendered immediately, otherwise the output will be buffered until your script exits or enough data is sent to STDOUT to flush the buffer.
I already tried many ways to flush, found here on SO. None help. And the data appears to be sent non-gzipped, too, as one can tell from phpinfo(). The only other thing I could imagine is that I need to reach a minimum buffer size first, e.g. 256 or so bytes.
@ThomasTempelmann - I don't see anything in your question or my answer about gzip (it usually makes sense to get the simplest scenario working first before adding layers of complexity). In order to establish when the server is actually sending data you can use a packet sniffer of browser plugin (like fiddler, tamperdata, etc.). Then, if you find that the webserver is really holding all script output until exit regardless of flushing, then you need to modify your webserver configuration (there's nothing that your PHP script can do in that case).
I use a virtual web service, so I have little control over its configuration. I was hoping to find other suggestions on what could be the culprit, but it seems that your answer simply isn't as universally applicable as it appears. Too many things can go wrong, obviously. Your solution surely is much easier to implement than all the other anwers given here. Too bad it doesn't work for me.
p
phpPhil

If you don't want the full blown ActiveMQ, I recommend to consider RabbitMQ. RabbitMQ is lightweight messaging that uses the AMQP standard.

I recommend to also look into php-amqplib - a popular AMQP client library to access AMQP based message brokers.


C
Community

i think you should try this technique it will help to call as many as pages you like all pages will run at once independently without waiting for each page response as asynchronous.

cornjobpage.php //mainpage

    <?php

post_async("http://localhost/projectname/testpage.php", "Keywordname=testValue");
//post_async("http://localhost/projectname/testpage.php", "Keywordname=testValue2");
//post_async("http://localhost/projectname/otherpage.php", "Keywordname=anyValue");
//call as many as pages you like all pages will run at once independently without waiting for each page response as asynchronous.
            ?>
            <?php

            /*
             * Executes a PHP page asynchronously so the current page does not have to wait for it to     finish running.
             *  
             */
            function post_async($url,$params)
            {

                $post_string = $params;

                $parts=parse_url($url);

                $fp = fsockopen($parts['host'],
                    isset($parts['port'])?$parts['port']:80,
                    $errno, $errstr, 30);

                $out = "GET ".$parts['path']."?$post_string"." HTTP/1.1\r\n";//you can use POST instead of GET if you like
                $out.= "Host: ".$parts['host']."\r\n";
                $out.= "Content-Type: application/x-www-form-urlencoded\r\n";
                $out.= "Content-Length: ".strlen($post_string)."\r\n";
                $out.= "Connection: Close\r\n\r\n";
                fwrite($fp, $out);
                fclose($fp);
            }
            ?>

testpage.php

    <?
    echo $_REQUEST["Keywordname"];//case1 Output > testValue
    ?>

PS:if you want to send url parameters as loop then follow this answer :https://stackoverflow.com/a/41225209/6295712


G
Greg Glockner

Spawning new processes on the server using exec() or directly on another server using curl doesn't scale all that well at all, if we go for exec you are basically filling your server with long running processes which can be handled by other non web facing servers, and using curl ties up another server unless you build in some sort of load balancing.

I have used Gearman in a few situations and I find it better for this sort of use case. I can use a single job queue server to basically handle queuing of all the jobs needing to be done by the server and spin up worker servers, each of which can run as many instances of the worker process as needed, and scale up the number of worker servers as needed and spin them down when not needed. It also let's me shut down the worker processes entirely when needed and queues the jobs up until the workers come back online.


M
Marc W

PHP is a single-threaded language, so there is no official way to start an asynchronous process with it other than using exec or popen. There is a blog post about that here. Your idea for a queue in MySQL is a good idea as well.

Your specific requirement here is for sending an email to the user. I'm curious as to why you are trying to do that asynchronously since sending an email is a pretty trivial and quick task to perform. I suppose if you are sending tons of email and your ISP is blocking you on suspicion of spamming, that might be one reason to queue, but other than that I can't think of any reason to do it this way.


The email was just an example, since the other tasks are more complex to explain, and it's not really the point of the question. The way we used to send email, the email command wouldn't return until the remote server accepted the mail. We found that some mail servers were configured to add long delays (like 10-20 second delays) before accepting mail (probably to fight spambots), and these delays would then be passed onto our users. Now, we are using a local mailserver to queue up the mails to be sent, so this particular one doesn't apply, but we have other tasks of similar nature.
For example: sending emails through Google Apps Smtp with ssl and port 465 takes longer than usual.