ChatGPT解决这个技术问题 Extra ChatGPT

How do I implement basic "Long Polling"?

I can find lots of information on how Long Polling works (For example, this, and this), but no simple examples of how to implement this in code.

All I can find is cometd, which relies on the Dojo JS framework, and a fairly complex server system..

Basically, how would I use Apache to serve the requests, and how would I write a simple script (say, in PHP) which would "long-poll" the server for new messages?

The example doesn't have to be scaleable, secure or complete, it just needs to work!


M
Minko Gechev

It's simpler than I initially thought.. Basically you have a page that does nothing, until the data you want to send is available (say, a new message arrives).

Here is a really basic example, which sends a simple string after 2-10 seconds. 1 in 3 chance of returning an error 404 (to show error handling in the coming Javascript example)

msgsrv.php

<?php
if(rand(1,3) == 1){
    /* Fake an error */
    header("HTTP/1.0 404 Not Found");
    die();
}

/* Send a string after a random number of seconds (2-10) */
sleep(rand(2,10));
echo("Hi! Have a random number: " . rand(1,10));
?>

Note: With a real site, running this on a regular web-server like Apache will quickly tie up all the "worker threads" and leave it unable to respond to other requests.. There are ways around this, but it is recommended to write a "long-poll server" in something like Python's twisted, which does not rely on one thread per request. cometD is an popular one (which is available in several languages), and Tornado is a new framework made specifically for such tasks (it was built for FriendFeed's long-polling code)... but as a simple example, Apache is more than adequate! This script could easily be written in any language (I chose Apache/PHP as they are very common, and I happened to be running them locally)

Then, in Javascript, you request the above file (msg_srv.php), and wait for a response. When you get one, you act upon the data. Then you request the file and wait again, act upon the data (and repeat)

What follows is an example of such a page.. When the page is loaded, it sends the initial request for the msgsrv.php file.. If it succeeds, we append the message to the #messages div, then after 1 second we call the waitForMsg function again, which triggers the wait.

The 1 second setTimeout() is a really basic rate-limiter, it works fine without this, but if msgsrv.php always returns instantly (with a syntax error, for example) - you flood the browser and it can quickly freeze up. This would better be done checking if the file contains a valid JSON response, and/or keeping a running total of requests-per-minute/second, and pausing appropriately.

If the page errors, it appends the error to the #messages div, waits 15 seconds and then tries again (identical to how we wait 1 second after each message)

The nice thing about this approach is it is very resilient. If the clients internet connection dies, it will timeout, then try and reconnect - this is inherent in how long polling works, no complicated error-handling is required

Anyway, the long_poller.htm code, using the jQuery framework:

<html>
<head>
    <title>BargePoller</title>
    <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.2.6/jquery.min.js" type="text/javascript" charset="utf-8"></script>

    <style type="text/css" media="screen">
      body{ background:#000;color:#fff;font-size:.9em; }
      .msg{ background:#aaa;padding:.2em; border-bottom:1px #000 solid}
      .old{ background-color:#246499;}
      .new{ background-color:#3B9957;}
    .error{ background-color:#992E36;}
    </style>

    <script type="text/javascript" charset="utf-8">
    function addmsg(type, msg){
        /* Simple helper to add a div.
        type is the name of a CSS class (old/new/error).
        msg is the contents of the div */
        $("#messages").append(
            "<div class='msg "+ type +"'>"+ msg +"</div>"
        );
    }

    function waitForMsg(){
        /* This requests the url "msgsrv.php"
        When it complete (or errors)*/
        $.ajax({
            type: "GET",
            url: "msgsrv.php",

            async: true, /* If set to non-async, browser shows page as "Loading.."*/
            cache: false,
            timeout:50000, /* Timeout in ms */

            success: function(data){ /* called when request to barge.php completes */
                addmsg("new", data); /* Add response to a .msg div (with the "new" class)*/
                setTimeout(
                    waitForMsg, /* Request next message */
                    1000 /* ..after 1 seconds */
                );
            },
            error: function(XMLHttpRequest, textStatus, errorThrown){
                addmsg("error", textStatus + " (" + errorThrown + ")");
                setTimeout(
                    waitForMsg, /* Try again after.. */
                    15000); /* milliseconds (15seconds) */
            }
        });
    };

    $(document).ready(function(){
        waitForMsg(); /* Start the inital request */
    });
    </script>
</head>
<body>
    <div id="messages">
        <div class="msg old">
            BargePoll message requester!
        </div>
    </div>
</body>
</html>

Couldn't some messages slip through using this idea? In that 1 second time out, say 1000 chat messages were sent, how would the server know to send the 1000 messsages specifically to that client?
Probably. This is a very simplified example, to demonstrate the concept.. To do this better you'd need more elaborate server-side code, where it would store up those 1000 messages for that specific client, and send them in one chunk. You could also safely reduce the waitForMsg timeout
nodejs is another excellent server side solution for long polling requests, with the additional advantage (over Twisted) that you can write server code in Javascript too.
This is just a plain recurrent AJAX connections to server with 1 second interval. This has nothing to do with "long polling". Long polling should keep connection alive, as long as client timeout accours.
the question is what does a real PHP script do instead of sleep(rand(2,10)); ? in order to do nothing, poll the database each 100 milisecs? when does it decide to die?
j
j0k

I've got a really simple chat example as part of slosh.

Edit: (since everyone's pasting their code in here)

This is the complete JSON-based multi-user chat using long-polling and slosh. This is a demo of how to do the calls, so please ignore the XSS problems. Nobody should deploy this without sanitizing it first.

Notice that the client always has a connection to the server, and as soon as anyone sends a message, everyone should see it roughly instantly.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<!-- Copyright (c) 2008 Dustin Sallings <dustin+html@spy.net> -->
<html lang="en">
  <head>
    <title>slosh chat</title>
    <script type="text/javascript"
      src="http://code.jquery.com/jquery-latest.js"></script>
    <link title="Default" rel="stylesheet" media="screen" href="style.css" />
  </head>

  <body>
    <h1>Welcome to Slosh Chat</h1>

    <div id="messages">
      <div>
        <span class="from">First!:</span>
        <span class="msg">Welcome to chat. Please don't hurt each other.</span>
      </div>
    </div>

    <form method="post" action="#">
      <div>Nick: <input id='from' type="text" name="from"/></div>
      <div>Message:</div>
      <div><textarea id='msg' name="msg"></textarea></div>
      <div><input type="submit" value="Say it" id="submit"/></div>
    </form>

    <script type="text/javascript">
      function gotData(json, st) {
        var msgs=$('#messages');
        $.each(json.res, function(idx, p) {
          var from = p.from[0]
          var msg = p.msg[0]
          msgs.append("<div><span class='from'>" + from + ":</span>" +
            " <span class='msg'>" + msg + "</span></div>");
        });
        // The jQuery wrapped msgs above does not work here.
        var msgs=document.getElementById("messages");
        msgs.scrollTop = msgs.scrollHeight;
      }

      function getNewComments() {
        $.getJSON('/topics/chat.json', gotData);
      }

      $(document).ready(function() {
        $(document).ajaxStop(getNewComments);
        $("form").submit(function() {
          $.post('/topics/chat', $('form').serialize());
          return false;
        });
        getNewComments();
      });
    </script>
  </body>
</html>

May I know how this is always connected ? Sorry if I am asking something silly, but I want to know that.
It does an HTTP GET and the server blocks the GET until there's data available. When data arrives to the server, the server returns the data to the client, queues whatever else might be coming in and then the client reconnects and picks up the missing messages if any, otherwise it blocks again.
May be not obvious at first, but thing is that responsible for 'always connected state' is ajaxStop with getNewComments callback there, so it just fires it at the end of every ajax request endlessly
j
johndodo

Tornado is designed for long-polling, and includes a very minimal (few hundred lines of Python) chat app in /examples/chatdemo , including server code and JS client code. It works like this:

Clients use JS to ask for an updates since (number of last message), server URLHandler receives these and adds a callback to respond to the client to a queue.

When the server gets a new message, the onmessage event fires, loops through the callbacks, and sends the messages.

The client-side JS receives the message, adds it to the page, then asks for updates since this new message ID.


G
Greg

I think the client looks like a normal asynchronous AJAX request, but you expect it to take a "long time" to come back.

The server then looks like this.

while (!hasNewData())
    usleep(50);

outputNewData();

So, the AJAX request goes to the server, probably including a timestamp of when it was last update so that your hasNewData() knows what data you have already got. The server then sits in a loop sleeping until new data is available. All the while, your AJAX request is still connected, just hanging there waiting for data. Finally, when new data is available, the server gives it to your AJAX request and closes the connection.


This is a busy wait that blocks your current thread. That doesn't scale at all.
No, usleep is not a busy wait. And the whole point of "waiting" is to block your thread for a while. Probably he meant 50 milliseconds (usleep(50000)) though, not 50 microseconds! But anyway, with a typical Apache/PHP setup, is there any other way to do this?
Well, from the priciple, you cannot make a blocking function for chat message without wait.
Great really! I built a recursive function in server to check for new data. But what is the best product to use the long polling efficiently? I use the normal Apache and the server doesn't respond when I open more than 4/5 browser tabs :( Looking for something to be used with the PHP
C
Community

Here are some classes I use for long-polling in C#. There are basically 6 classes (see below).

Controller: Processes actions required to create a valid response (db operations etc.) Processor: Manages asynch communication with the web page (itself) IAsynchProcessor: The service processes instances that implement this interface Sevice: Processes request objects that implement IAsynchProcessor Request: The IAsynchProcessor wrapper containing your response (object) Response: Contains custom objects or fields


Okay...so WHY was this voted down? These classes are indeed valid examples of long-polling.
Real long-polling is not (simply) the practice of increasing the interval wherin you conduct a normal-poll (on a resource). It is part of a larger pattern...which is "somewhat" subject to interpretation...but only in certain areas of the overall implementation. That said...these classes follow said pattern! So if you have a reason for voting this down...I truly would be interested in the reason.
Perhaps it was voted down as it does not directly address the question of a simple code example. Of course I did not vote it down so I can only guess.
S
Sean O

This is a nice 5-minute screencast on how to do long polling using PHP & jQuery: http://screenr.com/SNH

Code is quite similar to dbr's example above.


I think you should only see this as an introduction to long-polling because this implementation will for sure kill your server with many concurrent users.
im just learning about all this...how dependable, or not, is it with a few users...say 10 chatting back n forth?
d
dbr

Here is a simple long-polling example in PHP by Erik Dubbelboer using the Content-type: multipart/x-mixed-replace header:

<?

header('Content-type: multipart/x-mixed-replace; boundary=endofsection');

// Keep in mind that the empty line is important to separate the headers
// from the content.
echo 'Content-type: text/plain

After 5 seconds this will go away and a cat will appear...
--endofsection
';
flush(); // Don't forget to flush the content to the browser.


sleep(5);


echo 'Content-type: image/jpg

';

$stream = fopen('cat.jpg', 'rb');
fpassthru($stream);
fclose($stream);

echo '
--endofsection
';

And here is a demo:

http://dubbelboer.com/multipart.php


a
adam

I used this to get to grips with Comet, I have also set up Comet using the Java Glassfish server and found lots of other examples by subscribing to cometdaily.com


D
Denis

Take a look at this blog post which has code for a simple chat app in Python/Django/gevent.


R
Ryan Henderson

Below is a long polling solution I have developed for Inform8 Web. Basically you override the class and implement the loadData method. When the loadData returns a value or the operation times out it will print the result and return.

If the processing of your script may take longer than 30 seconds you may need to alter the set_time_limit() call to something longer.

Apache 2.0 license. Latest version on github https://github.com/ryanhend/Inform8/blob/master/Inform8-web/src/config/lib/Inform8/longpoll/LongPoller.php

Ryan

abstract class LongPoller {

  protected $sleepTime = 5;
  protected $timeoutTime = 30;

  function __construct() {
  }


  function setTimeout($timeout) {
    $this->timeoutTime = $timeout;
  }

  function setSleep($sleep) {
    $this->sleepTime = $sleepTime;
  }


  public function run() {
    $data = NULL;
    $timeout = 0;

    set_time_limit($this->timeoutTime + $this->sleepTime + 15);

    //Query database for data
    while($data == NULL && $timeout < $this->timeoutTime) {
      $data = $this->loadData();
      if($data == NULL){

        //No new orders, flush to notify php still alive
        flush();

        //Wait for new Messages
        sleep($this->sleepTime);
        $timeout += $this->sleepTime;
      }else{
        echo $data;
        flush();
      }
    }

  }


  protected abstract function loadData();

}

b
brightball

This is one of the scenarios that PHP is a very bad choice for. As previously mentioned, you can tie up all of your Apache workers very quickly doing something like this. PHP is built for start, execute, stop. It's not built for start, wait...execute, stop. You'll bog down your server very quickly and find that you have incredible scaling problems.

That said, you can still do this with PHP and have it not kill your server using the nginx HttpPushStreamModule: http://wiki.nginx.org/HttpPushStreamModule

You setup nginx in front of Apache (or whatever else) and it will take care of holding open the concurrent connections. You just respond with payload by sending data to an internal address which you could do with a background job or just have the messages fired off to people that were waiting whenever the new requests come in. This keeps PHP processes from sitting open during long polling.

This is not exclusive to PHP and can be done using nginx with any backend language. The concurrent open connections load is equal to Node.js so the biggest perk is that it gets you out of NEEDING Node for something like this.

You see a lot of other people mentioning other language libraries for accomplishing long polling and that's with good reason. PHP is just not well built for this type of behavior naturally.


Is this an Apache problem or a PHP problem? Would I have issues with long polling if my PHP code ran directly on nginx or lighttpd?
It's less a PHP problem and more a PHP misuse. On every request PHP runs the script from scratch, loading libraries as needed, executing its code and then shutting down while garbage collecting everything started in the request. A lot of modifications have been made to PHP over the years to minimize the impact like late static bindings, lazy loading, in memory bytecode caches to remove disk I/O, etc. The problem remains that PHP is intended to start and stop as quickly as possible. Languages that will load once/boot and open a thread for the request are much better suited for long polling.
But to answer the question, yes you would experience the issue regardless of whether you were using Apache or something else. It's just how PHP works. I should amend this to say that, if you are going to have a known max traffic load PHP will be fine. I've seen embedded systems using PHP that have no issues because there's only a couple of connections. Potentially on a company intranet this could also be passable. For public facing applications though, you'll absolutely kill your servers as traffic grows.
x
xoblau

Thanks for the code, dbr. Just a small typo in long_poller.htm around the line

1000 /* ..after 1 seconds */

I think it should be

"1000"); /* ..after 1 seconds */

for it to work.

For those interested, I tried a Django equivalent. Start a new Django project, say lp for long polling:

django-admin.py startproject lp

Call the app msgsrv for message server:

python manage.py startapp msgsrv

Add the following lines to settings.py to have a templates directory:

import os.path
PROJECT_DIR = os.path.dirname(__file__)
TEMPLATE_DIRS = (
    os.path.join(PROJECT_DIR, 'templates'),
)

Define your URL patterns in urls.py as such:

from django.views.generic.simple import direct_to_template
from lp.msgsrv.views import retmsg

urlpatterns = patterns('',
    (r'^msgsrv\.php$', retmsg),
    (r'^long_poller\.htm$', direct_to_template, {'template': 'long_poller.htm'}),
)

And msgsrv/views.py should look like:

from random import randint
from time import sleep
from django.http import HttpResponse, HttpResponseNotFound

def retmsg(request):
    if randint(1,3) == 1:
        return HttpResponseNotFound('<h1>Page not found</h1>')
    else:
        sleep(randint(2,10))
        return HttpResponse('Hi! Have a random number: %s' % str(randint(1,10)))

Lastly, templates/long_poller.htm should be the same as above with typo corrected. Hope this helps.


Actually, "15000" is the syntax error. setTimeout takes an integer as its 2nd parameter.
This answer needs work. It is the culmination of one or more comments and a separate answer or answers.
C
Community

Why not consider the web sockets instead of long polling? They are much efficient and easy to setup. However they are supported only in modern browsers. Here is a quick reference.


I think once websockets are implemented everywhere (probably not for years to come) they will be the standard for this sort of application. Unfortunately for now, we can't rely on them for production apps.
@Richard You can however use something like Socket.IO which provides automatic fallback transports, providing web-socket-like functionality all the way down to IE 6.
C
Community

The WS-I group published something called "Reliable Secure Profile" that has a Glass Fish and .NET implementation that apparently inter-operate well.

With any luck there is a Javascript implementation out there as well.

There is also a Silverlight implementation that uses HTTP Duplex. You can connect javascript to the Silverlight object to get callbacks when a push occurs.

There are also commercial paid versions as well.


m
makerofthings7

For a ASP.NET MVC implementation, look at SignalR which is available on NuGet.. note that the NuGet is often out of date from the Git source which gets very frequent commits.

Read more about SignalR on a blog on by Scott Hanselman


i
ideawu

You can try icomet(https://github.com/ideawu/icomet), a C1000K C++ comet server built with libevent. icomet also provides a JavaScript library, it is easy to use as simple as

var comet = new iComet({
    sign_url: 'http://' + app_host + '/sign?obj=' + obj,
    sub_url: 'http://' + icomet_host + '/sub',
    callback: function(msg){
        // on server push
        alert(msg.content);
    }
});

icomet supports a wide range of Browsers and OSes, including Safari(iOS, Mac), IEs(Windows), Firefox, Chrome, etc.


s
sp3c1

Simplest NodeJS

const http = require('http');

const server = http.createServer((req, res) => {
  SomeVeryLongAction(res);
});

server.on('clientError', (err, socket) => {
  socket.end('HTTP/1.1 400 Bad Request\r\n\r\n');
});

server.listen(8000);

// the long running task - simplified to setTimeout here
// but can be async, wait from websocket service - whatever really
function SomeVeryLongAction(response) {
  setTimeout(response.end, 10000);
}

Production wise scenario in Express for exmaple you would get response in the middleware. Do you what you need to do, can scope out all of the long polled methods to Map or something (that is visible to other flows), and invoke <Response> response.end() whenever you are ready. There is nothing special about long polled connections. Rest is just how you normally structure your application.

If you dont know what i mean by scoping out, this should give you idea

const http = require('http');
var responsesArray = [];

const server = http.createServer((req, res) => {
  // not dealing with connection
  // put it on stack (array in this case)
  responsesArray.push(res);
  // end this is where normal api flow ends
});

server.on('clientError', (err, socket) => {
  socket.end('HTTP/1.1 400 Bad Request\r\n\r\n');
});

// and eventually when we are ready to resolve
// that if is there just to ensure you actually 
// called endpoint before the timeout kicks in
function SomeVeryLongAction() {
  if ( responsesArray.length ) {
    let localResponse = responsesArray.shift();
    localResponse.end();
  }
}

// simulate some action out of endpoint flow
setTimeout(SomeVeryLongAction, 10000);
server.listen(8000);

As you see, you could really respond to all connections, one, do whatever you want. There is id for every request so you should be able to use map and access specific out of api call.