start blog post

Multiple cURL Requests with PHP

(jump to the practical example)

PHP has a set of cURL functions to let your script download other webpages. If you use cURL to scrape data or build mashups, you may need to fetch more than one page. This could create a massive performance problem, adding seconds to your own script's runtime because you have to wait for several individual cURL requests to come back.

Enter curl_multi_init. This family of functions allows you to combine cURL handles and execute them simultaneously.

  // this example does NOT use simultaneous requests, it must wait for each response
  
  // request 1
  $ch = curl_init('http://webservice.one.com/');  // initialize the request
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // store the page contents
  $response_1 = curl_exec($ch);                   // actually make the request
  
  // request 2
  $ch = curl_init('http://webservice.two.com/');
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  $response_2 = curl_exec($ch);
  
  // normally you would process your results here
  echo "$response_1 $response_2";
  // with curl_multi, you only have to wait for the longest-running request
  
  // build the individual requests as above, but do not execute them
  $ch_1 = curl_init('http://webservice.one.com/');
  $ch_2 = curl_init('http://webservice.two.com/');
  curl_setopt($ch_1, CURLOPT_RETURNTRANSFER, true);
  curl_setopt($ch_2, CURLOPT_RETURNTRANSFER, true);
  
  // build the multi-curl handle, adding both $ch
  $mh = curl_multi_init();
  curl_multi_add_handle($mh, $ch_1);
  curl_multi_add_handle($mh, $ch_2);
  
  // execute all queries simultaneously, and continue when all are complete
  $running = null;
  do {
    curl_multi_exec($mh, $running);
  } while ($running);
  
  // all of our requests are done, we can now access the results
  $response_1 = curl_multi_getcontent($ch_1);
  $response_2 = curl_multi_getcontent($ch_2);
  echo "$response_1 $response_2"; // same output as first example
  

If both websites take one second to return, we literally cut our page load time in half by using the second example instead of the first. Sweet!

In Action: Twitter

Here's an example where we run multiple Twitter searches and combine the results to display them on our own site.

As a bonus, it also caches the results for 1 minute so we avoid hitting Twitter's rate limit if we get a ton of visitors at the same time. You can change $minutes to any number you feel comfortable with, but it's important to include because you will end up with a complete blank list if your page gets a lot of hits, which is precisely the worst time to kill your content.

function tweets() {
    
    // check cache
    $cache = 'twitter-search.txt';
    if (file_exists($cache)) {
        clearstatcache();
        $minutes = 1; // how long to wait before refreshing the cache
        if (filemtime($cache) > (time() - (60 * $minutes)) {
            return file_get_contents($cache);
        }
    }
    
    // we are going to search for tweets mentioning these keywords
    $keywords = array(
        'javascript',
        'html5',
        'css3'
    );
    
    // build the requests
    $ch = array();
    $mh = curl_multi_init();
    for ($i = 0; $i < count($keywords); $i++) {
        $keyword = $keywords[$i];
        $ch[$i] = curl_init();
        curl_setopt($ch[$i], CURLOPT_URL, 
                'http://search.twitter.com/search.json?rpp=3&q=' . $keyword);
        curl_setopt($ch[$i], CURLOPT_USERAGENT, 
                'Twitter requires you to set a user agent, any value works here.');
        curl_setopt($ch[$i], CURLOPT_RETURNTRANSFER, true);
        curl_setopt($ch[$i], CURLOPT_HEADER, false);
        curl_multi_add_handle($mh, $ch[$i]);
    }
    
    // execute the requests simultaneously
    $running = 0;
    do {
        curl_multi_exec($mh, $running);
    } while ($running > 0);
    
    // display the results
    $output = '';
    for ($i = 0; $i < count($keywords); $i++) {
        // $results contains this keyword's tweets as an associative array
        $results = reset(json_decode(curl_multi_getcontent($ch[$i]), true));
        $resultCount = count($results);
        
        // link to our keyword
        $output .= '<dl><dt><a href="http://search.twitter.com/search?' . $keywords[$i] . '</a></dt>';
        
        // dump the search results
        for ($j = 0; $j < $resultCount; $j++) {
            $id = $results[$j]['id'];                          // twitter user ID
            $user = $results[$j]['from_user'];                 // twitter user name
            $tweet = $results[$j]['text'];                     // tweet text
            $url = "http://www.twitter.com/$user/status/$id/"; // link to the tweet
            
            $output .= '<a href="' . $url . '">' . $tweet . ' &mdash; ' . $user . '</a>';
        }
        $output .= '</dd></dl>';
    }
    file_put_contents($cache, $output); // store in local cache for performance boost
    return $output;
}
echo tweets();

var tags = [, ];

  • share this post:
  • email a friend
  • float this post
  • digg this post
  • share on stumbleupon
  • submit to technorati
  • tweet this post

end blog post

most viewed this week

least viewed this week