PHP cURL
cURL is a library written in C that enables easy transfer of data in many different protocols including FTP, FTPS, HTTP, TELNET and LDAP. cURL does more than simply download a file. You can store cookies, upload files, use various types of authentication and tunnel all requests through a proxy. The cURL extension has been bundled with PHP since version 4.0.2 and is enabled by default in php.ini. In this tutorial we will look at the uses of cURL and how it compares to file_get_contents() , we will then write a cURL class which we can use to download the source code of a webpage and download a binary file.
cURL vs file_get_contents()
Many PHP scripts require the user to download and then parse the HTML source of a page. When people ask how to do this, the most common reply is to use the file_get_contents() function as it’s a simple one line solution – you just need to set the first argument to webpage you want to download and it will return the source code of it. While it may seem very easy to use, it has many pitfalls.
- cURL is significantly faster than file_get_contents() at retrieving the source of a webpage. It took around 2.3 seconds for file_get_contents() to retrieve the source of google.com while cURL only took 0.65 seconds. (source)
- Familiarity of code. cURL is implemented in Python, Ruby, Java and many other languages. Your cURL code in PHP will be easily translatable and transferable to other languages, allowing you to save time and money.
- cURL allows you to do more. If the webpage were password protected or required a form to be submitted before viewing, this would be possible in cURL. No possible way with file_get_contents().
- cURL enables you to remain anonymous while requesting the webpage. You can funnel all cURL requests through one or multiple proxies whereas file_get_contents() will give away the server IP address.
cURL Class
cURL is only available through procedural functions, there is no cURL class in PHP. Hence scripts that utilize cURL can be messy. By writing a simple class wrapper, we can execute a cURL request in one line – yet still have the power and configuration of cURL. The class has two public methods. The first getPage() accepts one argument – the URL of the website. It returns an array of information about the request including the HTML source, content type, time it took to request the page, average download speed and other useful stats. The second downloadFile() accepts two arguments – the url to the file being downloaded and name of the file it should be saved as. The $options attribute holds the cURL options (that we eventually set using curl_set_opt_array()). A full list of options as well as their descriptions are available here.
class curl
{
public $options = array(
CURLOPT_RETURNTRANSFER => true, // return the web page
CURLOPT_HEADER => false, // don't return the headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_USERAGENT => "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; Media Center PC 4.0)", // set a normal looking useragent
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout
CURLOPT_TIMEOUT => 120, // timeout
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
);
public function __construct()
{
}
public function getPage($url)
{
$p = curl_init($url);
curl_setopt_array($p, $this->options );
$content = curl_exec($p);
if(!$content)
{
throw new \Exception(curl_error($p) . ' ERROR CODE ' . curl_errno($p));
}
$header = curl_getinfo($p);
curl_close($p);
$header['content'] = $content;
return $header;
}
public function downloadFile($url, $filePath)
{
$fp = fopen($filePath, 'w');
$this->options[CURLOPT_URL] = $url;
$this->options[CURLOPT_FILE] = $fp;
$p = curl_init();
curl_setopt_array($p, $this->options);
$file = curl_exec($p);
curl_close($p);
fclose($fp);
}
}
To use the class just instantiate it and call the appropriate methods.
$c = new curl();
$c->downloadFile('http://static.reddit.com/reddit.com.header.png', 'logo.png');
//Will download the reddit header image into a file called logo.png
$foo = $c->getPage('http://www.query7.com');
print_r($foo);
Which returns..


