Adding DNS Caching to Zend_Http_Client
Submitted by Matthew Turland on Mon, 12/01/2008 - 19:42In a recent post on my blog, I benchmarked various mainstream PHP HTTP client extensions and libraries. In doing so, I noticed that extensions based on libcurl (specifically the cURL and pecl_http) use an internal DNS cache and glean a significant performance gain from it.
While it is possible for PHP-based libraries to implement their own cache, none that I examined actually do. With a little curiosity egging me on, I decided to spend a bit of time digging into their code to see how difficult it would be to modify them to include a cache so I could compare the results for performance.
PEAR::HTTP_Client depends on PEAR::Net_Socket for its socket operations. Net_Socket handles DNS lookups by calling gethostbyname within its connect() method, but does not cache the result. Modifying this behavior would unfortunately require reimplementing a large portion of that method in a subclass and, in the interest of time, I decided not to pursue it further.
Zend_Http_Client, on the other hand, actually made it relatively simple to do by extending and overriding a few methods in its Socket adapter class. See below for the subclass source code.
<?php
require_once 'Zend/Http/Client/Adapter/Socket.php';
class Custom_Http_Client_Adapter_Socket extends Zend_Http_Client_Adapter_Socket
{
protected $_ip;
protected $_ips = array();
public function connect($host, $port = 80, $secure = false)
{
if (!isset($this->_ips[$host])) {
$this->_ips[$host] = gethostbyname($host);
}
$this->_ip = $this->_ips[$host];
return parent::connect($this->_ip, $port, $secure);
}
public function write($method, $uri, $http_ver = '1.1',
$headers = array(), $body = '')
{
$uri->setHost($this->_ip);
return parent::write($method, $uri, $http_ver, $headers, $body);
}
}
This will perform a DNS lookup and cache the result for subsequent requests in which the host does not change. Once the host does change, the if block in the connect() method will repeat the lookup process for the new host. This works by having the socket connect directly to the IP address. The Host header of the request is how the server at that IP address knows the virtual host for which the request is intended.
When Zend_Http_Client::request() is called, it calls the _prepareHeaders() method of that class, which uses the host value associated with the internal $uri property at that time. The write() method of the adapter is subsequently called, which is overridden in the class above to overwrite that host value with the cached IP address.
The test case code to use this against the native Zend socket adapter is shown below. Running on the same machine and internet connection as cited in my original post, control.php had a runtime of 17.314s and trial.php had a runtime of 9.219s.
<?php
// control.php
require_once 'Zend/Http/Client.php';
$client1 = new Zend_Http_Client();
$client1->setUri('http://ishouldbecoding.com');
$client1->request();
$client1->setUri('http://ishouldbecoding.com/code');
$client1->request();
// trial.php
require_once 'Zend/Http/Client.php';
require_once 'Custom/Http/Client/Adapter/Socket.php';
$client2 = new Zend_Http_Client();
$client2->setAdapter(new Custom_Http_Client_Adapter_Socket());
$client2->setUri('http://ishouldbecoding.com');
$client2->request();
$client2->setUri('http://ishouldbecoding.com/code');
$client2->request();
Note that this approach is intended for specific circumstances. One of these is that the server on which your PHP code is running does not have an OS that natively supports DNS caching or does not have software installed to add that capability and you do not have sufficient access to install it. Trying to handle DNS caching in a PHP script when the same functionality is available at a much lower (and faster) software level (i.e. within a C extension or the OS) can actually hurt performance, comparatively speaking. Examples of DNS software include bind, nscd, and dnsmasq (the one I'm using). As my original blog post states, the performance of the various extensions and libraries I reviewed is roughly consistent when this type of software is running.
The other circumstance is that the script either targets multiple hosts and alternates hosts between requests more often than not or has another requirement that mandates establishing multiple connections to the same host or hosts over the course of its runtime. If requests are consecutively made to the same hosts without much alternation, the recommended practice of keepalive configuration flag can be used to retain the established connection for multiple requests. This alleviates the need to go through the phases of DNS lookup and connection establishment multiple times per host. I made this modification to control.php, reran it through Xdebug, and got a runtime of 9.164s (beating the custom socket adapter runtime by 55ms).
A noteworthy point is that it is possible (though not very likely) for DNS changes to take effect while your script is running. The custom socket adapter class shown operates on the assumption that this will not happen. Particularly for scripts with a longer runtime, additional logic should be added to automatically invalidate the cache either on a timed basis and/or if a request to the cached IP address fails.
So, that's a wrap. Thanks for reading, hope you enjoyed it.

