Making the Call with Functions

Something I've struggled with for a number of years, both professionally and personally, is communicating with people who aren't developers and know little to nothing about the nature of my job outside of its necessity. They often don't know that writing software is a juggling act of considerations and compromises. Sometimes you have to sacrifice flexibility for performance, sometimes you give up performance for flexibility. Logistics rarely allow for a silver bullet to any given situation and you frequently have to decide which is more important.

As Maggie Nelson points out, there's a difference between designing for a particular purpose and premature optimization and sometimes good design doesn't mean designing for performance. Terry Chay has pointed out a similar notion in the form of the YAGNI principle. When optimizing, it's important to know where to optimize first. Triage. Patch up the puncture wounds that are gushing cycles and memory before trying to put a band-aid on the scraped knees.

It also helps to know more about what's going on under the hood. Being aware of things like copy-on-write, opcodes, and compiled variables give you a more informed perspective when the odd issue makes you think you've been pulled into the Twilight Zone.

Another of these topics is how PHP handles memory. PHP is written in the C programming language. One of the C-based components of PHP is the Zend Memory Manager, which handles the somewhat tedious task of allocating and deallocating memory for storing the data that we assign to variables, pull out of database servers, and so forth. As someone who's experienced his share of segfaults and the occasional bus error, I can say that this is one of the niceties that PHP provides and is often underappreciated by people who have no experience programming in lower-level languages.

Let's say you have an application that's pulling a rather hefty result set from your database server. Your script terminates and you see an error that looks something like this: "Fatal error: Allowed memory size of X bytes exhausted (tried to allocate Y bytes) [...] on line Z." You furrow your brow, do some searching, find out about the memory_limit directive, increase it to some insane amount or disable it entirely, then run your script again. It runs. And runs. And runs some more. You go to the restroom, refill your coffee cup, come back, and the darned thing is still loading. Eventually it finishes, but the amount of time required leaves you scratching your head.

All potential obscure causes aside, this is a situation where being aware of lower-level details is helpful. Your server obviously has a limited amount of memory available. It's running your OS and any other number of things besides your web server and PHP. Even if you don't use the memory_limit directive, PHP only has so much memory available to it. Once all system memory is consumed, it starts hitting what's called swap space. This is space on the hard disk, generally on its own partition, that the system uses by swapping out currently unused data in memory so it has room to allocate new data. Hard disk reads and writes are slower than memory access by several magnitudes, so of course this often becomes a bottleneck. Being aware of it, you can build a solution that only uses so much memory at one time in order to work around the restriction.

Database result sets aren't the only cause of this sort of occurrence. Functions can be, too. I ran into an issue with Zend_Form not too long ago. A form with a lot of elements run through Xdebug revealed this to be a bottleneck because it was adding one (particularly slow) function call for every element in the form. While what goes on within a function obviously contributes to how it performs, the fact that it is a function does the same to a degree.

Whenever a function is called, a new scope is pushed onto a stack in memory and is removed when the function terminates. You can think of the latter action as an implicit unset() call of all variables local to the function's scope when its execution concludes. PHP generally allocates memory in chunks so that it doesn't have to do so every time a function call is made, but regardless, the operation of calling a function isn't cheap.

Now, I'm not saying run off and run all your code in the global scope. We've got plenty of legacy PHP 4 code to show why that's a bad idea. It's a frequent part of our processes as developers, whether we're coming up with project estimates or designing a set of classes, to subdivide larger tasks and problems into smaller ones that are easier to tackle. Part of the latter is dividing tasks into multiple functions or methods. In doing so, we have to make decisions based on a number of factors. These include how much memory is required for a single function call, how likely it is that part of process encapsulated by a method will need to be overridden in a subclass, and how clear the code will be to other developers who will have to read and maintain it in the future.

Consider these factors carefully when architecturing your code and be conscious of the decision you may when you add complexity or flexibility in exchange for performance. In short, while the DRY principle is certainly a good reason to put code into a function, the other reasons are equally important and you need to be conscious of the decision you're making.