Artificial Sorting and Closures in PHP

Sorting is a common, necessary task in programming. As everyone should know, the best way to sort is to use a built-in sorting function in whatever language or framework you're using. Rarely do you want to (nor should you) roll your own sorting algorithm/function. Usually sorting functions are used in conjunction with a comparator function to specify an ordering for the sorting routine.

To review: a comparator function is a function that takes two (generic) objects a and b and is responsible for comparing them. The function is expected to return an integer: a negative integer if the objects are in-order ( a < b ), zero if they are equal, and a positive integer if they are out of order ( a > b). This allows the sorting function to sort without having to worry about what it is sorting.

As a simple example, consider sorting an array of integers in non-increasing order (the reverse order from the default ordering of the sort() function). One could this as follows.

Now suppose we want to sort an array of objects whose ordering is more complex. We don't have too much freedom in how to design the comparator function: it must take two (mixed) arguments and return an integer. Any deviation from this design may have undefined consequences. However, suppose that the ordering relies on some other pieces of data in order to do the comparison. As an example, consider the problem of sorting an array of Student objects based on the students year (Freshman, Sophomore, Junior, Senior). Suppose that we define the ordering by setting up an array:

The ordering of these elements is "unnatural" (a "natural" ordering would order the elements in lexicographic ordering: Freshman, Junior, Senior, Sophomore). But its the ordering that we would expect if we sorted an array of students by their class year. Of course we could always define a comparator with lots of checks as follows.

As you can see, the logic gets pretty complex if we're going to hardcode everything. Moreover, suppose that we expand the list of possible values for class year to allow "Graduate" students, "Pre-Med" students, "Pre-Enrollment" students, etc. Each possibility leads to a complete redesign of the comparator function, increasing its complexity, increasing the potential for bugs, etc. Its not maintainable at all.

Instead, what we want is to use the array defined above in our comparator function. We could instead do the following.

This is much cleaner and much easier to extend: all we need to do to add values or change the ordering is modify the $classYear array. Simple, right?

The problem is that the cmpStudentByYear() function doesn't know anything about the $classYear array. Its not a local variable (indeed, we wouldn't want to recreate it every time we call the function), nor is it a parameter (passing it would mean we violate the comparator function signature). Any attempt to access $classYear will likely result in a warning and a null result or even a fatal error.

Note: there is another problem here if the student's class year is invalid and not in the $classYear array. In that case, array_search() returns the bizarre choice of false (a boolean) rather than (say) an invalid index (such as -1).

One solution would be to refer to $classYear as a global variable.

Adding the first line places the array within the scope of the comparator. This is not ideal as, in general, globals are evil.

Instead, let's define a closure. A closure is a function that has its own "enclosure" (that is, its very own scope). A function always knows about its parameters and local variables. However, a function sometimes needs to know about other, non-local variables (also called "free" variables). We can create a closure to "close" the function and these variables into one package that is (mostly) protected from outside modifications (and protects the non-local variables from the function itself).

Sometimes we need to do this (as in the example we've looked at). Basically, the benefit is that a function may have access to variables that are not in the same scope as the scope in which the function is invoked. In our particular example, the $classYear is accessed by the comparator function, but the comparator function is invoked within the usort() function, which has a completely different scope!

In PHP (starting with version 5.3.0), closures can be achieved using the keyword use. However, closures seem to only be definable with respect to anonymous functions (indeed, the official PHP documentation seems to conflate closures and anonymous functions, http://php.net/manual/en/functions.anonymous.php).

This example creates an anonymous function and stores a reference to it in the variable $myComparator. Moreover, the use keyword "closes" the function with a completely new copy of $classYear that is only accessible by the function itself (this is called "early binding": the copy is created upon definition of the function). The function's copy of $classYear is protected from outside changes and any changes to its copy have no effect on the original $classYear. Neat.

Now we can finally call usort() and pass our closure comparator.