David Preston

Musings of an old bloke.



Tidying & Formatting Postal Addresses in PHP

A couple of times in the course of my job, I've needed to tidy-up user input of postal addresses, usually when importing from an external database into a web application or taking input from a web form. It looks horrible when everything is in upper or lower case. PHP has strtoupper(), strtolower() and ucasewords() functions, but none of these does just what I needed. Googling found a few attempts to solve the same problem, and I've probably pinched bits of code from various places. If I've pinched some of yours, let me know and I'll acknowledge your contribution. Hence the function below.

<?php
function address_tidy($text) {
    $text = trim($text);
    $text = str_replace("\\r", "", $text);
    $text = str_replace("\\n", "", $text);
    $words = explode(" ", $text);
    foreach ($words as $word) {
        $word = ucfirst(strtolower($word));
        //Specials like Mac, Mc etc
        $specials = array("Mac", "Mc", "O'");
        foreach ($specials as $special) {
            $pos = stripos($word, $special);
            if (($pos !== false) && ($pos == 0)) {
                $parts = explode($special, $word);
                $word = $special . ucfirst($parts[1]);
            }
        }
        //...but not for some words that begin with "Mac"
        // (make your own mind up about Macintosh, Maclure & Maclaren)
        $specials = array("macken", "macclesfield", "machynlleth");
        if (in_array(strtolower($word), $specials)) {
            $word = ucfirst(strtolower($word));
        }
        //Let"s go lower case on some words
        $specials = array("de", "la", "le", "on", "of", "and", "under", "upon");
        if (in_array(strtolower($word), $specials)) {
            $word = strtolower($word);
        }
    }
    return implode(" ", $words);
}
?>

Here’s a quick run-down of what it does when you feed it a string…

The first 3 lines just strip white-space from either end of the string and remove any line breaks and newlines.

We then use explode() to break the string into words on space characters and the foreach() loop deals with each word, one at a time.

The first line inside the loop does most of the work, converting the word first to lower case and then capitalising it, which is great for the vast majority of words, but what about Scottish and Irish names? That’s what the next few lines deal with. Essentially, presented with a list of prefixes declared as $specials (which you can add to if needed), we then find any words that begin with those prefixes and capitalise the fragment of word following. Example – the word being processed arrives as ‘Macdonald’, and the code goes through each of the ‘specials’ to see if our word starts with the prefix. It does in the case of ‘Mac’, so we capitalise ‘donald’ and put the two fragments back together as ‘MacDonald’.

But that messes up ‘Macclesfield’, making it ‘MacClesfield’, so the next few lines undo what we just did for another list of ‘specials’, which you can also add to if you wish.

Then we just make some shorter words (are they conjunctions?) into lower case, so you end up with ‘Walton le Dale’ or ‘Stratford upon Avon’.

Then we put the text back together and return it. Easy. Help yourself if you find this function useful. Any links back or other acknowledgements would be more than welcome, but aren't necessary.

Leave a comment if you find any bugs, or you think you can improve my code.

Categories:Web Development