Contact_Vcard_Parse

  1. Quick Instructions
  2. Introduction to vCard
  3. How Contact_Vcard_Parse Returns Data
  4. Known Issues and Bug Reports
  5. Other PHP Code For The vCard Format

Quick Instructions

For the impatient. :-)

  1. Download and un-compress Contact_Vcard_Parse from the PEAR archive.
  2. Include Contact_Vcard_Parse.php in your PHP script.
  3. Instantiate a new Contact_Vcard_Parse object.
  4. Use the fromFile() method to parse any file which may have one or more vCards in it; try the sample.vcf file for a start. Contact_Vcard_Parse should work with both 2.1 and 3.0 vCard files.
  5. Use print_r() to view the resulting array of data from the parsed file.
  6. Do what you want with the data, such as insert into a table.

Example code:

<?php
    
    // include the class file
    require_once 'Contact_Vcard_Parse.php';
    
    // instantiate a parser object
    $parse = new Contact_Vcard_Parse();
    
    // parse a vCard file and store the data
    // in $cardinfo
    $cardinfo = $parse->fromFile('sample.vcf');
    
    // view the card info array
    echo '<pre>';
    print_r($cardinfo);
    echo '</pre>';
    
?>

Introduction to vCard

For a full overview of the vCard format, refer to Internet Engineering Task Force RFC 2426.

Basically, a vCard is a plain text file that contains an "electronic business card" of contact information suitable for including in an address book. The vCard format is a standardized way of trading personal and organizational contact information. A vCard file can have one or more vCard entries in it.

What's In A vCard Entry?

These are some, but not all, of the data elements of a vCard:

How Contact_Vcard_Parse Returns Data

Contact_Vcard_Parse reads a file or block of text for vCard data, then converts that data into a series of nested arrays. I used to present a detailed prose explanation of the array, but I think it's easier to just give a generic outline of the array:

$parse_result = array (
    [int_cardnumber] => array (
        [string_datatype] => array (
            ["param"] => array (
                [string_paramname] => array (
                    [int_repetitionnumber] => string_paramtext
                )
            )
            ["value"] => array (
                [int_partnumber] => array (
                    [int_repetitionnumber] => string_valuetext
                )
            )
        )
    )
)

By way of example, let's take a look at the vCard of my friend Bolivar Shagnasty.

BEGIN:VCARD
VERSION:3.0
N:Shagnasty;Bolivar;Odysseus;Mr.;III,B.S.
FN:Bolivar Shagnasty
ADR;TYPE=HOME,WORK:;;123 Main,Apartment 101;Beverly Hills;CA;90210
EMAIL;TYPE=HOME;TYPE=WORK:boshag@example.com
EMAIL;TYPE=PREF:boshag@ciaweb.net
END:VCARD

This is a pretty simple vCard: my buddy Bolivar's name, one address (looks like Bolivar works from home), two email addresses (one for work and home, and one as his "preferred" address). This simple vCard, when it gets parsed, looks like this:

Array
(
    [0] => Array
        (
            [VERSION] => Array
                (
                    [0] => Array
                        (
                            [param] => Array
                                (
                                )

                            [value] => Array
                                (
                                    [0] => Array
                                        (
                                            [0] => 3.0
                                        )

                                )

                        )

                )

            [N] => Array
                (
                    [0] => Array
                        (
                            [param] => Array
                                (
                                )

                            [value] => Array
                                (
                                    [0] => Array // family
                                        (
                                            [0] => Shagnasty
                                        )

                                    [1] => Array // first
                                        (
                                            [0] => Bolivar
                                        )

                                    [2] => Array // additional or middle
                                        (
                                            [0] => Odysseus
                                        )

                                    [3] => Array // honorifix prefix
                                        (
                                            [0] => Mr.
                                        )

                                    [4] => Array // honorifix suffix
                                        (
                                            [0] => III
                                            [1] => B.S.
                                        )

                                )

                        )

                )

            [FN] => Array
                (
                    [0] => Array
                        (
                            [param] => Array
                                (
                                )

                            [value] => Array
                                (
                                    [0] => Array
                                        (
                                            [0] => Bolivar Shagnasty
                                        )

                                )

                        )

                )

            [ADR] => Array
                (
                    [0] => Array
                        (
                            [param] => Array
                                (
                                    [TYPE] => Array
                                        (
                                            [0] => HOME
                                            [1] => WORK
                                        )

                                )

                            [value] => Array
                                (
                                    [0] => Array // p.o. box
                                        (
                                            [0] => 
                                        )

                                    [1] => Array // extended
                                        (
                                            [0] => 
                                        )

                                    [2] => Array // street
                                        (
                                            [0] => 123 Main
                                            [1] => Apartment 101
                                        )

                                    [3] => Array // locality or city
                                        (
                                            [0] => Beverly Hills
                                        )

                                    [4] => Array // region, state, or province
                                        (
                                            [0] => CA
                                        )

                                    [5] => Array // postal code
                                        (
                                            [0] => 90210
                                        )

                                    [6] => Array // country
                                        (
                                            [0] => 
                                        )

                                )

                        )

                )

            [EMAIL] => Array
                (
                    [0] => Array
                        (
                            [param] => Array
                                (
                                    [TYPE] => Array
                                        (
                                            [0] => HOME
                                            [1] => WORK
                                        )

                                )

                            [value] => Array
                                (
                                    [0] => Array
                                        (
                                            [0] => boshag@example.com
                                        )

                                )

                        )

                    [1] => Array
                        (
                            [param] => Array
                                (
                                    [TYPE] => Array
                                        (
                                            [0] => PREF
                                        )

                                )

                            [value] => Array
                                (
                                    [0] => Array
                                        (
                                            [0] => boshag@ciaweb.net
                                        )

                                )

                        )

                )

        )

)

Sweet Jebus! That's an ugly mess. But it retains every bit of info about the vCard so you can do what you like with it. It keeps (separately) every element and component so you can see the underlying structure of the vCard.

Yes, I know it's a deeply-nested array set, and is ugly and probably inefficient. The problem (or genius?) of the vCard format is that just about every part of a vCard element can have multiple values. While this makes the vCard format very flexible, it makes it a little difficult to parse and interpret in a simple fashion. The easiest way I could think of was a series of nested arrays. An object-oriented approach might be better, but even then you're going to have nested objects or nested arrays within the vCard object to represent multiple values of a vCard data element.

Known Issues

When I wrote this parser, my primary goal was to be able to read vCard files produced by the Mac OS X Address Book application. However, it looks like Address Book puts some weird character after every single text character in the output, in addition to some weird line endings. If you want to use .vcf files generated by the Mac OS X Address Book, you might need to massage the file in BBEdit or TextWrangler first; turn on "show invisibles" to see the offending characters, then do a search-and-replace to delete them all at once (or perhaps "Zap Gremlins".

UPDATE: David Weingart writes, "That's probably Unicode. In my extremely limited testing, it looks like in some cases you get plain vanilla ISO Latin 1, but if there are any high ascii characters in the entry, they export UTF 16 (double-byte) Unicode." Thanks, David. (Contact_Vcard_Parse does not do Unicode at this time.)

Contact_Vcard_Parse does not validate the information or formatting in the vCard (although it does decode quoted-printable text). In the spirit of "be lenient in what you accept and strict in what you produce", Contact_Vcard_Parse should be able to read just about anything from a vCard file, but it's up to you as the programmer to make sense of the data.

Contact_Vcard_Parse should work on file with any kind of line endings (Mac \r, Unix \n, and DOS \r\n) automatically. It also unfolds lines automatically, so data elements spread across multiple lines should come through OK.

If you discover a new bug or want to contribute code to Contact_Vcard_Parse, contact Paul M. Jones at pjones at ciaweb dot net; the subject line should start with [VCARD].

Other PHP Code For The vCard Format

Frank Hellwig has a 2.1/3.0 parser and address-book page generator. See his site at http://vcardphp.sourceforge.net/. My parser includes one of his class methods.

Kai Blankenhorn has a 2.1 card generator (not a parser). See his work at http://www.bitfolge.de/?s=phpvcard.

Flaimo has a vCard generator, too. http://flaimo.com/php_scripts.php (scroll down past the iCalendar stuff).

HORDE has a vCard data element in their application framework, but I don't quite see how to use it outside that framework. The doc pages for it are at http://dev.horde.org/api/horde/dev-doxygen/html/classData__vcard.html.

Of course, you can always Google for more vCard stuff under PHP, too: http://www.google.com/search?q=vcard+php.