PHP Snippet: xml2array

4 June 2004 • code | PHP • PermaLink

I’m digging up some of my old code snippets from various places and putting them here for posterity (and possible use by you, the viewer).

Here’s one that takes a simple XML file and converts it into an array. In this case, “simple” means “no attributes”. It may barf on other XML files too, so I make no guarantees.

However, as an example, it will turn this XML file:

<people> <person> <name> <first>Colin</first> <middle>M</middle> <last>Viebrock</last> </name> <birthday> <year>1970</year> <month>9</month> <day>26</day> </birthday> </person> <person> <name> <first>Joe</first> <last>Smith</last> </name> <birthday> <year>1965</year> <month>1</month> <day>12</day> </birthday> </person> </people>

into this PHP array:

Array ( [people] => Array ( [0] => Array ( [person] => Array ( [0] => Array ( [name] => Array ( [first] => Colin [middle] => M [last] => Viebrock ) [birthday] => Array ( [year] => 1970 [month] => 9 [day] => 26 ) ) [1] => Array ( [name] => Array ( [first] => Joe [last] => Smith ) [birthday] => Array ( [year] => 1965 [month] => 1 [day] => 12 ) ) ) ) ) )

Scanning the PHP manual, I see that this is pretty much the same functionality provided by the SimpleXML extension, which wasn't around at the time I wrote this.

Anyway, my source code is here.

Comments

  1. Hi Colin, you may be interested in my XML library. It has a similar purpose and handles all data-oriented XML (including attributes).

    Also, your code collapses all elements in repeated tags (XML “arrays”) to the last instance of that tag. For example, your code parses:

    <?xml version=”1.0” ?>
    <foo>
    <bar>baz</bar>
    <bar>turtle</bar>
    </foo>

    into: array(‘foo’=>array(‘bar’=>‘turtle’)).

    If you’d like, you can test out the library by using my XML to PHP translator. The library also includes a “serializer” that generates XML from PHP data structures. I generate my RSS feed using the serializer, for instance.
    Keith
    4 June 2004, 16:23 • PermaLink
  2. You might also want to check out PEAR’s XML_Unserializer – it can mimick SimpleXML’s behaviour.
    – Davey
    Davey
    4 June 2004, 17:36 • PermaLink
  3. How do you cope with mixed content models?
    ben
    6 June 2004, 08:31 • PermaLink
  4. Ben, if you’re asking me, my library doesn’t. That’s why I was specific about “data-oriented” XML above.
    Keith
    6 June 2004, 12:46 • PermaLink
  5. Keith, actually, I wanted to cask Colin, but it’s good to know about your scripts too, thanks!
    ben
    7 June 2004, 10:56 • PermaLink
  6. Hi Ben,

    I’m not actually sure what you mean by “mixed content models”. However, like I said, this is a quick little PHP hack. As Keith points out, it will fail if there are repeated tags inside one element. Heck, even I said that I made no guarantees.

    I wrote this a while ago and had it on an old site of mine. Someone found it via the Internet Archive, so I resurrected it.

    I’d probably use something like SimpleXML, or the PEAR XML_Unserializer, if I needed the same functionality today.
    Colin
    7 June 2004, 12:02 • PermaLink
  7. Pretty cool so far, but didn’t you know of PHP’s native XML function xml_parse_into_struct?
    Christian Machmeier
    8 June 2004, 13:53 • PermaLink
  8. xml_parse_into_struct isn’t actually very helpful. It basically gives you a tokenization of the XML file that you then have to parse yourself.

    What Colin’s code and my code try to do is give you a “ready to use” PHP data structure that represents the XML you’re working with. In one line of code :)
    Keith
    8 June 2004, 18:01 • PermaLink
  9. Thanks Colin,

    today I would use different techniques, too, but I’m always interested how different people solve problems in a different way…

    By “mixed content model” I mean
    th content model of for example a p element in HTML. You have text, then an em element with text inside and then text again.

    Mixed content models are the reason why XML existst overal, as in the history of textmarkup, you need this mixing and nesting. You cant easyly use a database sollution for these kinds of information.

    Thanks anyway! ben_
    ben
    10 June 2004, 03:38 • PermaLink
  10. How about this?
    http://univerlife.com/sml.php

    IMHO it’s simle and faster
    Foshvad
    1 July 2004, 03:06 • PermaLink
  11. Foshvad, very clever code! I’m impressed! I timed it and it is about twice as fast as my code using the event based parser.

    I have two main issues with it though, the most important being that it doesn’t handle attributes. I also prefer not having a dummy array at every level (in other words, you have to do things like $arr['people'][0]['person'][0]['name'][0]['first'] instead of $arr['people']['person'][0]['name']['first'] which mine lets you do), but that’s largely personal preference.

    A while ago, I used to use a parser based on xml_parse_into_struct but I switched to the event-based parser from it because the event-based parser was better for large documents and IIRC was faster as well. Things may have changed with PHP since I wrote my code.
    Keith
    1 July 2004, 14:24 • PermaLink
  12. I ended up using the PEAR XML_Unserializer for the job of grabbing mapdata from a local street mapping organisation and linking the image url’s and getting route map data back as well (directions on how to drive from a to b using either the shorest or fastest route).

    There is an article on Sitepoint which was a bit of a ‘heads up’ (http://www.sitepoint.com/article/xml-php-pear-xml_serializer/1) on how XML_Unserializer etc. works.

    Colin’s xml_array works quite nicely with configuration files which is used with my personal website.
    Jacques
    21 July 2004, 02:26 • PermaLink
Name:
Email:
Website:
Comment:
What is 15 - 2
Textile Help