Ultrashock Forums > Blogs > Holaso
Xpath = the smart way to XML in Actionscript
Member Blogs
 
Bookmark and Share!
Rate this Entry

Xpath = the smart way to XML in Actionscript

Posted 2008-02-15 at 11:20 by Holaso
Getting data from XML
To gain access to the data that we want to use we need to traverse the XML DOM (Document Object Model). To traverse the XML DOM we use the XML object properties like firstChild and childNodes. If you have used the XML object before then I’m sure you have created statements that looked something like this:
myXML.childNodes[1].childNodes[3].firstChild.nodeValue;
Can you tell what data this statement is referring to without dissecting the XML structure first? Every node is assumed to be literally at a hard coded position. Imagine if you would have the following XML data structure loaded in a XML object called xmlBooks:
<books>
<book id='0'>
<author>J.R.R. Tolkien</author>
<title>The Lord of the Rings</title>
<isbn>0618346252</isbn>
<price>34.95</price>
</book>
<book id='1'>
<author>Dan Brown</author>
<title>The Da Vinci Code<title>
<isbn>1400079179</isbn>
<price>17.95</price>
</book>
</books>
If we want to retrieve the author of the second book, The Da Vinci Code, we could use the following statement to access that data using the XML object:
xmlBooks.firstChild.childNodes[1].childNodes[1].firstChild.nodeValue;
What we’re in reality doing here is telling the XML object to get a reference to the firstChild node of the XML object, which is the <books> tag in our data example. Then, from the firstChild node get the second <book> node by indicating the second index in the childNodes array of the first child node (remember that array indexes start at zero, so we need to specify index 1 here to target the second index). Then we target the second node in the <book> tag, which is the <author> tag, and read the nodeValue of the firstChild node of that tag to get the actual data we want.
Did you actually follow all of that? There must be a better way of doing this!
Introducing XPath
I’m sure you’ve worked with files and folders on a computer drive before. Files and folders are usually stored in other folders, this way you can give structure to your drive and prevent your drive from turning into a big mess. If I was to show you the path to a file you’d probably know exactly what file I was talking about:
C:\Program Files\Macromedia\Flash 8\Readme.txt
Here we’re talking about a file called Readme.txt that is located on the C: drive in the Flash 8 program folder. Imagine if we couldn’t specify folders this way but had to use some obscure syntax like this:
myDrive.folders[4].folders[6].folders[2].files[4].name;
What would happen if the file at index 2 got deleted. Would that mean that our file would now be at index 3? If that was true, all our shell scripts, batch files, program code or whatever else that used that file would have to be adjusted just because a file got deleted!
With XPath, on the other hand, you specify a query to retrieve data. This query is actually just a special kind of path, just as you would use to access the files on your drive (hence the name XPath which basically is derived from XML Path). Having another look at our books example from the previous section; if we want to retrieve the name of the author of the second book using an XPath query we could specify a path, or query, that looked something like this:
"/books/book[2]/author"
Now, that looks a lot more civilized. If you have never seen an XPath query before then the number 2 between the brackets might look odd to you at first if you would expect a 1 to target the second book element. This is because in XPath, arrays are one based indexed not zero based as in Actionscript.
Now, by the time your are comfortable with XPath, you can understand that reading back an XPath query you wrote a couple of months earlier reads a lot more pleasant then those cryptic statements you had to write with the XML object, i.e. XPath gives a lot more context to what you are trying to accomplish. Further more, if for some reason the layout of your XML data structure was to change; your code wouldn’t break when using XPath because XPath doesn’t really care about the order of the nodes, it only cares about the scope, or nesting, of the nodes. Remember that with the XML object you had to target a specific node at a specific location, like index 1 of the childNodes array, to get to the data you want. If your data structure would change, your code would, not necessarily but probably, need to change as well.
Adobe’s XPathAPI
As mentioned in an earlier section, the Adobe XPathAPI is not a fully compliant implementation of the XPath API specification. It is, however, good enough for usage in most applications and definitely a better choice than the plain old vanilla XML object.
One of it’s major shortfalls is that it doesn’t support the iterator to retrieve childNodes. Rather than waste your time with Adobe’s XPathAPI, I want to go straight for the useful stuff and refer you to the XPath created by xfactorstudio…
XFactorstudio’s XPath
XFactorstudio ‘s implementation of XPath seems the most complete and robust implementation out there for Actionscript 2.0. Besides the fact that it’s the most complete XPath implementation, it isn’t only available for Flash but also for the excellent open source Actionscript 2.0 compiler, MTASC.
All the things that are missing in Adobe’s implementation of the XPath specification are implemented in XFactorstudio’s XPath API. With the XFactorstudio’s XPath implementation you can use iterators, attributes, functions and much more advanced XPath expressions of what the Adobe XPathAPI is not capable of.
import com.xfactorstudio.xml.xpath.XPath;

var books : XML;
books = new XML();
books.ignoreWhite = true;
books.onLoad = function(success : Boolean) : Void
{
var titles : Array;
titles = XPath.selectNodesAsString(books, "/books/book/title");
trace( titles );

//alternatively if we wanted to focus on a specific node…
Titles = XPath.selectNodes(books,”/book/book[2]/title”);
Trace(titles);

};
books.load("books.xml");
The above example uses the selectNodesAsString method to retrieve the titles of all the books. We are going to take a look at this method in a later section in more detail. From the example you can see that the Adobe XPathAPI and the XFactorstudio XPath API both have a different approach. Not only is the actual method call different but also the query syntax used to get the result is quite different as well. You can trust on XFactorstudio’s query syntax to be true to the actual XPath specifications whereas Adobe’s XPathAPI class needs fiddling from time to time to get a satisfying result.
selectNodes
Syntax:
XPath.selectNodes(node : XMLNode, query : String) : Array
Description:
The selectNodes method is one of the methods you will probably be using a lot when using the XFactorstudio XPath API. With the selectNodes method you can select a whole bunch of nodes that match a given query an receive an array of XMLNode objects for further parsing. The best thing about all this is, just as with all the other XFactorstudio XPath API methods, that you can use advanced XPath queries when you’re making your selection. You’re not limited to the basic stuff that the Adobe XPathAPI class has to offer.
So, if we would want to select all the book elements from the XML data as XMLNode objects, we could use a query like this:
XPath.selectNodes(books, "/books/book");
The above example will return an array with all the book elements found under the books node. When using more advanced queries you can imagine that you can get access to almost all data stored in the XML with only a simple expression. If we e.g. want to select all the values of the id attributes of all the book nodes, you could use a query like this:
XPath.selectNodes(books, "/books/book/@id");
As you can see, the selectNodes method is a very powerful method to use. It gives you direct access to a complete dataset stored in the XML.
selectNodesAsString,
selectNodesAsNumber,
selectNodesAsBoolean
Syntax:
XPath.selectNodesAsString(node : XMLNode, query : String) : Array

XPath.selectNodesAsNumber(node : XMLNode, query : String) : Array

XPath.selectNodesAsBoolean(node : XMLNode, query : String) : Array
Description
Just as the selectNodes returns an array with XMLNode objects, in this section we’ll dicuss three methods that are very simlar. The selectNodeAsString, selectNodesAsNumber and the selectNodesAsBoolean all return an array with actual node values. Whereas the selectNodes returned XMLNode objects still need further processing before the data can be used, with any of these three methods you get direct access to the XML data.
selectSingleNode
Syntax:
selectSingleNode(node : XMLNode, path : String) : XMLNode
Description:
With the selectSingleNode you can easily retrieve data stored in XML nodes. The purpose of this method is to target a specific node to retrieve its value. You use this method to gain access to the actual data stored in the XML data structure, rather then to make a selection and to get a result is in the form of a record or array.
Conclusion
Having discussed both API’s we can conclude that both have their pros and cons. A definitive downside of the Adobe XPathAPI is its lack of implementation of the XPath specification. The XFactorstudio XPath API really shines when it comes to correctness and robustness of the XPath expressions you can use. On the other hand, when all you need is a simple XPath parsing package or when you create components to be used within the Adobe Flash environment then the Adobe XPathAPI is definitely the best choice because you eliminate dependencies on external sources. The only real downside of the XFactorstudio XPath API is its shear size. It adds a whopping 14kb to the compressed .SWF instead of the 4kb the Adobe XPath requires.
Using XPath when working with XML data in Flash is a definitive pro.
 
Recent Blog Entries by Holaso