Apache XML Beans is a set of tools and class libraries used to generate a JAR library specifically for an XML schema as defined by an XSD file.
The generated library can be used for XML parsing or XML generation conforming to a specified XSD schema.
JAXB and XML Beans are the two prominent XML parsing and generation frameworks.
- XML Beans: supports the entire XSD/XML standard. Although it loads the entire document, XML Beans can also parse only the elements and attributes you are looking for.
- JAXB: loads XML file data into a Java class generated by JAXB as defined by the XSD schema. If the final destination of the data is a class of another definition, the data will have to pass through this intermediate JAXB class.
This tutorial will cover the use of XML Beans for parsing an XML file.
Download latest binaries from http://xmlbeans.apache.org/
Install:cd /opt tar xzf ~/Downloads/xmlbeans-2.5.0.tgz
Add to PATH and CLASSPATH:
-
File: ~/.bashrc
# # Apache XMLBeans # if [ -d /opt/xmlbeans-2.5.0 ] then PATH=$PATH:/opt/xmlbeans-2.5.0/bin export CLASSPATH=$CLASSPATH:/opt/xmlbeans-2.5.0/lib/xbean.jar fi
XML Beans and Java programming require the installation of Java. See our YoLinux Java Tutorial.
Define an XML schema file (XSD) using your favorite XSD/XML editor:
-
XML Schema File: corporation.xsd
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified"> <xs:element name="root"> <xs:complexType> <xs:sequence> <xs:element name="Corporation"> <xs:complexType> <xs:sequence> <xs:element name="Name" type="xs:string"/> <xs:element name="Phone" type="xs:string"/> <xs:element name="Fax" type="xs:string"/> <xs:element name="Address" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="People"> <xs:complexType> <xs:sequence> <xs:element name="Employee" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="Name" type="xs:string"/> <xs:choice> <xs:element name="US-W2"> <xs:complexType mixed="false"> <xs:sequence> <xs:element name="EmpNumber" type="xs:int"/> <xs:element name="Manager" type="xs:string"/> <xs:element name="YearStart" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="US-1099"> <xs:complexType mixed="false"> <xs:sequence> <xs:element name="SsnNumber" type="xs:string"/> <xs:element name="Phone" type="xs:string" default="1-800-555-1212"/> <xs:element name="CorpName" type="xs:string"/> <xs:element name="CorpAddress" type="xs:string"/> <xs:element name="Relationship" type="ERelation"/> </xs:sequence> </xs:complexType> </xs:element> </xs:choice> <xs:element name="Data"> <xs:annotation> <xs:documentation>Data: personel data.</xs:documentation> </xs:annotation> <xs:complexType mixed="false"> <xs:sequence> <xs:element name="WorkPhone" type="xs:string" default="1-800-555-1212"/> <xs:element name="CellPhone" type="xs:string" default="1-800-555-1212"/> <xs:element name="Address" type="xs:string" default="NULL"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="TaxStatus" type="ETaxStatus" use="required"/> <xs:attribute name="Gender" type="EGender" use="required"/> <xs:attribute name="Desc" type="ELocation" use="required"/> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> <xs:simpleType name="EGender"> <xs:restriction base="xs:string"> <xs:enumeration value="Male"/> <xs:enumeration value="Female"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="ELocation"> <xs:restriction base="xs:string"> <xs:enumeration value="OnSite"/> <xs:enumeration value="OffSite"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="ETaxStatus"> <xs:restriction base="xs:string"> <xs:enumeration value="US-W2"/> <xs:enumeration value="US-1099"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="ERelation"> <xs:restriction base="xs:string"> <xs:enumeration value="CorpToCorp"/> <xs:enumeration value="CorpToIndividual"/> </xs:restriction> </xs:simpleType> </xs:schema>
XSD Notes:
- XSD requires that element attributes be defined at the end of the element definition.
- Complex types allow elements in their content and may carry attributes. The "complexType" section refers to how the content (defined by elements) is formed. Simple types do not have element content and cannot carry attributes. SimpleTypes are used to define and use a new type like an enum. SimpleTypes are also used to set restrictions such as maxLength/minLength, max/min values, number of characters (length), totalDigits, etc.
- Multiple occurances of an element type is permitted when the following is specified:
- element attribute maxOccurs="unbounded" is specified. or maxOccurs="10" is specified as a specific value greater than one.
- element attribute minOccurs="3" is specified as a value greater than one.
- Indicate that any one of the elements may be specified: <xs:choice>
- Indicate that the elements must be specified in the order given: <xs:sequence>
This will generate the file corporation.jar
Most Java IDE's support the interrogation and use of classes and methods within the JAR file for use by the programmer. You can also employ the following script to view the classes and class methods available:File: listmeth
#!/bin/bash echo "List classes and methods available in the JAR file: $1" # Loop through classes for class in $(jar -tf $1 | grep '.class'); do # Replace "/" with "." to derive class name from path class=${class//\//.}; # javap - class file disassembler javap -classpath $1 ${class//.class/}; done
Use: listmeth corporation.jar
List classes and methods available in the JAR file: corporation.jar public final class noNamespace.ETaxStatus$Factory { public static noNamespace.ETaxStatus newValue(java.lang.Object); public static noNamespace.ETaxStatus newInstance(); public static noNamespace.ETaxStatus newInstance(org.apache.xmlbeans.XmlOptions); public static noNamespace.ETaxStatus parse(java.lang.String) throws org.apache.xmlbeans.XmlException; public static noNamespace.ETaxStatus parse(java.lang.String, org.apache.xmlbeans.XmlOptions) ... ... ...
File: TestXmlBeans.java
import java.io.File; import java.util.ArrayList; import noNamespace.RootDocument; import noNamespace.RootDocument.Root; import noNamespace.RootDocument.Root.People.Employee; import noNamespace.RootDocument.Root.People.Employee.Data; import org.apache.xmlbeans.XmlException; import org.apache.xmlbeans.XmlObject; import org.apache.xmlbeans.XmlOptions; public class TestXmlBeans { static public void main (String[] args) throws XmlException { String fileName = "CorporationMegaX.xml"; try { readFile(fileName); } catch (Exception e) { System.out.println("Error! Exception caught"); e.printStackTrace(); } } public static void readFile(String fileName) throws XmlException { try { java.io.File inputXMLFile = new java.io.File(fileName); RootDocument rootDocument = RootDocument.Factory.parse(inputXMLFile); Root root = rootDocument.getRoot(); System.out.println("Corporation: " + root.getCorporation().getName()); Employee[] employee = root.getPeople().getEmployeeArray(); System.out.println("There are " + employee.length + " employees."); for (int pp = 0; pp < employee.length; pp++) { System.out.println("Employee name: " + employee[pp].getName()); System.out.println(" Gender: " + employee[pp].getGender().toString()); System.out.println(" Location: " + employee[pp].getDesc().toString()); System.out.println(" Cell phone: " + employee[pp].getData().getCellPhone()); String employeeTaxStatus = employee[pp].getTaxStatus().toString(); if(employeeTaxStatus.equalsIgnoreCase("US-W2")) { System.out.println(" Tax status: W2"); System.out.println(" Employee number: " + employee[pp].getUSW2().getEmpNumber()); System.out.println(" Employee manager: " + employee[pp].getUSW2().getManager()); System.out.println(" Employee start year: " + employee[pp].getUSW2().getYearStart()); } else if(employeeTaxStatus.equalsIgnoreCase("US-1099")) { System.out.println(" Tax status: 1099"); System.out.println(" SSN number: " + employee[pp].getUS1099().getSsnNumber()); System.out.println(" Phone number: " + employee[pp].getUS1099().getPhone()); System.out.println(" Corp name: " + employee[pp].getUS1099().getCorpName()); System.out.println(" Corp address: " + employee[pp].getUS1099().getCorpAddress()); System.out.println(" Corp relationship: " + employee[pp].getUS1099().getRelationship().toString()); } else { System.out.println("No tax status specified"); } } } catch (Exception e) { System.out.println("Error! Exception caught"); e.printStackTrace(); } } }
XML data file: CorporationMegaX.xml
<?xml version="1.0" encoding="UTF-8"?> <root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="corporation.xsd"> <Corporation> <Name>Corporation MegaX</Name> <Phone>1-800-555-1212</Phone> <Fax>1-877-555-1212</Fax> <Address>512 Megacorp Way, Gotham City</Address> </Corporation> <People> <!-- The Boss --> <Employee TaxStatus="US-W2" Gender="Male" Desc="OnSite"> <Name>Mr Grand Kahuna</Name> <US-W2> <EmpNumber>1</EmpNumber> <Manager>None</Manager> <YearStart>1998</YearStart> </US-W2> <Data> <WorkPhone>1-800-555-1213</WorkPhone> <CellPhone>1-800-555-1214</CellPhone> <Address>100 Cherry Hill Lane, Gotham City</Address> </Data> </Employee> <!-- The Consultant --> <Employee TaxStatus="US-1099" Gender="Male" Desc="OnSite"> <Name>Mr Special Tee</Name> <US-1099> <SsnNumber>123-45-6788</SsnNumber> <Phone>1-817-555-1212</Phone> <CorpName>ABC Consulting</CorpName> <CorpAddress>3 Mockingbird Lane, Smallville AK</CorpAddress> <Relationship>CorpToIndividual</Relationship> </US-1099> <Data> <WorkPhone>1-800-555-1215</WorkPhone> <CellPhone>1-800-555-1216</CellPhone> <Address>200 Lookout Hill, Gotham City</Address> </Data> </Employee> <!-- The Secratary --> <Employee TaxStatus="US-W2" Gender="Female" Desc="OnSite"> <Name>Mrs Jenny Reliable</Name> <US-W2> <EmpNumber>2</EmpNumber> <Manager>Mr Grand Kahuna</Manager> <YearStart>1999</YearStart> </US-W2> <Data> <WorkPhone>1-800-555-1217</WorkPhone> <CellPhone>1-800-555-1218</CellPhone> <Address>300 Riverside View, Gotham City</Address> </Data> </Employee> </People> </root>
Ant build script: build.xml
<?xml version="1.0" encoding="utf-8"?> <project name="IO" default="compile" basedir="."> <description>Builds, tests, and runs the project Test.</description> <property name="build.dir" value="./" /> <taskdef name="xmlbean" classname="org.apache.xmlbeans.impl.tool.XMLBean" classpath="/opt/xmlbeans-2.5.0/lib/xbean.jar" /> <path id="classpath"> <pathelement location="/usr/java/latest/lib/tools.jar" /> <pathelement location="/opt/xmlbeans-2.5.0/lib/xbean.jar" /> <pathelement location="./corporation.jar" /> </path> <target name="clean" description="Remove .class files"> <delete includeEmptyDirs="true" failonerror="false"> <fileset dir="${build.dir}"> <include name="**/*.class" /> </fileset> </delete> </target> <target name="compile"> <javac srcdir="./" destdir="./" debug="true" includeAntRuntime="false"> <classpath refid="classpath" /> <include name="**/*.java" /> </javac> </target> <target name="run" depends="compile"> <java classname="TestXmlBeans" failonerror="true" fork="true"> <classpath> <path refid="classpath" /> <path location="./"/> </classpath> </java> </target> </project>
Compile: ant compile
[user1@tux XMLBeans]$ ant compile Buildfile: /home/user1/Desktop/XMLBeans/build.xml compile: [javac] Compiling 1 source file to /home/user1/Desktop/XMLBeans BUILD SUCCESSFUL Total time: 0 seconds
[Potential Pitfall]: If you got the following error:
[javac] Compiling 1 source file to /home/greg/src/Test/JavaXmlBeans
[javac] /home/greg/src/Test/JavaXmlBeans/TestXmlBeans.java:4: error: package noNamespace does not exist
[javac] import noNamespace.RootDocument;
...
...
..
Run: ant run
[user1@tux XMLBeans]$ ant run Buildfile: /home/user1/XMLBeans/build.xml compile: run: [java] Corporation: Corporation MegaX [java] There are 3 employees. [java] Employee name: Mr Grand Kahuna [java] Gender: Male [java] Location: OnSite [java] Cell phone: 1-800-555-1214 [java] Tax status: W2 [java] Employee number: 1 [java] Employee manager: None [java] Employee start year: 1998 [java] Employee name: Mr Special Tee [java] Gender: Male [java] Location: OnSite [java] Cell phone: 1-800-555-1216 [java] Tax status: 1099 [java] SSN number: 123-45-6788 [java] Phone number: 1-817-555-1212 [java] Corp name: ABC Consulting [java] Corp address: 3 Mockingbird Lane, Smallville AK [java] Corp relationship: CorpToIndividual [java] Employee name: Mrs Jenny Reliable [java] Gender: Female [java] Location: OnSite [java] Cell phone: 1-800-555-1218 [java] Tax status: W2 [java] Employee number: 2 [java] Employee manager: Mr Grand Kahuna [java] Employee start year: 1999 BUILD SUCCESSFUL Total time: 1 second
Our example is for a standard XML file containing only XML content. This is common for configuration files. This is specified in our example by the XML XSD schema:
<xs:element name="US-1099"> <xs:complexType mixed="false"> ... ... </xs:complexType>
One can also have mixed content where there is plain text around the XML content. The XSD would be stated as:
<xs:element name="US-1099"> <xs:complexType mixed="true"> ... ... </xs:complexType>
and the XML would have plain text mixed with the XML as in this example snippet:
<US-1099> SSN Number: <SsnNumber>123-45-6788</SsnNumber> Phone Number: <Phone>1-817-555-1212</Phone> Corporation: <CorpName>ABC Consulting</CorpName> Corp Adress: <CorpAddress>3 Mockingbird Lane, Smallville AK</CorpAddress> Relationship: <Relationship>CorpToIndividual</Relationship> </US-1099>
XML Beans can handle a complexType with a "mixed" text of both "true" and "false" with no change to the code. The same can not be said for JAXB which gets significantly more complex when set to "true".
YoLinux Tutorials: