- 浏览: 68648 次
- 性别:
- 来自: 北京
文章分类
最新评论
refer to http://www.saxproject.org/
Quickstart
This document provides a quick-start tutorial for Java programmers who wish to use SAX2 in their programs.
Requirements
SAX is a common interface implemented for many different XML parsers (and things that pose as XML parsers), just as the JDBC is a common interface implemented for many different relational databases (and things that pose as relational databases). If you want to use SAX, you'll need all of the following:
- Java 1.1 or higher.
- A SAX2-compatible XML parser installed on your Java classpath. (If you need such a parser, see the page of links at the left.)
- The SAX2 distribution installed on your Java classpath. (This probably came with your parser.)
Most Java/XML tools distributions include SAX2 and a parser using it. Most web applications servers use it for their core XML support. In particular, environments with JAXP 1.1 support include SAX2.
Parsing a document
Start by creating a class that extends DefaultHandler :
import org.xml.sax.helpers.DefaultHandler;public class MySAXApp extends DefaultHandler
{
public MySAXApp ()
{
super();
}
}
Since this is a Java application, we'll create a static main method that uses the the createXMLReader method from the XMLReaderFactory class to choose a SAX driver dynamically. Note the "throws Exception" wimp-out; real applications would need real error handling:
public static void main (String args[])throws Exception
{
XMLReader xr = XMLReaderFactory.createXMLReader();
}
In case your Java environment did not arrange for a compiled-in default (or to use the META-INF/services/org.xl.sax.driver system resource), you'll probably need to set the org.xml.sax.driver Java system property to the full classname of the SAX driver, as in
java -Dorg.xml.sax.driver=com.example.xml.SAXDriver MySAXApp sample.xmlSeveral of the SAX2 drivers currently in in widespread use are listed on the "links" page. Class names you might use include:
gnu.xml.aelfred2.SAXDriver | Lightweight non-validating parser; Free Software |
gnu.xml.aelfred2.XmlReader | Optionally validates; Free Software |
oracle.xml.parser.v2.SAXParser | Optionally validates; proprietary |
org.apache.crimson.parser.XMLReaderImpl | Optionally validates; used in JDK 1.4; Open Source |
org.apache.xerces.parsers.SAXParser | Optionally validates; Open Source |
Alternatively, if you don't mind coupling your application to a specific SAX driver, you can use its constructor directly. We assume that the SAX driver for your XML parser is named com.example.xml.SAXDriver , but this does not really exist. You must know the name of the real driver for your parser to use this approach.
public static void main (String args[])throws Exception
{
XMLReader xr = new com.example.xml.SAXDriver();
}
We can use this object to parse XML documents, but first, we have to register event handlers that the parser can use for reporting information, using the setContentHandler and setErrorHandler methods from the XMLReader interface. In a real-world application, the handlers will usually be separate objects, but for this simple demo, we've bundled the handlers into the top-level class, so we just have to instantiate the class and register it with the XML reader:
public static void main (String args[])throws Exception
{
XMLReader xr = XMLReaderFactory.createXMLReader();
MySAXApp handler = new MySAXApp();
xr.setContentHandler(handler);
xr.setErrorHandler(handler);
}
This code creates an instance of MySAXApp to receive XML parsing events, and registers it with the XML reader for regular content events and error events (there are other kinds, but they're rarely used). Now, let's assume that all of the command-line args are file names, and we'll try to parse them one-by-one using the parse method from the XMLReader interface:
public static void main (String args[])throws Exception
{
XMLReader xr = XMLReaderFactory.createXMLReader();
MySAXApp handler = new MySAXApp();
xr.setContentHandler(handler);
xr.setErrorHandler(handler);
// Parse each file provided on the
// command line.
for (int i = 0; i < args.length; i++) {
FileReader r = new FileReader(args[i]);
xr.parse(new InputSource(r));
}
}
Note that each reader must be wrapped in an InputSource object to be parsed. Here's the whole demo class together (so far):
import java.io.FileReader;import org.xml.sax.XMLReader;
import org.xml.sax.InputSource;
import org.xml.sax.helpers.XMLReaderFactory;
import org.xml.sax.helpers.DefaultHandler;
public class MySAXApp extends DefaultHandler
{
public static void main (String args[])
throws Exception
{
XMLReader xr = XMLReaderFactory.createXMLReader();
MySAXApp handler = new MySAXApp();
xr.setContentHandler(handler);
xr.setErrorHandler(handler);
// Parse each file provided on the
// command line.
for (int i = 0; i < args.length; i++) {
FileReader r = new FileReader(args[i]);
xr.parse(new InputSource(r));
}
}
public MySAXApp ()
{
super();
}
}
You can compile this code and run it (make sure you specify the SAX driver class in the org.xml.sax.driver property), but nothing much will happen unless the document contains malformed XML, because you have not yet set up your application to handle SAX events.
Handling events
Things get interesting when you start implementing methods to respond to XML parsing events (remember that we registered our class to receive XML parsing events in the previous section). The most important events are the start and end of the document, the start and end of elements, and character data.
To find out about the start and end of the document, the client application implements the startDocument and endDocument methods:
public void startDocument (){
System.out.println("Start document");
}
public void endDocument ()
{
System.out.println("End document");
}
The start/endDocument event handlers take no arguments. When the SAX driver finds the beginning of the document, it will invoke the startDocument method once; when it finds the end, it will invoke the endDocument method once (even if there have been errors).
These examples simply print a message to standard output, but your application can contain any arbitrary code in these handlers: most commonly, the code will build some kind of an in-memory tree, produce output, populate a database, or extract information from the XML stream.
The SAX driver will signal the start and end of elements in much the same way, except that it will also pass some parameters to the startElement and endElement methods:
public void startElement (String uri, String name,String qName, Attributes atts)
{
if ("".equals (uri))
System.out.println("Start element: " + qName);
else
System.out.println("Start element: {" + uri + "}" + name);
}
public void endElement (String uri, String name, String qName)
{
if ("".equals (uri))
System.out.println("End element: " + qName);
else
System.out.println("End element: {" + uri + "}" + name);
}
These methods print a message every time an element starts or ends, with any Namespace URI in braces before the element's local name. The qName contains the raw XML 1.0 name, which you must use for all elements that don't have a namespace URI. In this quick introduction, we won't look at how attributes are accessed; you can access them by name, or by iterating through them much as if they were a vector.
Finally, SAX2 reports regular character data through the characters method; the following implementation will print all character data to the screen; it is a little longer because it pretty-prints the output by escaping special characters:
public void characters (char ch[], int start, int length){
System.out.print("Characters: \"");
for (int i = start; i < start + length; i++) {
switch (ch[i]) {
case '\\':
System.out.print("\\\\");
break;
case '"':
System.out.print("\\\"");
break;
case '\n':
System.out.print("\\n");
break;
case '\r':
System.out.print("\\r");
break;
case '\t':
System.out.print("\\t");
break;
default:
System.out.print(ch[i]);
break;
}
}
System.out.print("\"\n");
}
Note that a SAX driver is free to chunk the character data any way it wants, so you cannot count on all of the character data content of an element arriving in a single characters event.
Sample SAX2 application
Here is the complete sample application (again, in a serious app the event handlers would probably be implemented in a separate class):
import java.io.FileReader;import org.xml.sax.XMLReader;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.helpers.XMLReaderFactory;
import org.xml.sax.helpers.DefaultHandler;
public class MySAXApp extends DefaultHandler
{
public static void main (String args[])
throws Exception
{
XMLReader xr = XMLReaderFactory.createXMLReader();
MySAXApp handler = new MySAXApp();
xr.setContentHandler(handler);
xr.setErrorHandler(handler);
// Parse each file provided on the
// command line.
for (int i = 0; i < args.length; i++) {
FileReader r = new FileReader(args[i]);
xr.parse(new InputSource(r));
}
}
public MySAXApp ()
{
super();
}
////////////////////////////////////////////////////////////////////
// Event handlers.
////////////////////////////////////////////////////////////////////
public void startDocument ()
{
System.out.println("Start document");
}
public void endDocument ()
{
System.out.println("End document");
}
public void startElement (String uri, String name,
String qName, Attributes atts)
{
if ("".equals (uri))
System.out.println("Start element: " + qName);
else
System.out.println("Start element: {" + uri + "}" + name);
}
public void endElement (String uri, String name, String qName)
{
if ("".equals (uri))
System.out.println("End element: " + qName);
else
System.out.println("End element: {" + uri + "}" + name);
}
public void characters (char ch[], int start, int length)
{
System.out.print("Characters: \"");
for (int i = start; i < start + length; i++) {
switch (ch[i]) {
case '\\':
System.out.print("\\\\");
break;
case '"':
System.out.print("\\\"");
break;
case '\n':
System.out.print("\\n");
break;
case '\r':
System.out.print("\\r");
break;
case '\t':
System.out.print("\\t");
break;
default:
System.out.print(ch[i]);
break;
}
}
System.out.print("\"\n");
}
}
Sample Output
Consider the following XML document:
<?xml version="1.0"?><poem xmlns="http://www.megginson.com/ns/exp/poetry">
<title>Roses are Red</title>
<l>Roses are red,</l>
<l>Violets are blue;</l>
<l>Sugar is sweet,</l>
<l>And I love you.</l>
</poem>
If this document is named roses.xml and there is a SAX2 driver on your classpath named com.example.xml.SAXDriver (this driver does not actually exist), you can invoke the sample application like this:
java -Dorg.xml.sax.driver=com.example.xml.SAXDriver MySAXApp roses.xml
When you run this, you'll get output something like this:
Start documentStart element: {http://www.megginson.com/ns/exp/poetry}poem
Characters: "\n"
Start element: {http://www.megginson.com/ns/exp/poetry}title
Characters: "Roses are Red"
End element: {http://www.megginson.com/ns/exp/poetry}title
Characters: "\n"
Start element: {http://www.megginson.com/ns/exp/poetry}l
Characters: "Roses are red,"
End element: {http://www.megginson.com/ns/exp/poetry}l
Characters: "\n"
Start element: {http://www.megginson.com/ns/exp/poetry}l
Characters: "Violets are blue;"
End element: {http://www.megginson.com/ns/exp/poetry}l
Characters: "\n"
Start element: {http://www.megginson.com/ns/exp/poetry}l
Characters: "Sugar is sweet,"
End element: {http://www.megginson.com/ns/exp/poetry}l
Characters: "\n"
Start element: {http://www.megginson.com/ns/exp/poetry}l
Characters: "And I love you."
End element: {http://www.megginson.com/ns/exp/poetry}l
Characters: "\n"
End element: {http://www.megginson.com/ns/exp/poetry}poem
End document
Note that even this short document generates (at least) 25 events: one for the start and end of each of the six elements used (or, if you prefer, one for each start tag and one for each end tag), one of each of the eleven chunks of character data (including whitespace between elements), one for the start of the document, and one for the end.
If the input document did not include the xmlns="http://www.megginson.com/ns/exp/poetry" attribute to declare that all the elements are in that namespace, the output would instead be like:
Start documentStart element: poem
Characters: "\n"
Start element: title
Characters: "Roses are Red"
End element: title
Characters: "\n"
Start element: l
Characters: "Roses are red,"
End element: l
Characters: "\n"
Start element: l
Characters: "Violets are blue;"
End element: l
Characters: "\n"
Start element: l
Characters: "Sugar is sweet,"
End element: l
Characters: "\n"
Start element: l
Characters: "And I love you."
End element: l
Characters: "\n"
End element: poem
End document
You will most likely work with both types of documents: ones using XML namespaces, and ones not using them. You may also work with documents that have some elements (and attributes) with namespaces, and some without. Make sure that your code actually tests for namespace URIs, rather than assuming they are always present (or always missing).
发表评论
-
How to be a Programmer: A Short,Comprehensive,and Personal Summary
2013-10-28 10:38 546well written. http://samizdat ... -
js module pattern
2013-10-12 16:21 358http://www.adequatelygood.com/ ... -
GZip compressing HTML, JavaScript, CSS etc. makes the data sent to the browser s
2013-07-31 15:48 629this is fun. http://tutorials ... -
java collection matrix
2012-08-07 11:24 697http://www.janeve.me/articles/w ... -
ghost text (aka in-field text)
2012-04-01 11:18 630http://archive.plugins.jquery.c ... -
What is Optimistic Locking vs. Pessimistic Locking
2011-09-09 16:50 794What is Optimistic Locking vs. ... -
what is DAO
2011-04-15 13:42 732http://java.sun.com/blueprints/ ... -
indenting xml in vim with xmllint
2011-01-10 09:48 670I added to my “.vimrc” file: ... -
css sprite
2010-12-15 16:57 604http://css-tricks.com/css-sprit ... -
最牛B 的 Linux Shell 命令
2010-10-30 00:08 673http://hi.baidu.com/hy0kl/blog/ ... -
GPS Bearing VS Heading
2010-10-21 15:40 1638http://gps.about.com/od/glossar ... -
Document Type Declaration
2010-07-19 22:01 797Document Type Declaration h ... -
XML Declaration must be the first line in the document.
2010-06-12 17:54 857The XML declaration typically a ... -
UCM
2010-05-08 11:41 709Two links about UCM The power ... -
What is an MXBean?
2010-01-28 11:10 691refer to http://weblogs.java. ... -
why wait() always in a loop
2010-01-19 00:17 812As we know ,jdk API doc suggest ... -
Locks in Java
2010-01-18 22:48 900copied from http://tutorials.je ... -
use jps instead of ps to find jvm process
2010-01-11 14:21 777copied from http://java.sun.com ... -
My first error of Hello Wolrd Struts
2010-01-04 09:10 840It's my first time to touch Str ... -
Unit Testing Equals and HashCode of Java Beans
2009-12-29 10:07 1277copy from http://blog.cornetdes ...
相关推荐
SAX的jar包 SAX的jar包SAX的jar包 SAX的jar包 SAX的jar包
sax.jar sax.jar sax.jar sax.jar sax.jar sax.jar sax.jar
sax9.0 sax9.0 sax9.0 sax9.0 sax9.0 sax9.0 sax9.0
SAX解析XML文件的实例。一个项目同时用dom解析和sax解析xml文件貌似会报错,项目框架建一直是用sax和dom4j解析xml文件的。当我用dom解析xml文件。导入包后就报错识别不了xml文件的编码格式。于是做了一个sax解析xml...
SAX,网络编程,解析工具,SAX解析网络编程
dom和sax解析的区别,dom的概念,sax的概念
SAX类解析XML
SAX 教程及代码 SAX 教程及代码 SAX 教程及代码 SAX 教程及代码
Android之SAX解析
SAX.java 操作xml文件SAX.java 操作xml文件SAX.java 操作xml文件SAX.java 操作xml文件SAX.java 操作xml文件SAX.java 操作xml文件SAX.java 操作xml文件SAX.java 操作xml文件
SAX是Simple API for XML的缩写。 SAX在概念上与DOM完全不同。首先,不同于DOM的文档驱动,它是事件驱动的,也就是说,它并不需要读入整个文档,而文档的读入过程也就是SAX的解析过程……
西门子SAX61.03说明书,最新的执行器。注意接线端子不能接反
DOM与SAX入门,适合初学者,理解DOM与SAX,进而web前端开发。
Sax解析XML文件解
SAX解析XML源码:安卓客户端程序,通过HTTP协议从服务器端获取XML文件,然后解析并输出到控制台
项目下包含dom4j的包 是里用dom4j的sax解析方式 sax解析打文件比dom速度快,该项目为测试项目
android 以SAX方式解析xml
sax课件 例子 sax课件 例子 sax课件 例子 sax课件 例子 sax课件 例子 sax课件 例子
使用SAX方式解析XML SAX 是读取和操作 XML 数据的更快速、更轻量的方 法。SAX 允许您在读取文档时处理它,从而不必等待整个文档被存储之后才采取操作。它不涉及 DOM 所必需的开销和概念跳跃。 SAX API是一个基于事件...
xml sax解析