Skip to main content

SAX Parser truncation problems

Have you ever met with strange problems regarding SAX parser when textual content seems to be truncated? When it seems that the parser transmits only a fragment of the content which is inside an XML element. Maybe not. Maybe yes but haven't noticed.

Parser reads blocks of stream and it may call characters method more than one times. Well, it's written in the Javadoc and it's quite logical. If I have a very long text content, it couldn't had been processed in one go. It has to be split into parts.

So, rather than assigning the content to a simple string, use concatenation instead and evaluate the content on endElement.

By the way, the magical number is 2048. The parser implementation typically uses this block size. Unfortunately it's a kind of thing which easily creeps under the radar of tests. Nobody writes tests for long data.

See also this on Stackoverflow.

Comments

Popular posts from this blog

Client's transaction aborted

I've met the above error message using a Wicket 1.2 / EJB3 intranet application under Glassfish v2 . Here is the more particular head of the stack trace: javax.ejb.TransactionRolledbackLocalException: Client's transaction aborted at com.sun.ejb.containers.BaseContainer.useClientTx(BaseContainer.java:3394) at com.sun.ejb.containers.BaseContainer.preInvokeTx(BaseContainer.java:3274) at com.sun.ejb.containers.BaseContainer.preInvoke(BaseContainer.java:1244) at com.sun.ejb.containers.EJBLocalObjectInvocationHandler.invoke(EJBLocalObjectInvocationHandler.java:195) at com.sun.ejb.containers.EJBLocalObjectInvocationHandlerDelegate.invoke(EJBLocalObjectInvocationHandlerDelegate.java:127) This exception raised on the integration server sometimes, randomly, for simple page fetch operations. After pressing reload on the browser, the operation was usually successful. I couldn't reproduce the failure on the local machine where I regularly restart the app server and

jxl.log

In an intranet production environment we have running a Glassfish v2 appserver with several J2EE applications which all use JexcelApi , a.k.a JXL, which is an open source library for accessing, generating or manipulating Microsoft Excel documents. We use version 2.6.3 of JXL because it's the recent one in the Maven repository which we use, however, at the official JXL site there are newer versions. Additionally we have log4j and Java Commons Logging (JCL), ignoring Glassfish's JSR-47 Java Util Logging (JUL) facility. Application #1 uses purely log4j and gets its log4j.xml config from a custom location. Application #2 runs Java Commons Logging with no explicite configuration file given, so JCL uses the default JUL facility of the appserver. Application #1 had been running for a long time without problems but when we installed #2 we realized that a jxl.log file had been created in the glassfish/domain/domain1/config directory and it's rapidly growing. As it happens, we

Architecture at CraftConf

At Craft Conf, there were some presentations about software architecture. I visited all of them and also searched for this subject in other talks. It was interesting to hear the same concepts from more places and to put together a picture how software architecture looks like in the mind of the presenters of today’s conferences. Stefan Tilkov: Architecture War Stories . It was indeed about weird stories from real life. I wrote down two things: If something is sophisticated, probable you shouldn’t do it. And having many architects is wrong. Many people liked this talk very much but I’m not really interested in real stories. I’m rather interested in the causes behind the stories to be able to avoid situations which head to weird architecture. Luckily for me, other talks were more abstract. The title of Rachel Laycock’s talk contained the very fashionable word combination “Continuous Delivery” beside “Architecture” so I anticipated it will be great, and it was. Rachel came with the same m