Quantcast

How to use Saxon XSLT "identity" transform so that default namespaces get set up and used, properly

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

How to use Saxon XSLT "identity" transform so that default namespaces get set up and used, properly

Marshall Schor
Hi.

We're using the Jaxp API in Java to get a content handler:

transformerFactory = (SAXTransformerFactory) SAXTransformerFactory.newInstance();

//Get a TransformerHandler object that can process SAX ContentHandler events
into a Result.
// The transformation is defined as an identity (or copy) transformation,
// for example to copy a series of SAX parse events into a DOM tree.

mHandler = transformerFactory.newTransformerHandler();

mTransformer = mHandler.getTransformer();

etc.

We write content by calling the contentHandler methods, such as

contentHandler.startElement(aNamespace, localname, qname, attributes)

This code has been working fine for years, using the standard XML SAX support in
Java.

Recently, Saxon 8 or 9 was introduced into the class path, and the Saxon
versions of the SAXTransformerFactory and TransformerHandler (Open Declaration
<eclipse-open:%E2%98%82=uimaj-core/C:%5C/Users%5C/IBM_ADMIN%5C/.m2%5C/repository%5C/net%5C/sf%5C/saxon%5C/Saxon-HE%5C/9.7.0-14%5C/Saxon-HE-9.7.0-14.jar%3Cnet.sf.saxon.jaxp%28IdentityTransformerHandler.class%E2%98%83IdentityTransformerHandler%7EIdentityTransformerHandler%7ELnet.sf.saxon.jaxp.IdentityTransformer;>net
<eclipse-javadoc:%E2%98%82=uimaj-core/C:%5C/Users%5C/IBM_ADMIN%5C/.m2%5C/repository%5C/net%5C/sf%5C/saxon%5C/Saxon-HE%5C/9.7.0-14%5C/Saxon-HE-9.7.0-14.jar%3Cnet>.sf
<eclipse-javadoc:%E2%98%82=uimaj-core/C:%5C/Users%5C/IBM_ADMIN%5C/.m2%5C/repository%5C/net%5C/sf%5C/saxon%5C/Saxon-HE%5C/9.7.0-14%5C/Saxon-HE-9.7.0-14.jar%3Cnet.sf>.saxon
<eclipse-javadoc:%E2%98%82=uimaj-core/C:%5C/Users%5C/IBM_ADMIN%5C/.m2%5C/repository%5C/net%5C/sf%5C/saxon%5C/Saxon-HE%5C/9.7.0-14%5C/Saxon-HE-9.7.0-14.jar%3Cnet.sf.saxon>.jaxp
<eclipse-javadoc:%E2%98%82=uimaj-core/C:%5C/Users%5C/IBM_ADMIN%5C/.m2%5C/repository%5C/net%5C/sf%5C/saxon%5C/Saxon-HE%5C/9.7.0-14%5C/Saxon-HE-9.7.0-14.jar%3Cnet.sf.saxon.jaxp>.IdentityTransformerHandler
<eclipse-javadoc:%E2%98%82=uimaj-core/C:%5C/Users%5C/IBM_ADMIN%5C/.m2%5C/repository%5C/net%5C/sf%5C/saxon%5C/Saxon-HE%5C/9.7.0-14%5C/Saxon-HE-9.7.0-14.jar%3Cnet.sf.saxon.jaxp%28IdentityTransformerHandler.class%E2%98%83IdentityTransformerHandler>.IdentityTransformerHandler)
were used instead. This has caused a difference in output, in which our method
of getting a default namespace to be written, has stopped working with the Saxon
version.

We are trying to serialize out a form containing start elements, lets say a, b,
and c,
**all of which belong to some namespace**,
let's call it "a-url-to-use-as-default-name-space".

Our calls to the startElement API pass the namespace string, the element name,
the element name (again, without any namespace prefix) and attributes (if any).

As a special case, the topmost startElement API call has an attribute added to
the set of attributes with the attribute name "xmlns", and a value of
"a-url-to-use-as-default-name-space".  This previously resulted in the generated
XML something like this:

<a xmlns="a-url-to-use-as-default-name-space">
    <b ... />    
    <c ... />
</a>

When Saxon is used, we've found that it strips off the attribute
'xmlns="a-url-to-use-as-default-name-space"'.  I can even see the code snippet
that does this, when I single-stepped through the code; it's this bit, in
"ReceivingContentHandler":

public void startElement(String uri, String localname, String rawname,
Attributes atts)
            throws SAXException {
        try {
            flush(true);

            NodeName elementName = getNodeName(uri, localname, rawname);
            receiver.startElement(elementName, Untyped.getInstance(),
localLocator, ReceiverOptions.NAMESPACE_OK);

            for (int n = 0; n < namespacesUsed; n++) {
                receiver.namespace(namespaces[n], 0);
            }

            for (int a = 0; a < atts.getLength(); a++) {
                int properties = ReceiverOptions.NAMESPACE_OK;
                String qname = atts.getQName(a);
                if (qname.startsWith("xmlns") && (qname.length()==5 ||
qname.charAt(5)==':')) {
                    // We normally configure the parser so that it doesn't
notify namespaces as attributes.
                    // But when running as a TransformerHandler, we have no
control over the feature settings
                    // of the sender of the events. So we filter them out, just
in case. There might be cases
                    // where we ought not just to ignore them, but to handle
them as namespace events, but
                    // we'll cross that bridge when we come to it.
                    continue;
                }

It's the part just above, ending with continue, which discovers we've attached
an attribute "xmlns", and it skips it.

If this is correct behavior, then I'm guessing there's another way one is
supposed to use to instruct the Saxon "Identity" XSLT transformer to specify it
should use a default name space.  Can you please say what that method is?

Thanks.  -Marshall Schor


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How to use Saxon XSLT "identity" transform so that default namespaces get set up and used, properly

Marshall Schor
I figured this out, I think.

To add a default namespace, before the first startElement call, do a
content-handler startPrefixMapping call, passing as arguments:

  the prefix: ""  -- this makes it the "default" prefix
  the uri: whatever uri your using for the default namespace in subsequent calls
to startElement.

The good news is that this approach also works for non-Saxon cases.

-Marshall


On 12/12/2016 10:12 PM, Marshall Schor wrote:

> Hi.
>
> We're using the Jaxp API in Java to get a content handler:
>
> transformerFactory = (SAXTransformerFactory) SAXTransformerFactory.newInstance();
>
> //Get a TransformerHandler object that can process SAX ContentHandler events
> into a Result.
> // The transformation is defined as an identity (or copy) transformation,
> // for example to copy a series of SAX parse events into a DOM tree.
>
> mHandler = transformerFactory.newTransformerHandler();
>
> mTransformer = mHandler.getTransformer();
>
> etc.
>
> We write content by calling the contentHandler methods, such as
>
> contentHandler.startElement(aNamespace, localname, qname, attributes)
>
> This code has been working fine for years, using the standard XML SAX support in
> Java.
>
> Recently, Saxon 8 or 9 was introduced into the class path, and the Saxon
> versions of the SAXTransformerFactory and TransformerHandler (Open Declaration
> <eclipse-open:%E2%98%82=uimaj-core/C:%5C/Users%5C/IBM_ADMIN%5C/.m2%5C/repository%5C/net%5C/sf%5C/saxon%5C/Saxon-HE%5C/9.7.0-14%5C/Saxon-HE-9.7.0-14.jar%3Cnet.sf.saxon.jaxp%28IdentityTransformerHandler.class%E2%98%83IdentityTransformerHandler%7EIdentityTransformerHandler%7ELnet.sf.saxon.jaxp.IdentityTransformer;>net
> <eclipse-javadoc:%E2%98%82=uimaj-core/C:%5C/Users%5C/IBM_ADMIN%5C/.m2%5C/repository%5C/net%5C/sf%5C/saxon%5C/Saxon-HE%5C/9.7.0-14%5C/Saxon-HE-9.7.0-14.jar%3Cnet>.sf
> <eclipse-javadoc:%E2%98%82=uimaj-core/C:%5C/Users%5C/IBM_ADMIN%5C/.m2%5C/repository%5C/net%5C/sf%5C/saxon%5C/Saxon-HE%5C/9.7.0-14%5C/Saxon-HE-9.7.0-14.jar%3Cnet.sf>.saxon
> <eclipse-javadoc:%E2%98%82=uimaj-core/C:%5C/Users%5C/IBM_ADMIN%5C/.m2%5C/repository%5C/net%5C/sf%5C/saxon%5C/Saxon-HE%5C/9.7.0-14%5C/Saxon-HE-9.7.0-14.jar%3Cnet.sf.saxon>.jaxp
> <eclipse-javadoc:%E2%98%82=uimaj-core/C:%5C/Users%5C/IBM_ADMIN%5C/.m2%5C/repository%5C/net%5C/sf%5C/saxon%5C/Saxon-HE%5C/9.7.0-14%5C/Saxon-HE-9.7.0-14.jar%3Cnet.sf.saxon.jaxp>.IdentityTransformerHandler
> <eclipse-javadoc:%E2%98%82=uimaj-core/C:%5C/Users%5C/IBM_ADMIN%5C/.m2%5C/repository%5C/net%5C/sf%5C/saxon%5C/Saxon-HE%5C/9.7.0-14%5C/Saxon-HE-9.7.0-14.jar%3Cnet.sf.saxon.jaxp%28IdentityTransformerHandler.class%E2%98%83IdentityTransformerHandler>.IdentityTransformerHandler)
> were used instead. This has caused a difference in output, in which our method
> of getting a default namespace to be written, has stopped working with the Saxon
> version.
>
> We are trying to serialize out a form containing start elements, lets say a, b,
> and c,
> **all of which belong to some namespace**,
> let's call it "a-url-to-use-as-default-name-space".
>
> Our calls to the startElement API pass the namespace string, the element name,
> the element name (again, without any namespace prefix) and attributes (if any).
>
> As a special case, the topmost startElement API call has an attribute added to
> the set of attributes with the attribute name "xmlns", and a value of
> "a-url-to-use-as-default-name-space".  This previously resulted in the generated
> XML something like this:
>
> <a xmlns="a-url-to-use-as-default-name-space">
>     <b ... />    
>     <c ... />
> </a>
>
> When Saxon is used, we've found that it strips off the attribute
> 'xmlns="a-url-to-use-as-default-name-space"'.  I can even see the code snippet
> that does this, when I single-stepped through the code; it's this bit, in
> "ReceivingContentHandler":
>
> public void startElement(String uri, String localname, String rawname,
> Attributes atts)
>             throws SAXException {
>         try {
>             flush(true);
>
>             NodeName elementName = getNodeName(uri, localname, rawname);
>             receiver.startElement(elementName, Untyped.getInstance(),
> localLocator, ReceiverOptions.NAMESPACE_OK);
>
>             for (int n = 0; n < namespacesUsed; n++) {
>                 receiver.namespace(namespaces[n], 0);
>             }
>
>             for (int a = 0; a < atts.getLength(); a++) {
>                 int properties = ReceiverOptions.NAMESPACE_OK;
>                 String qname = atts.getQName(a);
>                 if (qname.startsWith("xmlns") && (qname.length()==5 ||
> qname.charAt(5)==':')) {
>                     // We normally configure the parser so that it doesn't
> notify namespaces as attributes.
>                     // But when running as a TransformerHandler, we have no
> control over the feature settings
>                     // of the sender of the events. So we filter them out, just
> in case. There might be cases
>                     // where we ought not just to ignore them, but to handle
> them as namespace events, but
>                     // we'll cross that bridge when we come to it.
>                     continue;
>                 }
>
> It's the part just above, ending with continue, which discovers we've attached
> an attribute "xmlns", and it skips it.
>
> If this is correct behavior, then I'm guessing there's another way one is
> supposed to use to instruct the Saxon "Identity" XSLT transformer to specify it
> should use a default name space.  Can you please say what that method is?
>
> Thanks.  -Marshall Schor
>


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Loading...