OutputURIResolver: secondary trees loose base URI

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

OutputURIResolver: secondary trees loose base URI

Florent Georges-3
Hi,

Using s9api, I am trying to capture secondary output trees, written
using xsl:result-document.  So I created my own OutputURIResolver,
which uses XdmDestination to capture them in memory.  In order to make
sure to preserve the output base URI, I use Result.setSystemId on the
receiver of the XDM destination (it is properly resolved against the
base output URI).

But when the OutputURIResolver.close() method is called, the result
system ID has been removed.  Printing the result oobject itself shows
they are the exact same object returned by resolve() though.

Am I doing something wrong here?

The following is a self-contained, minimal example (yes, I don't think
it is possible to reduce it any further).  The only dependency it to
have Saxon in the classpath.  Tested against Saxon 9.7.0.10.

package saxon;

import java.io.StringReader;
import java.net.URI;
import java.net.URISyntaxException;
import javax.xml.transform.Result;
import javax.xml.transform.TransformerException;
import javax.xml.transform.stream.StreamSource;
import net.sf.saxon.Configuration;
import net.sf.saxon.lib.OutputURIResolver;
import net.sf.saxon.s9api.Processor;
import net.sf.saxon.s9api.QName;
import net.sf.saxon.s9api.SaxonApiException;
import net.sf.saxon.s9api.XdmDestination;
import net.sf.saxon.s9api.XsltTransformer;

public class BaseUriSecondaryDoc
{
    public static void main(String[] args)
            throws SaxonApiException
    {
        // new processor, custom output resolver
        Processor proc = new Processor(false);
        Configuration config = proc.getUnderlyingConfiguration();
        config.setOutputURIResolver(new Output(config));

        // compile the stylesheet
        XsltTransformer trans = proc
            .newXsltCompiler()
            .compile(new StreamSource(new StringReader(STYLE)))
            .load();

        // run template "main", ignore the main result
        trans.setInitialTemplate(new QName("main"));
        trans.setBaseOutputURI("http://www.example.com/");
        trans.setDestination(new XdmDestination());
        trans.transform();
    }

    private static class Output implements OutputURIResolver
    {
        public Output(Configuration config)
        {
            myConfig = config;
        }

        @Override
        public OutputURIResolver newInstance()
        {
            return new Output(myConfig);
        }

        private static String resolveUri(String href, String base)
                throws TransformerException
        {
            try {
                return new URI(base).resolve(href).toASCIIString();
            }
            catch (URISyntaxException ex) {
                throw new TransformerException(ex);
            }
        }

        @Override
        public Result resolve(String href, String base)
                throws TransformerException
        {
            String uri = resolveUri(href, base);
            XdmDestination dest = new XdmDestination();
            try {
                Result res = dest.getReceiver(myConfig);
                System.err.println("New result: " + res);
                System.err.println("  sysid: " + uri);
                res.setSystemId(uri);
                return res;
            }
            catch ( SaxonApiException ex ) {
                throw new TransformerException(ex);
            }
        }

        @Override
        public void close(Result result)
        {
            System.err.println("CLOSE result: " + result);
            System.err.println("  sysid: " + result.getSystemId());
        }

        private final Configuration myConfig;
    }

    private static final String STYLE =
        "<xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform'\n" +
        "                version='2.0'>\n" +
        "   <xsl:template name='main'>\n" +
        "      <xsl:result-document href='foo.xml'>\n" +
        "         <doc>Hello Foo!</doc>\n" +
        "      </xsl:result-document>\n" +
        "      <xsl:result-document href='bar.xml'>\n" +
        "         <doc>Hello Bar!</doc>\n" +
        "      </xsl:result-document>\n" +
        "      <ignored/>\n" +
        "   </xsl:template>\n" +
        "</xsl:stylesheet>";
}

When I run that class, it outputs the following (showing that 2
complex content outputter objects have been created, that the system
ID is properly set in resolve(), but that it is removed before close()
is called):

New result: net.sf.saxon.event.ComplexContentOutputter@55040f2f
  sysid: http://www.example.com/foo.xml
CLOSE result: net.sf.saxon.event.ComplexContentOutputter@55040f2f
  sysid:
New result: net.sf.saxon.event.ComplexContentOutputter@275710fc
  sysid: http://www.example.com/bar.xml
CLOSE result: net.sf.saxon.event.ComplexContentOutputter@275710fc
  sysid:

Did I miss anything?

Regards,

--
Florent Georges
http://fgeorges.org/
http://h2oconsulting.be/

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: OutputURIResolver: secondary trees loose base URI

Florent Georges-3
Hi,

Just in case it can help, if I create a wrapper around the receiver of
the XDM destination (as a SequenceReceiver), I see the following
sequence of calls:

getSystemId() - http://www.example.com/bar.xml
getSystemId() - http://www.example.com/bar.xml
setSystemId() - http://www.example.com/bar.xml
setPipelineConfiguration()
getSystemId() - http://www.example.com/bar.xml
open()
startDocument()
setSystemId() -
startElement()
startContent()
characters()
endElement()
endDocument()
close()

Interestingly, right after startDocument(), setSystemId() is called
with the empty string as parameter.  Not sure why.  Just in case, here
is the self-contained test (a bit long, but it is mostly boiler-plate
code for writing the wrapping class):

package org.fgeorges.scrapbook.saxon;

import java.io.StringReader;
import java.net.URI;
import java.net.URISyntaxException;
import javax.xml.transform.Result;
import javax.xml.transform.TransformerException;
import javax.xml.transform.stream.StreamSource;
import net.sf.saxon.Configuration;
import net.sf.saxon.event.PipelineConfiguration;
import net.sf.saxon.event.SequenceReceiver;
import net.sf.saxon.expr.parser.Location;
import net.sf.saxon.lib.OutputURIResolver;
import net.sf.saxon.om.Item;
import net.sf.saxon.om.NamePool;
import net.sf.saxon.om.NamespaceBinding;
import net.sf.saxon.om.NodeName;
import net.sf.saxon.s9api.Processor;
import net.sf.saxon.s9api.QName;
import net.sf.saxon.s9api.SaxonApiException;
import net.sf.saxon.s9api.XdmDestination;
import net.sf.saxon.s9api.XsltTransformer;
import net.sf.saxon.trans.XPathException;
import net.sf.saxon.type.SchemaType;
import net.sf.saxon.type.SimpleType;

/**
 * On 2016-11-06.
 *
 * @author Florent Georges
 */
public class BaseUriSecondaryDoc
{
    public static void main(String[] args)
            throws SaxonApiException
    {
        // new processor, custom output resolver
        Processor proc = new Processor(false);
        Configuration config = proc.getUnderlyingConfiguration();
        config.setOutputURIResolver(new Output(config));

        // compile the stylesheet
        XsltTransformer trans = proc
            .newXsltCompiler()
            .compile(new StreamSource(new StringReader(STYLE)))
            .load();

        // run template "main", ignore the main result
        trans.setInitialTemplate(new QName("main"));
        trans.setBaseOutputURI("http://www.example.com/");
        trans.setDestination(new XdmDestination());
        trans.transform();
    }

    private static class Output implements OutputURIResolver
    {
        public Output(Configuration config) {
            myConfig = config;
        }

        @Override
        public OutputURIResolver newInstance() {
            return new Output(myConfig);
        }

        private static String resolveUri(String href, String base)
throws TransformerException {
            try {
                return new URI(base).resolve(href).toASCIIString();
            }
            catch (URISyntaxException ex) {
                throw new TransformerException(ex);
            }
        }

        @Override
        public Result resolve(String href, String base) throws
TransformerException {
            String uri = resolveUri(href, base);
            XdmDestination dest = new XdmDestination();
            try {
                Result res = dest.getReceiver(myConfig);
                System.err.println("New result: " + res);
                System.err.println("  at: " + uri);
                res.setSystemId(uri);
                return new Outputter((SequenceReceiver) res);
            }
            catch ( SaxonApiException ex ) {
                throw new TransformerException(ex);
            }
        }

        @Override
        public void close(Result result) {
            System.err.println("CLOSE result: " + result);
            System.err.println("  at: " + result.getSystemId());
        }

        private final Configuration myConfig;
    }

    private static class Outputter
            extends SequenceReceiver
    {
        private SequenceReceiver myWrapped;

        public Outputter(SequenceReceiver wrapped) {
            super(null);
            myWrapped = wrapped;
        }

        @Override
        public NamePool getNamePool() {
            System.err.println("getNamePool()");
            return myWrapped.getNamePool();
        }

        @Override
        public void append(Item item) throws XPathException {
            System.err.println("append()");
            myWrapped.append(item);
        }

        @Override
        public void open() throws XPathException {
            System.err.println("open()");
            myWrapped.open();
        }

        @Override
        public void setUnparsedEntity(String name, String systemID,
String publicID) throws XPathException {
            System.err.println("setUnparsedEntity()");
            myWrapped.setUnparsedEntity(name, systemID, publicID);
        }

        @Override
        public String getSystemId() {
            System.err.println("getSystemId() - " + myWrapped.getSystemId());
            return myWrapped.getSystemId();
        }

        @Override
        public void setSystemId(String systemId) {
            System.err.println("setSystemId() - " + systemId);
            myWrapped.setSystemId(systemId);
        }

        @Override
        public void setPipelineConfiguration(PipelineConfiguration
pipelineConfiguration) {
            System.err.println("setPipelineConfiguration()");
            myWrapped.setPipelineConfiguration(pipelineConfiguration);
        }

        @Override
        public void append(Item item, Location loc, int i) throws
XPathException {
            System.err.println("append()");
            myWrapped.append(item, loc, i);
        }

        @Override
        public void startDocument(int i) throws XPathException {
            System.err.println("startDocument()");
            myWrapped.startDocument(i);
        }

        @Override
        public void endDocument() throws XPathException {
            System.err.println("endDocument()");
            myWrapped.endDocument();
        }

        @Override
        public void startElement(NodeName nn, SchemaType st, Location
lctn, int i) throws XPathException {
            System.err.println("startElement()");
            myWrapped.startElement(nn, st, lctn, i);
        }

        @Override
        public void namespace(NamespaceBinding nb, int i) throws
XPathException {
            System.err.println("namespace()");
            myWrapped.namespace(nb, i);
        }

        @Override
        public void attribute(NodeName nn, SimpleType st, CharSequence
cs, Location lctn, int i) throws XPathException {
            System.err.println("attribute()");
            myWrapped.attribute(nn, st, cs, lctn, i);
        }

        @Override
        public void startContent() throws XPathException {
            System.err.println("startContent()");
            myWrapped.startContent();
        }

        @Override
        public void endElement() throws XPathException {
            System.err.println("endElement()");
            myWrapped.endElement();
        }

        @Override
        public void characters(CharSequence cs, Location lctn, int i)
throws XPathException {
            System.err.println("characters()");
            myWrapped.characters(cs, lctn, i);
        }

        @Override
        public void processingInstruction(String string, CharSequence
cs, Location lctn, int i) throws XPathException {
            System.err.println("processingInstruction()");
            myWrapped.processingInstruction(string, cs, lctn, i);
        }

        @Override
        public void comment(CharSequence cs, Location lctn, int i)
throws XPathException {
            System.err.println("comment()");
            myWrapped.comment(cs, lctn, i);
        }

        @Override
        public void close() throws XPathException {
            System.err.println("close()");
            myWrapped.close();
        }

        @Override
        public boolean usesTypeAnnotations() {
            System.err.println("myWrapped.usesTypeAnnotations()");
            return myWrapped.usesTypeAnnotations();
        }
    }

    private static final String STYLE =
            "<xsl:stylesheet
xmlns:xsl='http://www.w3.org/1999/XSL/Transform'\n" +
            "                version='2.0'>\n" +
            "   <xsl:template name='main'>\n" +
            "      <xsl:result-document href='foo.xml'>\n" +
            "         <doc>Hello Foo!</doc>\n" +
            "      </xsl:result-document>\n" +
            "      <xsl:result-document href='bar.xml'>\n" +
            "         <doc>Hello Bar!</doc>\n" +
            "      </xsl:result-document>\n" +
            "      <ignored/>\n" +
            "   </xsl:template>\n" +
            "</xsl:stylesheet>";
}

Regards,

--
Florent Georges
http://fgeorges.org/
http://h2o.consulting/


On 6 November 2016 at 18:55, Florent Georges wrote:

> Hi,
>
> Using s9api, I am trying to capture secondary output trees, written
> using xsl:result-document.  So I created my own OutputURIResolver,
> which uses XdmDestination to capture them in memory.  In order to make
> sure to preserve the output base URI, I use Result.setSystemId on the
> receiver of the XDM destination (it is properly resolved against the
> base output URI).
>
> But when the OutputURIResolver.close() method is called, the result
> system ID has been removed.  Printing the result oobject itself shows
> they are the exact same object returned by resolve() though.
>
> Am I doing something wrong here?
>
> The following is a self-contained, minimal example (yes, I don't think
> it is possible to reduce it any further).  The only dependency it to
> have Saxon in the classpath.  Tested against Saxon 9.7.0.10.
>
> package saxon;
>
> import java.io.StringReader;
> import java.net.URI;
> import java.net.URISyntaxException;
> import javax.xml.transform.Result;
> import javax.xml.transform.TransformerException;
> import javax.xml.transform.stream.StreamSource;
> import net.sf.saxon.Configuration;
> import net.sf.saxon.lib.OutputURIResolver;
> import net.sf.saxon.s9api.Processor;
> import net.sf.saxon.s9api.QName;
> import net.sf.saxon.s9api.SaxonApiException;
> import net.sf.saxon.s9api.XdmDestination;
> import net.sf.saxon.s9api.XsltTransformer;
>
> public class BaseUriSecondaryDoc
> {
>     public static void main(String[] args)
>             throws SaxonApiException
>     {
>         // new processor, custom output resolver
>         Processor proc = new Processor(false);
>         Configuration config = proc.getUnderlyingConfiguration();
>         config.setOutputURIResolver(new Output(config));
>
>         // compile the stylesheet
>         XsltTransformer trans = proc
>             .newXsltCompiler()
>             .compile(new StreamSource(new StringReader(STYLE)))
>             .load();
>
>         // run template "main", ignore the main result
>         trans.setInitialTemplate(new QName("main"));
>         trans.setBaseOutputURI("http://www.example.com/");
>         trans.setDestination(new XdmDestination());
>         trans.transform();
>     }
>
>     private static class Output implements OutputURIResolver
>     {
>         public Output(Configuration config)
>         {
>             myConfig = config;
>         }
>
>         @Override
>         public OutputURIResolver newInstance()
>         {
>             return new Output(myConfig);
>         }
>
>         private static String resolveUri(String href, String base)
>                 throws TransformerException
>         {
>             try {
>                 return new URI(base).resolve(href).toASCIIString();
>             }
>             catch (URISyntaxException ex) {
>                 throw new TransformerException(ex);
>             }
>         }
>
>         @Override
>         public Result resolve(String href, String base)
>                 throws TransformerException
>         {
>             String uri = resolveUri(href, base);
>             XdmDestination dest = new XdmDestination();
>             try {
>                 Result res = dest.getReceiver(myConfig);
>                 System.err.println("New result: " + res);
>                 System.err.println("  sysid: " + uri);
>                 res.setSystemId(uri);
>                 return res;
>             }
>             catch ( SaxonApiException ex ) {
>                 throw new TransformerException(ex);
>             }
>         }
>
>         @Override
>         public void close(Result result)
>         {
>             System.err.println("CLOSE result: " + result);
>             System.err.println("  sysid: " + result.getSystemId());
>         }
>
>         private final Configuration myConfig;
>     }
>
>     private static final String STYLE =
>         "<xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform'\n" +
>         "                version='2.0'>\n" +
>         "   <xsl:template name='main'>\n" +
>         "      <xsl:result-document href='foo.xml'>\n" +
>         "         <doc>Hello Foo!</doc>\n" +
>         "      </xsl:result-document>\n" +
>         "      <xsl:result-document href='bar.xml'>\n" +
>         "         <doc>Hello Bar!</doc>\n" +
>         "      </xsl:result-document>\n" +
>         "      <ignored/>\n" +
>         "   </xsl:template>\n" +
>         "</xsl:stylesheet>";
> }
>
> When I run that class, it outputs the following (showing that 2
> complex content outputter objects have been created, that the system
> ID is properly set in resolve(), but that it is removed before close()
> is called):
>
> New result: net.sf.saxon.event.ComplexContentOutputter@55040f2f
>   sysid: http://www.example.com/foo.xml
> CLOSE result: net.sf.saxon.event.ComplexContentOutputter@55040f2f
>   sysid:
> New result: net.sf.saxon.event.ComplexContentOutputter@275710fc
>   sysid: http://www.example.com/bar.xml
> CLOSE result: net.sf.saxon.event.ComplexContentOutputter@275710fc
>   sysid:
>
> Did I miss anything?
>
> Regards,
>
> --
> Florent Georges
> http://fgeorges.org/
> http://h2oconsulting.be/

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: OutputURIResolver: secondary trees loose base URI

Michael Kay
This one is tricky, because I can see changes that might be desirable, but I'm worried about whether they will break something. The whole handling of systemId down a Receiver pipeline is pretty delicate code, because the systemId is used in a number of subtly different ways on different paths. In particular it is often used to represent the base URI of the nodes represented by the Receiver pipeline events (*), and the base URI of newly constructed nodes is dictated by the XSLT spec. Because the ComplexContentOutputter (CCO) returned by Destination.getReceiver() is a Result object, its systemId has to behave as expected by the Result class, but it is also a Receiver, which means that its systemId is used to represent the base URI of nodes.

The setting of systemId() to empty string reflects the fact that the stylesheet in this case has no static base URI, and therefore the nodes created by the <doc> literal result element have no base URI.

XSLT 2.0 contained a statement, under xsl:result-document, that:

The base URI of the document node at the root of the final result tree is based on the effective value of the href attribute [resolved against the base output URI]

This statement is no longer present in 3.0, largely because we now invoke Sequence Normalization from the serialization spec to construct the final result tree, rather than invoking Constructing Complex Content from the XSLT spec.

In fact the serialization spec says that sequence normalization is equivalent to an xsl:document instruction, which at face value means that the base URI should be the stylesheet base URI.

Either way, the systemId property of the ComplexContentOutputter is being overwritten to hold the base URI of the document node determined (more or less) according to the rules in the W3C specs, which might be quite different from any value placed in there by the OutputURIResolver.

I hope you see why this is an area where angels fear to tread!

This example also demonstrates an inefficiency, namely that the output events are passing through two ComplexContentOutputters and two NamespaceReducers. This is almost certainly unnecessary in nearly all cases, but I can't be 100% sure that it's not necessary in some unusual cases, for example when the serialization parameter item-separator is used, or when the output is not a well-formed document stream.

Michael Kay
Saxonica

(*) More strictly, the URI that would be the base URI in the absence of any xml:base attribute, and the URI that is used to resolve xml:base if xml:base is present and is relatve.



> On 8 Nov 2016, at 09:59, Florent Georges <[hidden email]> wrote:
>
> Hi,
>
> Just in case it can help, if I create a wrapper around the receiver of
> the XDM destination (as a SequenceReceiver), I see the following
> sequence of calls:
>
> getSystemId() - http://www.example.com/bar.xml
> getSystemId() - http://www.example.com/bar.xml
> setSystemId() - http://www.example.com/bar.xml
> setPipelineConfiguration()
> getSystemId() - http://www.example.com/bar.xml
> open()
> startDocument()
> setSystemId() -
> startElement()
> startContent()
> characters()
> endElement()
> endDocument()
> close()
>
> Interestingly, right after startDocument(), setSystemId() is called
> with the empty string as parameter.  Not sure why.  Just in case, here
> is the self-contained test (a bit long, but it is mostly boiler-plate
> code for writing the wrapping class):
>
> package org.fgeorges.scrapbook.saxon;
>
> import java.io.StringReader;
> import java.net.URI;
> import java.net.URISyntaxException;
> import javax.xml.transform.Result;
> import javax.xml.transform.TransformerException;
> import javax.xml.transform.stream.StreamSource;
> import net.sf.saxon.Configuration;
> import net.sf.saxon.event.PipelineConfiguration;
> import net.sf.saxon.event.SequenceReceiver;
> import net.sf.saxon.expr.parser.Location;
> import net.sf.saxon.lib.OutputURIResolver;
> import net.sf.saxon.om.Item;
> import net.sf.saxon.om.NamePool;
> import net.sf.saxon.om.NamespaceBinding;
> import net.sf.saxon.om.NodeName;
> import net.sf.saxon.s9api.Processor;
> import net.sf.saxon.s9api.QName;
> import net.sf.saxon.s9api.SaxonApiException;
> import net.sf.saxon.s9api.XdmDestination;
> import net.sf.saxon.s9api.XsltTransformer;
> import net.sf.saxon.trans.XPathException;
> import net.sf.saxon.type.SchemaType;
> import net.sf.saxon.type.SimpleType;
>
> /**
> * On 2016-11-06.
> *
> * @author Florent Georges
> */
> public class BaseUriSecondaryDoc
> {
>   public static void main(String[] args)
>           throws SaxonApiException
>   {
>       // new processor, custom output resolver
>       Processor proc = new Processor(false);
>       Configuration config = proc.getUnderlyingConfiguration();
>       config.setOutputURIResolver(new Output(config));
>
>       // compile the stylesheet
>       XsltTransformer trans = proc
>           .newXsltCompiler()
>           .compile(new StreamSource(new StringReader(STYLE)))
>           .load();
>
>       // run template "main", ignore the main result
>       trans.setInitialTemplate(new QName("main"));
>       trans.setBaseOutputURI("http://www.example.com/");
>       trans.setDestination(new XdmDestination());
>       trans.transform();
>   }
>
>   private static class Output implements OutputURIResolver
>   {
>       public Output(Configuration config) {
>           myConfig = config;
>       }
>
>       @Override
>       public OutputURIResolver newInstance() {
>           return new Output(myConfig);
>       }
>
>       private static String resolveUri(String href, String base)
> throws TransformerException {
>           try {
>               return new URI(base).resolve(href).toASCIIString();
>           }
>           catch (URISyntaxException ex) {
>               throw new TransformerException(ex);
>           }
>       }
>
>       @Override
>       public Result resolve(String href, String base) throws
> TransformerException {
>           String uri = resolveUri(href, base);
>           XdmDestination dest = new XdmDestination();
>           try {
>               Result res = dest.getReceiver(myConfig);
>               System.err.println("New result: " + res);
>               System.err.println("  at: " + uri);
>               res.setSystemId(uri);
>               return new Outputter((SequenceReceiver) res);
>           }
>           catch ( SaxonApiException ex ) {
>               throw new TransformerException(ex);
>           }
>       }
>
>       @Override
>       public void close(Result result) {
>           System.err.println("CLOSE result: " + result);
>           System.err.println("  at: " + result.getSystemId());
>       }
>
>       private final Configuration myConfig;
>   }
>
>   private static class Outputter
>           extends SequenceReceiver
>   {
>       private SequenceReceiver myWrapped;
>
>       public Outputter(SequenceReceiver wrapped) {
>           super(null);
>           myWrapped = wrapped;
>       }
>
>       @Override
>       public NamePool getNamePool() {
>           System.err.println("getNamePool()");
>           return myWrapped.getNamePool();
>       }
>
>       @Override
>       public void append(Item item) throws XPathException {
>           System.err.println("append()");
>           myWrapped.append(item);
>       }
>
>       @Override
>       public void open() throws XPathException {
>           System.err.println("open()");
>           myWrapped.open();
>       }
>
>       @Override
>       public void setUnparsedEntity(String name, String systemID,
> String publicID) throws XPathException {
>           System.err.println("setUnparsedEntity()");
>           myWrapped.setUnparsedEntity(name, systemID, publicID);
>       }
>
>       @Override
>       public String getSystemId() {
>           System.err.println("getSystemId() - " + myWrapped.getSystemId());
>           return myWrapped.getSystemId();
>       }
>
>       @Override
>       public void setSystemId(String systemId) {
>           System.err.println("setSystemId() - " + systemId);
>           myWrapped.setSystemId(systemId);
>       }
>
>       @Override
>       public void setPipelineConfiguration(PipelineConfiguration
> pipelineConfiguration) {
>           System.err.println("setPipelineConfiguration()");
>           myWrapped.setPipelineConfiguration(pipelineConfiguration);
>       }
>
>       @Override
>       public void append(Item item, Location loc, int i) throws
> XPathException {
>           System.err.println("append()");
>           myWrapped.append(item, loc, i);
>       }
>
>       @Override
>       public void startDocument(int i) throws XPathException {
>           System.err.println("startDocument()");
>           myWrapped.startDocument(i);
>       }
>
>       @Override
>       public void endDocument() throws XPathException {
>           System.err.println("endDocument()");
>           myWrapped.endDocument();
>       }
>
>       @Override
>       public void startElement(NodeName nn, SchemaType st, Location
> lctn, int i) throws XPathException {
>           System.err.println("startElement()");
>           myWrapped.startElement(nn, st, lctn, i);
>       }
>
>       @Override
>       public void namespace(NamespaceBinding nb, int i) throws
> XPathException {
>           System.err.println("namespace()");
>           myWrapped.namespace(nb, i);
>       }
>
>       @Override
>       public void attribute(NodeName nn, SimpleType st, CharSequence
> cs, Location lctn, int i) throws XPathException {
>           System.err.println("attribute()");
>           myWrapped.attribute(nn, st, cs, lctn, i);
>       }
>
>       @Override
>       public void startContent() throws XPathException {
>           System.err.println("startContent()");
>           myWrapped.startContent();
>       }
>
>       @Override
>       public void endElement() throws XPathException {
>           System.err.println("endElement()");
>           myWrapped.endElement();
>       }
>
>       @Override
>       public void characters(CharSequence cs, Location lctn, int i)
> throws XPathException {
>           System.err.println("characters()");
>           myWrapped.characters(cs, lctn, i);
>       }
>
>       @Override
>       public void processingInstruction(String string, CharSequence
> cs, Location lctn, int i) throws XPathException {
>           System.err.println("processingInstruction()");
>           myWrapped.processingInstruction(string, cs, lctn, i);
>       }
>
>       @Override
>       public void comment(CharSequence cs, Location lctn, int i)
> throws XPathException {
>           System.err.println("comment()");
>           myWrapped.comment(cs, lctn, i);
>       }
>
>       @Override
>       public void close() throws XPathException {
>           System.err.println("close()");
>           myWrapped.close();
>       }
>
>       @Override
>       public boolean usesTypeAnnotations() {
>           System.err.println("myWrapped.usesTypeAnnotations()");
>           return myWrapped.usesTypeAnnotations();
>       }
>   }
>
>   private static final String STYLE =
>           "<xsl:stylesheet
> xmlns:xsl='http://www.w3.org/1999/XSL/Transform'\n" +
>           "                version='2.0'>\n" +
>           "   <xsl:template name='main'>\n" +
>           "      <xsl:result-document href='foo.xml'>\n" +
>           "         <doc>Hello Foo!</doc>\n" +
>           "      </xsl:result-document>\n" +
>           "      <xsl:result-document href='bar.xml'>\n" +
>           "         <doc>Hello Bar!</doc>\n" +
>           "      </xsl:result-document>\n" +
>           "      <ignored/>\n" +
>           "   </xsl:template>\n" +
>           "</xsl:stylesheet>";
> }
>
> Regards,
>
> --
> Florent Georges
> http://fgeorges.org/
> http://h2o.consulting/
>
>
> On 6 November 2016 at 18:55, Florent Georges wrote:
>> Hi,
>>
>> Using s9api, I am trying to capture secondary output trees, written
>> using xsl:result-document.  So I created my own OutputURIResolver,
>> which uses XdmDestination to capture them in memory.  In order to make
>> sure to preserve the output base URI, I use Result.setSystemId on the
>> receiver of the XDM destination (it is properly resolved against the
>> base output URI).
>>
>> But when the OutputURIResolver.close() method is called, the result
>> system ID has been removed.  Printing the result oobject itself shows
>> they are the exact same object returned by resolve() though.
>>
>> Am I doing something wrong here?
>>
>> The following is a self-contained, minimal example (yes, I don't think
>> it is possible to reduce it any further).  The only dependency it to
>> have Saxon in the classpath.  Tested against Saxon 9.7.0.10.
>>
>> package saxon;
>>
>> import java.io.StringReader;
>> import java.net.URI;
>> import java.net.URISyntaxException;
>> import javax.xml.transform.Result;
>> import javax.xml.transform.TransformerException;
>> import javax.xml.transform.stream.StreamSource;
>> import net.sf.saxon.Configuration;
>> import net.sf.saxon.lib.OutputURIResolver;
>> import net.sf.saxon.s9api.Processor;
>> import net.sf.saxon.s9api.QName;
>> import net.sf.saxon.s9api.SaxonApiException;
>> import net.sf.saxon.s9api.XdmDestination;
>> import net.sf.saxon.s9api.XsltTransformer;
>>
>> public class BaseUriSecondaryDoc
>> {
>>   public static void main(String[] args)
>>           throws SaxonApiException
>>   {
>>       // new processor, custom output resolver
>>       Processor proc = new Processor(false);
>>       Configuration config = proc.getUnderlyingConfiguration();
>>       config.setOutputURIResolver(new Output(config));
>>
>>       // compile the stylesheet
>>       XsltTransformer trans = proc
>>           .newXsltCompiler()
>>           .compile(new StreamSource(new StringReader(STYLE)))
>>           .load();
>>
>>       // run template "main", ignore the main result
>>       trans.setInitialTemplate(new QName("main"));
>>       trans.setBaseOutputURI("http://www.example.com/");
>>       trans.setDestination(new XdmDestination());
>>       trans.transform();
>>   }
>>
>>   private static class Output implements OutputURIResolver
>>   {
>>       public Output(Configuration config)
>>       {
>>           myConfig = config;
>>       }
>>
>>       @Override
>>       public OutputURIResolver newInstance()
>>       {
>>           return new Output(myConfig);
>>       }
>>
>>       private static String resolveUri(String href, String base)
>>               throws TransformerException
>>       {
>>           try {
>>               return new URI(base).resolve(href).toASCIIString();
>>           }
>>           catch (URISyntaxException ex) {
>>               throw new TransformerException(ex);
>>           }
>>       }
>>
>>       @Override
>>       public Result resolve(String href, String base)
>>               throws TransformerException
>>       {
>>           String uri = resolveUri(href, base);
>>           XdmDestination dest = new XdmDestination();
>>           try {
>>               Result res = dest.getReceiver(myConfig);
>>               System.err.println("New result: " + res);
>>               System.err.println("  sysid: " + uri);
>>               res.setSystemId(uri);
>>               return res;
>>           }
>>           catch ( SaxonApiException ex ) {
>>               throw new TransformerException(ex);
>>           }
>>       }
>>
>>       @Override
>>       public void close(Result result)
>>       {
>>           System.err.println("CLOSE result: " + result);
>>           System.err.println("  sysid: " + result.getSystemId());
>>       }
>>
>>       private final Configuration myConfig;
>>   }
>>
>>   private static final String STYLE =
>>       "<xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform'\n" +
>>       "                version='2.0'>\n" +
>>       "   <xsl:template name='main'>\n" +
>>       "      <xsl:result-document href='foo.xml'>\n" +
>>       "         <doc>Hello Foo!</doc>\n" +
>>       "      </xsl:result-document>\n" +
>>       "      <xsl:result-document href='bar.xml'>\n" +
>>       "         <doc>Hello Bar!</doc>\n" +
>>       "      </xsl:result-document>\n" +
>>       "      <ignored/>\n" +
>>       "   </xsl:template>\n" +
>>       "</xsl:stylesheet>";
>> }
>>
>> When I run that class, it outputs the following (showing that 2
>> complex content outputter objects have been created, that the system
>> ID is properly set in resolve(), but that it is removed before close()
>> is called):
>>
>> New result: net.sf.saxon.event.ComplexContentOutputter@55040f2f
>> sysid: http://www.example.com/foo.xml
>> CLOSE result: net.sf.saxon.event.ComplexContentOutputter@55040f2f
>> sysid:
>> New result: net.sf.saxon.event.ComplexContentOutputter@275710fc
>> sysid: http://www.example.com/bar.xml
>> CLOSE result: net.sf.saxon.event.ComplexContentOutputter@275710fc
>> sysid:
>>
>> Did I miss anything?
>>
>> Regards,
>>
>> --
>> Florent Georges
>> http://fgeorges.org/
>> http://h2oconsulting.be/
>
> ------------------------------------------------------------------------------
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today. http://sdm.link/xeonphi
> _______________________________________________
> saxon-help mailing list archived at http://saxon.markmail.org/
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/saxon-help 


------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: OutputURIResolver: secondary trees loose base URI

Norman Walsh
Michael Kay <[hidden email]> writes:
> I hope you see why this is an area where angels fear to tread!

Indeed. But at the same time, having no way of getting the base URI
of secondary result documents is…a problem.

                                        Be seeing you,
                                          norm

--
Norman Walsh <[hidden email]> | Criminal: A person with predatory
http://nwalsh.com/            | instincts who has not sufficient
                              | capital to form a corporation.--Howard
                              | Scott

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 

signature.asc (178 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: OutputURIResolver: secondary trees loose base URI

Florent Georges-3
In reply to this post by Michael Kay
On 9 November 2016 at 13:46, Michael Kay wrote:

Hi Mike,

Thank you for this precise analysis!

> [...] in 3.0 [...]  In fact the serialization spec says that
> sequence normalization is equivalent to an xsl:document instruction,
> which at face value means that the base URI should be the stylesheet
> base URI.

> [...] I hope you see why this is an area where angels fear to tread!

Yes, absolutely!  Actually, for my precise problem, I do have a fix.
But first I wanted to be sure I was not doing anything silly and I was
not failing to report something in Saxon.

So if I understand correctly, a secondary result tree will always
have, according to the 3.0 specs, the base URI from the stylesheet
(depending on the base URI of the lexical scope it is part of in the
stylesheet).

Which sounds kind of an issue to me.  In 2.0, I used to see the result
of a transformation as a primary result tree + a set of named result
trees (that is, a dictionary, base URIs being the keys, and trees the
values).  In 3.0, the set of secondary trees is essentially anonymous
by default then (except for dealing with xml:base in the stylesheet).

If I am right, this is also how XProc defined p:xslt and its
"secondary" output port (the primary output port is the primary result
tree, this one is a sequence with the secondary output trees).

I know timing is bad, but it sounds to me like xsl:result-document
should be defined to ensure the base URI of the result tree is the
(resolved) base URI from its @href.  Or maybe I misunderstood
something, which would not surprise me here...

Regards,

--
Florent Georges
http://fgeorges.org/
http://h2o.consulting/

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: OutputURIResolver: secondary trees loose base URI

Michael Kay
In reply to this post by Norman Walsh

> On 9 Nov 2016, at 14:35, Norman Walsh <[hidden email]> wrote:
>
> Michael Kay <[hidden email]> writes:
>> I hope you see why this is an area where angels fear to tread!
>
> Indeed. But at the same time, having no way of getting the base URI
> of secondary result documents is…a problem.
>
>

Just to be clear the problem here is that there are two conflicting settings: the application has set a system ID using result.setSystemId(), and the stylesheet processor has then overwritten this with a different one, based on the rules in the spec (or one of the specs...), and the question is whether this is the right thing to do.

Michael Kay
Saxonica
------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: OutputURIResolver: secondary trees loose base URI

Michael Kay
In reply to this post by Florent Georges-3
I've found the relevant text in the 3.0 spec: it says (under xsl:result-document):

The href attribute is optional. The default value is the zero-length string. The effective value of the attribute must be a URI Reference, which may be absolute or relative. If it is relative, then it is resolved against the base output URI.

If the implementation provides an API to access secondary results, then it must allow a secondary result to be identified by means of the absolutized value of the href attribute. In addition, if a final result tree is constructed (that is, if the effective value of build-tree is yes), then this value is used as the base URI of the document node at the root of the final result tree.

So I think Saxon is correct to overwrite Result.systemId with its own value for the base URI, but what's happening here is that for some reason the fact that the stylesheet has no static base URI is causing it to behave as if there is no base output URI. I'l raise a bug on this.

https://saxonica.plan.io/issues/3024

Michael Kay
Saxonica


> On 10 Nov 2016, at 06:57, Florent Georges <[hidden email]> wrote:
>
> On 9 November 2016 at 13:46, Michael Kay wrote:
>
> Hi Mike,
>
> Thank you for this precise analysis!
>
>> [...] in 3.0 [...]  In fact the serialization spec says that
>> sequence normalization is equivalent to an xsl:document instruction,
>> which at face value means that the base URI should be the stylesheet
>> base URI.
>
>> [...] I hope you see why this is an area where angels fear to tread!
>
> Yes, absolutely!  Actually, for my precise problem, I do have a fix.
> But first I wanted to be sure I was not doing anything silly and I was
> not failing to report something in Saxon.
>
> So if I understand correctly, a secondary result tree will always
> have, according to the 3.0 specs, the base URI from the stylesheet
> (depending on the base URI of the lexical scope it is part of in the
> stylesheet).
>
> Which sounds kind of an issue to me.  In 2.0, I used to see the result
> of a transformation as a primary result tree + a set of named result
> trees (that is, a dictionary, base URIs being the keys, and trees the
> values).  In 3.0, the set of secondary trees is essentially anonymous
> by default then (except for dealing with xml:base in the stylesheet).
>
> If I am right, this is also how XProc defined p:xslt and its
> "secondary" output port (the primary output port is the primary result
> tree, this one is a sequence with the secondary output trees).
>
> I know timing is bad, but it sounds to me like xsl:result-document
> should be defined to ensure the base URI of the result tree is the
> (resolved) base URI from its @href.  Or maybe I misunderstood
> something, which would not surprise me here...
>
> Regards,
>
> --
> Florent Georges
> http://fgeorges.org/
> http://h2o.consulting/
>
> ------------------------------------------------------------------------------
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today. http://sdm.link/xeonphi
> _______________________________________________
> saxon-help mailing list archived at http://saxon.markmail.org/
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/saxon-help 


------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Learning more about result trees (was Re: OutputURIResolver: secondary trees loose base URI)

Norman Walsh
Michael Kay <[hidden email]> writes:
> So I think Saxon is correct to overwrite Result.systemId with its own
> value for the base URI, but what's happening here is that for some
> reason the fact that the stylesheet has no static base URI is causing
> it to behave as if there is no base output URI. I'l raise a bug on
> this.

In the off chance that this bug has you digging around in a relevant
bit of the code base, I’ll just say again that it would be very nice
if it was possible to find out what serialization parameters Saxon
would have used to serialize a secondary result tree, had it not
been intercepted by the output resolver. :-)

                                        Be seeing you,
                                          norm

--
Norman Walsh <[hidden email]> | Criminal: A person with predatory
http://nwalsh.com/            | instincts who has not sufficient
                              | capital to form a corporation.--Howard
                              | Scott

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 

signature.asc (178 bytes) Download Attachment
Loading...