document-uri() returns bad URL on Windows but Not Mac

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

document-uri() returns bad URL on Windows but Not Mac

Eliot Kimber-2
Using Saxon-HE 9.6.0.6J from Saxonica (and also SaxonPE as installed with
oXygenXML), I have the following code:

    <xsl:variable name="keydefsURI" as="xs:string"
      select="relpath:newFile(relpath:toUrl($tempdir), 'keydef.xml')"
    />
    <xsl:message> + [DEBUG] local:getURIForKeyref():
keydefsURI="<xsl:value-of select="$keydefsURI"/>"</xsl:message>
    <xsl:variable name="keydefDoc" as="document-node()?"
      select="document($keydefsURI, root($keyref))"
    />
    <xsl:message> + [DEBUG] local:getURIForKeyref():
document-uri($keydefDoc)="<xsl:value-of
select="document-uri($keydefDoc)"/>"</xsl:message>
   

That is: construct a URL, get the document at that URL, use document-uri()
to get the document's URI.


Under OS X I get the expected result:

     [xslt]  + [DEBUG] local:getURIForKeyref():
keydefsURI="/Users/ekimber/workspace-dita-community/dita13-dita-ot-1.x-supp
ort/test/dita/temp/pdf-ah/keydef.xml"
     [xslt]  + [DEBUG] local:getURIForKeyref():
document-uri($keydefDoc)="file:/Users/ekimber/workspace-dita-community/dita
13-dita-ot-1.x-support/test/dita/temp/pdf-ah/keydef.xml"
 


That is, the value returned by document-uri() is the same as the value
used for the document() function.

However, under Windows, using the same test document, I get this:

     [xslt]  + [DEBUG] local:getURIForKeyref():
keydefsURI="file:/C:/Users/Eliot
Kimber/workspace/dita-community/dita13-dita-ot-1.x-support/
test/dita/temp/xhtml/oxygen_dita_temp/keydef.xml"
     [xslt]  + [DEBUG] local:getURIForKeyref():
document-uri($keydefDoc)="file:/C:/Users/Eliot%20Kimber/workspace/dita-comm
unity/dita13-dita
-ot-1.x-support/test/dita/temp/xhtml/oxygen_dita_temp/domains/topics/mathml
-tests.dita/../file:/C:/Users/Eliot Kimber/workspace/dita-communi
ty/dita13-dita-ot-1.x-support/test/dita/temp/xhtml/oxygen_dita_temp/keydef.
xml"

Note the added
"file:/C:/Users/Eliot%20Kimber/workspace/dita-community/dita13-dita
-ot-1.x-support/test/dita/temp/xhtml/oxygen_dita_temp/domains/topics/mathml
-tests.dita/.."


(If I use the one-argument form of document() then I get the URL of the
XSLT module instead of the input source document.)

This resulting URL is obviously invalid and subsequent attempts to use it
as the base for other URLs fail under Windows.

I'm using the document-uri() approach here because I could have either of
two possible context nodes to use for relative URL resolution: the
keydef.xml file (a temp file generated by the Open Toolkit and always in
the same place for a given run) or one of the input files, so my current
logic is to determine the appropriate context document then get its URL in
order to then construct relative URLs.

Is this my user error or a bug in Saxon?

I can work around this in my code but I'd like to know what the root cause
of this issue is, especially if this is my user error.

Cheers,

E.
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com




------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|

Re: document-uri() returns bad URL on Windows but Not Mac

Michael Kay
Could you supply a runnable repro please?

Apart from anything else, I think the email sender/transmitter/reader might have mangled some of your strings.

Michael Kay
Saxonica


> On 10 Jun 2015, at 05:00, Eliot Kimber <[hidden email]> wrote:
>
> Using Saxon-HE 9.6.0.6J from Saxonica (and also SaxonPE as installed with
> oXygenXML), I have the following code:
>
>    <xsl:variable name="keydefsURI" as="xs:string"
>      select="relpath:newFile(relpath:toUrl($tempdir), 'keydef.xml')"
>    />
>    <xsl:message> + [DEBUG] local:getURIForKeyref():
> keydefsURI="<xsl:value-of select="$keydefsURI"/>"</xsl:message>
>    <xsl:variable name="keydefDoc" as="document-node()?"
>      select="document($keydefsURI, root($keyref))"
>    />
>    <xsl:message> + [DEBUG] local:getURIForKeyref():
> document-uri($keydefDoc)="<xsl:value-of
> select="document-uri($keydefDoc)"/>"</xsl:message>
>
>
> That is: construct a URL, get the document at that URL, use document-uri()
> to get the document's URI.
>
>
> Under OS X I get the expected result:
>
>     [xslt]  + [DEBUG] local:getURIForKeyref():
> keydefsURI="/Users/ekimber/workspace-dita-community/dita13-dita-ot-1.x-supp
> ort/test/dita/temp/pdf-ah/keydef.xml"
>     [xslt]  + [DEBUG] local:getURIForKeyref():
> document-uri($keydefDoc)="file:/Users/ekimber/workspace-dita-community/dita
> 13-dita-ot-1.x-support/test/dita/temp/pdf-ah/keydef.xml"
>
>
>
> That is, the value returned by document-uri() is the same as the value
> used for the document() function.
>
> However, under Windows, using the same test document, I get this:
>
>     [xslt]  + [DEBUG] local:getURIForKeyref():
> keydefsURI="file:/C:/Users/Eliot
> Kimber/workspace/dita-community/dita13-dita-ot-1.x-support/
> test/dita/temp/xhtml/oxygen_dita_temp/keydef.xml"
>     [xslt]  + [DEBUG] local:getURIForKeyref():
> document-uri($keydefDoc)="file:/C:/Users/Eliot%20Kimber/workspace/dita-comm
> unity/dita13-dita
> -ot-1.x-support/test/dita/temp/xhtml/oxygen_dita_temp/domains/topics/mathml
> -tests.dita/../file:/C:/Users/Eliot Kimber/workspace/dita-communi
> ty/dita13-dita-ot-1.x-support/test/dita/temp/xhtml/oxygen_dita_temp/keydef.
> xml"
>
> Note the added
> "file:/C:/Users/Eliot%20Kimber/workspace/dita-community/dita13-dita
> -ot-1.x-support/test/dita/temp/xhtml/oxygen_dita_temp/domains/topics/mathml
> -tests.dita/.."
>
>
> (If I use the one-argument form of document() then I get the URL of the
> XSLT module instead of the input source document.)
>
> This resulting URL is obviously invalid and subsequent attempts to use it
> as the base for other URLs fail under Windows.
>
> I'm using the document-uri() approach here because I could have either of
> two possible context nodes to use for relative URL resolution: the
> keydef.xml file (a temp file generated by the Open Toolkit and always in
> the same place for a given run) or one of the input files, so my current
> logic is to determine the appropriate context document then get its URL in
> order to then construct relative URLs.
>
> Is this my user error or a bug in Saxon?
>
> I can work around this in my code but I'd like to know what the root cause
> of this issue is, especially if this is my user error.
>
> Cheers,
>
> E.
> ----
> Eliot Kimber, Owner
> Contrext, LLC
> http://contrext.com
>
>
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> saxon-help mailing list archived at http://saxon.markmail.org/
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/saxon-help 


------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|

Re: document-uri() returns bad URL on Windows but Not Mac

Eliot Kimber-2
I'll try to get a runnable test case together. The full code is in GitHub
but it currently depends on a specific Open Toolkit configuration to run.

Cheers,

E.
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com




On 6/10/15, 3:07 AM, "Michael Kay" <[hidden email]> wrote:

>Could you supply a runnable repro please?
>
>Apart from anything else, I think the email sender/transmitter/reader
>might have mangled some of your strings.
>
>Michael Kay
>Saxonica
>
>
>> On 10 Jun 2015, at 05:00, Eliot Kimber <[hidden email]> wrote:
>>
>> Using Saxon-HE 9.6.0.6J from Saxonica (and also SaxonPE as installed
>>with
>> oXygenXML), I have the following code:
>>
>>    <xsl:variable name="keydefsURI" as="xs:string"
>>      select="relpath:newFile(relpath:toUrl($tempdir), 'keydef.xml')"
>>    />
>>    <xsl:message> + [DEBUG] local:getURIForKeyref():
>> keydefsURI="<xsl:value-of select="$keydefsURI"/>"</xsl:message>
>>    <xsl:variable name="keydefDoc" as="document-node()?"
>>      select="document($keydefsURI, root($keyref))"
>>    />
>>    <xsl:message> + [DEBUG] local:getURIForKeyref():
>> document-uri($keydefDoc)="<xsl:value-of
>> select="document-uri($keydefDoc)"/>"</xsl:message>
>>
>>
>> That is: construct a URL, get the document at that URL, use
>>document-uri()
>> to get the document's URI.
>>
>>
>> Under OS X I get the expected result:
>>
>>     [xslt]  + [DEBUG] local:getURIForKeyref():
>>
>>keydefsURI="/Users/ekimber/workspace-dita-community/dita13-dita-ot-1.x-su
>>pp
>> ort/test/dita/temp/pdf-ah/keydef.xml"
>>     [xslt]  + [DEBUG] local:getURIForKeyref():
>>
>>document-uri($keydefDoc)="file:/Users/ekimber/workspace-dita-community/di
>>ta
>> 13-dita-ot-1.x-support/test/dita/temp/pdf-ah/keydef.xml"
>>
>>
>>
>> That is, the value returned by document-uri() is the same as the value
>> used for the document() function.
>>
>> However, under Windows, using the same test document, I get this:
>>
>>     [xslt]  + [DEBUG] local:getURIForKeyref():
>> keydefsURI="file:/C:/Users/Eliot
>> Kimber/workspace/dita-community/dita13-dita-ot-1.x-support/
>> test/dita/temp/xhtml/oxygen_dita_temp/keydef.xml"
>>     [xslt]  + [DEBUG] local:getURIForKeyref():
>>
>>document-uri($keydefDoc)="file:/C:/Users/Eliot%20Kimber/workspace/dita-co
>>mm
>> unity/dita13-dita
>>
>>-ot-1.x-support/test/dita/temp/xhtml/oxygen_dita_temp/domains/topics/math
>>ml
>> -tests.dita/../file:/C:/Users/Eliot Kimber/workspace/dita-communi
>>
>>ty/dita13-dita-ot-1.x-support/test/dita/temp/xhtml/oxygen_dita_temp/keyde
>>f.
>> xml"
>>
>> Note the added
>> "file:/C:/Users/Eliot%20Kimber/workspace/dita-community/dita13-dita
>>
>>-ot-1.x-support/test/dita/temp/xhtml/oxygen_dita_temp/domains/topics/math
>>ml
>> -tests.dita/.."
>>
>>
>> (If I use the one-argument form of document() then I get the URL of the
>> XSLT module instead of the input source document.)
>>
>> This resulting URL is obviously invalid and subsequent attempts to use
>>it
>> as the base for other URLs fail under Windows.
>>
>> I'm using the document-uri() approach here because I could have either
>>of
>> two possible context nodes to use for relative URL resolution: the
>> keydef.xml file (a temp file generated by the Open Toolkit and always in
>> the same place for a given run) or one of the input files, so my current
>> logic is to determine the appropriate context document then get its URL
>>in
>> order to then construct relative URLs.
>>
>> Is this my user error or a bug in Saxon?
>>
>> I can work around this in my code but I'd like to know what the root
>>cause
>> of this issue is, especially if this is my user error.
>>
>> Cheers,
>>
>> E.
>> ----
>> Eliot Kimber, Owner
>> Contrext, LLC
>> http://contrext.com
>>
>>
>>
>>
>>
>>-------------------------------------------------------------------------
>>-----
>> _______________________________________________
>> saxon-help mailing list archived at http://saxon.markmail.org/
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/saxon-help
>
>
>--------------------------------------------------------------------------
>----
>_______________________________________________
>saxon-help mailing list archived at http://saxon.markmail.org/
>[hidden email]
>https://lists.sourceforge.net/lists/listinfo/saxon-help
>



------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|

Re: document-uri() returns bad URL on Windows but Not Mac

Jirka Kosek
In reply to this post by Eliot Kimber-2
On 10.6.2015 6:00, Eliot Kimber wrote:
> Is this my user error or a bug in Saxon?

And what are your functions in relpath namespace doing?

 relpath:newFile(relpath:toUrl($tempdir), 'keydef.xml')"

Couldn't be problem there.

Also you might find resolve-uri() function useful in your case
(http://www.w3.org/TR/xpath-functions/#func-resolve-uri).

Personally I switched to using doc() and resolve-uri()/base-uri() then
relying on little bit magic and overloaded document() function.

                                Jirka

--
------------------------------------------------------------------
  Jirka Kosek      e-mail: [hidden email]      http://xmlguru.cz
------------------------------------------------------------------
     Professional XML and Web consulting and training services
DocBook/DITA customization, custom XSLT/XSL-FO document processing
------------------------------------------------------------------
 OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 rep.
------------------------------------------------------------------
    Bringing you XML Prague conference    http://xmlprague.cz
------------------------------------------------------------------


------------------------------------------------------------------------------

_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 

signature.asc (203 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: document-uri() returns bad URL on Windows but Not Mac

Eliot Kimber-2
The functions in the relpath namespace should be doing the right thing
:-)--they've been around for quite a long time now.

The failure I'm getting is actually in resolve-uri() where the base I'm
passing in is the bad URL coming back from document-uri().

I'm building a focused test case so I can isolate the problem.

But the weird part for me is that I'm passing a URL into document() that
results in the document being found and parsed, but then calling
document-uri() on the resulting document node returns a bad URL. So it
looks like a bug in the Java underpinnings here--that is, my code (the
relpath functions and the local code that uses them) is providing a
resolvable URI to document() but the URI associated with the resulting
document node is bad. Because it happens on Windows and not OS X I'm
guessing it's an issue with how the Windows-specific file:/ URLs are being
handled.

Cheers,

E.
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com




On 6/10/15, 11:51 AM, "Jirka Kosek" <[hidden email]> wrote:

>On 10.6.2015 6:00, Eliot Kimber wrote:
>> Is this my user error or a bug in Saxon?
>
>And what are your functions in relpath namespace doing?
>
> relpath:newFile(relpath:toUrl($tempdir), 'keydef.xml')"
>
>Couldn't be problem there.
>
>Also you might find resolve-uri() function useful in your case
>(http://www.w3.org/TR/xpath-functions/#func-resolve-uri).
>
>Personally I switched to using doc() and resolve-uri()/base-uri() then
>relying on little bit magic and overloaded document() function.
>
> Jirka
>
>--
>------------------------------------------------------------------
>  Jirka Kosek      e-mail: [hidden email]      http://xmlguru.cz
>------------------------------------------------------------------
>     Professional XML and Web consulting and training services
>DocBook/DITA customization, custom XSLT/XSL-FO document processing
>------------------------------------------------------------------
> OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 rep.
>------------------------------------------------------------------
>    Bringing you XML Prague conference    http://xmlprague.cz
>------------------------------------------------------------------
>
>--------------------------------------------------------------------------
>----
>_______________________________________________
>saxon-help mailing list archived at http://saxon.markmail.org/
>[hidden email]
>https://lists.sourceforge.net/lists/listinfo/saxon-help 



------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|

Re: document-uri() returns bad URL on Windows but Not Mac

Eliot Kimber-2
My initial efforts at a focused test are not able to reproduce the
behavior I'm getting with my prodcution code, which strongly suggests my
user error.

Cheers,

Eliot
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com




On 6/10/15, 12:23 PM, "Eliot Kimber" <[hidden email]> wrote:

>The functions in the relpath namespace should be doing the right thing
>:-)--they've been around for quite a long time now.
>
>The failure I'm getting is actually in resolve-uri() where the base I'm
>passing in is the bad URL coming back from document-uri().
>
>I'm building a focused test case so I can isolate the problem.
>
>But the weird part for me is that I'm passing a URL into document() that
>results in the document being found and parsed, but then calling
>document-uri() on the resulting document node returns a bad URL. So it
>looks like a bug in the Java underpinnings here--that is, my code (the
>relpath functions and the local code that uses them) is providing a
>resolvable URI to document() but the URI associated with the resulting
>document node is bad. Because it happens on Windows and not OS X I'm
>guessing it's an issue with how the Windows-specific file:/ URLs are being
>handled.
>
>Cheers,
>
>E.
>----
>Eliot Kimber, Owner
>Contrext, LLC
>http://contrext.com
>
>
>
>
>On 6/10/15, 11:51 AM, "Jirka Kosek" <[hidden email]> wrote:
>
>>On 10.6.2015 6:00, Eliot Kimber wrote:
>>> Is this my user error or a bug in Saxon?
>>
>>And what are your functions in relpath namespace doing?
>>
>> relpath:newFile(relpath:toUrl($tempdir), 'keydef.xml')"
>>
>>Couldn't be problem there.
>>
>>Also you might find resolve-uri() function useful in your case
>>(http://www.w3.org/TR/xpath-functions/#func-resolve-uri).
>>
>>Personally I switched to using doc() and resolve-uri()/base-uri() then
>>relying on little bit magic and overloaded document() function.
>>
>> Jirka
>>
>>--
>>------------------------------------------------------------------
>>  Jirka Kosek      e-mail: [hidden email]      http://xmlguru.cz
>>------------------------------------------------------------------
>>     Professional XML and Web consulting and training services
>>DocBook/DITA customization, custom XSLT/XSL-FO document processing
>>------------------------------------------------------------------
>> OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 rep.
>>------------------------------------------------------------------
>>    Bringing you XML Prague conference    http://xmlprague.cz
>>------------------------------------------------------------------
>>
>>-------------------------------------------------------------------------
>>-
>>----
>>_______________________________________________
>>saxon-help mailing list archived at http://saxon.markmail.org/
>>[hidden email]
>>https://lists.sourceforge.net/lists/listinfo/saxon-help
>
>
>
>--------------------------------------------------------------------------
>----
>_______________________________________________
>saxon-help mailing list archived at http://saxon.markmail.org/
>[hidden email]
>https://lists.sourceforge.net/lists/listinfo/saxon-help
>



------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|

Re: document-uri() returns bad URL on Windows but Not Mac

Michael Kay
In reply to this post by Eliot Kimber-2
>
> But the weird part for me is that I'm passing a URL into document() that
> results in the document being found and parsed, but then calling
> document-uri() on the resulting document node returns a bad URL. So it
> looks like a bug in the Java underpinnings here--that is, my code (the
> relpath functions and the local code that uses them) is providing a
> resolvable URI to document() but the URI associated with the resulting
> document node is bad. Because it happens on Windows and not OS X I'm
> guessing it's an issue with how the Windows-specific file:/ URLs are being
> handled.
>

There are various things that can happen to the string S passed to the document() function before it comes back as the result T of document-uri().

Firstly, S gets resolved against the appropriate base URI. Saxon is a little bit tolerant here and allows you to pass certain things that aren’t actually URIs  — things I sometimes call “wannabe URIs”, e.g strings that become URIs if you escape them properly. For some reason lost in the mists of time, we actually escape spaces in the wannabe URI as %20 “by hand”, leaving other characters that are invalid in a URI unescaped.

Secondly, it gets handed to the URIResolver (which may, for example, access OASIS or other catalogs). The URIResolver returns a Source, and it is actually the value of the systemId property in this Source object that ends up being used as the value of the document-uri() property. If the URIResolver is feeling mischievous, this may be quite unrelated to the string that you passed to the document() function.

Saxon checks what comes back from the URIResolver to make sure it is a viable URI (e.g. that %HH escape sequences are valid) because dereferencing a badly-formed URI (as a URL) has been known to crash things very unhealthily.

Michael Kay
Saxonica


------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|

Re: document-uri() returns bad URL on Windows but Not Mac

Eliot Kimber-2
OK, I've created a gist here:

https://gist.github.com/drmacro/6f99df5330ff618d662d

It contains my focused test and an explanation of how to run it.

The issue appears to be assigning a document to a variable within a
function: in that case a subsequent document-uri() returns the bad value.

Doing exactly the same assignment with the same documents not in a
function does not result in a problem.

Cheers,

Eliot
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com




On 6/10/15, 3:34 PM, "Michael Kay" <[hidden email]> wrote:

>>
>> But the weird part for me is that I'm passing a URL into document() that
>> results in the document being found and parsed, but then calling
>> document-uri() on the resulting document node returns a bad URL. So it
>> looks like a bug in the Java underpinnings here--that is, my code (the
>> relpath functions and the local code that uses them) is providing a
>> resolvable URI to document() but the URI associated with the resulting
>> document node is bad. Because it happens on Windows and not OS X I'm
>> guessing it's an issue with how the Windows-specific file:/ URLs are
>>being
>> handled.
>>
>
>There are various things that can happen to the string S passed to the
>document() function before it comes back as the result T of
>document-uri().
>
>Firstly, S gets resolved against the appropriate base URI. Saxon is a
>little bit tolerant here and allows you to pass certain things that
>aren¹t actually URIs  ‹ things I sometimes call ³wannabe URIs², e.g
>strings that become URIs if you escape them properly. For some reason
>lost in the mists of time, we actually escape spaces in the wannabe URI
>as %20 ³by hand², leaving other characters that are invalid in a URI
>unescaped.
>
>Secondly, it gets handed to the URIResolver (which may, for example,
>access OASIS or other catalogs). The URIResolver returns a Source, and it
>is actually the value of the systemId property in this Source object that
>ends up being used as the value of the document-uri() property. If the
>URIResolver is feeling mischievous, this may be quite unrelated to the
>string that you passed to the document() function.
>
>Saxon checks what comes back from the URIResolver to make sure it is a
>viable URI (e.g. that %HH escape sequences are valid) because
>dereferencing a badly-formed URI (as a URL) has been known to crash
>things very unhealthily.
>
>Michael Kay
>Saxonica
>
>
>--------------------------------------------------------------------------
>----
>_______________________________________________
>saxon-help mailing list archived at http://saxon.markmail.org/
>[hidden email]
>https://lists.sourceforge.net/lists/listinfo/saxon-help 



------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|

Re: document-uri() returns bad URL on Windows but Not Mac

Jirka Kosek
On 11.6.2015 0:21, Eliot Kimber wrote:
> The issue appears to be assigning a document to a variable within a
> function: in that case a subsequent document-uri() returns the bad value.
>
> Doing exactly the same assignment with the same documents not in a
> function does not result in a problem.

Note that function doesn't obtain context other then passed parameters,
so calling document() function without second argument will give
different result then if there is some context node set (like inside
template body).

--
------------------------------------------------------------------
  Jirka Kosek      e-mail: [hidden email]      http://xmlguru.cz
------------------------------------------------------------------
     Professional XML and Web consulting and training services
DocBook/DITA customization, custom XSLT/XSL-FO document processing
------------------------------------------------------------------
 OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 rep.
------------------------------------------------------------------
    Bringing you XML Prague conference    http://xmlprague.cz
------------------------------------------------------------------


------------------------------------------------------------------------------

_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 

signature.asc (203 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: document-uri() returns bad URL on Windows but Not Mac

Eliot Kimber-2
I'm passing in an absolute file:/ URL as the first parameter to document()
so the base shouldn't matter. In my tests, it didn't matter whether I
specified the second parameter or not: I got the same behavior, the only
difference being the URL of the context node that gets added to the
document's URL.

Cheers,

E.
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com




On 6/11/15, 2:38 AM, "Jirka Kosek" <[hidden email]> wrote:

>On 11.6.2015 0:21, Eliot Kimber wrote:
>> The issue appears to be assigning a document to a variable within a
>> function: in that case a subsequent document-uri() returns the bad
>>value.
>>
>> Doing exactly the same assignment with the same documents not in a
>> function does not result in a problem.
>
>Note that function doesn't obtain context other then passed parameters,
>so calling document() function without second argument will give
>different result then if there is some context node set (like inside
>template body).
>
>--
>------------------------------------------------------------------
>  Jirka Kosek      e-mail: [hidden email]      http://xmlguru.cz
>------------------------------------------------------------------
>     Professional XML and Web consulting and training services
>DocBook/DITA customization, custom XSLT/XSL-FO document processing
>------------------------------------------------------------------
> OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 rep.
>------------------------------------------------------------------
>    Bringing you XML Prague conference    http://xmlprague.cz
>------------------------------------------------------------------
>
>--------------------------------------------------------------------------
>----
>_______________________________________________
>saxon-help mailing list archived at http://saxon.markmail.org/
>[hidden email]
>https://lists.sourceforge.net/lists/listinfo/saxon-help 



------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|

Re: document-uri() returns bad URL on Windows but Not Mac

Tony Graham
On 11/06/2015 16:07, Eliot Kimber wrote:
> I'm passing in an absolute file:/ URL as the first parameter to
> document()

Have you ever tried it with URLs starting with 'file:///'?  Sometimes
Java on Windows is sensitive about such things.  Lower-case drive
letters can be another problem with Java, but you are using 'C', IIRC.

Regards,


Tony Graham.
--
Senior Architect
XML Division
Antenna House, Inc.
----
Skerries, Ireland
[hidden email]

------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|

Re: document-uri() returns bad URL on Windows but Not Mac

Michael Kay
In reply to this post by Eliot Kimber-2
Here is what is happening.

The document() function (and doc() for that matter) do two things:

(a) they check in a document pool to see if the URI is already known

(b) if not, they call the URIResolver to dereference the URI and fetch the resource

The URIResolver is defined to take the href and baseURI as separate parameters, so the URIResolver has its own logic to combine them into an absolute URI. But before calling it, we need to see if the absolute URI is present in the document pool, so there’s separate logic to construct an absolute URI for that purpose. It turns out that the two bits of code to combine href and baseURI are doing it slightly differently: the URIResolver logic escapes any spaces in the href to %20, but the document pool logic does not; instead it gets an exception from URI.resolve() and uses a fallback algorithm to concatenate the base URI and relative URI with “/../“ as a separator. It’s this fallback URI that you are seeing in the result from the document-uri() function.

According to the spec, the argument to document() should be a URI, which means it can’t contain unescaped spaces. We’re trying to be a bit friendlier than that, but we’re not entirely succeeding. We’ll fix this by doing the “escape spaces” logic on the document pool path as well as the URIResolver path. Meanwhile, please ensure that the value you pass to document() is a valid URI.

We haven’t explored all the variations on why this bug is occurring on some paths but not others, and on some operating systems and not others.

Michael Kay
Saxonica
------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|

Re: document-uri() returns bad URL on Windows but Not Mac

debbie
In reply to this post by Eliot Kimber-2
See the bug and resolution information at https://saxonica.plan.io/issues/2397
 
9.6 patch will be available in the next maintenance release.
 
Debbie Lockett,
Saxonica

> Message: 3
> Date: Fri, 12 Jun 2015 11:12:47 +0100
>
From: Michael Kay <[hidden email]>
> Subject: Re: [saxon] document-uri() returns bad URL on Windows but Not
> Mac
> To: Mailing list for the SAXON XSLT and XQuery processor
> <[hidden email]>
> Message-ID: <[hidden email]>
> Content-Type: text/plain; charset=utf-8
>
> Here is what is happening.
>
> The document() function (and doc() for that matter) do two things:
>
> (a) they check in a document pool to see if the URI is already known
>
> (b) if not, they call the URIResolver to dereference the URI and fetch the
> resource
>
> The URIResolver is defined to take the href and baseURI as separate
> parameters, so the URIResolver has its own logic to combine them into an
> absolute URI. But before calling it, we need to see if the absolute URI is
> present in the document pool, so there?s separate logic to construct an
> absolute URI for that purpose. It turns out that the two bits of code to
> combine href and baseURI are doing it slightly differently: the
> URIResolver logic escapes any spaces in the href to %20, but the document
> pool logic does not; instead it gets an exception from URI.resolve() and
> uses a fallback algorithm to concatenate the base URI and relative URI
> with ?/../? as a separator. It?s this fallback URI that you are seeing in
> the result from the document-uri() function.
>
> According to the spec, the argument to document() should be a URI, which
> means it can?t contain unescaped spaces. We?re trying to be a bit
> friendlier than that, but we?re not entirely succeeding. We?ll fix this by
> doing the ?escape spaces? logic on the document pool path as well as the
> URIResolver path. Meanwhile, please ensure that the value you pass to
> document() is a valid URI.
>
> We haven?t explored all the variations on why this bug is occurring on
> some paths but not others, and on some operating systems and not others.
>
> Michael Kay
> Saxonica
>
>
> ------------------------------
>
> ------------------------------------------------------------------------------
>
>
> ------------------------------
>
> _______________________________________________
> saxon-help mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/saxon-help
> The saxon-help list is archived at http://saxon.markmail.org/
>
> End of saxon-help Digest, Vol 109, Issue 14
> *******************************************
>

------------------------------------------------------------------------------

_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|

Re: document-uri() returns bad URL on Windows but Not Mac

Eliot Kimber-2
In reply to this post by Michael Kay
OK, the explanation makes sense.

In my code I did add the logic (or thought I did) to escape the URL (in
particular, the spaces), but I'll check my test case again.

One challenge is that the built-in escape-uri() functions do things like
escape the ":" in "/c:/Users/" which we don't want, so I've had to
implement my own URL escaping function but it may not be exactly right
either.

The whole mess with URL handling in Java coupled with inconsistent
behavior on Windows for file:/ URLs is a real pain.

One the bright side, I've started setting up a hosted Windows test server
for this code, so yay.

Cheers,

E.
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com




On 6/12/15, 5:12 AM, "Michael Kay" <[hidden email]> wrote:

>Here is what is happening.
>
>The document() function (and doc() for that matter) do two things:
>
>(a) they check in a document pool to see if the URI is already known
>
>(b) if not, they call the URIResolver to dereference the URI and fetch
>the resource
>
>The URIResolver is defined to take the href and baseURI as separate
>parameters, so the URIResolver has its own logic to combine them into an
>absolute URI. But before calling it, we need to see if the absolute URI
>is present in the document pool, so there¹s separate logic to construct
>an absolute URI for that purpose. It turns out that the two bits of code
>to combine href and baseURI are doing it slightly differently: the
>URIResolver logic escapes any spaces in the href to %20, but the document
>pool logic does not; instead it gets an exception from URI.resolve() and
>uses a fallback algorithm to concatenate the base URI and relative URI
>with ³/../³ as a separator. It¹s this fallback URI that you are seeing in
>the result from the document-uri() function.
>
>According to the spec, the argument to document() should be a URI, which
>means it can¹t contain unescaped spaces. We¹re trying to be a bit
>friendlier than that, but we¹re not entirely succeeding. We¹ll fix this
>by doing the ³escape spaces² logic on the document pool path as well as
>the URIResolver path. Meanwhile, please ensure that the value you pass to
>document() is a valid URI.
>
>We haven¹t explored all the variations on why this bug is occurring on
>some paths but not others, and on some operating systems and not others.
>
>Michael Kay
>Saxonica
>--------------------------------------------------------------------------
>----
>_______________________________________________
>saxon-help mailing list archived at http://saxon.markmail.org/
>[hidden email]
>https://lists.sourceforge.net/lists/listinfo/saxon-help 



------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|

Re: document-uri() returns bad URL on Windows but Not Mac

Michael Kay

>
> The whole mess with URL handling in Java coupled with inconsistent
> behavior on Windows for file:/ URLs is a real pain.
>

Indeed. It starts with a series of very badly written RFCs (for which timbl must take a good share of the responsibility), including the absence of any definitive specification for how URIs in the “file:” scheme should map to the file naming conventions of particular operating systems.

Michael Kay
Saxonica


------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help