saxon:line-number is sometimes incorrect

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

saxon:line-number is sometimes incorrect

Stefan Wachter
Hi all,

I have a small stylesheet that outputs the first character of text nodes
together with their line number:

<xsl:stylesheet version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:saxon="http://saxon.sf.net/">
  <xsl:template match="/">
    <xsl:for-each select=".//text()">
      <xsl:message select="substring(., 1, 1), saxon:line-number(.)"/>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>


When this stylesheet is applied to the input document:

<?xml version="1.0" encoding="ISO-8859-1"?>
<root>a
<b>b
</b>c
</root>

then the following result is returned:

a 2
b 3
c 3

Clearly, the text node starting with the character "c" begins one line
later than the text node starting with "b".

--Stefan




-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
saxon-help mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help
Reply | Threaded
Open this post in threaded view
|

RE: saxon:line-number is sometimes incorrect

Michael Kay
There are two issues here: the granularity of the line numbering reported by
the SAX parser, and the granularity maintained within the Saxon tree.

SAX reports the end position of each event. Typically for startElement this
is the position of the ">" at the end of the start tag. For character
content, there may be a line number for each fragment of text, but more
typically it will be the line number of the end of a chunk of contiguous
text.

Because the line number for text nodes is "fuzzy" (and because Saxon doesn't
need it for reporting errors in schemas and stylesheets), the Saxon tree
model maintains line numbers only for element nodes (and then only if you
request it, e.g. using the -l option). If you ask for the line number of any
other node, you get the line number held for the nearest
ancestor-or-preceding element: that is, the line number reported for the
start tag of that element.

Michael Kay
http://www.saxonica.com/

> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> Stefan Wachter
> Sent: 01 March 2006 11:18
> To: [hidden email]
> Subject: [saxon] saxon:line-number is sometimes incorrect
>
> Hi all,
>
> I have a small stylesheet that outputs the first character of
> text nodes
> together with their line number:
>
> <xsl:stylesheet version="2.0"
>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>   xmlns:saxon="http://saxon.sf.net/">
>   <xsl:template match="/">
>     <xsl:for-each select=".//text()">
>       <xsl:message select="substring(., 1, 1), saxon:line-number(.)"/>
>     </xsl:for-each>
>   </xsl:template>
> </xsl:stylesheet>
>
>
> When this stylesheet is applied to the input document:
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <root>a
> <b>b
> </b>c
> </root>
>
> then the following result is returned:
>
> a 2
> b 3
> c 3
>
> Clearly, the text node starting with the character "c" begins
> one line
> later than the text node starting with "b".
>
> --Stefan
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking
> scripting language
> that extends applications into web and mobile media. Attend
> the live webcast
> and join the prime developer group breaking into this new
> coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&
> dat=121642
> _______________________________________________
> saxon-help mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/saxon-help
>




-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
saxon-help mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help
Reply | Threaded
Open this post in threaded view
|

Re: saxon:line-number is sometimes incorrect

Stefan Wachter
Thank you for the information. I see.

The problem I am working on is to output XML documents as line numbered
HTML documents. I think that a depth first traversal that counts the
line breaks will do the job.  I will simply start with the line number
of the document root element and increment the line counter for each
newline character in text nodes. As long as there are no line breaks in
element start tags this will work.

Thanks again.
--Stefan


Michael Kay wrote:

> There are two issues here: the granularity of the line numbering reported by
> the SAX parser, and the granularity maintained within the Saxon tree.
>
> SAX reports the end position of each event. Typically for startElement this
> is the position of the ">" at the end of the start tag. For character
> content, there may be a line number for each fragment of text, but more
> typically it will be the line number of the end of a chunk of contiguous
> text.
>
> Because the line number for text nodes is "fuzzy" (and because Saxon doesn't
> need it for reporting errors in schemas and stylesheets), the Saxon tree
> model maintains line numbers only for element nodes (and then only if you
> request it, e.g. using the -l option). If you ask for the line number of any
> other node, you get the line number held for the nearest
> ancestor-or-preceding element: that is, the line number reported for the
> start tag of that element.
>
> Michael Kay
> http://www.saxonica.com/
>
>  
>> -----Original Message-----
>> From: [hidden email]
>> [mailto:[hidden email]] On Behalf Of
>> Stefan Wachter
>> Sent: 01 March 2006 11:18
>> To: [hidden email]
>> Subject: [saxon] saxon:line-number is sometimes incorrect
>>
>> Hi all,
>>
>> I have a small stylesheet that outputs the first character of
>> text nodes
>> together with their line number:
>>
>> <xsl:stylesheet version="2.0"
>>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>>   xmlns:saxon="http://saxon.sf.net/">
>>   <xsl:template match="/">
>>     <xsl:for-each select=".//text()">
>>       <xsl:message select="substring(., 1, 1), saxon:line-number(.)"/>
>>     </xsl:for-each>
>>   </xsl:template>
>> </xsl:stylesheet>
>>
>>
>> When this stylesheet is applied to the input document:
>>
>> <?xml version="1.0" encoding="ISO-8859-1"?>
>> <root>a
>> <b>b
>> </b>c
>> </root>
>>
>> then the following result is returned:
>>
>> a 2
>> b 3
>> c 3
>>
>> Clearly, the text node starting with the character "c" begins
>> one line
>> later than the text node starting with "b".
>>
>> --Stefan
>>
>>
>>
>>
>> -------------------------------------------------------
>> This SF.Net email is sponsored by xPML, a groundbreaking
>> scripting language
>> that extends applications into web and mobile media. Attend
>> the live webcast
>> and join the prime developer group breaking into this new
>> coding territory!
>> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&
>> dat=121642
>> _______________________________________________
>> saxon-help mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/saxon-help
>>
>>    
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> saxon-help mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/saxon-help
>
>
>  



-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
saxon-help mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help
Reply | Threaded
Open this post in threaded view
|

RE: saxon:line-number is sometimes incorrect

Michael Kay
I think line breaks in start tags are quite common, especially when
documents declare several namespaces. There's also the document prolog to
worry about. What you could do is to use Saxon's line-number for the
elements, and then add the number of line breaks encountered in text nodes.

Michael Kay
http://www.saxonica.com/

> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> Stefan Wachter
> Sent: 01 March 2006 12:34
> To: [hidden email]
> Subject: Re: [saxon] saxon:line-number is sometimes incorrect
>
> Thank you for the information. I see.
>
> The problem I am working on is to output XML documents as
> line numbered
> HTML documents. I think that a depth first traversal that counts the
> line breaks will do the job.  I will simply start with the
> line number
> of the document root element and increment the line counter for each
> newline character in text nodes. As long as there are no line
> breaks in
> element start tags this will work.
>
> Thanks again.
> --Stefan
>
>
> Michael Kay wrote:
> > There are two issues here: the granularity of the line
> numbering reported by
> > the SAX parser, and the granularity maintained within the
> Saxon tree.
> >
> > SAX reports the end position of each event. Typically for
> startElement this
> > is the position of the ">" at the end of the start tag. For
> character
> > content, there may be a line number for each fragment of
> text, but more
> > typically it will be the line number of the end of a chunk
> of contiguous
> > text.
> >
> > Because the line number for text nodes is "fuzzy" (and
> because Saxon doesn't
> > need it for reporting errors in schemas and stylesheets),
> the Saxon tree
> > model maintains line numbers only for element nodes (and
> then only if you
> > request it, e.g. using the -l option). If you ask for the
> line number of any
> > other node, you get the line number held for the nearest
> > ancestor-or-preceding element: that is, the line number
> reported for the
> > start tag of that element.
> >
> > Michael Kay
> > http://www.saxonica.com/
> >
> >  
> >> -----Original Message-----
> >> From: [hidden email]
> >> [mailto:[hidden email]] On Behalf Of
> >> Stefan Wachter
> >> Sent: 01 March 2006 11:18
> >> To: [hidden email]
> >> Subject: [saxon] saxon:line-number is sometimes incorrect
> >>
> >> Hi all,
> >>
> >> I have a small stylesheet that outputs the first character of
> >> text nodes
> >> together with their line number:
> >>
> >> <xsl:stylesheet version="2.0"
> >>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> >>   xmlns:saxon="http://saxon.sf.net/">
> >>   <xsl:template match="/">
> >>     <xsl:for-each select=".//text()">
> >>       <xsl:message select="substring(., 1, 1),
> saxon:line-number(.)"/>
> >>     </xsl:for-each>
> >>   </xsl:template>
> >> </xsl:stylesheet>
> >>
> >>
> >> When this stylesheet is applied to the input document:
> >>
> >> <?xml version="1.0" encoding="ISO-8859-1"?>
> >> <root>a
> >> <b>b
> >> </b>c
> >> </root>
> >>
> >> then the following result is returned:
> >>
> >> a 2
> >> b 3
> >> c 3
> >>
> >> Clearly, the text node starting with the character "c" begins
> >> one line
> >> later than the text node starting with "b".
> >>
> >> --Stefan
> >>
> >>
> >>
> >>
> >> -------------------------------------------------------
> >> This SF.Net email is sponsored by xPML, a groundbreaking
> >> scripting language
> >> that extends applications into web and mobile media. Attend
> >> the live webcast
> >> and join the prime developer group breaking into this new
> >> coding territory!
> >> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&
> >> dat=121642
> >> _______________________________________________
> >> saxon-help mailing list
> >> [hidden email]
> >> https://lists.sourceforge.net/lists/listinfo/saxon-help
> >>
> >>    
> >
> >
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by xPML, a groundbreaking
> scripting language
> > that extends applications into web and mobile media. Attend
> the live webcast
> > and join the prime developer group breaking into this new
> coding territory!
> >
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&
> dat=121642
> > _______________________________________________
> > saxon-help mailing list
> > [hidden email]
> > https://lists.sourceforge.net/lists/listinfo/saxon-help
> >
> >
> >  
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking
> scripting language
> that extends applications into web and mobile media. Attend
> the live webcast
> and join the prime developer group breaking into this new
> coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&
> dat=121642
> _______________________________________________
> saxon-help mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/saxon-help
>




-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
saxon-help mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help
Reply | Threaded
Open this post in threaded view
|

Re: saxon:line-number is sometimes incorrect

Stefan Wachter
Thanks for the tip. Now, I am doing a depth first traversal counting
line breaks in text nodes and synchronizing the current line number
whenever an element is entered. This works fine. The only problem would
be the root element. Fortunately, the root elements are unimportant
containers in my application.

--Stefan

Michael Kay wrote:

> I think line breaks in start tags are quite common, especially when
> documents declare several namespaces. There's also the document prolog to
> worry about. What you could do is to use Saxon's line-number for the
> elements, and then add the number of line breaks encountered in text nodes.
>
> Michael Kay
> http://www.saxonica.com/
>
>  
>> -----Original Message-----
>> From: [hidden email]
>> [mailto:[hidden email]] On Behalf Of
>> Stefan Wachter
>> Sent: 01 March 2006 12:34
>> To: [hidden email]
>> Subject: Re: [saxon] saxon:line-number is sometimes incorrect
>>
>> Thank you for the information. I see.
>>
>> The problem I am working on is to output XML documents as
>> line numbered
>> HTML documents. I think that a depth first traversal that counts the
>> line breaks will do the job.  I will simply start with the
>> line number
>> of the document root element and increment the line counter for each
>> newline character in text nodes. As long as there are no line
>> breaks in
>> element start tags this will work.
>>
>> Thanks again.
>> --Stefan
>>
>>
>> Michael Kay wrote:
>>    
>>> There are two issues here: the granularity of the line
>>>      
>> numbering reported by
>>    
>>> the SAX parser, and the granularity maintained within the
>>>      
>> Saxon tree.
>>    
>>> SAX reports the end position of each event. Typically for
>>>      
>> startElement this
>>    
>>> is the position of the ">" at the end of the start tag. For
>>>      
>> character
>>    
>>> content, there may be a line number for each fragment of
>>>      
>> text, but more
>>    
>>> typically it will be the line number of the end of a chunk
>>>      
>> of contiguous
>>    
>>> text.
>>>
>>> Because the line number for text nodes is "fuzzy" (and
>>>      
>> because Saxon doesn't
>>    
>>> need it for reporting errors in schemas and stylesheets),
>>>      
>> the Saxon tree
>>    
>>> model maintains line numbers only for element nodes (and
>>>      
>> then only if you
>>    
>>> request it, e.g. using the -l option). If you ask for the
>>>      
>> line number of any
>>    
>>> other node, you get the line number held for the nearest
>>> ancestor-or-preceding element: that is, the line number
>>>      
>> reported for the
>>    
>>> start tag of that element.
>>>
>>> Michael Kay
>>> http://www.saxonica.com/
>>>
>>>  
>>>      
>>>> -----Original Message-----
>>>> From: [hidden email]
>>>> [mailto:[hidden email]] On Behalf Of
>>>> Stefan Wachter
>>>> Sent: 01 March 2006 11:18
>>>> To: [hidden email]
>>>> Subject: [saxon] saxon:line-number is sometimes incorrect
>>>>
>>>> Hi all,
>>>>
>>>> I have a small stylesheet that outputs the first character of
>>>> text nodes
>>>> together with their line number:
>>>>
>>>> <xsl:stylesheet version="2.0"
>>>>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>>>>   xmlns:saxon="http://saxon.sf.net/">
>>>>   <xsl:template match="/">
>>>>     <xsl:for-each select=".//text()">
>>>>       <xsl:message select="substring(., 1, 1),
>>>>        
>> saxon:line-number(.)"/>
>>    
>>>>     </xsl:for-each>
>>>>   </xsl:template>
>>>> </xsl:stylesheet>
>>>>
>>>>
>>>> When this stylesheet is applied to the input document:
>>>>
>>>> <?xml version="1.0" encoding="ISO-8859-1"?>
>>>> <root>a
>>>> <b>b
>>>> </b>c
>>>> </root>
>>>>
>>>> then the following result is returned:
>>>>
>>>> a 2
>>>> b 3
>>>> c 3
>>>>
>>>> Clearly, the text node starting with the character "c" begins
>>>> one line
>>>> later than the text node starting with "b".
>>>>
>>>> --Stefan
>>>>
>>>>
>>>>
>>>>
>>>> -------------------------------------------------------
>>>> This SF.Net email is sponsored by xPML, a groundbreaking
>>>> scripting language
>>>> that extends applications into web and mobile media. Attend
>>>> the live webcast
>>>> and join the prime developer group breaking into this new
>>>> coding territory!
>>>> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&
>>>> dat=121642
>>>> _______________________________________________
>>>> saxon-help mailing list
>>>> [hidden email]
>>>> https://lists.sourceforge.net/lists/listinfo/saxon-help
>>>>
>>>>    
>>>>        
>>>
>>>
>>> -------------------------------------------------------
>>> This SF.Net email is sponsored by xPML, a groundbreaking
>>>      
>> scripting language
>>    
>>> that extends applications into web and mobile media. Attend
>>>      
>> the live webcast
>>    
>>> and join the prime developer group breaking into this new
>>>      
>> coding territory!
>>    
>> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&
>> dat=121642
>>    
>>> _______________________________________________
>>> saxon-help mailing list
>>> [hidden email]
>>> https://lists.sourceforge.net/lists/listinfo/saxon-help
>>>
>>>
>>>  
>>>      
>>
>> -------------------------------------------------------
>> This SF.Net email is sponsored by xPML, a groundbreaking
>> scripting language
>> that extends applications into web and mobile media. Attend
>> the live webcast
>> and join the prime developer group breaking into this new
>> coding territory!
>> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&
>> dat=121642
>> _______________________________________________
>> saxon-help mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/saxon-help
>>
>>    
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> saxon-help mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/saxon-help
>
>
>  



-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
saxon-help mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help