Saxon HE in .NET speed

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Saxon HE in .NET speed

Meindert Oldenburger
We use Saxon as XQuery processor to execute some XQuery statements. Money is limited, so we use the HE version :)

Precondition:
- Fast query 3000 objects with a statement like "/Objects/Cables/Cable/Properties/ReelAD" ;
- Update text nodes without loading again the whole XML data.

Implementation now is:
public XPathEngine(string aXML)
{
NameTable nt = new NameTable();
mDoc = new XmlDocument(nt);
byte[] byteArray = Encoding.UTF8.GetBytes(aXML);
mDoc.Load(new MemoryStream(byteArray));
// Wrap the XmlDocument for Saxon
mProcessor = new Processor();
mRoot = mProcessor.NewDocumentBuilder().Wrap(mDoc);
mXPathCompiler = mProcessor.NewXPathCompiler();
mXQueryCompiler = mProcessor.NewXQueryCompiler();
}
This means we use XmlDocument as document model.
Disadvantage some queries, as described above, take 14 seconds or more.

Other implementation is:
public XPathEngine(string aXML)
{
byte[] byteArray = Encoding.UTF8.GetBytes(aXML);
mProcessor = new Processor();
// Load the source document into a DocumentBuilder
DocumentBuilder builder = mProcessor.NewDocumentBuilder();
builder.BaseUri = new Uri("file:///dummy/base/uri");
mRoot = builder.Build(new MemoryStream(byteArray));
mXPathCompiler = mProcessor.NewXPathCompiler();
mXQueryCompiler = mProcessor.NewXQueryCompiler();            
}
This is faster then the previous implementation but it seems I'm not capable to make minor updates to text nodes?

Could someone provide me with a possible solution?

------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Saxon HE in .NET speed

Michael Kay
How big is the difference between the two?

Generally our own experience is that using the Saxon tree model is 5-10 times faster than using a third-party DOM implementation. The Saxon tree model uses integer fingerprints to make name matching very fast, it holds namespace information in a much more effective way than DOM implementations do, and it makes it very easy to sort nodes into document order. A lot of this is only possible because the tree model (unlike DOM) is immutable.

A third option that is sometimes useful is to build the input as a DOM, and then copy it to a Saxon tree rather than wrapping it. Copying takes longer, but it makes Saxon's XPath access faster. Since your files are not that big but the query execution is slow, this might well be the best approach for you.

However, if the query execution is taking 14 seconds, then there is a very good chance that it would benefit from the kind of optimizations that Saxon-EE performs.

Michael Kay
Saxonica

On 5 Jul 2016, at 12:27, Meindert Oldenburger <[hidden email]> wrote:

We use Saxon as XQuery processor to execute some XQuery statements. Money is limited, so we use the HE version :)

Precondition:
- Fast query 3000 objects with a statement like "/Objects/Cables/Cable/Properties/ReelAD" ;
- Update text nodes without loading again the whole XML data.

Implementation now is:
public XPathEngine(string aXML)
{
NameTable nt = new NameTable();
mDoc = new XmlDocument(nt);
byte[] byteArray = Encoding.UTF8.GetBytes(aXML);
mDoc.Load(new MemoryStream(byteArray));
// Wrap the XmlDocument for Saxon
mProcessor = new Processor();
mRoot = mProcessor.NewDocumentBuilder().Wrap(mDoc);
mXPathCompiler = mProcessor.NewXPathCompiler();
mXQueryCompiler = mProcessor.NewXQueryCompiler();
}
This means we use XmlDocument as document model.
Disadvantage some queries, as described above, take 14 seconds or more.

Other implementation is:
public XPathEngine(string aXML)
{
byte[] byteArray = Encoding.UTF8.GetBytes(aXML);
mProcessor = new Processor();
// Load the source document into a DocumentBuilder
DocumentBuilder builder = mProcessor.NewDocumentBuilder();
builder.BaseUri = new Uri("file:///dummy/base/uri");
mRoot = builder.Build(new MemoryStream(byteArray));
mXPathCompiler = mProcessor.NewXPathCompiler();
mXQueryCompiler = mProcessor.NewXQueryCompiler();            
}
This is faster then the previous implementation but it seems I'm not capable to make minor updates to text nodes?

Could someone provide me with a possible solution?
------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help


------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Saxon HE in .NET speed

Meindert Oldenburger
Thanks for your answer. In total it seems to be twice at fast if I use the the Saxon DOM model, but there is also some GUI processing. This means that the XQuery is more than twice the speed.

Because of the Implement we need to keep alive two DOM models:
- XmlDocument; needed to make minor changes to text nodes;
- Saxon DOM; needed to query fast to particular nodes.

At a certain moment we need the XmlElement <-> XdmNode translation! Now I solve that to generate a XdmNode path like "/node()[3]/node()[2]" and query in the XmlDocument to the particular XmlElement.

Some questions about this:
- Is the node order the same in XmlDocument as in Saxon DOM (Dom is generated by serialize the XmlDocument and parse the result to the Saxon Builder);
- Is there a smarter way to have XmlElement info in XdmNode? A solution could be: Before parsing to the Builder create in each XmlElement an ID attribute?;
 

2016-07-05 16:26 GMT+02:00 Michael Kay <[hidden email]>:
How big is the difference between the two?

Generally our own experience is that using the Saxon tree model is 5-10 times faster than using a third-party DOM implementation. The Saxon tree model uses integer fingerprints to make name matching very fast, it holds namespace information in a much more effective way than DOM implementations do, and it makes it very easy to sort nodes into document order. A lot of this is only possible because the tree model (unlike DOM) is immutable.

A third option that is sometimes useful is to build the input as a DOM, and then copy it to a Saxon tree rather than wrapping it. Copying takes longer, but it makes Saxon's XPath access faster. Since your files are not that big but the query execution is slow, this might well be the best approach for you.

However, if the query execution is taking 14 seconds, then there is a very good chance that it would benefit from the kind of optimizations that Saxon-EE performs.

Michael Kay
Saxonica

On 5 Jul 2016, at 12:27, Meindert Oldenburger <[hidden email]> wrote:

We use Saxon as XQuery processor to execute some XQuery statements. Money is limited, so we use the HE version :)

Precondition:
- Fast query 3000 objects with a statement like "/Objects/Cables/Cable/Properties/ReelAD" ;
- Update text nodes without loading again the whole XML data.

Implementation now is:
public XPathEngine(string aXML)
{
NameTable nt = new NameTable();
mDoc = new XmlDocument(nt);
byte[] byteArray = Encoding.UTF8.GetBytes(aXML);
mDoc.Load(new MemoryStream(byteArray));
// Wrap the XmlDocument for Saxon
mProcessor = new Processor();
mRoot = mProcessor.NewDocumentBuilder().Wrap(mDoc);
mXPathCompiler = mProcessor.NewXPathCompiler();
mXQueryCompiler = mProcessor.NewXQueryCompiler();
}
This means we use XmlDocument as document model.
Disadvantage some queries, as described above, take 14 seconds or more.

Other implementation is:
public XPathEngine(string aXML)
{
byte[] byteArray = Encoding.UTF8.GetBytes(aXML);
mProcessor = new Processor();
// Load the source document into a DocumentBuilder
DocumentBuilder builder = mProcessor.NewDocumentBuilder();
builder.BaseUri = new Uri("file:///dummy/base/uri");
mRoot = builder.Build(new MemoryStream(byteArray));
mXPathCompiler = mProcessor.NewXPathCompiler();
mXQueryCompiler = mProcessor.NewXQueryCompiler();            
}
This is faster then the previous implementation but it seems I'm not capable to make minor updates to text nodes?

Could someone provide me with a possible solution?
------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help


------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help


------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Loading...