Building documents with different schemas

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Building documents with different schemas

Patrik.Stellmann

Hi,

 

I’m using DocumentBuilder.build to load multiple documents: a dita map with all referenced maps and topics. When loading the first topic I get an exception that there’s a validation error. I get the same errors when validating the topic file against the map schema. So it appears that the first schema that was used by the document builder will be used for all following files as well.

I managed to get around the problem by calling configuration.clearSchemaCache() for each file. But since the schema is usually much more complex than the actual xml file this makes the solution quite slow.

Is there any better solution to use a schema cache and still load files using different schemas?

 

If it should matter: I’m using Saxon EE 9.6.0.7 (embedded into oXygen 17.1/18.0)

 

Thanks and regards,

Patrik


------------------------------------------------------------------
Systemarchitektur & IT-Projekte
Tel: +49 40 33449-1142
Fax: +49 40 33449-1400
E-Mail: [hidden email]


GDV Dienstleistungs-GmbH & Co. KG
Glockengießerwall 1
D-20095 Hamburg
www.gdv-dl.de

Sitz und Registergericht: Hamburg
HRA 93 894
USt.-IdNr : DE 205183123

Komplementärin:
GDV Beteiligungsgesellschaft mbH
Sitz und Registergericht: Hamburg
HRB 71 153

Geschäftsführer:
Dr. Jens Bartenwerfer
Michael Bathke

------------------------------------------------------------------
Diese E-Mail und alle Anhänge enthalten vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe der E-Mail ist nicht gestattet.

This e-mail and any attached files may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden.


------------------------------------------------------------------------------

_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Building documents with different schemas

Radu Coravu
Hi Patrik,

This looks like what I described in this issue:

https://saxonica.plan.io/issues/2716

Regards,
Radu

Radu Coravu
<oXygen/>  XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com

On 9/26/2016 12:28 PM, Dr. Patrik Stellmann wrote:

> Hi,
>
>
>
> I’m using DocumentBuilder.build to load multiple documents: a dita map
> with all referenced maps and topics. When loading the first topic I get
> an exception that there’s a validation error. I get the same errors when
> validating the topic file against the map schema. So it appears that the
> first schema that was used by the document builder will be used for all
> following files as well.
>
> I managed to get around the problem by calling
> configuration.clearSchemaCache() for each file. But since the schema is
> usually much more complex than the actual xml file this makes the
> solution quite slow.
>
> Is there any better solution to use a schema cache and still load files
> using different schemas?
>
>
>
> If it should matter: I’m using Saxon EE 9.6.0.7 (embedded into oXygen
> 17.1/18.0)
>
>
>
> Thanks and regards,
>
> Patrik
>
>
> ------------------------------------------------------------------
> Systemarchitektur & IT-Projekte
> Tel: +49 40 33449-1142
> Fax: +49 40 33449-1400
> E-Mail: [hidden email] <mailto:[hidden email]>
>
>
> *GDV Dienstleistungs-GmbH & Co. KG*
> Glockengießerwall 1
> D-20095 Hamburg
> www.gdv-dl.de
>
> Sitz und Registergericht: Hamburg
> HRA 93 894
> USt.-IdNr : DE 205183123
>
> Komplementärin:
> GDV Beteiligungsgesellschaft mbH
> Sitz und Registergericht: Hamburg
> HRB 71 153
>
> Geschäftsführer:
> Dr. Jens Bartenwerfer
> Michael Bathke
>
> ------------------------------------------------------------------
> Diese E-Mail und alle Anhänge enthalten vertrauliche und/oder rechtlich
> geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder
> diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den
> Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie
> die unbefugte Weitergabe der E-Mail ist nicht gestattet.
>
> This e-mail and any attached files may contain confidential and/or
> privileged information. If you are not the intended recipient (or have
> received this e-mail in error) please notify the sender immediately and
> destroy this e-mail. Any unauthorised copying, disclosure or
> distribution of the material in this e-mail is strictly forbidden.
>
>
>
> ------------------------------------------------------------------------------
>
>
>
> _______________________________________________
> saxon-help mailing list archived at http://saxon.markmail.org/
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/saxon-help
>




------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Building documents with different schemas

Patrik.Stellmann
Thanks Radu, I'm afraid it is exactly the same problem.

Since it is very likely to have the same schema multiple times when processing a DITA map some kind of schema caching would be really useful. Maybe there is a better workaround by using multiple schema caches depending on the value of @xsi:noNamespaceSchemaLocation?

Patrik


------------------------------------------------------------------
Systemarchitektur & IT-Projekte
Tel: +49 40 33449-1142
Fax: +49 40 33449-1400
E-Mail: mailto:[hidden email]

-----Ursprüngliche Nachricht-----
Von: Radu Coravu [mailto:[hidden email]]
Gesendet: Montag, 26. September 2016 12:19
An: [hidden email]
Betreff: Re: [saxon] Building documents with different schemas

Hi Patrik,

This looks like what I described in this issue:

https://saxonica.plan.io/issues/2716

Regards,
Radu

Radu Coravu
<oXygen/>  XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com

On 9/26/2016 12:28 PM, Dr. Patrik Stellmann wrote:

> Hi,
>
>
>
> I'm using DocumentBuilder.build to load multiple documents: a dita map
> with all referenced maps and topics. When loading the first topic I
> get an exception that there's a validation error. I get the same
> errors when validating the topic file against the map schema. So it
> appears that the first schema that was used by the document builder
> will be used for all following files as well.
>
> I managed to get around the problem by calling
> configuration.clearSchemaCache() for each file. But since the schema
> is usually much more complex than the actual xml file this makes the
> solution quite slow.
>
> Is there any better solution to use a schema cache and still load
> files using different schemas?
>
>
>
> If it should matter: I'm using Saxon EE 9.6.0.7 (embedded into oXygen
> 17.1/18.0)
>
>
>
> Thanks and regards,
>
> Patrik
>
>
> ------------------------------------------------------------------
> Systemarchitektur & IT-Projekte
> Tel: +49 40 33449-1142
> Fax: +49 40 33449-1400
> E-Mail: [hidden email] <mailto:[hidden email]>
>
>
> *GDV Dienstleistungs-GmbH & Co. KG*
> Glockengießerwall 1
> D-20095 Hamburg
> www.gdv-dl.de
>
> Sitz und Registergericht: Hamburg
> HRA 93 894
> USt.-IdNr : DE 205183123
>
> Komplementärin:
> GDV Beteiligungsgesellschaft mbH
> Sitz und Registergericht: Hamburg
> HRB 71 153
>
> Geschäftsführer:
> Dr. Jens Bartenwerfer
> Michael Bathke
>
> ------------------------------------------------------------------
> Diese E-Mail und alle Anhänge enthalten vertrauliche und/oder
> rechtlich geschützte Informationen. Wenn Sie nicht der richtige
> Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren
> Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das
> unerlaubte Kopieren sowie die unbefugte Weitergabe der E-Mail ist nicht gestattet.
>
> This e-mail and any attached files may contain confidential and/or
> privileged information. If you are not the intended recipient (or have
> received this e-mail in error) please notify the sender immediately
> and destroy this e-mail. Any unauthorised copying, disclosure or
> distribution of the material in this e-mail is strictly forbidden.
>
>
>
> ----------------------------------------------------------------------
> --------
>
>
>
> _______________________________________________
> saxon-help mailing list archived at http://saxon.markmail.org/
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/saxon-help
>




------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/ [hidden email] https://lists.sourceforge.net/lists/listinfo/saxon-help

GDV Dienstleistungs-GmbH & Co. KG
Glockengießerwall 1
D-20095 Hamburg
www.gdv-dl.de

Sitz und Registergericht: Hamburg
HRA 93 894
USt.-IdNr : DE 205183123

Komplementärin:
GDV Beteiligungsgesellschaft mbH
Sitz und Registergericht: Hamburg
HRB 71 153

Geschäftsführer:
Dr. Jens Bartenwerfer
Michael Bathke

------------------------------------------------------------------
Diese E-Mail und alle Anhänge enthalten vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe der E-Mail ist nicht gestattet.

This e-mail and any attached files may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden.


------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Building documents with different schemas

Radu Coravu
Hi Patrik,

Right, this is what I proposed on the issue, if the XML is in no
namespace, to use the "noNamespaceSchemaLocation" for caching access to
the schemas.

Regards,
Radu

Radu Coravu
<oXygen/>  XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com

On 9/26/2016 2:44 PM, Dr. Patrik Stellmann wrote:

> Thanks Radu, I'm afraid it is exactly the same problem.
>
> Since it is very likely to have the same schema multiple times when processing a DITA map some kind of schema caching would be really useful. Maybe there is a better workaround by using multiple schema caches depending on the value of @xsi:noNamespaceSchemaLocation?
>
> Patrik
>
>
> ------------------------------------------------------------------
> Systemarchitektur & IT-Projekte
> Tel: +49 40 33449-1142
> Fax: +49 40 33449-1400
> E-Mail: mailto:[hidden email]
>
> -----Ursprüngliche Nachricht-----
> Von: Radu Coravu [mailto:[hidden email]]
> Gesendet: Montag, 26. September 2016 12:19
> An: [hidden email]
> Betreff: Re: [saxon] Building documents with different schemas
>
> Hi Patrik,
>
> This looks like what I described in this issue:
>
> https://saxonica.plan.io/issues/2716
>
> Regards,
> Radu
>
> Radu Coravu
> <oXygen/>  XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
>
> On 9/26/2016 12:28 PM, Dr. Patrik Stellmann wrote:
>> Hi,
>>
>>
>>
>> I'm using DocumentBuilder.build to load multiple documents: a dita map
>> with all referenced maps and topics. When loading the first topic I
>> get an exception that there's a validation error. I get the same
>> errors when validating the topic file against the map schema. So it
>> appears that the first schema that was used by the document builder
>> will be used for all following files as well.
>>
>> I managed to get around the problem by calling
>> configuration.clearSchemaCache() for each file. But since the schema
>> is usually much more complex than the actual xml file this makes the
>> solution quite slow.
>>
>> Is there any better solution to use a schema cache and still load
>> files using different schemas?
>>
>>
>>
>> If it should matter: I'm using Saxon EE 9.6.0.7 (embedded into oXygen
>> 17.1/18.0)
>>
>>
>>
>> Thanks and regards,
>>
>> Patrik
>>
>>
>> ------------------------------------------------------------------
>> Systemarchitektur & IT-Projekte
>> Tel: +49 40 33449-1142
>> Fax: +49 40 33449-1400
>> E-Mail: [hidden email] <mailto:[hidden email]>
>>
>>
>> *GDV Dienstleistungs-GmbH & Co. KG*
>> Glockengießerwall 1
>> D-20095 Hamburg
>> www.gdv-dl.de
>>
>> Sitz und Registergericht: Hamburg
>> HRA 93 894
>> USt.-IdNr : DE 205183123
>>
>> Komplementärin:
>> GDV Beteiligungsgesellschaft mbH
>> Sitz und Registergericht: Hamburg
>> HRB 71 153
>>
>> Geschäftsführer:
>> Dr. Jens Bartenwerfer
>> Michael Bathke
>>
>> ------------------------------------------------------------------
>> Diese E-Mail und alle Anhänge enthalten vertrauliche und/oder
>> rechtlich geschützte Informationen. Wenn Sie nicht der richtige
>> Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren
>> Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das
>> unerlaubte Kopieren sowie die unbefugte Weitergabe der E-Mail ist nicht gestattet.
>>
>> This e-mail and any attached files may contain confidential and/or
>> privileged information. If you are not the intended recipient (or have
>> received this e-mail in error) please notify the sender immediately
>> and destroy this e-mail. Any unauthorised copying, disclosure or
>> distribution of the material in this e-mail is strictly forbidden.
>>
>>
>>
>> ----------------------------------------------------------------------
>> --------
>>
>>
>>
>> _______________________________________________
>> saxon-help mailing list archived at http://saxon.markmail.org/
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/saxon-help
>>
>
>
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> saxon-help mailing list archived at http://saxon.markmail.org/ [hidden email] https://lists.sourceforge.net/lists/listinfo/saxon-help
>
> GDV Dienstleistungs-GmbH & Co. KG
> Glockengießerwall 1
> D-20095 Hamburg
> www.gdv-dl.de
>
> Sitz und Registergericht: Hamburg
> HRA 93 894
> USt.-IdNr : DE 205183123
>
> Komplementärin:
> GDV Beteiligungsgesellschaft mbH
> Sitz und Registergericht: Hamburg
> HRB 71 153
>
> Geschäftsführer:
> Dr. Jens Bartenwerfer
> Michael Bathke
>
> ------------------------------------------------------------------
> Diese E-Mail und alle Anhänge enthalten vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe der E-Mail ist nicht gestattet.
>
> This e-mail and any attached files may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden.
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> saxon-help mailing list archived at http://saxon.markmail.org/
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/saxon-help
>



------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Building documents with different schemas

Patrik.Stellmann
Sorry, should have read through to the end...

Patrik


------------------------------------------------------------------
Systemarchitektur & IT-Projekte
Tel: +49 40 33449-1142
Fax: +49 40 33449-1400
E-Mail: mailto:[hidden email]

-----Ursprüngliche Nachricht-----
Von: Radu Coravu [mailto:[hidden email]]
Gesendet: Montag, 26. September 2016 13:51
An: [hidden email]
Betreff: Re: [saxon] Building documents with different schemas

Hi Patrik,

Right, this is what I proposed on the issue, if the XML is in no namespace, to use the "noNamespaceSchemaLocation" for caching access to the schemas.

Regards,
Radu

Radu Coravu
<oXygen/>  XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com

On 9/26/2016 2:44 PM, Dr. Patrik Stellmann wrote:

> Thanks Radu, I'm afraid it is exactly the same problem.
>
> Since it is very likely to have the same schema multiple times when processing a DITA map some kind of schema caching would be really useful. Maybe there is a better workaround by using multiple schema caches depending on the value of @xsi:noNamespaceSchemaLocation?
>
> Patrik
>
>
> ------------------------------------------------------------------
> Systemarchitektur & IT-Projekte
> Tel: +49 40 33449-1142
> Fax: +49 40 33449-1400
> E-Mail: mailto:[hidden email]
>
> -----Ursprüngliche Nachricht-----
> Von: Radu Coravu [mailto:[hidden email]]
> Gesendet: Montag, 26. September 2016 12:19
> An: [hidden email]
> Betreff: Re: [saxon] Building documents with different schemas
>
> Hi Patrik,
>
> This looks like what I described in this issue:
>
> https://saxonica.plan.io/issues/2716
>
> Regards,
> Radu
>
> Radu Coravu
> <oXygen/>  XML Editor, Schema Editor and XSLT Editor/Debugger
> http://www.oxygenxml.com
>
> On 9/26/2016 12:28 PM, Dr. Patrik Stellmann wrote:
>> Hi,
>>
>>
>>
>> I'm using DocumentBuilder.build to load multiple documents: a dita
>> map with all referenced maps and topics. When loading the first topic
>> I get an exception that there's a validation error. I get the same
>> errors when validating the topic file against the map schema. So it
>> appears that the first schema that was used by the document builder
>> will be used for all following files as well.
>>
>> I managed to get around the problem by calling
>> configuration.clearSchemaCache() for each file. But since the schema
>> is usually much more complex than the actual xml file this makes the
>> solution quite slow.
>>
>> Is there any better solution to use a schema cache and still load
>> files using different schemas?
>>
>>
>>
>> If it should matter: I'm using Saxon EE 9.6.0.7 (embedded into oXygen
>> 17.1/18.0)
>>
>>
>>
>> Thanks and regards,
>>
>> Patrik
>>
>>
>> ------------------------------------------------------------------
>> Systemarchitektur & IT-Projekte
>> Tel: +49 40 33449-1142
>> Fax: +49 40 33449-1400
>> E-Mail: [hidden email]
>> <mailto:[hidden email]>
>>
>>
>> *GDV Dienstleistungs-GmbH & Co. KG*
>> Glockengießerwall 1
>> D-20095 Hamburg
>> www.gdv-dl.de
>>
>> Sitz und Registergericht: Hamburg
>> HRA 93 894
>> USt.-IdNr : DE 205183123
>>
>> Komplementärin:
>> GDV Beteiligungsgesellschaft mbH
>> Sitz und Registergericht: Hamburg
>> HRB 71 153
>>
>> Geschäftsführer:
>> Dr. Jens Bartenwerfer
>> Michael Bathke
>>
>> ------------------------------------------------------------------
>> Diese E-Mail und alle Anhänge enthalten vertrauliche und/oder
>> rechtlich geschützte Informationen. Wenn Sie nicht der richtige
>> Adressat sind oder diese E-Mail irrtümlich erhalten haben,
>> informieren Sie bitte sofort den Absender und vernichten Sie diese
>> E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe der E-Mail ist nicht gestattet.
>>
>> This e-mail and any attached files may contain confidential and/or
>> privileged information. If you are not the intended recipient (or
>> have received this e-mail in error) please notify the sender
>> immediately and destroy this e-mail. Any unauthorised copying,
>> disclosure or distribution of the material in this e-mail is strictly forbidden.
>>
>>
>>
>> ---------------------------------------------------------------------
>> -
>> --------
>>
>>
>>
>> _______________________________________________
>> saxon-help mailing list archived at http://saxon.markmail.org/
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/saxon-help
>>
>
>
>
>
> ----------------------------------------------------------------------
> -------- _______________________________________________
> saxon-help mailing list archived at http://saxon.markmail.org/
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/saxon-help
>
> GDV Dienstleistungs-GmbH & Co. KG
> Glockengießerwall 1
> D-20095 Hamburg
> www.gdv-dl.de
>
> Sitz und Registergericht: Hamburg
> HRA 93 894
> USt.-IdNr : DE 205183123
>
> Komplementärin:
> GDV Beteiligungsgesellschaft mbH
> Sitz und Registergericht: Hamburg
> HRB 71 153
>
> Geschäftsführer:
> Dr. Jens Bartenwerfer
> Michael Bathke
>
> ------------------------------------------------------------------
> Diese E-Mail und alle Anhänge enthalten vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe der E-Mail ist nicht gestattet.
>
> This e-mail and any attached files may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden.
>
>
> ----------------------------------------------------------------------
> -------- _______________________________________________
> saxon-help mailing list archived at http://saxon.markmail.org/
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/saxon-help
>



------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/ [hidden email] https://lists.sourceforge.net/lists/listinfo/saxon-help


------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Building documents with different schemas

Michael Kay
In reply to this post by Patrik.Stellmann
As explained, schema components are held by the Configuration object (in s9api, that means by the Processor), and you can only have one component with a given name, and you can't unload components selectively. So if you have to replace any part of the schema, then you have to replace all of it.

You could reduce the cost of reloading a schema by exporting it to an SCM file and loading it from that, instead of from XSD source.

Re-reading the thread that Radu pointed to, I'm wondering whether there might be some kind of solution that's not too disruptive whereby a Configuration can only have one active schema at a time, but a compiled schema can be swapped in and out: almost like the export/import process, but without serializing the schema components to XML.

Michael Kay
Saxonica


On 26 Sep 2016, at 10:28, Dr. Patrik Stellmann <[hidden email]> wrote:

Hi,

 

I’m using DocumentBuilder.build to load multiple documents: a dita map with all referenced maps and topics. When loading the first topic I get an exception that there’s a validation error. I get the same errors when validating the topic file against the map schema. So it appears that the first schema that was used by the document builder will be used for all following files as well.

I managed to get around the problem by calling configuration.clearSchemaCache() for each file. But since the schema is usually much more complex than the actual xml file this makes the solution quite slow.

Is there any better solution to use a schema cache and still load files using different schemas?

 

If it should matter: I’m using Saxon EE 9.6.0.7 (embedded into oXygen 17.1/18.0)

 

Thanks and regards,

Patrik


------------------------------------------------------------------
Systemarchitektur & IT-Projekte
Tel: +49 40 33449-1142
Fax: +49 40 33449-1400
E-Mail: [hidden email]



GDV Dienstleistungs-GmbH & Co. KG
Glockengießerwall 1
D-20095 Hamburg
www.gdv-dl.de

Sitz und Registergericht: Hamburg
HRA 93 894
USt.-IdNr : DE 205183123

Komplementärin:
GDV Beteiligungsgesellschaft mbH
Sitz und Registergericht: Hamburg
HRB 71 153

Geschäftsführer:
Dr. Jens Bartenwerfer
Michael Bathke

------------------------------------------------------------------
Diese E-Mail und alle Anhänge enthalten vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe der E-Mail ist nicht gestattet.

This e-mail and any attached files may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

------------------------------------------------------------------------------
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help


------------------------------------------------------------------------------

_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Building documents with different schemas

Patrik.Stellmann

I made some experiments with different workaround strategies that are easy to implement. I compared them by recursively reading the same DITA map (119 files with 4 different schemas: bookmap, map, topic and task):

1.     Call clearSchemaCache() for each file:
66 seconds

2.     Find the noNamespaceSchemaLocation value by reading plain text from file and call clearSchemaCache() only if the value differs from the last file:
20 seconds

3.     Hold a map of multiple configurations with the noNamespaceSchemaLocation-valuie as key:
~3 seconds reading the first time (building up the cache), ~1 second reparsing the map when all schemas are cached.

 

The 3rd solution will work for me but of course it would be nice if Saxon would support it directly. So I’d appreciate the implementation of your idea with swapping the schema internally.

 

BTW: My first solution consumes a large amount of memory. After parsing the document  oXygen has a total of 1.16GB used memory (checked with JProfiler) compared to 0.32GB with my third solution. So it appears to me that clearSchemaCache() does not really free the memory!?

 

Patrik

 


------------------------------------------------------------------
Systemarchitektur & IT-Projekte
Tel: +49 40 33449-1142
Fax: +49 40 33449-1400
E-Mail: [hidden email]


Von: Michael Kay [mailto:[hidden email]]
Gesendet: Montag, 26. September 2016 19:14
An: Mailing list for the SAXON XSLT and XQuery processor <[hidden email]>
Betreff: Re: [saxon] Building documents with different schemas

 

As explained, schema components are held by the Configuration object (in s9api, that means by the Processor), and you can only have one component with a given name, and you can't unload components selectively. So if you have to replace any part of the schema, then you have to replace all of it.

 

You could reduce the cost of reloading a schema by exporting it to an SCM file and loading it from that, instead of from XSD source.

 

Re-reading the thread that Radu pointed to, I'm wondering whether there might be some kind of solution that's not too disruptive whereby a Configuration can only have one active schema at a time, but a compiled schema can be swapped in and out: almost like the export/import process, but without serializing the schema components to XML.

 

Michael Kay

Saxonica

 

 

On 26 Sep 2016, at 10:28, Dr. Patrik Stellmann <[hidden email]> wrote:

 

Hi,

 

I’m using DocumentBuilder.build to load multiple documents: a dita map with all referenced maps and topics. When loading the first topic I get an exception that there’s a validation error. I get the same errors when validating the topic file against the map schema. So it appears that the first schema that was used by the document builder will be used for all following files as well.

I managed to get around the problem by calling configuration.clearSchemaCache() for each file. But since the schema is usually much more complex than the actual xml file this makes the solution quite slow.

Is there any better solution to use a schema cache and still load files using different schemas?

 

If it should matter: I’m using Saxon EE 9.6.0.7 (embedded into oXygen 17.1/18.0)

 

Thanks and regards,

Patrik

GDV Dienstleistungs-GmbH & Co. KG
Glockengießerwall 1
D-20095 Hamburg
www.gdv-dl.de

Sitz und Registergericht: Hamburg
HRA 93 894
USt.-IdNr : DE 205183123

Komplementärin:
GDV Beteiligungsgesellschaft mbH
Sitz und Registergericht: Hamburg
HRB 71 153

Geschäftsführer:
Dr. Jens Bartenwerfer
Michael Bathke

------------------------------------------------------------------
Diese E-Mail und alle Anhänge enthalten vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe der E-Mail ist nicht gestattet.

This e-mail and any attached files may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden.


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Building documents with different schemas

Michael Kay
Thanks for the feedback.

clearSchemaCache() should make the data eligible for garbage collection, but of course it won't trigger an immediate garbage collection. There may also be other things holding references to schema components, e.g. XSLT or XQuery compilers. If you want to investigate further, take a heap dump and look (for example) for instances of com.saxonica.schema.ElementDecl, and examine whether there are references to these objects that lock them down.

Although it's not something I've tested, I suspect that the approach of swapping schemas in and out of a Configuration could be made to work today. Use

EnterpriseConfiguration config = ...
PreparedSchema savedSchema = config.getSuperSchema();

to get a handle on the schema currently loaded in the configuration, and use

config.clearSchemaCache();
config.addSchema(savedSchema);

to reinstate a previously-used schema. It won't be quite as efficient as just setting a pointer to the new schema, because the schema components will all be indexed and checked for uniqueness, but it will be a lot more efficient than compiling them from a source schema document.

Michael Kay
Saxonica

On 5 Oct 2016, at 06:03, Dr. Patrik Stellmann <[hidden email]> wrote:

I made some experiments with different workaround strategies that are easy to implement. I compared them by recursively reading the same DITA map (119 files with 4 different schemas: bookmap, map, topic and task):

1.     Call clearSchemaCache() for each file:
66 seconds

2.     Find the noNamespaceSchemaLocation value by reading plain text from file and call clearSchemaCache() only if the value differs from the last file:
20 seconds

3.     Hold a map of multiple configurations with the noNamespaceSchemaLocation-valuie as key:
~3 seconds reading the first time (building up the cache), ~1 second reparsing the map when all schemas are cached.

 

The 3rd solution will work for me but of course it would be nice if Saxon would support it directly. So I’d appreciate the implementation of your idea with swapping the schema internally.

 

BTW: My first solution consumes a large amount of memory. After parsing the document  oXygen has a total of 1.16GB used memory (checked with JProfiler) compared to 0.32GB with my third solution. So it appears to me that clearSchemaCache() does not really free the memory!?

 

Patrik

 


------------------------------------------------------------------
Systemarchitektur & IT-Projekte
Tel: +49 40 33449-1142
Fax: +49 40 33449-1400
E-Mail: [hidden email]


Von: Michael Kay [[hidden email]]
Gesendet: Montag, 26. September 2016 19:14
An: Mailing list for the SAXON XSLT and XQuery processor <[hidden email]>
Betreff: Re: [saxon] Building documents with different schemas

 

As explained, schema components are held by the Configuration object (in s9api, that means by the Processor), and you can only have one component with a given name, and you can't unload components selectively. So if you have to replace any part of the schema, then you have to replace all of it.

 

You could reduce the cost of reloading a schema by exporting it to an SCM file and loading it from that, instead of from XSD source.

 

Re-reading the thread that Radu pointed to, I'm wondering whether there might be some kind of solution that's not too disruptive whereby a Configuration can only have one active schema at a time, but a compiled schema can be swapped in and out: almost like the export/import process, but without serializing the schema components to XML.

 

Michael Kay

Saxonica

 

 

On 26 Sep 2016, at 10:28, Dr. Patrik Stellmann <[hidden email]> wrote:

 

Hi,

 

I’m using DocumentBuilder.build to load multiple documents: a dita map with all referenced maps and topics. When loading the first topic I get an exception that there’s a validation error. I get the same errors when validating the topic file against the map schema. So it appears that the first schema that was used by the document builder will be used for all following files as well.

I managed to get around the problem by calling configuration.clearSchemaCache() for each file. But since the schema is usually much more complex than the actual xml file this makes the solution quite slow.

Is there any better solution to use a schema cache and still load files using different schemas?

 

If it should matter: I’m using Saxon EE 9.6.0.7 (embedded into oXygen 17.1/18.0)

 

Thanks and regards,

Patrik


GDV Dienstleistungs-GmbH & Co. KG
Glockengießerwall 1
D-20095 Hamburg
www.gdv-dl.de

Sitz und Registergericht: Hamburg
HRA 93 894
USt.-IdNr : DE 205183123

Komplementärin:
GDV Beteiligungsgesellschaft mbH
Sitz und Registergericht: Hamburg
HRB 71 153

Geschäftsführer:
Dr. Jens Bartenwerfer
Michael Bathke

------------------------------------------------------------------
Diese E-Mail und alle Anhänge enthalten vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe der E-Mail ist nicht gestattet.

This e-mail and any attached files may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Building documents with different schemas

Patrik.Stellmann

Thanks for the hints. I could make the approach with caching only the schema work as well and got pretty much the same runtime as with caching multiple configurations. But since I need my source to be compatible with SaxonHE to integrate it into DITA-OT I will stick to my previous approach.

 

BTW: Is there any way to have the attribute defaults from XSD being expanded on building a document with SaxonHE? It works with SaxonEE but when switching to HE they disappear. I understand that this requires schema-awareness but why is the method setExpandAttributeDefaults available in the SaxonHE class Configuration and not just in EnterpriseConfiguration?

 

Patrik

 


------------------------------------------------------------------
Systemarchitektur & IT-Projekte
Tel: +49 40 33449-1142
Fax: +49 40 33449-1400
E-Mail: [hidden email]


Von: Michael Kay [mailto:[hidden email]]
Gesendet: Mittwoch, 5. Oktober 2016 10:41
An: Mailing list for the SAXON XSLT and X
Query processor <[hidden email]>
Betreff: Re: [saxon] Building documents with different schemas

 

Thanks for the feedback.

 

clearSchemaCache() should make the data eligible for garbage collection, but of course it won't trigger an immediate garbage collection. There may also be other things holding references to schema components, e.g. XSLT or XQuery compilers. If you want to investigate further, take a heap dump and look (for example) for instances of com.saxonica.schema.ElementDecl, and examine whether there are references to these objects that lock them down.

 

Although it's not something I've tested, I suspect that the approach of swapping schemas in and out of a Configuration could be made to work today. Use

 

EnterpriseConfiguration config = ...

PreparedSchema savedSchema = config.getSuperSchema();

 

to get a handle on the schema currently loaded in the configuration, and use

 

config.clearSchemaCache();

config.addSchema(savedSchema);

 

to reinstate a previously-used schema. It won't be quite as efficient as just setting a pointer to the new schema, because the schema components will all be indexed and checked for uniqueness, but it will be a lot more efficient than compiling them from a source schema document.

 

Michael Kay

Saxonica

 

On 5 Oct 2016, at 06:03, Dr. Patrik Stellmann <[hidden email]> wrote:

 

I made some experiments with different workaround strategies that are easy to implement. I compared them by recursively reading the same DITA map (119 files with 4 different schemas: bookmap, map, topic and task):

1.       Call clearSchemaCache() for each file:
66 seconds

2.       Find the noNamespaceSchemaLocation value by reading plain text from file and call clearSchemaCache() only if the value differs from the last file:
20 seconds

3.       Hold a map of multiple configurations with the noNamespaceSchemaLocation-valuie as key:
~3 seconds reading the first time (building up the cache), ~1 second reparsing the map when all schemas are cached.

 

The 3rd solution will work for me but of course it would be nice if Saxon would support it directly. So I’d appreciate the implementation of your idea with swapping the schema internally.

 

BTW: My first solution consumes a large amount of memory. After parsing the document  oXygen has a total of 1.16GB used memory (checked with JProfiler) compared to 0.32GB with my third solution. So it appears to me that clearSchemaCache() does not really free the memory!?

 

Patrik

 


------------------------------------------------------------------
Systemarchitektur & IT-Projekte
Tel: +49 40 33449-1142
Fax: +49 40 33449-1400
E-Mail: [hidden email]

Von: Michael Kay [[hidden email]]
Gesendet: Montag, 26. September 2016 19:14
An: Mailing list for the SAXON XSLT and XQuery processor <[hidden email]>
Betreff: Re: [saxon] Building documents with different schemas

 

As explained, schema components are held by the Configuration object (in s9api, that means by the Processor), and you can only have one component with a given name, and you can't unload components selectively. So if you have to replace any part of the schema, then you have to replace all of it.

 

You could reduce the cost of reloading a schema by exporting it to an SCM file and loading it from that, instead of from XSD source.

 

Re-reading the thread that Radu pointed to, I'm wondering whether there might be some kind of solution that's not too disruptive whereby a Configuration can only have one active schema at a time, but a compiled schema can be swapped in and out: almost like the export/import process, but without serializing the schema components to XML.

 

Michael Kay

Saxonica

 

 

On 26 Sep 2016, at 10:28, Dr. Patrik Stellmann <[hidden email]> wrote:

 

Hi,

 

I’m using DocumentBuilder.build to load multiple documents: a dita map with all referenced maps and topics. When loading the first topic I get an exception that there’s a validation error. I get the same errors when validating the topic file against the map schema. So it appears that the first schema that was used by the document builder will be used for all following files as well.

I managed to get around the problem by calling configuration.clearSchemaCache() for each file. But since the schema is usually much more complex than the actual xml file this makes the solution quite slow.

Is there any better solution to use a schema cache and still load files using different schemas?

 

If it should matter: I’m using Saxon EE 9.6.0.7 (embedded into oXygen 17.1/18.0)

 

Thanks and regards,

Patrik

 

GDV Dienstleistungs-GmbH & Co. KG
Glockengießerwall 1
D-20095 Hamburg
www.gdv-dl.de

Sitz und Registergericht: Hamburg
HRA 93 894
USt.-IdNr : DE 205183123

Komplementärin:
GDV Beteiligungsgesellschaft mbH
Sitz und Registergericht: Hamburg
HRB 71 153

Geschäftsführer:
Dr. Jens Bartenwerfer
Michael Bathke

------------------------------------------------------------------
Diese E-Mail und alle Anhänge enthalten vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe der E-Mail ist nicht gestattet.

This e-mail and any attached files may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden.


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Building documents with different schemas

Michael Kay

On 6 Oct 2016, at 07:37, Dr. Patrik Stellmann <[hidden email]> wrote:

Thanks for the hints. I could make the approach with caching only the schema work as well and got pretty much the same runtime as with caching multiple configurations. But since I need my source to be compatible with SaxonHE to integrate it into DITA-OT I will stick to my previous approach.

 

BTW: Is there any way to have the attribute defaults from XSD being expanded on building a document with SaxonHE? It works with SaxonEE but when switching to HE they disappear. I understand that this requires schema-awareness but why is the method setExpandAttributeDefaults available in the SaxonHE class Configuration and not just in EnterpriseConfiguration?

 



Answer to the last question is because this method also influences DTD-based attribute defaults (though in that case it only works to the extent that Saxon can influence what the XML parser does).

I suspect that if you run a Xerces parse with schema validation, and pipe the output into Saxon, schema-based attribute defaults will be expanded. Haven't tested it though.

Michael Kay
Saxonica


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help 
Loading...