collections?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

collections?

Dave Pawson-2
Seeing the earlier use of collections,
My requirement is to process n files (call it '*.xml') in current directory,
to use a single xslt stylesheet (x.xsl) on each one, and replace the input file
with the transformed  file (I know that works through experimentation).

Can I use a collection for that with Saxon 8 please?
Is that what collections are for?

regards
--
Dave Pawson
XSLT XSL-FO FAQ.
http://www.dpawson.co.uk


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
<a href="http://ads.osdn.com/?ad_idv37&alloc_id865&op=click">http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
saxon-help mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help
Reply | Threaded
Open this post in threaded view
|

Re: collections?

David Carlisle

> Can I use a collection for that with Saxon 8 please?
> Is that what collections are for?

yes except it's much more flexble, as it doesn't have to be all the
files, and the result file names don't have to have the same name.

You often need to use saxon:discard-document though otherwise all the
files are held in memory (until you run out of same).
see examples posted earlier today.

assuming by "replace" you mean make new file of same name in different
place, I wouldn't try to overwrite the input file with the output.

The ?recurse=yes options don't appear to be documented at
http://www.saxonica.com/documentation/functions/intro/fn_collection.html
(I thought I saw them somewhere in an earlier release documentation?)

doc() documents some of them but not of course  the options pertaining
to multiple files.

Before posting I wanted to check I had the syntax right, and only found
some descriptions on mailing lists via google, eg
http://www.xslt.com/html/xsl-list/2005-08/msg00045.html

David

________________________________________________________________________
This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
saxon-help mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help
Reply | Threaded
Open this post in threaded view
|

Re: collections?

Andrew Welch
> > Can I use a collection for that with Saxon 8 please?
> > Is that what collections are for?
>
> yes except it's much more flexble, as it doesn't have to be all the
> files, and the result file names don't have to have the same name.
>
> You often need to use saxon:discard-document though otherwise all the
> files are held in memory (until you run out of same).
> see examples posted earlier today.

Using collection() and saxon:discard-document() together is really
powerful - I've been using it to generate reports on thousands of xml
files held on disk.

For example: "I've got a directory of 10,000 xml files, tell me how
many don't contain a <title> element" - discovering which don't have
title is easy, but doing it across the whole directory quickly and
using little memory wouldn't have been feasible with xslt before now.

 I'm sure this is just the tip of the iceberg for this kind of processing,


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
<a href="http://ads.osdn.com/?ad_idv37&alloc_id865&op=click">http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
saxon-help mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help
Reply | Threaded
Open this post in threaded view
|

Re: collections?

Dave Pawson-2
Thanks Andrew, David...

What you omit is the syntax!

Is it in a for-each loop?
then use xsl:result-document for the output?

David, what is the caution on using the same name for the output file?
(I'm not even sure its available from collection() is it?)
I'm basically trying to avoid writing more Linux scripts to rename!
I agree the stylesheet needs testing prior to use, but having done that
it seems to be what I want?

regards DaveP


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
<a href="http://ads.osdn.com/?ad_idv37&alloc_id865&op=click">http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
saxon-help mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help
Reply | Threaded
Open this post in threaded view
|

Re: collections?

David Carlisle

  Is it in a for-each loop?
  then use xsl:result-document for the output?

yes

  David, what is the caution on using the same name for the output file?
  (I'm not even sure its available from collection() is it?)

yes, see the code I posted, you use base-uri. I removed the
result-document as it wasn't relevant to my question about
discard-document but something like

<xsl:for-each select="
   for $f in collection('/some/path/to/input?select=*.xml;recurse=yes')
     return saxon:discard($f)">
<xsl:result-document href="{replace(base-uri(),'/some/path/to/input','/output/path')}">
....


My warning was not to do

xsl:result-document href="{base-uri()}">

ie write the result directly over the input. It clashes with xslt's
read-only tree view of the world, and even with systems where it is
allowed, it's inherently unsafe, if the process crashes for any reason
you have trashed your input, and haven't got your required output.


I replied as you addressed me directly but I'm wary of turning
saxon-help into xsl-list, I suspect that you should move the thread
there. Michael, not sure if you want saxon-help to be a discussion list
or whether you want to keep it more just for people to make comments/bug
reports and (just) you respond as necessary?


David

________________________________________________________________________
This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
saxon-help mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help
Reply | Threaded
Open this post in threaded view
|

Re: collections?

Dave Pawson-2
On 25/11/05, David Carlisle <[hidden email]> wrote:
<snip/> Thanks David.

> I replied as you addressed me directly but I'm wary of turning
> saxon-help into xsl-list, I suspect that you should move the thread
> there. Michael, not sure if you want saxon-help to be a discussion list
> or whether you want to keep it more just for people to make comments/bug
> reports and (just) you respond as necessary?

Sorry Mike, just that xpath says it's implementation dependent,
hence seemed more appropriate here.

regards

--
Dave Pawson
XSLT XSL-FO FAQ.
http://www.dpawson.co.uk


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
<a href="http://ads.osdn.com/?ad_idv37&alloc_id865&op=click">http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
saxon-help mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/saxon-help