in place file splitter

Igor Gueths igueths at attbi.com
Mon Nov 11 10:05:44 EST 2002


Hi Ralph. Very interesting technique. Two things. First, don't you have to
specify the chunksize which will be read? Also, if you have a very large
file (1 MB), and the chunksize is aproximately 5 KB (roughly 5000 bytes),
this could take a while.

May you code in the power of the source,
may the kernel, libraries, and utilities be with you,
throughout all distributions until the end of the epoch.

On Mon, 11 Nov 2002, Ralph W. Reid wrote:

> Igor Gueths staggered into view and mumbled:
> >
> >Hi Chuck. I think you're probably right, as the file contents will have to
> >be stored in RAM until written to outfiles.
>
>
> I am about to take this a bit off topic for speakup, so if you are
> not interested in a programming technique, you might want to delete
> this article now--sorry if I stepped on anyone's toes with this
> discussion.
>
> Actually, a technique can be used to read chunks of the input file,
> truncating it as you go.  This technique will require more disk I/O,
> but will not require storing massive files in memory.  This technique
> is not as necessary nowadays as it used to be given the low cost and
> massive size of RAM available, but it might be of some use somewhere.
> Here is an algorithm which describes the basics of how this technique
> works:
>
> Open the input file.
> Open an output file.
> Read a chunk of the input file.
> while the end of the file hasnot been reached, do:
>   Write the chunk to the output file.
>   Close the output file.
>   Move all of the remaining input file data to the beginning of the
>     input file.
>   Get the current position in the input file.
>   Close the input file.
>   Truncate the input file at the current position.
>   Open the input file.
>   Open an output file.
>   Read a chunk of data from the input file.
> End of while loop.
> If a chunk of data has been read which has not been written do:
>   Write the chunk of data to the output file.
>   Close the output file.
> Else
>   Close the empty output file.
>   Delete the empty output file.
> End of if-else statement.
> Close the input file.
> Delete the remainder of the input file.
>
> See the man page for the C function `truncate'.  Once working
> properly, this technique will chop the input file down in chunks equal
> to the amount of data written to the output files.  Because the input
> file overwrites itself over and over again in ever shrinking amounts,
> lots of disk I/O will be necessary, especially for large files which
> are to be split into many smaller ones.  All of this disk I/O will of
> course require much more time than loading the entire input file into
> memory and writing the output files from there, but any size input
> files can be handled this way even if memory size is limited.  You
> may or may not find this technique useful.  I am not too sure what
> this technique has to do with speakup though;);).
>
> Have a _great_ day!
>
> --
> Ralph.  N6BNO.  Wisdom comes from central processing, not from I/O.
> rreid at sunset.net  http://personalweb.sunset.net/~rreid
> Opinions herein are either mine or they are flame bait.
> SEC (x) / COSEC (x) = (TAN (x) / COTAN (x)) ^ 2
>
> _______________________________________________
> Speakup mailing list
> Speakup at braille.uwo.ca
> http://speech.braille.uwo.ca/mailman/listinfo/speakup
>





More information about the Speakup mailing list