Mandalika's scratchpad [ Work blog @Oracle | My Music Compositions ]

Old Posts: 09.04  10.04  11.04  12.04  01.05  02.05  03.05  04.05  05.05  06.05  07.05  08.05  09.05  10.05  11.05  12.05  01.06  02.06  03.06  04.06  05.06  06.06  07.06  08.06  09.06  10.06  11.06  12.06  01.07  02.07  03.07  04.07  05.07  06.07  08.07  09.07  10.07  11.07  12.07  01.08  02.08  03.08  04.08  05.08  06.08  07.08  08.08  09.08  10.08  11.08  12.08  01.09  02.09  03.09  04.09  05.09  06.09  07.09  08.09  09.09  10.09  11.09  12.09  01.10  02.10  03.10  04.10  05.10  06.10  07.10  08.10  09.10  10.10  11.10  12.10  01.11  02.11  03.11  04.11  05.11  07.11  08.11  09.11  10.11  11.11  12.11  01.12  02.12  03.12  04.12  05.12  06.12  07.12  08.12  09.12  10.12  11.12  12.12  01.13  02.13  03.13  04.13  05.13  06.13  07.13  08.13  09.13  10.13  11.13  12.13  01.14  02.14  03.14  04.14  05.14  06.14  07.14  09.14  10.14  11.14  12.14  01.15  02.15  03.15  04.15  06.15  09.15  12.15  01.16  03.16  04.16  05.16  06.16  07.16  08.16  09.16  12.16  01.17  02.17  03.17  04.17  06.17  07.17  08.17  09.17  10.17  12.17  01.18  02.18  03.18  04.18  05.18  06.18  07.18  08.18  09.18  11.18  12.18  01.19  02.19  05.19  06.19  08.19  10.19  11.19  05.20  10.20  11.20  12.20  09.21  11.21  12.22 


Thursday, June 30, 2005
 
Solaris: Tip - Splitting and Merging Files

Sometimes we may have to split a large file into several small files before transfering 'em to another host. Later we can assemble these small files to re-create the original large file at the destination.

Splitting can be done on any file type, including binary files, with the help of split utility of Solaris. Read the man page of split, for all the available options. In the example, we will be using the following options to split one 11M file into several 2M files.

Syntax:
split [-b nm] [file [name]]

where:
-b nm suggests splitting a file into pieces of size nMB, with the name to be used for each of the files resulting from the split operation.

split will append something like "aa", "ab", etc., to make the resulting file names unique and in ascending sort order.

eg., 1. Splitting a large file
% ls -lh *.zip
-rw-r--r-- 1 giri other 11M Jun 30 12:10 src.zip

% split -b 3m src.zip source

% ls -lh source*
-rw-rw-r-- 1 giri other 3.0M Jun 30 17:18 sourceaa
-rw-rw-r-- 1 giri other 3.0M Jun 30 17:18 sourceab
-rw-rw-r-- 1 giri other 3.0M Jun 30 17:18 sourceac
-rw-rw-r-- 1 giri other 2.3M Jun 30 17:18 sourcead

2. Re-creating the original file (assembling) from small files

Since the same file has been split into small files sequentially, we can use cat to merge the small files. If the number files to merge is small, then it is easy to use cat and redirecting the output to a file.
eg.,
  1. One file at a time
    % cat sourceaa > CopyOfsrc.zip
    cat sourceab >> CopyOfsrc.zip
    cat sourceac >> CopyOfsrc.zip
    cat sourcead >> CopyOfsrc.zip
  2. All files at once
    % cat source* > CopyOfsrc.zip
  3. Some problems with above mentioned approaches:
    1. will be inconvenient to redirect the output to a file, if the number of files are too many (approach #1)
    2. may fail, with Arguments too long error, depending on the shell, if the number of files are too many (approach #2)

    • Workaround:
      Use the combination of cat and xargs
        xargs reads a group of arguments from its standard input, then runs a UNIX command with that group of arguments. It keeps reading arguments and running the command until it runs out of arguments. So, it is very unlikely to hit the limitations of a shell and Arguments too long error

      eg.,
      % ls source* | xargs cat > CopyOfsrc.zip.
      Since ls lists all the small files in sorted order, we don't have to worry about merging the files in correct order.
% ls -lh CopyOfsrc.zip
-rw-rw-r-- 1 giri other 11M Jun 30 17:57 CopyOfsrc.zip

% file CopyOfsrc.zip
CopyOfsrc.zip: ZIP archive
Verify the checksum of both the files:
% cksum CopyOfsrc.zip
1904556195 11875601 CopyOfsrc.zip

% cksum src.zip
1904556195 11875601 src.zip
Note:
The resulting files from split, will be stored in the same format as original. In our example, all sourcexy files will be of type "zip archive". But since they are not complete, trying to extract the files results in an error.
% file sourceaa
sourceaa: ZIP archive

% unzip sourceaa
Archive: sourceaa
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of sourceaa or
sourceaa.zip, and cannot find sourceaa.ZIP, period.

________________
Technorati tags: |


Comments:
Thanks a lot its really worth having a small session of split it worked for me really well. Keep the good work.
 
Post a Comment



<< Home


2004-2019 

This page is powered by Blogger. Isn't yours?