Sometimes we may have to split a large file into several small files before transfering 'em to another host. Later we can assemble these small files to re-create the original large file at the destination.
Splitting can be done on any file type, including binary files, with the help of
split
utility of Solaris. Read the man page of
split
, for all the available options. In the example, we will be using the following options to split one 11M file into several 2M files.
Syntax:
split [-b nm] [file [name]]
where:
-b nm
suggests splitting a
file
into pieces of size
nMB, with the
name
to be used for each of the files resulting from the split operation.
split
will append something like "aa", "ab", etc., to make the resulting file names unique and in ascending sort order.
eg., 1. Splitting a large file
% ls -lh *.zip
-rw-r--r-- 1 giri other 11M Jun 30 12:10 src.zip
% split -b 3m src.zip source
% ls -lh source*
-rw-rw-r-- 1 giri other 3.0M Jun 30 17:18 sourceaa
-rw-rw-r-- 1 giri other 3.0M Jun 30 17:18 sourceab
-rw-rw-r-- 1 giri other 3.0M Jun 30 17:18 sourceac
-rw-rw-r-- 1 giri other 2.3M Jun 30 17:18 sourcead
2. Re-creating the original file (assembling) from small files
Since the same file has been split into small files sequentially, we can use
cat
to merge the small files. If the number files to merge is small, then it is easy to use
cat
and redirecting the output to a file.
eg.,
- One file at a time
% cat sourceaa > CopyOfsrc.zip
cat sourceab >> CopyOfsrc.zip
cat sourceac >> CopyOfsrc.zip
cat sourcead >> CopyOfsrc.zip
- All files at once
% cat source* > CopyOfsrc.zip
- Some problems with above mentioned approaches:
- will be inconvenient to redirect the output to a file, if the number of files are too many (approach #1)
- may fail, with
Arguments too long
error, depending on the shell, if the number of files are too many (approach #2)
- Workaround:
Use the combination of cat
and xargs
xargs
reads a group of arguments from its standard input, then runs a UNIX command with that group of arguments. It keeps reading arguments and running the command until it runs out of arguments. So, it is very unlikely to hit the limitations of a shell and Arguments too long
error
eg.,
% ls source* | xargs cat > CopyOfsrc.zip
.
Since ls
lists all the small files in sorted order, we don't have to worry about merging the files in correct order.
% ls -lh CopyOfsrc.zip
-rw-rw-r-- 1 giri other 11M Jun 30 17:57 CopyOfsrc.zip
% file CopyOfsrc.zip
CopyOfsrc.zip: ZIP archive
Verify the checksum of both the files:
% cksum CopyOfsrc.zip
1904556195 11875601 CopyOfsrc.zip
% cksum src.zip
1904556195 11875601 src.zip
Note:
The resulting files from
split
, will be stored in the same format as original. In our example, all
sourcexy
files will be of type "zip archive". But since they are not complete, trying to extract the files results in an error.
% file sourceaa
sourceaa: ZIP archive
% unzip sourceaa
Archive: sourceaa
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of sourceaa or
sourceaa.zip, and cannot find sourceaa.ZIP, period.
________________
Technorati tags:
Solaris |
OpenSolaris