diff options
Diffstat (limited to 'usr.bin/split/split.1')
| -rw-r--r-- | usr.bin/split/split.1 | 232 |
1 files changed, 232 insertions, 0 deletions
diff --git a/usr.bin/split/split.1 b/usr.bin/split/split.1 new file mode 100644 index 000000000000..bd837f3e9c71 --- /dev/null +++ b/usr.bin/split/split.1 @@ -0,0 +1,232 @@ +.\" Copyright (c) 1990, 1991, 1993, 1994 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 3. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.Dd May 26, 2023 +.Dt SPLIT 1 +.Os +.Sh NAME +.Nm split +.Nd split a file into pieces +.Sh SYNOPSIS +.Nm +.Op Fl cd +.Op Fl l Ar line_count +.Op Fl a Ar suffix_length +.Op Ar file Op Ar prefix +.Nm +.Op Fl cd +.Fl b Ar byte_count Ns +.Oo +.Sm off +.Cm K | k | M | m | G | g +.Sm on +.Oc +.Op Fl a Ar suffix_length +.Op Ar file Op Ar prefix +.Nm +.Op Fl cd +.Fl n Ar chunk_count +.Op Fl a Ar suffix_length +.Op Ar file Op Ar prefix +.Nm +.Op Fl cd +.Fl p Ar pattern +.Op Fl a Ar suffix_length +.Op Ar file Op Ar prefix +.Sh DESCRIPTION +The +.Nm +utility reads the given +.Ar file +and breaks it up into files of 1000 lines each +(if no options are specified), leaving the +.Ar file +unchanged. +If +.Ar file +is a single dash +.Pq Sq Fl +or absent, +.Nm +reads from the standard input. +.Pp +The options are as follows: +.Bl -tag -width indent +.It Fl a Ar suffix_length +Use +.Ar suffix_length +letters to form the suffix of the file name. +.It Fl b Ar byte_count Ns Oo +.Sm off +.Cm K | k | M | m | G | g +.Sm on +.Oc +Create split files +.Ar byte_count +bytes in length. +If +.Cm k +or +.Cm K +is appended to the number, the file is split into +.Ar byte_count +kilobyte pieces. +If +.Cm m +or +.Cm M +is appended to the number, the file is split into +.Ar byte_count +megabyte pieces. +If +.Cm g +or +.Cm G +is appended to the number, the file is split into +.Ar byte_count +gigabyte pieces. +.It Fl c +Continue creating files and do not overwrite existing +output files. +.It Fl d +Use a numeric suffix instead of a alphabetic suffix. +.It Fl l Ar line_count +Create split files +.Ar line_count +lines in length. +.It Fl n Ar chunk_count +Split file into +.Ar chunk_count +smaller files. +The first n - 1 files will be of size (size of +.Ar file +/ +.Ar chunk_count +) +and the last file will contain the remaining bytes. +.It Fl p Ar pattern +The file is split whenever an input line matches +.Ar pattern , +which is interpreted as an extended regular expression. +The matching line will be the first line of the next output file. +This option is incompatible with the +.Fl b +and +.Fl l +options. +.El +.Pp +If additional arguments are specified, the first is used as the name +of the input file which is to be split. +If a second additional argument is specified, it is used as a prefix +for the names of the files into which the file is split. +In this case, each file into which the file is split is named by the +prefix followed by a lexically ordered suffix using +.Ar suffix_length +characters in the range +.Dq Li a Ns - Ns Li z . +If +.Fl a +is not specified, two letters are used as the initial suffix. +If the output does not fit into the resulting number of files and the +.Fl d +flag is not specified, then the suffix length is automatically extended as +needed such that all output files continue to sort in lexical order. +.Pp +If the +.Ar prefix +argument is not specified, the file is split into lexically ordered +files named with the prefix +.Dq Li x +and with suffixes as above. +.Pp +By default, +.Nm +will overwrite any existing output files. +If the +.Fl c +flag is specified, +.Nm +will instead create files with names that do not already exist. +.Sh ENVIRONMENT +The +.Ev LANG , LC_ALL , LC_CTYPE +and +.Ev LC_COLLATE +environment variables affect the execution of +.Nm +as described in +.Xr environ 7 . +.Sh EXIT STATUS +.Ex -std +.Sh EXAMPLES +Split input into as many files as needed, so that each file contains at most 2 +lines: +.Bd -literal -offset indent +$ echo -e "first line\\nsecond line\\nthird line\\nforth line" | split -l2 +.Ed +.Pp +Split input in chunks of 10 bytes using numeric prefixes for file names. +This generates two files of 10 bytes (x00 and x01) and a third file (x02) with the +remaining 2 bytes: +.Bd -literal -offset indent +$ echo -e "This is 22 bytes long" | split -d -b10 +.Ed +.Pp +Split input generating 6 files: +.Bd -literal -offset indent +$ echo -e "This is 22 bytes long" | split -n 6 +.Ed +.Pp +Split input creating a new file every time a line matches the regular expression +for a +.Dq t +followed by either +.Dq a +or +.Dq u +thus creating two files: +.Bd -literal -offset indent +$ echo -e "stack\\nstock\\nstuck\\nanother line" | split -p 't[au]' +.Ed +.Sh SEE ALSO +.Xr csplit 1 , +.Xr re_format 7 +.Sh STANDARDS +The +.Nm +utility conforms to +.St -p1003.1-2001 . +.Sh HISTORY +A +.Nm +command appeared in +.At v3 . +.Pp +Before +.Fx 14 , +pattern and line matching only operated on lines shorter than 65,536 bytes. |
