aboutsummaryrefslogtreecommitdiff
path: root/usr.bin/split/split.1
diff options
context:
space:
mode:
Diffstat (limited to 'usr.bin/split/split.1')
-rw-r--r--usr.bin/split/split.1232
1 files changed, 232 insertions, 0 deletions
diff --git a/usr.bin/split/split.1 b/usr.bin/split/split.1
new file mode 100644
index 000000000000..bd837f3e9c71
--- /dev/null
+++ b/usr.bin/split/split.1
@@ -0,0 +1,232 @@
+.\" Copyright (c) 1990, 1991, 1993, 1994
+.\" The Regents of the University of California. All rights reserved.
+.\"
+.\" Redistribution and use in source and binary forms, with or without
+.\" modification, are permitted provided that the following conditions
+.\" are met:
+.\" 1. Redistributions of source code must retain the above copyright
+.\" notice, this list of conditions and the following disclaimer.
+.\" 2. Redistributions in binary form must reproduce the above copyright
+.\" notice, this list of conditions and the following disclaimer in the
+.\" documentation and/or other materials provided with the distribution.
+.\" 3. Neither the name of the University nor the names of its contributors
+.\" may be used to endorse or promote products derived from this software
+.\" without specific prior written permission.
+.\"
+.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
+.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
+.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+.\" SUCH DAMAGE.
+.\"
+.Dd May 26, 2023
+.Dt SPLIT 1
+.Os
+.Sh NAME
+.Nm split
+.Nd split a file into pieces
+.Sh SYNOPSIS
+.Nm
+.Op Fl cd
+.Op Fl l Ar line_count
+.Op Fl a Ar suffix_length
+.Op Ar file Op Ar prefix
+.Nm
+.Op Fl cd
+.Fl b Ar byte_count Ns
+.Oo
+.Sm off
+.Cm K | k | M | m | G | g
+.Sm on
+.Oc
+.Op Fl a Ar suffix_length
+.Op Ar file Op Ar prefix
+.Nm
+.Op Fl cd
+.Fl n Ar chunk_count
+.Op Fl a Ar suffix_length
+.Op Ar file Op Ar prefix
+.Nm
+.Op Fl cd
+.Fl p Ar pattern
+.Op Fl a Ar suffix_length
+.Op Ar file Op Ar prefix
+.Sh DESCRIPTION
+The
+.Nm
+utility reads the given
+.Ar file
+and breaks it up into files of 1000 lines each
+(if no options are specified), leaving the
+.Ar file
+unchanged.
+If
+.Ar file
+is a single dash
+.Pq Sq Fl
+or absent,
+.Nm
+reads from the standard input.
+.Pp
+The options are as follows:
+.Bl -tag -width indent
+.It Fl a Ar suffix_length
+Use
+.Ar suffix_length
+letters to form the suffix of the file name.
+.It Fl b Ar byte_count Ns Oo
+.Sm off
+.Cm K | k | M | m | G | g
+.Sm on
+.Oc
+Create split files
+.Ar byte_count
+bytes in length.
+If
+.Cm k
+or
+.Cm K
+is appended to the number, the file is split into
+.Ar byte_count
+kilobyte pieces.
+If
+.Cm m
+or
+.Cm M
+is appended to the number, the file is split into
+.Ar byte_count
+megabyte pieces.
+If
+.Cm g
+or
+.Cm G
+is appended to the number, the file is split into
+.Ar byte_count
+gigabyte pieces.
+.It Fl c
+Continue creating files and do not overwrite existing
+output files.
+.It Fl d
+Use a numeric suffix instead of a alphabetic suffix.
+.It Fl l Ar line_count
+Create split files
+.Ar line_count
+lines in length.
+.It Fl n Ar chunk_count
+Split file into
+.Ar chunk_count
+smaller files.
+The first n - 1 files will be of size (size of
+.Ar file
+/
+.Ar chunk_count
+)
+and the last file will contain the remaining bytes.
+.It Fl p Ar pattern
+The file is split whenever an input line matches
+.Ar pattern ,
+which is interpreted as an extended regular expression.
+The matching line will be the first line of the next output file.
+This option is incompatible with the
+.Fl b
+and
+.Fl l
+options.
+.El
+.Pp
+If additional arguments are specified, the first is used as the name
+of the input file which is to be split.
+If a second additional argument is specified, it is used as a prefix
+for the names of the files into which the file is split.
+In this case, each file into which the file is split is named by the
+prefix followed by a lexically ordered suffix using
+.Ar suffix_length
+characters in the range
+.Dq Li a Ns - Ns Li z .
+If
+.Fl a
+is not specified, two letters are used as the initial suffix.
+If the output does not fit into the resulting number of files and the
+.Fl d
+flag is not specified, then the suffix length is automatically extended as
+needed such that all output files continue to sort in lexical order.
+.Pp
+If the
+.Ar prefix
+argument is not specified, the file is split into lexically ordered
+files named with the prefix
+.Dq Li x
+and with suffixes as above.
+.Pp
+By default,
+.Nm
+will overwrite any existing output files.
+If the
+.Fl c
+flag is specified,
+.Nm
+will instead create files with names that do not already exist.
+.Sh ENVIRONMENT
+The
+.Ev LANG , LC_ALL , LC_CTYPE
+and
+.Ev LC_COLLATE
+environment variables affect the execution of
+.Nm
+as described in
+.Xr environ 7 .
+.Sh EXIT STATUS
+.Ex -std
+.Sh EXAMPLES
+Split input into as many files as needed, so that each file contains at most 2
+lines:
+.Bd -literal -offset indent
+$ echo -e "first line\\nsecond line\\nthird line\\nforth line" | split -l2
+.Ed
+.Pp
+Split input in chunks of 10 bytes using numeric prefixes for file names.
+This generates two files of 10 bytes (x00 and x01) and a third file (x02) with the
+remaining 2 bytes:
+.Bd -literal -offset indent
+$ echo -e "This is 22 bytes long" | split -d -b10
+.Ed
+.Pp
+Split input generating 6 files:
+.Bd -literal -offset indent
+$ echo -e "This is 22 bytes long" | split -n 6
+.Ed
+.Pp
+Split input creating a new file every time a line matches the regular expression
+for a
+.Dq t
+followed by either
+.Dq a
+or
+.Dq u
+thus creating two files:
+.Bd -literal -offset indent
+$ echo -e "stack\\nstock\\nstuck\\nanother line" | split -p 't[au]'
+.Ed
+.Sh SEE ALSO
+.Xr csplit 1 ,
+.Xr re_format 7
+.Sh STANDARDS
+The
+.Nm
+utility conforms to
+.St -p1003.1-2001 .
+.Sh HISTORY
+A
+.Nm
+command appeared in
+.At v3 .
+.Pp
+Before
+.Fx 14 ,
+pattern and line matching only operated on lines shorter than 65,536 bytes.