diff options
Diffstat (limited to 'doc')
-rw-r--r-- | doc/Makefile.in | 1 | ||||
-rw-r--r-- | doc/file.man | 6 | ||||
-rw-r--r-- | doc/libmagic.man | 29 | ||||
-rw-r--r-- | doc/magic.man | 112 |
4 files changed, 103 insertions, 45 deletions
diff --git a/doc/Makefile.in b/doc/Makefile.in index 08b71aaa2ff6..19cf44bcafc5 100644 --- a/doc/Makefile.in +++ b/doc/Makefile.in @@ -186,6 +186,7 @@ EGREP = @EGREP@ ETAGS = @ETAGS@ EXEEXT = @EXEEXT@ FGREP = @FGREP@ +FILECMD = @FILECMD@ GREP = @GREP@ HAVE_VISIBILITY = @HAVE_VISIBILITY@ INSTALL = @INSTALL@ diff --git a/doc/file.man b/doc/file.man index bf78c0c707f0..366e4c3ce847 100644 --- a/doc/file.man +++ b/doc/file.man @@ -1,5 +1,5 @@ -.\" $File: file.man,v 1.150 2023/05/21 17:08:34 christos Exp $ -.Dd May 21, 2023 +.\" $File: file.man,v 1.151 2024/04/07 21:27:35 christos Exp $ +.Dd April 7, 2024 .Dt FILE __CSECTION__ .Os .Sh NAME @@ -348,7 +348,7 @@ Set various parameter limits. .It Li elf_shsize Ta 128MB Ta max ELF section size processed .It Li encoding Ta 65K Ta max number of bytes to determine encoding .It Li indir Ta 50 Ta recursion limit for indirect magic -.It Li name Ta 50 Ta use count limit for name/use magic +.It Li name Ta 100 Ta use count limit for name/use magic .It Li regex Ta 8K Ta length limit for regex searches .El .It Fl r , Fl Fl raw diff --git a/doc/libmagic.man b/doc/libmagic.man index e89c6ee0bfac..d7571ad1aa4f 100644 --- a/doc/libmagic.man +++ b/doc/libmagic.man @@ -1,4 +1,4 @@ -.\" $File: libmagic.man,v 1.49 2023/07/20 14:32:07 christos Exp $ +.\" $File: libmagic.man,v 1.50 2023/12/29 18:04:47 christos Exp $ .\" .\" Copyright (c) Christos Zoulas 2003, 2018, 2022 .\" All Rights Reserved. @@ -25,7 +25,7 @@ .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" -.Dd June 16, 2023 +.Dd December 29, 2023 .Dt LIBMAGIC 3 .Os .Sh NAME @@ -311,7 +311,10 @@ library. .It Li MAGIC_PARAM_ELF_PHNUM_MAX Ta size_t Ta 128 .It Li MAGIC_PARAM_ELF_SHNUM_MAX Ta size_t Ta 32768 .It Li MAGIC_PARAM_REGEX_MAX Ta size_t Ta 8192 -.It Li MAGIC_PARAM_BYTES_MAX Ta size_t Ta 1048576 +.It Li MAGIC_PARAM_BYTES_MAX Ta size_t Ta 7340032 +.It Li MAGIC_PARAM_ENCODING_MAX Ta size_t Ta 1048576 +.It Li MAGIC_PARAM_ELF_SHSIZE_MAX Ta size_t Ta 134217728 +.It Li MAGIC_PARAM_MAGWARN_MAX Ta size_t Ta 64 .El .Pp The @@ -341,6 +344,26 @@ The parameter controls how many ELF sections will be processed. .Pp The +.Dv MAGIC_PARAM_REGEX_MAX +parameter controls the maximum length for regex searches. +.Pp +The +.Dv MAGIC_PARAM_BYTES_MAX +parameter controls the maximum number of bytes to look inside a file. +.Pp +The +.Dv MAGIC_PARAM_ENCODING_MAX +parameter controls the maximum number of bytes to scan for encoding detection. +.Pp +The +.Dv MAGIC_PARAM_ELF_SHSIZE_MAX +parameter controls the maximum number of bytes in an elf section. +.Pp +The +.Dv MAGIC_PARAM_MAGWARN_MAX +parameter controls the maximum number of warnings to tolerate in a magic file. +.Pp +The .Fn magic_version command returns the version number of this library which is compiled into the shared library using the constant diff --git a/doc/magic.man b/doc/magic.man index af4bfa89c6bd..6916b7b211d7 100644 --- a/doc/magic.man +++ b/doc/magic.man @@ -1,5 +1,5 @@ -.\" $File: magic.man,v 1.103 2023/07/20 14:32:07 christos Exp $ -.Dd Arpil 18, 2023 +.\" $File: magic.man,v 1.110 2024/11/27 15:37:00 christos Exp $ +.Dd November 27, 2024 .Dt MAGIC __FSECTION__ .Os .\" install as magic.4 on USG, magic.5 on V7, Berkeley and Linux systems. @@ -50,6 +50,10 @@ is a regular file. A continuation offset relative to the end of the last up-level field .Dv ( \*[Am] ) . .El +If the offset starts with the symbol +.Dq + , +then all offsets are interpreted as from the beginning of the file (the +default). .It Dv type The type of the data to be tested. The possible values are: @@ -146,6 +150,10 @@ An eight-byte value interpreted as a UNIX-style date, but interpreted as local time rather than UTC. .It Dv qwdate An eight-byte value interpreted as a Windows-style date. +.It Dv msdosdate +A two-byte value interpreted as FAT/DOS-style date. +.It Dv msdostime +A two-byte value interpreted as FAT/DOS-style time. .It Dv beid3 A 32-bit ID3 length in big-endian byte order. .It Dv beshort @@ -175,6 +183,12 @@ than UTC. .It Dv beqwdate An eight-byte value in big-endian byte order, interpreted as a Windows-style date. +.It Dv bemsdosdate +A two-byte value in big-endian byte order, +interpreted as FAT/DOS-style date. +.It Dv bemsdostime +A two-byte value in big-endian byte order, +interpreted as FAT/DOS-style time. .It Dv bestring16 A two-byte unicode (UCS16) string in big-endian byte order. .It Dv leid3 @@ -206,6 +220,12 @@ than UTC. .It Dv leqwdate An eight-byte value in little-endian byte order, interpreted as a Windows-style date. +.It Dv lemsdosdate +A two-byte value in big-endian byte order, +interpreted as FAT/DOS-style date. +.It Dv lemsdostime +A two-byte value in big-endian byte order, +interpreted as FAT/DOS-style time. .It Dv lestring16 A two-byte unicode (UCS16) string in little-endian byte order. .It Dv melong @@ -360,7 +380,6 @@ For example the magic entries: .It Dv octal A string representing an octal number. .El -.El .Pp For compatibility with the Single .Ux @@ -610,9 +629,9 @@ with level For more complex files, one can use empty messages to get just the "if/then" effect, in the following way: .Bd -literal -offset indent -0 string MZ -\*[Gt]0x18 leshort \*[Lt]0x40 MS-DOS executable -\*[Gt]0x18 leshort \*[Gt]0x3f extended PC executable (e.g., MS Windows) +0 string MZ +\*[Gt]0x18 uleshort \*[Lt]0x40 MS-DOS executable +\*[Gt]0x18 uleshort \*[Gt]0x3f extended PC executable (e.g., MS Windows) .Ed .Pp Offsets do not need to be constant, but can also be read from the file @@ -627,17 +646,17 @@ the file. The value at that offset is read, and is used again as an offset in the file. Indirect offsets are of the form: -.Em (( x [[.,][bBcCeEfFgGhHiIlmsSqQ]][+\-][ y ]) . +.Em ( x [[.,][bBcCeEfFgGhHiIlmosSqQ]][+\-][ y ]) . The value of .Em x is used as an offset in the file. A byte, id3 length, short or long is read at that offset depending on the -.Em [bBcCeEfFgGhHiIlmsSqQ] +.Em [bBcCeEfFgGhHiIlLmsSqQ] type specifier. The value is treated as signed if -.Dq , +.Dq \&, is specified or unsigned if -.Dq . +.Dq \&. is specified. The capitalized types interpret the number as a big endian value, whereas the small letter versions interpret the number as a little @@ -652,13 +671,15 @@ The default type if one is not specified is long. The following types are recognized: .Bl -column -offset indent "Type" "Half/Short" "Little" "Size" .It Sy Type Sy Mnemonic Sy Endian Sy Size -.It bcBc Byte/Char N/A 1 +.It bcBC Byte/Char N/A 1 .It efg Double Little 8 .It EFG Double Big 8 .It hs Half/Short Little 2 .It HS Half/Short Big 2 .It i ID3 Little 4 .It I ID3 Big 4 +.It l Long Little 4 +.It L Long Big 4 .It m Middle Middle 4 .It o Octal Textual Variable .It q Quad Little 8 @@ -668,12 +689,12 @@ The following types are recognized: That way variable length structures can be examined: .Bd -literal -offset indent # MS Windows executables are also valid MS-DOS executables -0 string MZ -\*[Gt]0x18 leshort \*[Lt]0x40 MZ executable (MS-DOS) +0 string MZ +\*[Gt]0x18 uleshort \*[Lt]0x40 MZ executable (MS-DOS) # skip the whole block below if it is not an extended executable -\*[Gt]0x18 leshort \*[Gt]0x3f -\*[Gt]\*[Gt](0x3c.l) string PE\e0\e0 PE executable (MS-Windows) -\*[Gt]\*[Gt](0x3c.l) string LX\e0\e0 LX executable (OS/2) +\*[Gt]0x18 uleshort \*[Gt]0x3f +\*[Gt]\*[Gt](0x3c.l) string PE\e0\e0 PE executable (MS-Windows) +\*[Gt]\*[Gt](0x3c.l) string LX\e0\e0 LX executable (OS/2) .Ed .Pp This strategy of examining has a drawback: you must make sure that you @@ -687,12 +708,12 @@ inside parentheses allows one to modify the value read from the file before it is used as an offset: .Bd -literal -offset indent # MS Windows executables are also valid MS-DOS executables -0 string MZ +0 string MZ # sometimes, the value at 0x18 is less that 0x40 but there's still an # extended executable, simply appended to the file -\*[Gt]0x18 leshort \*[Lt]0x40 -\*[Gt]\*[Gt](4.s*512) leshort 0x014c COFF executable (MS-DOS, DJGPP) -\*[Gt]\*[Gt](4.s*512) leshort !0x014c MZ executable (MS-DOS) +\*[Gt]0x18 uleshort \*[Lt]0x40 +\*[Gt]\*[Gt](4.s*512) leshort 0x014c COFF executable (MS-DOS, DJGPP) +\*[Gt]\*[Gt](4.s*512) leshort !0x014c MZ executable (MS-DOS) .Ed .Pp Sometimes you do not know the exact offset as this depends on the length or @@ -702,44 +723,45 @@ field using .Sq \*[Am] as a prefix to the offset: .Bd -literal -offset indent -0 string MZ -\*[Gt]0x18 leshort \*[Gt]0x3f -\*[Gt]\*[Gt](0x3c.l) string PE\e0\e0 PE executable (MS-Windows) +0 string MZ +\*[Gt]0x18 uleshort \*[Gt]0x3f +\*[Gt]\*[Gt](0x3c.l) string PE\e0\e0 PE executable (MS-Windows) # immediately following the PE signature is the CPU type -\*[Gt]\*[Gt]\*[Gt]\*[Am]0 leshort 0x14c for Intel 80386 -\*[Gt]\*[Gt]\*[Gt]\*[Am]0 leshort 0x184 for DEC Alpha +\*[Gt]\*[Gt]\*[Gt]\*[Am]0 leshort 0x14c for Intel 80386 +\*[Gt]\*[Gt]\*[Gt]\*[Am]0 leshort 0x8664 for x86-64 +\*[Gt]\*[Gt]\*[Gt]\*[Am]0 leshort 0x184 for DEC Alpha .Ed .Pp Indirect and relative offsets can be combined: .Bd -literal -offset indent -0 string MZ -\*[Gt]0x18 leshort \*[Lt]0x40 -\*[Gt]\*[Gt](4.s*512) leshort !0x014c MZ executable (MS-DOS) +0 string MZ +\*[Gt]0x18 uleshort \*[Lt]0x40 +\*[Gt]\*[Gt](4.s*512) leshort !0x014c MZ executable (MS-DOS) # if it's not COFF, go back 512 bytes and add the offset taken # from byte 2/3, which is yet another way of finding the start # of the extended executable -\*[Gt]\*[Gt]\*[Gt]\*[Am](2.s-514) string LE LE executable (MS Windows VxD driver) +\*[Gt]\*[Gt]\*[Gt]\*[Am](2.s-514) string LE LE executable (MS Windows VxD driver) .Ed .Pp Or the other way around: .Bd -literal -offset indent -0 string MZ -\*[Gt]0x18 leshort \*[Gt]0x3f -\*[Gt]\*[Gt](0x3c.l) string LE\e0\e0 LE executable (MS-Windows) +0 string MZ +\*[Gt]0x18 uleshort \*[Gt]0x3f +\*[Gt]\*[Gt](0x3c.l) string LE\e0\e0 LE executable (MS-Windows) # at offset 0x80 (-4, since relative offsets start at the end # of the up-level match) inside the LE header, we find the absolute # offset to the code area, where we look for a specific signature -\*[Gt]\*[Gt]\*[Gt](\*[Am]0x7c.l+0x26) string UPX \eb, UPX compressed +\*[Gt]\*[Gt]\*[Gt](\*[Am]0x7c.l+0x26) string UPX \eb, UPX compressed .Ed .Pp Or even both! .Bd -literal -offset indent -0 string MZ -\*[Gt]0x18 leshort \*[Gt]0x3f -\*[Gt]\*[Gt](0x3c.l) string LE\e0\e0 LE executable (MS-Windows) +0 string MZ +\*[Gt]0x18 uleshort \*[Gt]0x3f +\*[Gt]\*[Gt](0x3c.l) string LE\e0\e0 LE executable (MS-Windows) # at offset 0x58 inside the LE header, we find the relative offset # to a data area where we look for a specific signature -\*[Gt]\*[Gt]\*[Gt]\*[Am](\*[Am]0x54.l-3) string UNACE \eb, ACE self-extracting archive +\*[Gt]\*[Gt]\*[Gt]\*[Am](\*[Am]0x54.l-3) string UNACE \eb, ACE self-extracting archive .Ed .Pp If you have to deal with offset/length pairs in your file, even the @@ -749,7 +771,7 @@ Note that this additional indirect offset is always relative to the start of the main indirect offset. .Bd -literal -offset indent 0 string MZ -\*[Gt]0x18 leshort \*[Gt]0x3f +\*[Gt]0x18 uleshort \*[Gt]0x3f \*[Gt]\*[Gt](0x3c.l) string PE\e0\e0 PE executable (MS-Windows) # search for the PE section called ".idata"... \*[Gt]\*[Gt]\*[Gt]\*[Am]0xf4 search/0x140 .idata @@ -762,7 +784,7 @@ If you have a list of known values at a particular continuation level, and you want to provide a switch-like default case: .Bd -literal -offset indent # clear that continuation level match -\*[Gt]18 clear +\*[Gt]18 clear x \*[Gt]18 lelong 1 one \*[Gt]18 lelong 2 two \*[Gt]18 default x @@ -828,3 +850,15 @@ to make it clearer that those types have specified widths. .\" the changes I posted to the S5R2 version. .\" .\" Modified for Ian Darwin's version of the file command. +.\" +.\" For emacs editor +.\" Local Variables: +.\" eval: (add-hook 'before-save-hook 'time-stamp) +.\" time-stamp-start: ".Dd " +.\" time-stamp-end: "$" +.\" time-stamp-format: "%:B %02d, %:Y" +.\" time-stamp-time-zone: "UTC0" +.\" system-time-locale: "C" +.\" eval:(setq compile-command (concat "groff -Tlatin1 -m man " (buffer-file-name)) ) +.\" End: +.\" |