diff options
Diffstat (limited to 'doc/zstd_manual.html')
| -rw-r--r-- | doc/zstd_manual.html | 1366 |
1 files changed, 764 insertions, 602 deletions
diff --git a/doc/zstd_manual.html b/doc/zstd_manual.html index f9b1daa8a28c..6dfa6d997cb9 100644 --- a/doc/zstd_manual.html +++ b/doc/zstd_manual.html @@ -1,10 +1,10 @@ <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> -<title>zstd 1.3.7 Manual</title> +<title>zstd 1.3.8 Manual</title> </head> <body> -<h1>zstd 1.3.7 Manual</h1> +<h1>zstd 1.3.8 Manual</h1> <hr> <a name="Contents"></a><h2>Contents</h2> <ol> @@ -19,16 +19,19 @@ <li><a href="#Chapter9">Streaming compression - HowTo</a></li> <li><a href="#Chapter10">Streaming decompression - HowTo</a></li> <li><a href="#Chapter11">ADVANCED AND EXPERIMENTAL FUNCTIONS</a></li> -<li><a href="#Chapter12">Frame size functions</a></li> -<li><a href="#Chapter13">Memory management</a></li> -<li><a href="#Chapter14">Advanced compression functions</a></li> -<li><a href="#Chapter15">Advanced decompression functions</a></li> -<li><a href="#Chapter16">Advanced streaming functions</a></li> -<li><a href="#Chapter17">Buffer-less and synchronous inner streaming functions</a></li> -<li><a href="#Chapter18">Buffer-less streaming compression (synchronous mode)</a></li> -<li><a href="#Chapter19">Buffer-less streaming decompression (synchronous mode)</a></li> -<li><a href="#Chapter20">New advanced API (experimental)</a></li> -<li><a href="#Chapter21">Block level API</a></li> +<li><a href="#Chapter12">Candidate API for promotion to stable status</a></li> +<li><a href="#Chapter13">Advanced compression API</a></li> +<li><a href="#Chapter14">experimental API (static linking only)</a></li> +<li><a href="#Chapter15">Frame size functions</a></li> +<li><a href="#Chapter16">Memory management</a></li> +<li><a href="#Chapter17">Advanced compression functions</a></li> +<li><a href="#Chapter18">Advanced decompression functions</a></li> +<li><a href="#Chapter19">Advanced streaming functions</a></li> +<li><a href="#Chapter20">Buffer-less and synchronous inner streaming functions</a></li> +<li><a href="#Chapter21">Buffer-less streaming compression (synchronous mode)</a></li> +<li><a href="#Chapter22">Buffer-less streaming decompression (synchronous mode)</a></li> +<li><a href="#Chapter23">ZSTD_getFrameHeader() :</a></li> +<li><a href="#Chapter24">Block level API</a></li> </ol> <hr> <a name="Chapter1"></a><h2>Introduction</h2><pre> @@ -64,7 +67,7 @@ <a name="Chapter2"></a><h2>Version</h2><pre></pre> -<pre><b>unsigned ZSTD_versionNumber(void); </b>/**< useful to check dll version */<b> +<pre><b>unsigned ZSTD_versionNumber(void); </b>/**< to check runtime library version */<b> </b></pre><BR> <a name="Chapter3"></a><h2>Default constant</h2><pre></pre> @@ -139,11 +142,13 @@ int ZSTD_maxCLevel(void); </b>/*!< maximum compression lev ZSTD_CCtx* ZSTD_createCCtx(void); size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx); </pre></b><BR> -<pre><b>size_t ZSTD_compressCCtx(ZSTD_CCtx* ctx, +<pre><b>size_t ZSTD_compressCCtx(ZSTD_CCtx* cctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, int compressionLevel); -</b><p> Same as ZSTD_compress(), requires an allocated ZSTD_CCtx (see ZSTD_createCCtx()). +</b><p> Same as ZSTD_compress(), using an explicit ZSTD_CCtx + The function will compress at requested compression level, + ignoring any other parameter </p></pre><BR> <h3>Decompression context</h3><pre> When decompressing many times, @@ -155,10 +160,13 @@ size_t ZSTD_freeCCtx(ZSTD_CCtx* cctx); ZSTD_DCtx* ZSTD_createDCtx(void); size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx); </pre></b><BR> -<pre><b>size_t ZSTD_decompressDCtx(ZSTD_DCtx* ctx, +<pre><b>size_t ZSTD_decompressDCtx(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize); -</b><p> Same as ZSTD_decompress(), requires an allocated ZSTD_DCtx (see ZSTD_createDCtx()) +</b><p> Same as ZSTD_decompress(), + requires an allocated ZSTD_DCtx. + Compatible with sticky parameters. + </p></pre><BR> <a name="Chapter6"></a><h2>Simple dictionary API</h2><pre></pre> @@ -168,18 +176,22 @@ size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx); const void* src, size_t srcSize, const void* dict,size_t dictSize, int compressionLevel); -</b><p> Compression using a predefined Dictionary (see dictBuilder/zdict.h). +</b><p> Compression at an explicit compression level using a Dictionary. + A dictionary can be any arbitrary data segment (also called a prefix), + or a buffer with specified information (see dictBuilder/zdict.h). Note : This function loads the dictionary, resulting in significant startup delay. - Note : When `dict == NULL || dictSize < 8` no dictionary is used. + It's intended for a dictionary used only once. + Note 2 : When `dict == NULL || dictSize < 8` no dictionary is used. </p></pre><BR> <pre><b>size_t ZSTD_decompress_usingDict(ZSTD_DCtx* dctx, void* dst, size_t dstCapacity, const void* src, size_t srcSize, const void* dict,size_t dictSize); -</b><p> Decompression using a predefined Dictionary (see dictBuilder/zdict.h). +</b><p> Decompression using a known Dictionary. Dictionary must be identical to the one used during compression. Note : This function loads the dictionary, resulting in significant startup delay. + It's intended for a dictionary used only once. Note : When `dict == NULL || dictSize < 8` no dictionary is used. </p></pre><BR> @@ -187,11 +199,12 @@ size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx); <pre><b>ZSTD_CDict* ZSTD_createCDict(const void* dictBuffer, size_t dictSize, int compressionLevel); -</b><p> When compressing multiple messages / blocks with the same dictionary, it's recommended to load it just once. - ZSTD_createCDict() will create a digested dictionary, ready to start future compression operations without startup delay. +</b><p> When compressing multiple messages / blocks using the same dictionary, it's recommended to load it only once. + ZSTD_createCDict() will create a digested dictionary, ready to start future compression operations without startup cost. ZSTD_CDict can be created once and shared by multiple threads concurrently, since its usage is read-only. - `dictBuffer` can be released after ZSTD_CDict creation, since its content is copied within CDict - Note : A ZSTD_CDict can be created with an empty dictionary, but it is inefficient for small data. + `dictBuffer` can be released after ZSTD_CDict creation, because its content is copied within CDict. + Consider experimental function `ZSTD_createCDict_byReference()` if you prefer to not duplicate `dictBuffer` content. + Note : A ZSTD_CDict can be created from an empty dictBuffer, but it is inefficient when used to compress small data. </p></pre><BR> <pre><b>size_t ZSTD_freeCDict(ZSTD_CDict* CDict); @@ -203,16 +216,14 @@ size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx); const void* src, size_t srcSize, const ZSTD_CDict* cdict); </b><p> Compression using a digested Dictionary. - Faster startup than ZSTD_compress_usingDict(), recommended when same dictionary is used multiple times. - Note that compression level is decided during dictionary creation. - Frame parameters are hardcoded (dictID=yes, contentSize=yes, checksum=no) - Note : ZSTD_compress_usingCDict() can be used with a ZSTD_CDict created from an empty dictionary. - But it is inefficient for small data, and it is recommended to use ZSTD_compressCCtx(). + Recommended when same dictionary is used multiple times. + Note : compression level is _decided at dictionary creation time_, + and frame parameters are hardcoded (dictID=yes, contentSize=yes, checksum=no) </p></pre><BR> <pre><b>ZSTD_DDict* ZSTD_createDDict(const void* dictBuffer, size_t dictSize); </b><p> Create a digested dictionary, ready to start decompression operation without startup delay. - dictBuffer can be released after DDict creation, as its content is copied inside DDict + dictBuffer can be released after DDict creation, as its content is copied inside DDict. </p></pre><BR> <pre><b>size_t ZSTD_freeDDict(ZSTD_DDict* ddict); @@ -224,7 +235,7 @@ size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx); const void* src, size_t srcSize, const ZSTD_DDict* ddict); </b><p> Decompression using a digested Dictionary. - Faster startup than ZSTD_decompress_usingDict(), recommended when same dictionary is used multiple times. + Recommended when same dictionary is used multiple times. </p></pre><BR> <a name="Chapter8"></a><h2>Streaming</h2><pre></pre> @@ -245,13 +256,17 @@ size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx); A ZSTD_CStream object is required to track streaming operation. Use ZSTD_createCStream() and ZSTD_freeCStream() to create/release resources. ZSTD_CStream objects can be reused multiple times on consecutive compression operations. - It is recommended to re-use ZSTD_CStream in situations where many streaming operations will be achieved consecutively, - since it will play nicer with system's memory, by re-using already allocated memory. - Use one separate ZSTD_CStream per thread for parallel execution. + It is recommended to re-use ZSTD_CStream since it will play nicer with system's memory, by re-using already allocated memory. + + For parallel execution, use one separate ZSTD_CStream per thread. - Start a new compression by initializing ZSTD_CStream context. - Use ZSTD_initCStream() to start a new compression operation. - Use variants ZSTD_initCStream_usingDict() or ZSTD_initCStream_usingCDict() for streaming with dictionary (experimental section) + note : since v1.3.0, ZSTD_CStream and ZSTD_CCtx are the same thing. + + Parameters are sticky : when starting a new compression on the same context, + it will re-use the same sticky parameters as previous compression session. + When in doubt, it's recommended to fully initialize the context before usage. + Use ZSTD_initCStream() to set the parameter to a selected compression level. + Use advanced API (ZSTD_CCtx_setParameter(), etc.) to set more specific parameters. Use ZSTD_compressStream() as many times as necessary to consume input stream. The function will automatically update both `pos` fields within `input` and `output`. @@ -260,12 +275,11 @@ size_t ZSTD_freeDCtx(ZSTD_DCtx* dctx); in which case `input.pos < input.size`. The caller must check if input has been entirely consumed. If not, the caller must make some room to receive more compressed data, - typically by emptying output buffer, or allocating a new output buffer, and then present again remaining input data. - @return : a size hint, preferred nb of bytes to use as input for next function call - or an error code, which can be tested using ZSTD_isError(). - Note 1 : it's just a hint, to help latency a little, any other value will work fine. - Note 2 : size hint is guaranteed to be <= ZSTD_CStreamInSize() + @return : a size hint, preferred nb of bytes to use as input for next function call + or an error code, which can be tested using ZSTD_isError(). + Note 1 : it's just a hint, to help latency a little, any value will work fine. + Note 2 : size hint is guaranteed to be <= ZSTD_CStreamInSize() At any moment, it's possible to flush whatever data might remain stuck within internal buffer, using ZSTD_flushStream(). `output->pos` will be updated. @@ -305,25 +319,24 @@ size_t ZSTD_endStream(ZSTD_CStream* zcs, ZSTD_outBuffer* output); Use ZSTD_createDStream() and ZSTD_freeDStream() to create/release resources. ZSTD_DStream objects can be re-used multiple times. - Use ZSTD_initDStream() to start a new decompression operation, - or ZSTD_initDStream_usingDict() if decompression requires a dictionary. - @return : recommended first input size + Use ZSTD_initDStream() to start a new decompression operation. + @return : recommended first input size + Alternatively, use advanced API to set specific properties. Use ZSTD_decompressStream() repetitively to consume your input. The function will update both `pos` fields. If `input.pos < input.size`, some input has not been consumed. It's up to the caller to present again remaining data. - The function tries to flush all data decoded immediately, repecting buffer sizes. + The function tries to flush all data decoded immediately, respecting output buffer size. If `output.pos < output.size`, decoder has flushed everything it could. - But if `output.pos == output.size`, there is no such guarantee, - it's likely that some decoded data was not flushed and still remains within internal buffers. + But if `output.pos == output.size`, there might be some data left within internal buffers., In which case, call ZSTD_decompressStream() again to flush whatever remains in the buffer. - When no additional input is provided, amount of data flushed is necessarily <= ZSTD_BLOCKSIZE_MAX. + Note : with no additional input provided, amount of data flushed is necessarily <= ZSTD_BLOCKSIZE_MAX. @return : 0 when a frame is completely decoded and fully flushed, or an error code, which can be tested using ZSTD_isError(), or any other value > 0, which means there is still some decoding or flushing to do to complete current frame : - the return value is a suggested next input size (a hint for better latency) - that will never load more than the current frame. + the return value is a suggested next input size (just a hint for better latency) + that will never request more than the remaining frame size. <BR></pre> @@ -340,32 +353,477 @@ size_t ZSTD_decompressStream(ZSTD_DStream* zds, ZSTD_outBuffer* output, ZSTD_inB <pre><b>size_t ZSTD_DStreamOutSize(void); </b>/*!< recommended size for output buffer. Guarantee to successfully flush at least one complete block in all circumstances. */<b> </b></pre><BR> <a name="Chapter11"></a><h2>ADVANCED AND EXPERIMENTAL FUNCTIONS</h2><pre> - The definitions in this section are considered experimental. - They should never be used with a dynamic library, as prototypes may change in the future. + The definitions in the following section are considered experimental. They are provided for advanced scenarios. + They should never be used with a dynamic library, as prototypes may change in the future. Use them only in association with static linking. <BR></pre> +<a name="Chapter12"></a><h2>Candidate API for promotion to stable status</h2><pre> + The following symbols and constants form the "staging area" : + they are considered to join "stable API" by v1.4.0. + The proposal is written so that it can be made stable "as is", + though it's still possible to suggest improvements. + Staging is in fact last chance for changes, + the API is locked once reaching "stable" status. + +<BR></pre> + <pre><b>int ZSTD_minCLevel(void); </b>/*!< minimum negative compression level allowed */<b> </b></pre><BR> -<pre><b>typedef enum { ZSTD_fast=1, ZSTD_dfast, ZSTD_greedy, ZSTD_lazy, ZSTD_lazy2, - ZSTD_btlazy2, ZSTD_btopt, ZSTD_btultra } ZSTD_strategy; </b>/* from faster to stronger */<b> +<pre><b>size_t ZSTD_findFrameCompressedSize(const void* src, size_t srcSize); +</b><p> `src` should point to the start of a ZSTD frame or skippable frame. + `srcSize` must be >= first frame size + @return : the compressed size of the first frame starting at `src`, + suitable to pass as `srcSize` to `ZSTD_decompress` or similar, + or an error code if input is invalid +</p></pre><BR> + +<pre><b>size_t ZSTD_sizeof_CCtx(const ZSTD_CCtx* cctx); +size_t ZSTD_sizeof_DCtx(const ZSTD_DCtx* dctx); +size_t ZSTD_sizeof_CStream(const ZSTD_CStream* zcs); +size_t ZSTD_sizeof_DStream(const ZSTD_DStream* zds); +size_t ZSTD_sizeof_CDict(const ZSTD_CDict* cdict); +size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict); +</b><p> These functions give the _current_ memory usage of selected object. + Note that object memory usage can evolve (increase or decrease) over time. +</p></pre><BR> + +<a name="Chapter13"></a><h2>Advanced compression API</h2><pre></pre> + +<pre><b>typedef enum { ZSTD_fast=1, + ZSTD_dfast=2, + ZSTD_greedy=3, + ZSTD_lazy=4, + ZSTD_lazy2=5, + ZSTD_btlazy2=6, + ZSTD_btopt=7, + ZSTD_btultra=8, + ZSTD_btultra2=9 + </b>/* note : new strategies _might_ be added in the future.<b> + Only the order (from fast to strong) is guaranteed */ +} ZSTD_strategy; +</b></pre><BR> +<pre><b>typedef enum { + + </b>/* compression parameters */<b> + ZSTD_c_compressionLevel=100, </b>/* Update all compression parameters according to pre-defined cLevel table<b> + * Default level is ZSTD_CLEVEL_DEFAULT==3. + * Special: value 0 means default, which is controlled by ZSTD_CLEVEL_DEFAULT. + * Note 1 : it's possible to pass a negative compression level. + * Note 2 : setting a level sets all default values of other compression parameters */ + ZSTD_c_windowLog=101, </b>/* Maximum allowed back-reference distance, expressed as power of 2.<b> + * Must be clamped between ZSTD_WINDOWLOG_MIN and ZSTD_WINDOWLOG_MAX. + * Special: value 0 means "use default windowLog". + * Note: Using a windowLog greater than ZSTD_WINDOWLOG_LIMIT_DEFAULT + * requires explicitly allowing such window size at decompression stage if using streaming. */ + ZSTD_c_hashLog=102, </b>/* Size of the initial probe table, as a power of 2.<b> + * Resulting memory usage is (1 << (hashLog+2)). + * Must be clamped between ZSTD_HASHLOG_MIN and ZSTD_HASHLOG_MAX. + * Larger tables improve compression ratio of strategies <= dFast, + * and improve speed of strategies > dFast. + * Special: value 0 means "use default hashLog". */ + ZSTD_c_chainLog=103, </b>/* Size of the multi-probe search table, as a power of 2.<b> + * Resulting memory usage is (1 << (chainLog+2)). + * Must be clamped between ZSTD_CHAINLOG_MIN and ZSTD_CHAINLOG_MAX. + * Larger tables result in better and slower compression. + * This parameter is useless when using "fast" strategy. + * It's still useful when using "dfast" strategy, + * in which case it defines a secondary probe table. + * Special: value 0 means "use default chainLog". */ + ZSTD_c_searchLog=104, </b>/* Number of search attempts, as a power of 2.<b> + * More attempts result in better and slower compression. + * This parameter is useless when using "fast" and "dFast" strategies. + * Special: value 0 means "use default searchLog". */ + ZSTD_c_minMatch=105, </b>/* Minimum size of searched matches.<b> + * Note that Zstandard can still find matches of smaller size, + * it just tweaks its search algorithm to look for this size and larger. + * Larger values increase compression and decompression speed, but decrease ratio. + * Must be clamped between ZSTD_MINMATCH_MIN and ZSTD_MINMATCH_MAX. + * Note that currently, for all strategies < btopt, effective minimum is 4. + * , for all strategies > fast, effective maximum is 6. + * Special: value 0 means "use default minMatchLength". */ + ZSTD_c_targetLength=106, </b>/* Impact of this field depends on strategy.<b> + * For strategies btopt, btultra & btultra2: + * Length of Match considered "good enough" to stop search. + * Larger values make compression stronger, and slower. + * For strategy fast: + * Distance between match sampling. + * Larger values make compression faster, and weaker. + * Special: value 0 means "use default targetLength". */ + ZSTD_c_strategy=107, </b>/* See ZSTD_strategy enum definition.<b> + * The higher the value of selected strategy, the more complex it is, + * resulting in stronger and slower compression. + * Special: value 0 means "use default strategy". */ + + </b>/* LDM mode parameters */<b> + ZSTD_c_enableLongDistanceMatching=160, </b>/* Enable long distance matching.<b> + * This parameter is designed to improve compression ratio + * for large inputs, by finding large matches at long distance. + * It increases memory usage and window size. + * Note: enabling this parameter increases default ZSTD_c_windowLog to 128 MB + * except when expressly set to a different value. */ + ZSTD_c_ldmHashLog=161, </b>/* Size of the table for long distance matching, as a power of 2.<b> + * Larger values increase memory usage and compression ratio, + * but decrease compression speed. + * Must be clamped between ZSTD_HASHLOG_MIN and ZSTD_HASHLOG_MAX + * default: windowlog - 7. + * Special: value 0 means "automatically determine hashlog". */ + ZSTD_c_ldmMinMatch=162, </b>/* Minimum match size for long distance matcher.<b> + * Larger/too small values usually decrease compression ratio. + * Must be clamped between ZSTD_LDM_MINMATCH_MIN and ZSTD_LDM_MINMATCH_MAX. + * Special: value 0 means "use default value" (default: 64). */ + ZSTD_c_ldmBucketSizeLog=163, </b>/* Log size of each bucket in the LDM hash table for collision resolution.<b> + * Larger values improve collision resolution but decrease compression speed. + * The maximum value is ZSTD_LDM_BUCKETSIZELOG_MAX. + * Special: value 0 means "use default value" (default: 3). */ + ZSTD_c_ldmHashRateLog=164, </b>/* Frequency of inserting/looking up entries into the LDM hash table.<b> + * Must be clamped between 0 and (ZSTD_WINDOWLOG_MAX - ZSTD_HASHLOG_MIN). + * Default is MAX(0, (windowLog - ldmHashLog)), optimizing hash table usage. + * Larger values improve compression speed. + * Deviating far from default value will likely result in a compression ratio decrease. + * Special: value 0 means "automatically determine hashRateLog". */ + + </b>/* frame parameters */<b> + ZSTD_c_contentSizeFlag=200, </b>/* Content size will be written into frame header _whenever known_ (default:1)<b> + * Content size must be known at the beginning of compression. + * This is automatically the case when using ZSTD_compress2(), + * For streaming variants, content size must be provided with ZSTD_CCtx_setPledgedSrcSize() */ + ZSTD_c_checksumFlag=201, </b>/* A 32-bits checksum of content is written at end of frame (default:0) */<b> + ZSTD_c_dictIDFlag=202, </b>/* When applicable, dictionary's ID is written into frame header (default:1) */<b> + + </b>/* multi-threading parameters */<b> + </b>/* These parameters are only useful if multi-threading is enabled (compiled with build macro ZSTD_MULTITHREAD).<b> + * They return an error otherwise. */ + ZSTD_c_nbWorkers=400, </b>/* Select how many threads will be spawned to compress in parallel.<b> + * When nbWorkers >= 1, triggers asynchronous mode when used with ZSTD_compressStream*() : + * ZSTD_compressStream*() consumes input and flush output if possible, but immediately gives back control to caller, + * while compression work is performed in parallel, within worker threads. + * (note : a strong exception to this rule is when first invocation of ZSTD_compressStream2() sets ZSTD_e_end : + * in which case, ZSTD_compressStream2() delegates to ZSTD_compress2(), which is always a blocking call). + * More workers improve speed, but also increase memory usage. + * Default value is `0`, aka "single-threaded mode" : no worker is spawned, compression is performed inside Caller's thread, all invocations are blocking */ + ZSTD_c_jobSize=401, </b>/* Size of a compression job. This value is enforced only when nbWorkers >= 1.<b> + * Each compression job is completed in parallel, so this value can indirectly impact the nb of active threads. + * 0 means default, which is dynamically determined based on compression parameters. + * Job size must be a minimum of overlap size, or 1 MB, whichever is largest. + * The minimum size is automatically and transparently enforced */ + ZSTD_c_overlapLog=402, </b>/* Control the overlap size, as a fraction of window size.<b> + * The overlap size is an amount of data reloaded from previous job at the beginning of a new job. + * It helps preserve compression ratio, while each job is compressed in parallel. + * This value is enforced only when nbWorkers >= 1. + * Larger values increase compression ratio, but decrease speed. + * Possible values range from 0 to 9 : + * - 0 means "default" : value will be determined by the library, depending on strategy + * - 1 means "no overlap" + * - 9 means "full overlap", using a full window size. + * Each intermediate rank increases/decreases load size by a factor 2 : + * 9: full window; 8: w/2; 7: w/4; 6: w/8; 5:w/16; 4: w/32; 3:w/64; 2:w/128; 1:no overlap; 0:default + * default value varies between 6 and 9, depending on strategy */ + + </b>/* note : additional experimental parameters are also available<b> + * within the experimental section of the API. + * At the time of this writing, they include : + * ZSTD_c_rsyncable + * ZSTD_c_format + * ZSTD_c_forceMaxWindow + * ZSTD_c_forceAttachDict + * Because they are not stable, it's necessary to define ZSTD_STATIC_LINKING_ONLY to access them. + * note : never ever use experimentalParam? names directly; + * also, the enums values themselves are unstable and can still change. + */ + ZSTD_c_experimentalParam1=500, + ZSTD_c_experimentalParam2=10, + ZSTD_c_experimentalParam3=1000, + ZSTD_c_experimentalParam4=1001 +} ZSTD_cParameter; </b></pre><BR> <pre><b>typedef struct { - unsigned windowLog; </b>/**< largest match distance : larger == more compression, more memory needed during decompression */<b> - unsigned chainLog; </b>/**< fully searched segment : larger == more compression, slower, more memory (useless for fast) */<b> - unsigned hashLog; </b>/**< dispatch table : larger == faster, more memory */<b> - unsigned searchLog; </b>/**< nb of searches : larger == more compression, slower */<b> - unsigned searchLength; </b>/**< match length searched : larger == faster decompression, sometimes less compression */<b> - unsigned targetLength; </b>/**< acceptable match size for optimal parser (only) : larger == more compression, slower */<b> - ZSTD_strategy strategy; + size_t error; + int lowerBound; + int upperBound; +} ZSTD_bounds; +</b></pre><BR> +<pre><b>ZSTD_bounds ZSTD_cParam_getBounds(ZSTD_cParameter cParam); +</b><p> All parameters must belong to an interval with lower and upper bounds, + otherwise they will either trigger an error or be automatically clamped. + @return : a structure, ZSTD_bounds, which contains + - an error status field, which must be tested using ZSTD_isError() + - lower and upper bounds, both inclusive + +</p></pre><BR> + +<pre><b>size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, int value); +</b><p> Set one compression parameter, selected by enum ZSTD_cParameter. + All parameters have valid bounds. Bounds can be queried using ZSTD_cParam_getBounds(). + Providing a value beyond bound will either clamp it, or trigger an error (depending on parameter). + Setting a parameter is generally only possible during frame initialization (before starting compression). + Exception : when using multi-threading mode (nbWorkers >= 1), + the following parameters can be updated _during_ compression (within same frame): + => compressionLevel, hashLog, chainLog, searchLog, minMatch, targetLength and strategy. + new parameters will be active for next job only (after a flush()). + @return : an error code (which can be tested using ZSTD_isError()). + +</p></pre><BR> + +<pre><b>size_t ZSTD_CCtx_setPledgedSrcSize(ZSTD_CCtx* cctx, unsigned long long pledgedSrcSize); +</b><p> Total input data size to be compressed as a single frame. + Value will be written in frame header, unless if explicitly forbidden using ZSTD_c_contentSizeFlag. + This value will also be controlled at end of frame, and trigger an error if not respected. + @result : 0, or an error code (which can be tested with ZSTD_isError()). + Note 1 : pledgedSrcSize==0 actually means zero, aka an empty frame. + In order to mean "unknown content size", pass constant ZSTD_CONTENTSIZE_UNKNOWN. + ZSTD_CONTENTSIZE_UNKNOWN is default value for any new frame. + Note 2 : pledgedSrcSize is only valid once, for the next frame. + It's discarded at the end of the frame, and replaced by ZSTD_CONTENTSIZE_UNKNOWN. + Note 3 : Whenever all input data is provided and consumed in a single round, + for example with ZSTD_compress2(), + or invoking immediately ZSTD_compressStream2(,,,ZSTD_e_end), + this value is automatically overriden by srcSize instead. + +</p></pre><BR> + +<pre><b>size_t ZSTD_CCtx_loadDictionary(ZSTD_CCtx* cctx, const void* dict, size_t dictSize); +</b><p> Create an internal CDict from `dict` buffer. + Decompression will have to use same dictionary. + @result : 0, or an error code (which can be tested with ZSTD_isError()). + Special: Loading a NULL (or 0-size) dictionary invalidates previous dictionary, + meaning "return to no-dictionary mode". + Note 1 : Dictionary is sticky, it will be used for all future compressed frames. + To return to "no-dictionary" situation, load a NULL dictionary (or reset parameters). + Note 2 : Loading a dictionary involves building tables. + It's also a CPU consuming operation, with non-negligible impact on latency. + Tables are dependent on compression parameters, and for this reason, + compression parameters can no longer be changed after loading a dictionary. + Note 3 :`dict` content will be copied internally. + Use experimental ZSTD_CCtx_loadDictionary_byReference() to reference content instead. + In such a case, dictionary buffer must outlive its users. + Note 4 : Use ZSTD_CCtx_loadDictionary_advanced() + to precisely select how dictionary content must be interpreted. +</p></pre><BR> + +<pre><b>size_t ZSTD_CCtx_refCDict(ZSTD_CCtx* cctx, const ZSTD_CDict* cdict); +</b><p> Reference a prepared dictionary, to be used for all next compressed frames. + Note that compression parameters are enforced from within CDict, + and supercede any compression parameter previously set within CCtx. + The dictionary will remain valid for future compressed frames using same CCtx. + @result : 0, or an error code (which can be tested with ZSTD_isError()). + Special : Referencing a NULL CDict means "return to no-dictionary mode". + Note 1 : Currently, only one dictionary can be managed. + Referencing a new dictionary effectively "discards" any previous one. + Note 2 : CDict is just referenced, its lifetime must outlive its usage within CCtx. +</p></pre><BR> + +<pre><b>size_t ZSTD_CCtx_refPrefix(ZSTD_CCtx* cctx, + const void* prefix, size_t prefixSize); +</b><p> Reference a prefix (single-usage dictionary) for next compressed frame. + A prefix is **only used once**. Tables are discarded at end of frame (ZSTD_e_end). + Decompression will need same prefix to properly regenerate data. + Compressing with a prefix is similar in outcome as performing a diff and compressing it, + but performs much faster, especially during decompression (compression speed is tunable with compression level). + @result : 0, or an error code (which can be tested with ZSTD_isError()). + Special: Adding any prefix (including NULL) invalidates any previous prefix or dictionary + Note 1 : Prefix buffer is referenced. It **must** outlive compression. + Its content must remain unmodified during compression. + Note 2 : If the intention is to diff some large src data blob with some prior version of itself, + ensure that the window size is large enough to contain the entire source. + See ZSTD_c_windowLog. + Note 3 : Referencing a prefix involves building tables, which are dependent on compression parameters. + It's a CPU consuming operation, with non-negligible impact on latency. + If there is a need to use the same prefix multiple times, consider loadDictionary instead. + Note 4 : By default, the prefix is interpreted as raw content (ZSTD_dm_rawContent). + Use experimental ZSTD_CCtx_refPrefix_advanced() to alter dictionary interpretation. +</p></pre><BR> + +<pre><b>typedef enum { + ZSTD_reset_session_only = 1, + ZSTD_reset_parameters = 2, + ZSTD_reset_session_and_parameters = 3 +} ZSTD_ResetDirective; +</b></pre><BR> +<pre><b>size_t ZSTD_CCtx_reset(ZSTD_CCtx* cctx, ZSTD_ResetDirective reset); +</b><p> There are 2 different things that can be reset, independently or jointly : + - The session : will stop compressing current frame, and make CCtx ready to start a new one. + Useful after an error, or to interrupt any ongoing compression. + Any internal data not yet flushed is cancelled. + Compression parameters and dictionary remain unchanged. + They will be used to compress next frame. + Resetting session never fails. + - The parameters : changes all parameters back to "default". + This removes any reference to any dictionary too. + Parameters can only be changed between 2 sessions (i.e. no compression is currently ongoing) + otherwise the reset fails, and function returns an error value (which can be tested using ZSTD_isError()) + - Both : similar to resetting the session, followed by resetting parameters. + +</p></pre><BR> + +<pre><b>size_t ZSTD_compress2( ZSTD_CCtx* cctx, + void* dst, size_t dstCapacity, + const void* src, size_t srcSize); +</b><p> Behave the same as ZSTD_compressCCtx(), but compression parameters are set using the advanced API. + ZSTD_compress2() always starts a new frame. + Should cctx hold data from a previously unfinished frame, everything about it is forgotten. + - Compression parameters are pushed into CCtx before starting compression, using ZSTD_CCtx_set*() + - The function is always blocking, returns when compression is completed. + Hint : compression runs faster if `dstCapacity` >= `ZSTD_compressBound(srcSize)`. + @return : compressed size written into `dst` (<= `dstCapacity), + or an error code if it fails (which can be tested using ZSTD_isError()). + +</p></pre><BR> + +<pre><b>typedef enum { + ZSTD_e_continue=0, </b>/* collect more data, encoder decides when to output compressed result, for optimal compression ratio */<b> + ZSTD_e_flush=1, </b>/* flush any data provided so far,<b> + * it creates (at least) one new block, that can be decoded immediately on reception; + * frame will continue: any future data can still reference previously compressed data, improving compression. */ + ZSTD_e_end=2 </b>/* flush any remaining data _and_ close current frame.<b> + * note that frame is only closed after compressed data is fully flushed (return value == 0). + * After that point, any additional data starts a new frame. + * note : each frame is independent (does not reference any content from previous frame). */ +} ZSTD_EndDirective; +</b></pre><BR> +<pre><b>size_t ZSTD_compressStream2( ZSTD_CCtx* cctx, + ZSTD_outBuffer* output, + ZSTD_inBuffer* input, + ZSTD_EndDirective endOp); +</b><p> Behaves about the same as ZSTD_compressStream, with additional control on end directive. + - Compression parameters are pushed into CCtx before starting compression, using ZSTD_CCtx_set*() + - Compression parameters cannot be changed once compression is started (save a list of exceptions in multi-threading mode) + - outpot->pos must be <= dstCapacity, input->pos must be <= srcSize + - outpot->pos and input->pos will be updated. They are guaranteed to remain below their respective limit. + - When nbWorkers==0 (default), function is blocking : it completes its job before returning to caller. + - When nbWorkers>=1, function is non-blocking : it just acquires a copy of input, and distributes jobs to internal worker threads, flush whatever is available, + and then immediately returns, just indicating that there is some data remaining to be flushed. + The function nonetheless guarantees forward progress : it will return only after it reads or write at least 1+ byte. + - Exception : if the first call requests a ZSTD_e_end directive and provides enough dstCapacity, the function delegates to ZSTD_compress2() which is always blocking. + - @return provides a minimum amount of data remaining to be flushed from internal buffers + or an error code, which can be tested using ZSTD_isError(). + if @return != 0, flush is not fully completed, there is still some data left within internal buffers. + This is useful for ZSTD_e_flush, since in this case more flushes are necessary to empty all buffers. + For ZSTD_e_end, @return == 0 when internal buffers are fully flushed and frame is completed. + - after a ZSTD_e_end directive, if internal buffer is not fully flushed (@return != 0), + only ZSTD_e_end or ZSTD_e_flush operations are allowed. + Before starting a new compression job, or changing compression parameters, + it is required to fully flush internal buffers. + +</p></pre><BR> + +<pre><b>typedef enum { + + ZSTD_d_windowLogMax=100, </b>/* Select a size limit (in power of 2) beyond which<b> + * the streaming API will refuse to allocate memory buffer + * in order to protect the host from unreasonable memory requirements. + * This parameter is only useful in streaming mode, since no internal buffer is allocated in single-pass mode. + * By default, a decompression context accepts window sizes <= (1 << ZSTD_WINDOWLOG_LIMIT_DEFAULT) */ + + </b>/* note : additional experimental parameters are also available<b> + * within the experimental section of the API. + * At the time of this writing, they include : + * ZSTD_c_format + * Because they are not stable, it's necessary to define ZSTD_STATIC_LINKING_ONLY to access them. + * note : never ever use experimentalParam? names directly + */ + ZSTD_d_experimentalParam1=1000 + +} ZSTD_dParameter; +</b></pre><BR> +<pre><b>ZSTD_bounds ZSTD_dParam_getBounds(ZSTD_dParameter dParam); +</b><p> All parameters must belong to an interval with lower and upper bounds, + otherwise they will either trigger an error or be automatically clamped. + @return : a structure, ZSTD_bounds, which contains + - an error status field, which must be tested using ZSTD_isError() + - both lower and upper bounds, inclusive + +</p></pre><BR> + +<pre><b>size_t ZSTD_DCtx_setParameter(ZSTD_DCtx* dctx, ZSTD_dParameter param, int value); +</b><p> Set one compression parameter, selected by enum ZSTD_dParameter. + All parameters have valid bounds. Bounds can be queried using ZSTD_dParam_getBounds(). + Providing a value beyond bound will either clamp it, or trigger an error (depending on parameter). + Setting a parameter is only possible during frame initialization (before starting decompression). + @return : 0, or an error code (which can be tested using ZSTD_isError()). + +</p></pre><BR> + +<pre><b>size_t ZSTD_DCtx_loadDictionary(ZSTD_DCtx* dctx, const void* dict, size_t dictSize); +</b><p> Create an internal DDict from dict buffer, + to be used to decompress next frames. + The dictionary remains valid for all future frames, until explicitly invalidated. + @result : 0, or an error code (which can be tested with ZSTD_isError()). + Special : Adding a NULL (or 0-size) dictionary invalidates any previous dictionary, + meaning "return to no-dictionary mode". + Note 1 : Loading a dictionary involves building tables, + which has a non-negligible impact on CPU usage and latency. + It's recommended to "load once, use many times", to amortize the cost + Note 2 :`dict` content will be copied internally, so `dict` can be released after loading. + Use ZSTD_DCtx_loadDictionary_byReference() to reference dictionary content instead. + Note 3 : Use ZSTD_DCtx_loadDictionary_advanced() to take control of + how dictionary content is loaded and interpreted. + +</p></pre><BR> + +<pre><b>size_t ZSTD_DCtx_refDDict(ZSTD_DCtx* dctx, const ZSTD_DDict* ddict); +</b><p> Reference a prepared dictionary, to be used to decompress next frames. + The dictionary remains active for decompression of future frames using same DCtx. + @result : 0, or an error code (which can be tested with ZSTD_isError()). + Note 1 : Currently, only one dictionary can be managed. + Referencing a new dictionary effectively "discards" any previous one. + Special: referencing a NULL DDict means "return to no-dictionary mode". + Note 2 : DDict is just referenced, its lifetime must outlive its usage from DCtx. + +</p></pre><BR> + +<pre><b>size_t ZSTD_DCtx_refPrefix(ZSTD_DCtx* dctx, + const void* prefix, size_t prefixSize); +</b><p> Reference a prefix (single-usage dictionary) to decompress next frame. + This is the reverse operation of ZSTD_CCtx_refPrefix(), + and must use the same prefix as the one used during compression. + Prefix is **only used once**. Reference is discarded at end of frame. + End of frame is reached when ZSTD_decompressStream() returns 0. + @result : 0, or an error code (which can be tested with ZSTD_isError()). + Note 1 : Adding any prefix (including NULL) invalidates any previously set prefix or dictionary + Note 2 : Prefix buffer is referenced. It **must** outlive decompression. + Prefix buffer must remain unmodified up to the end of frame, + reached when ZSTD_decompressStream() returns 0. + Note 3 : By default, the prefix is treated as raw content (ZSTD_dm_rawContent). + Use ZSTD_CCtx_refPrefix_advanced() to alter dictMode (Experimental section) + Note 4 : Referencing a raw content prefix has almost no cpu nor memory cost. + A full dictionary is more costly, as it requires building tables. + +</p></pre><BR> + +<pre><b>size_t ZSTD_DCtx_reset(ZSTD_DCtx* dctx, ZSTD_ResetDirective reset); +</b><p> Return a DCtx to clean state. + Session and parameters can be reset jointly or separately. + Parameters can only be reset when no active frame is being decompressed. + @return : 0, or an error code, which can be tested with ZSTD_isError() + +</p></pre><BR> + +<a name="Chapter14"></a><h2>experimental API (static linking only)</h2><pre> + The following symbols and constants + are not planned to join "stable API" status in the near future. + They can still change in future versions. + Some of them are planned to remain in the static_only section indefinitely. + Some of them might be removed in the future (especially when redundant with existing stable functions) + +<BR></pre> + +<pre><b>typedef struct { + unsigned windowLog; </b>/**< largest match distance : larger == more compression, more memory needed during decompression */<b> + unsigned chainLog; </b>/**< fully searched segment : larger == more compression, slower, more memory (useless for fast) */<b> + unsigned hashLog; </b>/**< dispatch table : larger == faster, more memory */<b> + unsigned searchLog; </b>/**< nb of searches : larger == more compression, slower */<b> + unsigned minMatch; </b>/**< match length searched : larger == faster decompression, sometimes less compression */<b> + unsigned targetLength; </b>/**< acceptable match size for optimal parser (only) : larger == more compression, slower */<b> + ZSTD_strategy strategy; </b>/**< see ZSTD_strategy definition above */<b> } ZSTD_compressionParameters; </b></pre><BR> <pre><b>typedef struct { - unsigned contentSizeFlag; </b>/**< 1: content size will be in frame header (when known) */<b> - unsigned checksumFlag; </b>/**< 1: generate a 32-bits checksum at end of frame, for error detection */<b> - unsigned noDictIDFlag; </b>/**< 1: no dictID will be saved into frame header (if dictionary compression) */<b> + int contentSizeFlag; </b>/**< 1: content size will be in frame header (when known) */<b> + int checksumFlag; </b>/**< 1: generate a 32-bits checksum using XXH64 algorithm at end of frame, for error detection */<b> + int noDictIDFlag; </b>/**< 1: no dictID will be saved into frame header (dictID is only useful for dictionary compression) */<b> } ZSTD_frameParameters; </b></pre><BR> <pre><b>typedef struct { @@ -374,25 +832,65 @@ size_t ZSTD_decompressStream(ZSTD_DStream* zds, ZSTD_outBuffer* output, ZSTD_inB } ZSTD_parameters; </b></pre><BR> <pre><b>typedef enum { - ZSTD_dct_auto=0, </b>/* dictionary is "full" when starting with ZSTD_MAGIC_DICTIONARY, otherwise it is "rawContent" */<b> - ZSTD_dct_rawContent, </b>/* ensures dictionary is always loaded as rawContent, even if it starts with ZSTD_MAGIC_DICTIONARY */<b> - ZSTD_dct_fullDict </b>/* refuses to load a dictionary if it does not respect Zstandard's specification */<b> + ZSTD_dct_auto = 0, </b>/* dictionary is "full" when starting with ZSTD_MAGIC_DICTIONARY, otherwise it is "rawContent" */<b> + ZSTD_dct_rawContent = 1, </b>/* ensures dictionary is always loaded as rawContent, even if it starts with ZSTD_MAGIC_DICTIONARY */<b> + ZSTD_dct_fullDict = 2 </b>/* refuses to load a dictionary if it does not respect Zstandard's specification, starting with ZSTD_MAGIC_DICTIONARY */<b> } ZSTD_dictContentType_e; </b></pre><BR> <pre><b>typedef enum { - ZSTD_dlm_byCopy = 0, </b>/**< Copy dictionary content internally */<b> - ZSTD_dlm_byRef, </b>/**< Reference dictionary content -- the dictionary buffer must outlive its users. */<b> + ZSTD_dlm_byCopy = 0, </b>/**< Copy dictionary content internally */<b> + ZSTD_dlm_byRef = 1, </b>/**< Reference dictionary content -- the dictionary buffer must outlive its users. */<b> } ZSTD_dictLoadMethod_e; </b></pre><BR> -<a name="Chapter12"></a><h2>Frame size functions</h2><pre></pre> - -<pre><b>size_t ZSTD_findFrameCompressedSize(const void* src, size_t srcSize); -</b><p> `src` should point to the start of a ZSTD encoded frame or skippable frame - `srcSize` must be >= first frame size - @return : the compressed size of the first frame starting at `src`, - suitable to pass to `ZSTD_decompress` or similar, - or an error code if input is invalid -</p></pre><BR> +<pre><b>typedef enum { + </b>/* Opened question : should we have a format ZSTD_f_auto ?<b> + * Today, it would mean exactly the same as ZSTD_f_zstd1. + * But, in the future, should several formats become supported, + * on the compression side, it would mean "default format". + * On the decompression side, it would mean "automatic format detection", + * so that ZSTD_f_zstd1 would mean "accept *only* zstd frames". + * Since meaning is a little different, another option could be to define different enums for compression and decompression. + * This question could be kept for later, when there are actually multiple formats to support, + * but there is also the question of pinning enum values, and pinning value `0` is especially important */ + ZSTD_f_zstd1 = 0, </b>/* zstd frame format, specified in zstd_compression_format.md (default) */<b> + ZSTD_f_zstd1_magicless = 1, </b>/* Variant of zstd frame format, without initial 4-bytes magic number.<b> + * Useful to save 4 bytes per generated frame. + * Decoder cannot recognise automatically this format, requiring this instruction. */ +} ZSTD_format_e; +</b></pre><BR> +<pre><b>typedef enum { + </b>/* Note: this enum and the behavior it controls are effectively internal<b> + * implementation details of the compressor. They are expected to continue + * to evolve and should be considered only in the context of extremely + * advanced performance tuning. + * + * Zstd currently supports the use of a CDict in two ways: + * + * - The contents of the CDict can be copied into the working context. This + * means that the compression can search both the dictionary and input + * while operating on a single set of internal tables. This makes + * the compression faster per-byte of input. However, the initial copy of + * the CDict's tables incurs a fixed cost at the beginning of the + * compression. For small compressions (< 8 KB), that copy can dominate + * the cost of the compression. + * + * - The CDict's tables can be used in-place. In this model, compression is + * slower per input byte, because the compressor has to search two sets of + * tables. However, this model incurs no start-up cost (as long as the + * working context's tables can be reused). For small inputs, this can be + * faster than copying the CDict's tables. + * + * Zstd has a simple internal heuristic that selects which strategy to use + * at the beginning of a compression. However, if experimentation shows that + * Zstd is making poor choices, it is possible to override that choice with + * this enum. + */ + ZSTD_dictDefaultAttach = 0, </b>/* Use the default heuristic. */<b> + ZSTD_dictForceAttach = 1, </b>/* Never copy the dictionary. */<b> + ZSTD_dictForceCopy = 2, </b>/* Always copy the dictionary. */<b> +} ZSTD_dictAttachPref_e; +</b></pre><BR> +<a name="Chapter15"></a><h2>Frame size functions</h2><pre></pre> <pre><b>unsigned long long ZSTD_findDecompressedSize(const void* src, size_t srcSize); </b><p> `src` should point the start of a series of ZSTD encoded and/or skippable frames @@ -418,22 +916,12 @@ size_t ZSTD_decompressStream(ZSTD_DStream* zds, ZSTD_outBuffer* output, ZSTD_inB </p></pre><BR> <pre><b>size_t ZSTD_frameHeaderSize(const void* src, size_t srcSize); -</b><p> srcSize must be >= ZSTD_frameHeaderSize_prefix. +</b><p> srcSize must be >= ZSTD_FRAMEHEADERSIZE_PREFIX. @return : size of the Frame Header, or an error code (if srcSize is too small) </p></pre><BR> -<a name="Chapter13"></a><h2>Memory management</h2><pre></pre> - -<pre><b>size_t ZSTD_sizeof_CCtx(const ZSTD_CCtx* cctx); -size_t ZSTD_sizeof_DCtx(const ZSTD_DCtx* dctx); -size_t ZSTD_sizeof_CStream(const ZSTD_CStream* zcs); -size_t ZSTD_sizeof_DStream(const ZSTD_DStream* zds); -size_t ZSTD_sizeof_CDict(const ZSTD_CDict* cdict); -size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict); -</b><p> These functions give the current memory usage of selected object. - Object memory usage can evolve when re-used. -</p></pre><BR> +<a name="Chapter16"></a><h2>Memory management</h2><pre></pre> <pre><b>size_t ZSTD_estimateCCtxSize(int compressionLevel); size_t ZSTD_estimateCCtxSize_usingCParams(ZSTD_compressionParameters cParams); @@ -445,7 +933,7 @@ size_t ZSTD_estimateDCtxSize(void); It will also consider src size to be arbitrarily "large", which is worst case. If srcSize is known to always be small, ZSTD_estimateCCtxSize_usingCParams() can provide a tighter estimation. ZSTD_estimateCCtxSize_usingCParams() can be used in tandem with ZSTD_getCParams() to create cParams from compressionLevel. - ZSTD_estimateCCtxSize_usingCCtxParams() can be used in tandem with ZSTD_CCtxParam_setParameter(). Only single-threaded compression is supported. This function will return an error code if ZSTD_p_nbWorkers is >= 1. + ZSTD_estimateCCtxSize_usingCCtxParams() can be used in tandem with ZSTD_CCtxParam_setParameter(). Only single-threaded compression is supported. This function will return an error code if ZSTD_c_nbWorkers is >= 1. Note : CCtx size estimation is only correct for single-threaded compression. </p></pre><BR> @@ -458,7 +946,7 @@ size_t ZSTD_estimateDStreamSize_fromFrame(const void* src, size_t srcSize); It will also consider src size to be arbitrarily "large", which is worst case. If srcSize is known to always be small, ZSTD_estimateCStreamSize_usingCParams() can provide a tighter estimation. ZSTD_estimateCStreamSize_usingCParams() can be used in tandem with ZSTD_getCParams() to create cParams from compressionLevel. - ZSTD_estimateCStreamSize_usingCCtxParams() can be used in tandem with ZSTD_CCtxParam_setParameter(). Only single-threaded compression is supported. This function will return an error code if ZSTD_p_nbWorkers is >= 1. + ZSTD_estimateCStreamSize_usingCCtxParams() can be used in tandem with ZSTD_CCtxParam_setParameter(). Only single-threaded compression is supported. This function will return an error code if ZSTD_c_nbWorkers is >= 1. Note : CStream size estimation is only correct for single-threaded compression. ZSTD_DStream memory budget depends on window Size. This information can be passed manually, using ZSTD_estimateDStreamSize, @@ -513,12 +1001,13 @@ static ZSTD_customMem const ZSTD_defaultCMem = { NULL, NULL, NULL }; </b>/**< t </p></pre><BR> -<a name="Chapter14"></a><h2>Advanced compression functions</h2><pre></pre> +<a name="Chapter17"></a><h2>Advanced compression functions</h2><pre></pre> <pre><b>ZSTD_CDict* ZSTD_createCDict_byReference(const void* dictBuffer, size_t dictSize, int compressionLevel); </b><p> Create a digested dictionary for compression - Dictionary content is simply referenced, and therefore stays in dictBuffer. - It is important that dictBuffer outlives CDict, it must remain read accessible throughout the lifetime of CDict + Dictionary content is just referenced, not duplicated. + As a consequence, `dictBuffer` **must** outlive CDict, + and its content must remain unmodified throughout the lifetime of CDict. </p></pre><BR> <pre><b>ZSTD_compressionParameters ZSTD_getCParams(int compressionLevel, unsigned long long estimatedSrcSize, size_t dictSize); @@ -540,22 +1029,120 @@ static ZSTD_customMem const ZSTD_defaultCMem = { NULL, NULL, NULL }; </b>/**< t both values are optional, select `0` if unknown. </p></pre><BR> -<pre><b>size_t ZSTD_compress_advanced (ZSTD_CCtx* cctx, - void* dst, size_t dstCapacity, - const void* src, size_t srcSize, - const void* dict,size_t dictSize, - ZSTD_parameters params); -</b><p> Same as ZSTD_compress_usingDict(), with fine-tune control over each compression parameter +<pre><b>size_t ZSTD_compress_advanced(ZSTD_CCtx* cctx, + void* dst, size_t dstCapacity, + const void* src, size_t srcSize, + const void* dict,size_t dictSize, + ZSTD_parameters params); +</b><p> Same as ZSTD_compress_usingDict(), with fine-tune control over compression parameters (by structure) </p></pre><BR> <pre><b>size_t ZSTD_compress_usingCDict_advanced(ZSTD_CCtx* cctx, - void* dst, size_t dstCapacity, - const void* src, size_t srcSize, - const ZSTD_CDict* cdict, ZSTD_frameParameters fParams); -</b><p> Same as ZSTD_compress_usingCDict(), with fine-tune control over frame parameters + void* dst, size_t dstCapacity, + const void* src, size_t srcSize, + const ZSTD_CDict* cdict, + ZSTD_frameParameters fParams); +</b><p> Same as ZSTD_compress_usingCDict(), with fine-tune control over frame parameters +</p></pre><BR> + +<pre><b>size_t ZSTD_CCtx_loadDictionary_byReference(ZSTD_CCtx* cctx, const void* dict, size_t dictSize); +</b><p> Same as ZSTD_CCtx_loadDictionary(), but dictionary content is referenced, instead of being copied into CCtx. + It saves some memory, but also requires that `dict` outlives its usage within `cctx` +</p></pre><BR> + +<pre><b>size_t ZSTD_CCtx_loadDictionary_advanced(ZSTD_CCtx* cctx, const void* dict, size_t dictSize, ZSTD_dictLoadMethod_e dictLoadMethod, ZSTD_dictContentType_e dictContentType); +</b><p> Same as ZSTD_CCtx_loadDictionary(), but gives finer control over + how to load the dictionary (by copy ? by reference ?) + and how to interpret it (automatic ? force raw mode ? full mode only ?) </p></pre><BR> -<a name="Chapter15"></a><h2>Advanced decompression functions</h2><pre></pre> +<pre><b>size_t ZSTD_CCtx_refPrefix_advanced(ZSTD_CCtx* cctx, const void* prefix, size_t prefixSize, ZSTD_dictContentType_e dictContentType); +</b><p> Same as ZSTD_CCtx_refPrefix(), but gives finer control over + how to interpret prefix content (automatic ? force raw mode (default) ? full mode only ?) +</p></pre><BR> + +<pre><b>size_t ZSTD_CCtx_getParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, int* value); +</b><p> Get the requested compression parameter value, selected by enum ZSTD_cParameter, + and store it into int* value. + @return : 0, or an error code (which can be tested with ZSTD_isError()). + +</p></pre><BR> + +<pre><b>ZSTD_CCtx_params* ZSTD_createCCtxParams(void); +size_t ZSTD_freeCCtxParams(ZSTD_CCtx_params* params); +</b><p> Quick howto : + - ZSTD_createCCtxParams() : Create a ZSTD_CCtx_params structure + - ZSTD_CCtxParam_setParameter() : Push parameters one by one into + an existing ZSTD_CCtx_params structure. + This is similar to + ZSTD_CCtx_setParameter(). + - ZSTD_CCtx_setParametersUsingCCtxParams() : Apply parameters to + an existing CCtx. + These parameters will be applied to + all subsequent frames. + - ZSTD_compressStream2() : Do compression using the CCtx. + - ZSTD_freeCCtxParams() : Free the memory. + + This can be used with ZSTD_estimateCCtxSize_advanced_usingCCtxParams() + for static allocation of CCtx for single-threaded compression. + +</p></pre><BR> + +<pre><b>size_t ZSTD_CCtxParams_reset(ZSTD_CCtx_params* params); +</b><p> Reset params to default values. + +</p></pre><BR> + +<pre><b>size_t ZSTD_CCtxParams_init(ZSTD_CCtx_params* cctxParams, int compressionLevel); +</b><p> Initializes the compression parameters of cctxParams according to + compression level. All other parameters are reset to their default values. + +</p></pre><BR> + +<pre><b>size_t ZSTD_CCtxParams_init_advanced(ZSTD_CCtx_params* cctxParams, ZSTD_parameters params); +</b><p> Initializes the compression and frame parameters of cctxParams according to + params. All other parameters are reset to their default values. + +</p></pre><BR> + +<pre><b>size_t ZSTD_CCtxParam_setParameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, int value); +</b><p> Similar to ZSTD_CCtx_setParameter. + Set one compression parameter, selected by enum ZSTD_cParameter. + Parameters must be applied to a ZSTD_CCtx using ZSTD_CCtx_setParametersUsingCCtxParams(). + @result : 0, or an error code (which can be tested with ZSTD_isError()). + +</p></pre><BR> + +<pre><b>size_t ZSTD_CCtxParam_getParameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, int* value); +</b><p> Similar to ZSTD_CCtx_getParameter. + Get the requested value of one compression parameter, selected by enum ZSTD_cParameter. + @result : 0, or an error code (which can be tested with ZSTD_isError()). + +</p></pre><BR> + +<pre><b>size_t ZSTD_CCtx_setParametersUsingCCtxParams( + ZSTD_CCtx* cctx, const ZSTD_CCtx_params* params); +</b><p> Apply a set of ZSTD_CCtx_params to the compression context. + This can be done even after compression is started, + if nbWorkers==0, this will have no impact until a new compression is started. + if nbWorkers>=1, new parameters will be picked up at next job, + with a few restrictions (windowLog, pledgedSrcSize, nbWorkers, jobSize, and overlapLog are not updated). + +</p></pre><BR> + +<pre><b>size_t ZSTD_compressStream2_simpleArgs ( + ZSTD_CCtx* cctx, + void* dst, size_t dstCapacity, size_t* dstPos, + const void* src, size_t srcSize, size_t* srcPos, + ZSTD_EndDirective endOp); +</b><p> Same as ZSTD_compressStream2(), + but using only integral types as arguments. + This variant might be helpful for binders from dynamic languages + which have troubles handling structures containing memory pointers. + +</p></pre><BR> + +<a name="Chapter18"></a><h2>Advanced decompression functions</h2><pre></pre> <pre><b>unsigned ZSTD_isFrame(const void* buffer, size_t size); </b><p> Tells if the content of `buffer` starts with a valid Frame Identifier. @@ -595,7 +1182,56 @@ static ZSTD_customMem const ZSTD_defaultCMem = { NULL, NULL, NULL }; </b>/**< t When identifying the exact failure cause, it's possible to use ZSTD_getFrameHeader(), which will provide a more precise error code. </p></pre><BR> -<a name="Chapter16"></a><h2>Advanced streaming functions</h2><pre></pre> +<pre><b>size_t ZSTD_DCtx_loadDictionary_byReference(ZSTD_DCtx* dctx, const void* dict, size_t dictSize); +</b><p> Same as ZSTD_DCtx_loadDictionary(), + but references `dict` content instead of copying it into `dctx`. + This saves memory if `dict` remains around., + However, it's imperative that `dict` remains accessible (and unmodified) while being used, so it must outlive decompression. +</p></pre><BR> + +<pre><b>size_t ZSTD_DCtx_loadDictionary_advanced(ZSTD_DCtx* dctx, const void* dict, size_t dictSize, ZSTD_dictLoadMethod_e dictLoadMethod, ZSTD_dictContentType_e dictContentType); +</b><p> Same as ZSTD_DCtx_loadDictionary(), + but gives direct control over + how to load the dictionary (by copy ? by reference ?) + and how to interpret it (automatic ? force raw mode ? full mode only ?). +</p></pre><BR> + +<pre><b>size_t ZSTD_DCtx_refPrefix_advanced(ZSTD_DCtx* dctx, const void* prefix, size_t prefixSize, ZSTD_dictContentType_e dictContentType); +</b><p> Same as ZSTD_DCtx_refPrefix(), but gives finer control over + how to interpret prefix content (automatic ? force raw mode (default) ? full mode only ?) +</p></pre><BR> + +<pre><b>size_t ZSTD_DCtx_setMaxWindowSize(ZSTD_DCtx* dctx, size_t maxWindowSize); +</b><p> Refuses allocating internal buffers for frames requiring a window size larger than provided limit. + This protects a decoder context from reserving too much memory for itself (potential attack scenario). + This parameter is only useful in streaming mode, since no internal buffer is allocated in single-pass mode. + By default, a decompression context accepts all window sizes <= (1 << ZSTD_WINDOWLOG_LIMIT_DEFAULT) + @return : 0, or an error code (which can be tested using ZSTD_isError()). + +</p></pre><BR> + +<pre><b>size_t ZSTD_DCtx_setFormat(ZSTD_DCtx* dctx, ZSTD_format_e format); +</b><p> Instruct the decoder context about what kind of data to decode next. + This instruction is mandatory to decode data without a fully-formed header, + such ZSTD_f_zstd1_magicless for example. + @return : 0, or an error code (which can be tested using ZSTD_isError()). +</p></pre><BR> + +<pre><b>size_t ZSTD_decompressStream_simpleArgs ( + ZSTD_DCtx* dctx, + void* dst, size_t dstCapacity, size_t* dstPos, + const void* src, size_t srcSize, size_t* srcPos); +</b><p> Same as ZSTD_decompressStream(), + but using only integral types as arguments. + This can be helpful for binders from dynamic languages + which have troubles handling structures containing memory pointers. + +</p></pre><BR> + +<a name="Chapter19"></a><h2>Advanced streaming functions</h2><pre> Warning : most of these functions are now redundant with the Advanced API. + Once Advanced API reaches "stable" status, + redundant functions will be deprecated, and then at some point removed. +<BR></pre> <h3>Advanced Streaming compression functions</h3><pre></pre><b><pre>size_t ZSTD_initCStream_srcSize(ZSTD_CStream* zcs, int compressionLevel, unsigned long long pledgedSrcSize); </b>/**< pledgedSrcSize must be correct. If it is not known at init time, use ZSTD_CONTENTSIZE_UNKNOWN. Note that, for compatibility with older programs, "0" also disables frame content size field. It may be enabled in the future. */<b> size_t ZSTD_initCStream_usingDict(ZSTD_CStream* zcs, const void* dict, size_t dictSize, int compressionLevel); </b>/**< creates of an internal CDict (incompatible with static CCtx), except if dict == NULL or dictSize < 8, in which case no dict is used. Note: dict is loaded with ZSTD_dm_auto (treated as a full zstd dictionary if it begins with ZSTD_MAGIC_DICTIONARY, else as raw content) and ZSTD_dlm_byCopy.*/<b> @@ -605,7 +1241,7 @@ size_t ZSTD_initCStream_usingCDict(ZSTD_CStream* zcs, const ZSTD_CDict* cdict); size_t ZSTD_initCStream_usingCDict_advanced(ZSTD_CStream* zcs, const ZSTD_CDict* cdict, ZSTD_frameParameters fParams, unsigned long long pledgedSrcSize); </b>/**< same as ZSTD_initCStream_usingCDict(), with control over frame parameters. pledgedSrcSize must be correct. If srcSize is not known at init time, use value ZSTD_CONTENTSIZE_UNKNOWN. */<b> </pre></b><BR> <pre><b>size_t ZSTD_resetCStream(ZSTD_CStream* zcs, unsigned long long pledgedSrcSize); -</b><p> start a new compression job, using same parameters from previous job. +</b><p> start a new frame, using same parameters from previous frame. This is typically useful to skip dictionary loading stage, since it will re-use it in-place. Note that zcs must be init at least once before using ZSTD_resetCStream(). If pledgedSrcSize is not known at reset time, use macro ZSTD_CONTENTSIZE_UNKNOWN. @@ -635,25 +1271,23 @@ size_t ZSTD_initCStream_usingCDict_advanced(ZSTD_CStream* zcs, const ZSTD_CDict* + there is no active job (could be checked with ZSTD_frameProgression()), or + oldest job is still actively compressing data, but everything it has produced has also been flushed so far, - therefore flushing speed is currently limited by production speed of oldest job - irrespective of the speed of concurrent newer jobs. + therefore flush speed is limited by production speed of oldest job + irrespective of the speed of concurrent (and newer) jobs. </p></pre><BR> -<h3>Advanced Streaming decompression functions</h3><pre></pre><b><pre>typedef enum { DStream_p_maxWindowSize } ZSTD_DStreamParameter_e; -size_t ZSTD_setDStreamParameter(ZSTD_DStream* zds, ZSTD_DStreamParameter_e paramType, unsigned paramValue); </b>/* obsolete : this API will be removed in a future version */<b> -size_t ZSTD_initDStream_usingDict(ZSTD_DStream* zds, const void* dict, size_t dictSize); </b>/**< note: no dictionary will be used if dict == NULL or dictSize < 8 */<b> +<h3>Advanced Streaming decompression functions</h3><pre></pre><b><pre>size_t ZSTD_initDStream_usingDict(ZSTD_DStream* zds, const void* dict, size_t dictSize); </b>/**< note: no dictionary will be used if dict == NULL or dictSize < 8 */<b> size_t ZSTD_initDStream_usingDDict(ZSTD_DStream* zds, const ZSTD_DDict* ddict); </b>/**< note : ddict is referenced, it must outlive decompression session */<b> size_t ZSTD_resetDStream(ZSTD_DStream* zds); </b>/**< re-use decompression parameters from previous init; saves dictionary loading */<b> </pre></b><BR> -<a name="Chapter17"></a><h2>Buffer-less and synchronous inner streaming functions</h2><pre> +<a name="Chapter20"></a><h2>Buffer-less and synchronous inner streaming functions</h2><pre> This is an advanced API, giving full control over buffer management, for users which need direct control over memory. But it's also a complex one, with several restrictions, documented below. Prefer normal streaming API for an easier experience. <BR></pre> -<a name="Chapter18"></a><h2>Buffer-less streaming compression (synchronous mode)</h2><pre> +<a name="Chapter21"></a><h2>Buffer-less streaming compression (synchronous mode)</h2><pre> A ZSTD_CCtx object is required to track streaming operations. Use ZSTD_createCCtx() / ZSTD_freeCCtx() to manage resource. ZSTD_CCtx object can be re-used multiple times within successive compression operations. @@ -689,7 +1323,7 @@ size_t ZSTD_compressBegin_usingCDict(ZSTD_CCtx* cctx, const ZSTD_CDict* cdict); size_t ZSTD_compressBegin_usingCDict_advanced(ZSTD_CCtx* const cctx, const ZSTD_CDict* const cdict, ZSTD_frameParameters const fParams, unsigned long long const pledgedSrcSize); </b>/* compression parameters are already set within cdict. pledgedSrcSize must be correct. If srcSize is not known, use macro ZSTD_CONTENTSIZE_UNKNOWN */<b> size_t ZSTD_copyCCtx(ZSTD_CCtx* cctx, const ZSTD_CCtx* preparedCCtx, unsigned long long pledgedSrcSize); </b>/**< note: if pledgedSrcSize is not known, use ZSTD_CONTENTSIZE_UNKNOWN */<b> </pre></b><BR> -<a name="Chapter19"></a><h2>Buffer-less streaming decompression (synchronous mode)</h2><pre> +<a name="Chapter22"></a><h2>Buffer-less streaming decompression (synchronous mode)</h2><pre> A ZSTD_DCtx object is required to track streaming operations. Use ZSTD_createDCtx() / ZSTD_freeDCtx() to manage it. A ZSTD_DCtx object can be re-used multiple times. @@ -770,496 +1404,24 @@ typedef struct { unsigned dictID; unsigned checksumFlag; } ZSTD_frameHeader; -</b>/** ZSTD_getFrameHeader() :<b> - * decode Frame Header, or requires larger `srcSize`. - * @return : 0, `zfhPtr` is correctly filled, - * >0, `srcSize` is too small, value is wanted `srcSize` amount, - * or an error code, which can be tested using ZSTD_isError() */ -size_t ZSTD_getFrameHeader(ZSTD_frameHeader* zfhPtr, const void* src, size_t srcSize); </b>/**< doesn't consume input */<b> -size_t ZSTD_decodingBufferSize_min(unsigned long long windowSize, unsigned long long frameContentSize); </b>/**< when frame content size is not known, pass in frameContentSize == ZSTD_CONTENTSIZE_UNKNOWN */<b> </pre></b><BR> -<pre><b>typedef enum { ZSTDnit_frameHeader, ZSTDnit_blockHeader, ZSTDnit_block, ZSTDnit_lastBlock, ZSTDnit_checksum, ZSTDnit_skippableFrame } ZSTD_nextInputType_e; -</b></pre><BR> -<a name="Chapter20"></a><h2>New advanced API (experimental)</h2><pre></pre> - -<pre><b>typedef enum { - </b>/* Opened question : should we have a format ZSTD_f_auto ?<b> - * Today, it would mean exactly the same as ZSTD_f_zstd1. - * But, in the future, should several formats become supported, - * on the compression side, it would mean "default format". - * On the decompression side, it would mean "automatic format detection", - * so that ZSTD_f_zstd1 would mean "accept *only* zstd frames". - * Since meaning is a little different, another option could be to define different enums for compression and decompression. - * This question could be kept for later, when there are actually multiple formats to support, - * but there is also the question of pinning enum values, and pinning value `0` is especially important */ - ZSTD_f_zstd1 = 0, </b>/* zstd frame format, specified in zstd_compression_format.md (default) */<b> - ZSTD_f_zstd1_magicless, </b>/* Variant of zstd frame format, without initial 4-bytes magic number.<b> - * Useful to save 4 bytes per generated frame. - * Decoder cannot recognise automatically this format, requiring instructions. */ -} ZSTD_format_e; -</b></pre><BR> -<pre><b>typedef enum { - </b>/* compression format */<b> - ZSTD_p_format = 10, </b>/* See ZSTD_format_e enum definition.<b> - * Cast selected format as unsigned for ZSTD_CCtx_setParameter() compatibility. */ - - </b>/* compression parameters */<b> - ZSTD_p_compressionLevel=100, </b>/* Update all compression parameters according to pre-defined cLevel table<b> - * Default level is ZSTD_CLEVEL_DEFAULT==3. - * Special: value 0 means default, which is controlled by ZSTD_CLEVEL_DEFAULT. - * Note 1 : it's possible to pass a negative compression level by casting it to unsigned type. - * Note 2 : setting a level sets all default values of other compression parameters. - * Note 3 : setting compressionLevel automatically updates ZSTD_p_compressLiterals. */ - ZSTD_p_windowLog, </b>/* Maximum allowed back-reference distance, expressed as power of 2.<b> - * Must be clamped between ZSTD_WINDOWLOG_MIN and ZSTD_WINDOWLOG_MAX. - * Special: value 0 means "use default windowLog". - * Note: Using a window size greater than ZSTD_MAXWINDOWSIZE_DEFAULT (default: 2^27) - * requires explicitly allowing such window size during decompression stage. */ - ZSTD_p_hashLog, </b>/* Size of the initial probe table, as a power of 2.<b> - * Resulting table size is (1 << (hashLog+2)). - * Must be clamped between ZSTD_HASHLOG_MIN and ZSTD_HASHLOG_MAX. - * Larger tables improve compression ratio of strategies <= dFast, - * and improve speed of strategies > dFast. - * Special: value 0 means "use default hashLog". */ - ZSTD_p_chainLog, </b>/* Size of the multi-probe search table, as a power of 2.<b> - * Resulting table size is (1 << (chainLog+2)). - * Must be clamped between ZSTD_CHAINLOG_MIN and ZSTD_CHAINLOG_MAX. - * Larger tables result in better and slower compression. - * This parameter is useless when using "fast" strategy. - * Note it's still useful when using "dfast" strategy, - * in which case it defines a secondary probe table. - * Special: value 0 means "use default chainLog". */ - ZSTD_p_searchLog, </b>/* Number of search attempts, as a power of 2.<b> - * More attempts result in better and slower compression. - * This parameter is useless when using "fast" and "dFast" strategies. - * Special: value 0 means "use default searchLog". */ - ZSTD_p_minMatch, </b>/* Minimum size of searched matches (note : repCode matches can be smaller).<b> - * Larger values make faster compression and decompression, but decrease ratio. - * Must be clamped between ZSTD_SEARCHLENGTH_MIN and ZSTD_SEARCHLENGTH_MAX. - * Note that currently, for all strategies < btopt, effective minimum is 4. - * , for all strategies > fast, effective maximum is 6. - * Special: value 0 means "use default minMatchLength". */ - ZSTD_p_targetLength, </b>/* Impact of this field depends on strategy.<b> - * For strategies btopt & btultra: - * Length of Match considered "good enough" to stop search. - * Larger values make compression stronger, and slower. - * For strategy fast: - * Distance between match sampling. - * Larger values make compression faster, and weaker. - * Special: value 0 means "use default targetLength". */ - ZSTD_p_compressionStrategy, </b>/* See ZSTD_strategy enum definition.<b> - * Cast selected strategy as unsigned for ZSTD_CCtx_setParameter() compatibility. - * The higher the value of selected strategy, the more complex it is, - * resulting in stronger and slower compression. - * Special: value 0 means "use default strategy". */ - - ZSTD_p_enableLongDistanceMatching=160, </b>/* Enable long distance matching.<b> - * This parameter is designed to improve compression ratio - * for large inputs, by finding large matches at long distance. - * It increases memory usage and window size. - * Note: enabling this parameter increases ZSTD_p_windowLog to 128 MB - * except when expressly set to a different value. */ - ZSTD_p_ldmHashLog, </b>/* Size of the table for long distance matching, as a power of 2.<b> - * Larger values increase memory usage and compression ratio, - * but decrease compression speed. - * Must be clamped between ZSTD_HASHLOG_MIN and ZSTD_HASHLOG_MAX - * default: windowlog - 7. - * Special: value 0 means "automatically determine hashlog". */ - ZSTD_p_ldmMinMatch, </b>/* Minimum match size for long distance matcher.<b> - * Larger/too small values usually decrease compression ratio. - * Must be clamped between ZSTD_LDM_MINMATCH_MIN and ZSTD_LDM_MINMATCH_MAX. - * Special: value 0 means "use default value" (default: 64). */ - ZSTD_p_ldmBucketSizeLog, </b>/* Log size of each bucket in the LDM hash table for collision resolution.<b> - * Larger values improve collision resolution but decrease compression speed. - * The maximum value is ZSTD_LDM_BUCKETSIZELOG_MAX . - * Special: value 0 means "use default value" (default: 3). */ - ZSTD_p_ldmHashEveryLog, </b>/* Frequency of inserting/looking up entries in the LDM hash table.<b> - * Must be clamped between 0 and (ZSTD_WINDOWLOG_MAX - ZSTD_HASHLOG_MIN). - * Default is MAX(0, (windowLog - ldmHashLog)), optimizing hash table usage. - * Larger values improve compression speed. - * Deviating far from default value will likely result in a compression ratio decrease. - * Special: value 0 means "automatically determine hashEveryLog". */ - - </b>/* frame parameters */<b> - ZSTD_p_contentSizeFlag=200, </b>/* Content size will be written into frame header _whenever known_ (default:1)<b> - * Content size must be known at the beginning of compression, - * it is provided using ZSTD_CCtx_setPledgedSrcSize() */ - ZSTD_p_checksumFlag, </b>/* A 32-bits checksum of content is written at end of frame (default:0) */<b> - ZSTD_p_dictIDFlag, </b>/* When applicable, dictionary's ID is written into frame header (default:1) */<b> - - </b>/* multi-threading parameters */<b> - </b>/* These parameters are only useful if multi-threading is enabled (ZSTD_MULTITHREAD).<b> - * They return an error otherwise. */ - ZSTD_p_nbWorkers=400, </b>/* Select how many threads will be spawned to compress in parallel.<b> - * When nbWorkers >= 1, triggers asynchronous mode : - * ZSTD_compress_generic() consumes some input, flush some output if possible, and immediately gives back control to caller, - * while compression work is performed in parallel, within worker threads. - * (note : a strong exception to this rule is when first invocation sets ZSTD_e_end : it becomes a blocking call). - * More workers improve speed, but also increase memory usage. - * Default value is `0`, aka "single-threaded mode" : no worker is spawned, compression is performed inside Caller's thread, all invocations are blocking */ - ZSTD_p_jobSize, </b>/* Size of a compression job. This value is enforced only in non-blocking mode.<b> - * Each compression job is completed in parallel, so this value indirectly controls the nb of active threads. - * 0 means default, which is dynamically determined based on compression parameters. - * Job size must be a minimum of overlapSize, or 1 MB, whichever is largest. - * The minimum size is automatically and transparently enforced */ - ZSTD_p_overlapSizeLog, </b>/* Size of previous input reloaded at the beginning of each job.<b> - * 0 => no overlap, 6(default) => use 1/8th of windowSize, >=9 => use full windowSize */ - - </b>/* =================================================================== */<b> - </b>/* experimental parameters - no stability guaranteed */<b> - </b>/* =================================================================== */<b> - - ZSTD_p_forceMaxWindow=1100, </b>/* Force back-reference distances to remain < windowSize,<b> - * even when referencing into Dictionary content (default:0) */ - ZSTD_p_forceAttachDict, </b>/* ZSTD supports usage of a CDict in-place<b> - * (avoiding having to copy the compression tables - * from the CDict into the working context). Using - * a CDict in this way saves an initial setup step, - * but comes at the cost of more work per byte of - * input. ZSTD has a simple internal heuristic that - * guesses which strategy will be faster. You can - * use this flag to override that guess. - * - * Note that the by-reference, in-place strategy is - * only used when reusing a compression context - * with compatible compression parameters. (If - * incompatible / uninitialized, the working - * context needs to be cleared anyways, which is - * about as expensive as overwriting it with the - * dictionary context, so there's no savings in - * using the CDict by-ref.) - * - * Values greater than 0 force attaching the dict. - * Values less than 0 force copying the dict. - * 0 selects the default heuristic-guided behavior. - */ - -} ZSTD_cParameter; -</b></pre><BR> -<pre><b>size_t ZSTD_CCtx_setParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned value); -</b><p> Set one compression parameter, selected by enum ZSTD_cParameter. - Setting a parameter is generally only possible during frame initialization (before starting compression). - Exception : when using multi-threading mode (nbThreads >= 1), - following parameters can be updated _during_ compression (within same frame): - => compressionLevel, hashLog, chainLog, searchLog, minMatch, targetLength and strategy. - new parameters will be active on next job, or after a flush(). - Note : when `value` type is not unsigned (int, or enum), cast it to unsigned for proper type checking. - @result : informational value (typically, value being set, correctly clamped), - or an error code (which can be tested with ZSTD_isError()). -</p></pre><BR> - -<pre><b>size_t ZSTD_CCtx_getParameter(ZSTD_CCtx* cctx, ZSTD_cParameter param, unsigned* value); -</b><p> Get the requested value of one compression parameter, selected by enum ZSTD_cParameter. - @result : 0, or an error code (which can be tested with ZSTD_isError()). - -</p></pre><BR> - -<pre><b>size_t ZSTD_CCtx_setPledgedSrcSize(ZSTD_CCtx* cctx, unsigned long long pledgedSrcSize); -</b><p> Total input data size to be compressed as a single frame. - This value will be controlled at the end, and result in error if not respected. - @result : 0, or an error code (which can be tested with ZSTD_isError()). - Note 1 : 0 means zero, empty. - In order to mean "unknown content size", pass constant ZSTD_CONTENTSIZE_UNKNOWN. - ZSTD_CONTENTSIZE_UNKNOWN is default value for any new compression job. - Note 2 : If all data is provided and consumed in a single round, - this value is overriden by srcSize instead. -</p></pre><BR> - -<pre><b>size_t ZSTD_CCtx_loadDictionary(ZSTD_CCtx* cctx, const void* dict, size_t dictSize); -size_t ZSTD_CCtx_loadDictionary_byReference(ZSTD_CCtx* cctx, const void* dict, size_t dictSize); -size_t ZSTD_CCtx_loadDictionary_advanced(ZSTD_CCtx* cctx, const void* dict, size_t dictSize, ZSTD_dictLoadMethod_e dictLoadMethod, ZSTD_dictContentType_e dictContentType); -</b><p> Create an internal CDict from `dict` buffer. - Decompression will have to use same dictionary. - @result : 0, or an error code (which can be tested with ZSTD_isError()). - Special: Adding a NULL (or 0-size) dictionary invalidates previous dictionary, - meaning "return to no-dictionary mode". - Note 1 : Dictionary will be used for all future compression jobs. - To return to "no-dictionary" situation, load a NULL dictionary - Note 2 : Loading a dictionary involves building tables, which are dependent on compression parameters. - For this reason, compression parameters cannot be changed anymore after loading a dictionary. - It's also a CPU consuming operation, with non-negligible impact on latency. - Note 3 :`dict` content will be copied internally. - Use ZSTD_CCtx_loadDictionary_byReference() to reference dictionary content instead. - In such a case, dictionary buffer must outlive its users. - Note 4 : Use ZSTD_CCtx_loadDictionary_advanced() - to precisely select how dictionary content must be interpreted. -</p></pre><BR> - -<pre><b>size_t ZSTD_CCtx_refCDict(ZSTD_CCtx* cctx, const ZSTD_CDict* cdict); -</b><p> Reference a prepared dictionary, to be used for all next compression jobs. - Note that compression parameters are enforced from within CDict, - and supercede any compression parameter previously set within CCtx. - The dictionary will remain valid for future compression jobs using same CCtx. - @result : 0, or an error code (which can be tested with ZSTD_isError()). - Special : adding a NULL CDict means "return to no-dictionary mode". - Note 1 : Currently, only one dictionary can be managed. - Adding a new dictionary effectively "discards" any previous one. - Note 2 : CDict is just referenced, its lifetime must outlive CCtx. -</p></pre><BR> - -<pre><b>size_t ZSTD_CCtx_refPrefix(ZSTD_CCtx* cctx, - const void* prefix, size_t prefixSize); -size_t ZSTD_CCtx_refPrefix_advanced(ZSTD_CCtx* cctx, - const void* prefix, size_t prefixSize, - ZSTD_dictContentType_e dictContentType); -</b><p> Reference a prefix (single-usage dictionary) for next compression job. - Decompression will need same prefix to properly regenerate data. - Compressing with a prefix is similar in outcome as performing a diff and compressing it, - but performs much faster, especially during decompression (compression speed is tunable with compression level). - Note that prefix is **only used once**. Tables are discarded at end of compression job (ZSTD_e_end). - @result : 0, or an error code (which can be tested with ZSTD_isError()). - Special: Adding any prefix (including NULL) invalidates any previous prefix or dictionary - Note 1 : Prefix buffer is referenced. It **must** outlive compression job. - Its contain must remain unmodified up to end of compression (ZSTD_e_end). - Note 2 : If the intention is to diff some large src data blob with some prior version of itself, - ensure that the window size is large enough to contain the entire source. - See ZSTD_p_windowLog. - Note 3 : Referencing a prefix involves building tables, which are dependent on compression parameters. - It's a CPU consuming operation, with non-negligible impact on latency. - If there is a need to use same prefix multiple times, consider loadDictionary instead. - Note 4 : By default, the prefix is treated as raw content (ZSTD_dm_rawContent). - Use ZSTD_CCtx_refPrefix_advanced() to alter dictMode. -</p></pre><BR> - -<pre><b>void ZSTD_CCtx_reset(ZSTD_CCtx* cctx); -</b><p> Return a CCtx to clean state. - Useful after an error, or to interrupt an ongoing compression job and start a new one. - Any internal data not yet flushed is cancelled. - The parameters and dictionary are kept unchanged, to reset them use ZSTD_CCtx_resetParameters(). - -</p></pre><BR> - -<pre><b>size_t ZSTD_CCtx_resetParameters(ZSTD_CCtx* cctx); -</b><p> All parameters are back to default values (compression level is ZSTD_CLEVEL_DEFAULT). - Dictionary (if any) is dropped. - Resetting parameters is only possible during frame initialization (before starting compression). - To reset the context use ZSTD_CCtx_reset(). - @return 0 or an error code (which can be checked with ZSTD_isError()). - -</p></pre><BR> +<a name="Chapter23"></a><h2>ZSTD_getFrameHeader() :</h2><pre> decode Frame Header, or requires larger `srcSize`. + @return : 0, `zfhPtr` is correctly filled, + >0, `srcSize` is too small, value is wanted `srcSize` amount, + or an error code, which can be tested using ZSTD_isError() +<BR></pre> -<pre><b>typedef enum { - ZSTD_e_continue=0, </b>/* collect more data, encoder decides when to output compressed result, for optimal compression ratio */<b> - ZSTD_e_flush, </b>/* flush any data provided so far,<b> - * it creates (at least) one new block, that can be decoded immediately on reception; - * frame will continue: any future data can still reference previously compressed data, improving compression. */ - ZSTD_e_end </b>/* flush any remaining data and close current frame.<b> - * any additional data starts a new frame. - * each frame is independent (does not reference any content from previous frame). */ -} ZSTD_EndDirective; +<pre><b>size_t ZSTD_getFrameHeader(ZSTD_frameHeader* zfhPtr, const void* src, size_t srcSize); </b>/**< doesn't consume input */<b> </b></pre><BR> -<pre><b>size_t ZSTD_compress_generic (ZSTD_CCtx* cctx, - ZSTD_outBuffer* output, - ZSTD_inBuffer* input, - ZSTD_EndDirective endOp); -</b><p> Behave about the same as ZSTD_compressStream. To note : - - Compression parameters are pushed into CCtx before starting compression, using ZSTD_CCtx_setParameter() - - Compression parameters cannot be changed once compression is started. - - outpot->pos must be <= dstCapacity, input->pos must be <= srcSize - - outpot->pos and input->pos will be updated. They are guaranteed to remain below their respective limit. - - In single-thread mode (default), function is blocking : it completed its job before returning to caller. - - In multi-thread mode, function is non-blocking : it just acquires a copy of input, and distribute job to internal worker threads, - and then immediately returns, just indicating that there is some data remaining to be flushed. - The function nonetheless guarantees forward progress : it will return only after it reads or write at least 1+ byte. - - Exception : in multi-threading mode, if the first call requests a ZSTD_e_end directive, it is blocking : it will complete compression before giving back control to caller. - - @return provides a minimum amount of data remaining to be flushed from internal buffers - or an error code, which can be tested using ZSTD_isError(). - if @return != 0, flush is not fully completed, there is still some data left within internal buffers. - This is useful for ZSTD_e_flush, since in this case more flushes are necessary to empty all buffers. - For ZSTD_e_end, @return == 0 when internal buffers are fully flushed and frame is completed. - - after a ZSTD_e_end directive, if internal buffer is not fully flushed (@return != 0), - only ZSTD_e_end or ZSTD_e_flush operations are allowed. - Before starting a new compression job, or changing compression parameters, - it is required to fully flush internal buffers. - -</p></pre><BR> - -<pre><b>size_t ZSTD_compress_generic_simpleArgs ( - ZSTD_CCtx* cctx, - void* dst, size_t dstCapacity, size_t* dstPos, - const void* src, size_t srcSize, size_t* srcPos, - ZSTD_EndDirective endOp); -</b><p> Same as ZSTD_compress_generic(), - but using only integral types as arguments. - Argument list is larger than ZSTD_{in,out}Buffer, - but can be helpful for binders from dynamic languages - which have troubles handling structures containing memory pointers. - -</p></pre><BR> - -<pre><b>ZSTD_CCtx_params* ZSTD_createCCtxParams(void); -size_t ZSTD_freeCCtxParams(ZSTD_CCtx_params* params); -</b><p> Quick howto : - - ZSTD_createCCtxParams() : Create a ZSTD_CCtx_params structure - - ZSTD_CCtxParam_setParameter() : Push parameters one by one into - an existing ZSTD_CCtx_params structure. - This is similar to - ZSTD_CCtx_setParameter(). - - ZSTD_CCtx_setParametersUsingCCtxParams() : Apply parameters to - an existing CCtx. - These parameters will be applied to - all subsequent compression jobs. - - ZSTD_compress_generic() : Do compression using the CCtx. - - ZSTD_freeCCtxParams() : Free the memory. - - This can be used with ZSTD_estimateCCtxSize_advanced_usingCCtxParams() - for static allocation for single-threaded compression. - -</p></pre><BR> - -<pre><b>size_t ZSTD_CCtxParams_reset(ZSTD_CCtx_params* params); -</b><p> Reset params to default values. - -</p></pre><BR> - -<pre><b>size_t ZSTD_CCtxParams_init(ZSTD_CCtx_params* cctxParams, int compressionLevel); -</b><p> Initializes the compression parameters of cctxParams according to - compression level. All other parameters are reset to their default values. - -</p></pre><BR> - -<pre><b>size_t ZSTD_CCtxParams_init_advanced(ZSTD_CCtx_params* cctxParams, ZSTD_parameters params); -</b><p> Initializes the compression and frame parameters of cctxParams according to - params. All other parameters are reset to their default values. - -</p></pre><BR> - -<pre><b>size_t ZSTD_CCtxParam_setParameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, unsigned value); -</b><p> Similar to ZSTD_CCtx_setParameter. - Set one compression parameter, selected by enum ZSTD_cParameter. - Parameters must be applied to a ZSTD_CCtx using ZSTD_CCtx_setParametersUsingCCtxParams(). - Note : when `value` is an enum, cast it to unsigned for proper type checking. - @result : 0, or an error code (which can be tested with ZSTD_isError()). - -</p></pre><BR> - -<pre><b>size_t ZSTD_CCtxParam_getParameter(ZSTD_CCtx_params* params, ZSTD_cParameter param, unsigned* value); -</b><p> Similar to ZSTD_CCtx_getParameter. - Get the requested value of one compression parameter, selected by enum ZSTD_cParameter. - @result : 0, or an error code (which can be tested with ZSTD_isError()). - -</p></pre><BR> - -<pre><b>size_t ZSTD_CCtx_setParametersUsingCCtxParams( - ZSTD_CCtx* cctx, const ZSTD_CCtx_params* params); -</b><p> Apply a set of ZSTD_CCtx_params to the compression context. - This can be done even after compression is started, - if nbWorkers==0, this will have no impact until a new compression is started. - if nbWorkers>=1, new parameters will be picked up at next job, - with a few restrictions (windowLog, pledgedSrcSize, nbWorkers, jobSize, and overlapLog are not updated). - -</p></pre><BR> - -<h3>Advanced decompression API</h3><pre></pre><b><pre></b>/* ==================================== */<b> -</pre></b><BR> -<pre><b>size_t ZSTD_DCtx_loadDictionary(ZSTD_DCtx* dctx, const void* dict, size_t dictSize); -size_t ZSTD_DCtx_loadDictionary_byReference(ZSTD_DCtx* dctx, const void* dict, size_t dictSize); -size_t ZSTD_DCtx_loadDictionary_advanced(ZSTD_DCtx* dctx, const void* dict, size_t dictSize, ZSTD_dictLoadMethod_e dictLoadMethod, ZSTD_dictContentType_e dictContentType); -</b><p> Create an internal DDict from dict buffer, - to be used to decompress next frames. - @result : 0, or an error code (which can be tested with ZSTD_isError()). - Special : Adding a NULL (or 0-size) dictionary invalidates any previous dictionary, - meaning "return to no-dictionary mode". - Note 1 : `dict` content will be copied internally. - Use ZSTD_DCtx_loadDictionary_byReference() - to reference dictionary content instead. - In which case, the dictionary buffer must outlive its users. - Note 2 : Loading a dictionary involves building tables, - which has a non-negligible impact on CPU usage and latency. - Note 3 : Use ZSTD_DCtx_loadDictionary_advanced() to select - how dictionary content will be interpreted and loaded. - -</p></pre><BR> - -<pre><b>size_t ZSTD_DCtx_refDDict(ZSTD_DCtx* dctx, const ZSTD_DDict* ddict); -</b><p> Reference a prepared dictionary, to be used to decompress next frames. - The dictionary remains active for decompression of future frames using same DCtx. - @result : 0, or an error code (which can be tested with ZSTD_isError()). - Note 1 : Currently, only one dictionary can be managed. - Referencing a new dictionary effectively "discards" any previous one. - Special : adding a NULL DDict means "return to no-dictionary mode". - Note 2 : DDict is just referenced, its lifetime must outlive its usage from DCtx. - -</p></pre><BR> - -<pre><b>size_t ZSTD_DCtx_refPrefix(ZSTD_DCtx* dctx, - const void* prefix, size_t prefixSize); -size_t ZSTD_DCtx_refPrefix_advanced(ZSTD_DCtx* dctx, - const void* prefix, size_t prefixSize, - ZSTD_dictContentType_e dictContentType); -</b><p> Reference a prefix (single-usage dictionary) for next compression job. - This is the reverse operation of ZSTD_CCtx_refPrefix(), - and must use the same prefix as the one used during compression. - Prefix is **only used once**. Reference is discarded at end of frame. - End of frame is reached when ZSTD_DCtx_decompress_generic() returns 0. - @result : 0, or an error code (which can be tested with ZSTD_isError()). - Note 1 : Adding any prefix (including NULL) invalidates any previously set prefix or dictionary - Note 2 : Prefix buffer is referenced. It **must** outlive decompression job. - Prefix buffer must remain unmodified up to the end of frame, - reached when ZSTD_DCtx_decompress_generic() returns 0. - Note 3 : By default, the prefix is treated as raw content (ZSTD_dm_rawContent). - Use ZSTD_CCtx_refPrefix_advanced() to alter dictMode. - Note 4 : Referencing a raw content prefix has almost no cpu nor memory cost. - A fulldict prefix is more costly though. - -</p></pre><BR> - -<pre><b>size_t ZSTD_DCtx_setMaxWindowSize(ZSTD_DCtx* dctx, size_t maxWindowSize); -</b><p> Refuses allocating internal buffers for frames requiring a window size larger than provided limit. - This is useful to prevent a decoder context from reserving too much memory for itself (potential attack scenario). - This parameter is only useful in streaming mode, since no internal buffer is allocated in direct mode. - By default, a decompression context accepts all window sizes <= (1 << ZSTD_WINDOWLOG_MAX) - @return : 0, or an error code (which can be tested using ZSTD_isError()). - -</p></pre><BR> - -<pre><b>size_t ZSTD_DCtx_setFormat(ZSTD_DCtx* dctx, ZSTD_format_e format); -</b><p> Instruct the decoder context about what kind of data to decode next. - This instruction is mandatory to decode data without a fully-formed header, - such ZSTD_f_zstd1_magicless for example. - @return : 0, or an error code (which can be tested using ZSTD_isError()). - -</p></pre><BR> - -<pre><b>size_t ZSTD_getFrameHeader_advanced(ZSTD_frameHeader* zfhPtr, - const void* src, size_t srcSize, ZSTD_format_e format); +<pre><b>size_t ZSTD_getFrameHeader_advanced(ZSTD_frameHeader* zfhPtr, const void* src, size_t srcSize, ZSTD_format_e format); +size_t ZSTD_decodingBufferSize_min(unsigned long long windowSize, unsigned long long frameContentSize); </b>/**< when frame content size is not known, pass in frameContentSize == ZSTD_CONTENTSIZE_UNKNOWN */<b> </b><p> same as ZSTD_getFrameHeader(), with added capability to select a format (like ZSTD_f_zstd1_magicless) </p></pre><BR> -<pre><b>size_t ZSTD_decompress_generic(ZSTD_DCtx* dctx, - ZSTD_outBuffer* output, - ZSTD_inBuffer* input); -</b><p> Behave the same as ZSTD_decompressStream. - Decompression parameters cannot be changed once decompression is started. - @return : an error code, which can be tested using ZSTD_isError() - if >0, a hint, nb of expected input bytes for next invocation. - `0` means : a frame has just been fully decoded and flushed. - -</p></pre><BR> - -<pre><b>size_t ZSTD_decompress_generic_simpleArgs ( - ZSTD_DCtx* dctx, - void* dst, size_t dstCapacity, size_t* dstPos, - const void* src, size_t srcSize, size_t* srcPos); -</b><p> Same as ZSTD_decompress_generic(), - but using only integral types as arguments. - Argument list is larger than ZSTD_{in,out}Buffer, - but can be helpful for binders from dynamic languages - which have troubles handling structures containing memory pointers. - -</p></pre><BR> - -<pre><b>void ZSTD_DCtx_reset(ZSTD_DCtx* dctx); -</b><p> Return a DCtx to clean state. - If a decompression was ongoing, any internal data not yet flushed is cancelled. - All parameters are back to default values, including sticky ones. - Dictionary (if any) is dropped. - Parameters can be modified again after a reset. - -</p></pre><BR> - -<a name="Chapter21"></a><h2>Block level API</h2><pre></pre> +<pre><b>typedef enum { ZSTDnit_frameHeader, ZSTDnit_blockHeader, ZSTDnit_block, ZSTDnit_lastBlock, ZSTDnit_checksum, ZSTDnit_skippableFrame } ZSTD_nextInputType_e; +</b></pre><BR> +<a name="Chapter24"></a><h2>Block level API</h2><pre></pre> <pre><b></b><p> Frame metadata cost is typically ~18 bytes, which can be non-negligible for very small blocks (< 100 bytes). User will have to take in charge required information to regenerate data, such as compressed and content sizes. @@ -1273,10 +1435,10 @@ size_t ZSTD_DCtx_refPrefix_advanced(ZSTD_DCtx* dctx, + copyCCtx() and copyDCtx() can be used too - Block size is limited, it must be <= ZSTD_getBlockSize() <= ZSTD_BLOCKSIZE_MAX == 128 KB + If input is larger than a block size, it's necessary to split input data into multiple blocks - + For inputs larger than a single block size, consider using the regular ZSTD_compress() instead. + + For inputs larger than a single block, really consider using regular ZSTD_compress() instead. Frame metadata is not that costly, and quickly becomes negligible as source size grows larger. - When a block is considered not compressible enough, ZSTD_compressBlock() result will be zero. - In which case, nothing is produced into `dst`. + In which case, nothing is produced into `dst` ! + User must test for such outcome and deal directly with uncompressed data + ZSTD_decompressBlock() doesn't accept uncompressed data as input !!! + In case of multiple successive blocks, should some of them be uncompressed, |
