Code Examples
package main
import (
"bytes"
"fmt"
"github.com/klauspost/compress/zstd"
)
func main() {
// "Raw" dictionaries can be used for compressed delta encoding.
source := []byte(`
This is the source file. Compression of the target file with
the source file as the dictionary will produce a compressed
delta encoding of the target file.`)
target := []byte(`
This is the target file. Decompression of the delta encoding with
the source file as the dictionary will produce this file.`)
// The dictionary id is arbitrary. We use zero for compatibility
// with zstd --patch-from, but applications can use any id
// not in the range [32768, 1<<31).
const id = 0
bestLevel := zstd.WithEncoderLevel(zstd.SpeedBestCompression)
w, _ := zstd.NewWriter(nil, bestLevel,
zstd.WithEncoderDictRaw(id, source))
delta := w.EncodeAll(target, nil)
r, _ := zstd.NewReader(nil, zstd.WithDecoderDictRaw(id, source))
out, err := r.DecodeAll(delta, nil)
if err != nil || !bytes.Equal(out, target) {
panic("decoding error")
}
// Ordinary compression, for reference.
w, _ = zstd.NewWriter(nil, bestLevel)
compressed := w.EncodeAll(target, nil)
// Check that the delta is at most half as big as the compressed file.
fmt.Println(len(delta) < len(compressed)/2)
}
Package-Level Type Names (total 64, in which 8 are exported)
/* sort exporteds by: | */
CompatV155 will make the dictionary compatible with Zstd v1.5.5 and earlier.
See https://github.com/facebook/zstd/issues/3724 Content to use to create dictionary tables. DebugOut will write stats and other details here if set. History to use for all blocks. Dictionary ID. Use the specified encoder level.
The dictionary will be built using the specified encoder level,
which will reflect speed and make the dictionary tailored for that level.
If not set SpeedBestCompression will be used. Offsets to use.
func BuildDict(o BuildDictOptions) ([]byte, error)
Decoder provides decoding of zstandard streams.
The decoder has been designed to operate without allocations after a warmup.
This means that you should store the decoder for best performance.
To re-use a stream decoder, use the Reset(r io.Reader) error to switch to another stream.
A decoder can safely be re-used even if the previous stream failed.
To release the resources, you must call the Close() function on a decoder. Current read position used for Reader functionality. Unreferenced decoders, ready for use.frame*frameDecodecoderOptions streamWg is the waitgroup for all streams sync stream decoding Close will release all resources.
It is NOT possible to reuse the decoder after this. DecodeAll allows stateless decoding of a blob of bytes.
Output will be appended to dst, so if the destination size is known
you can pre-allocate the destination slice to avoid allocations.
DecodeAll can be used concurrently.
The Decoder concurrency limits will be respected. IOReadCloser returns the decoder as an io.ReadCloser for convenience.
Any changes to the decoder will be reflected, so the returned ReadCloser
can be reused along with the decoder.
io.WriterTo is also supported by the returned ReadCloser. Read bytes from the decompressed stream into p.
Returns the number of bytes read and any error that occurred.
When the stream is done, io.EOF will be returned. Reset will reset the decoder the supplied stream after the current has finished processing.
Note that this functionality cannot be used after Close has been called.
Reset can be called with a nil reader to release references to the previous reader.
After being called with a nil reader, no other operations than Reset or DecodeAll or Close
should be used. ResetWithOptions will reset the decoder and apply the given options
for the next stream or DecodeAll operation.
Options are applied on top of the existing options.
Some options cannot be changed on reset and will return an error. WriteTo writes data to w until there's no more data to write or when an error occurs.
The return value n is the number of bytes written.
Any error encountered during the write is also returned. drainOutput will drain the output until errEndOfStream is sent. nextBlock returns the next block.
If an error occurs d.err will be set.
Optionally the function can block for new output.
If non-blocking mode is used the returned boolean will be false
if no data was available without blocking.(*Decoder) nextBlockSync() (ok bool)(*Decoder) setDict(frame *frameDec) (err error) Create Decoder:
ASYNC:
Spawn 3 go routines.
0: Read frames and decode block literals.
1: Decode sequences.
2: Execute sequences, send to output.(*Decoder) startSyncDecoder(r io.Reader) error(*Decoder) stashDecoder()
*Decoder : io.Reader
*Decoder : io.WriterTo
func NewReader(r io.Reader, opts ...DOption) (*Decoder, error)
Encoder provides encoding to Zstandard.
An Encoder can be used for either compressing a stream via the
io.WriteCloser interface supported by the Encoder or as multiple independent
tasks via the EncodeAll function.
Smaller encodes are encouraged to use the EncodeAll function.
Use NewWriter to create a new instance.encoderschan encoderinitsync.OnceoencoderOptionsstateencoderState Close will flush the final output and close the stream.
The function will block until everything has been written.
The Encoder can still be re-used after calling this. EncodeAll will encode all input in src and append it to dst.
This function can be called concurrently, but each call will only run on a single goroutine.
If empty input is given, nothing is returned, unless WithZeroFrames is specified.
Encoded blocks can be concatenated and the result will be the combined input stream.
Data compressed with EncodeAll can be decoded with the Decoder,
using either a stream or DecodeAll. Flush will send the currently written data to output
and block until everything has been written.
This should only be used on rare occasions where pushing the currently queued data is critical. MaxEncodedSize returns the expected maximum
size of an encoded block or stream. ReadFrom reads data from r until EOF or error.
The return value n is the number of bytes read.
Any error except io.EOF encountered during the read is also returned.
The Copy function uses ReaderFrom if available. Reset will re-initialize the writer and new writes will encode to the supplied writer
as a new, independent stream. ResetContentSize will reset and set a content size for the next stream.
If the bytes written does not match the size given an error will be returned
when calling Close().
This is removed when Reset is called.
Sizes <= 0 results in no content size set. ResetWithOptions will re-initialize the writer and apply the given options
as a new, independent stream.
Options are applied on top of the existing options.
Some options cannot be changed on reset and will return an error. Write data to the encoder.
Input data will be buffered and as the buffer fills up
content will be compressed and written to the output.
When done writing, use Close to flush the remaining output
and write CRC if requested.(*Encoder) encodeAll(enc encoder, src, dst []byte) []byte(*Encoder) initialize() nextBlock will synchronize and start compressing input in e.state.filling.
If an error has occurred during encoding it will be returned.
*Encoder : internal/bisect.Writer
*Encoder : io.Closer
*Encoder : io.ReaderFrom
*Encoder : io.WriteCloser
*Encoder : io.Writer
*Encoder : github.com/refraction-networking/utls.transcriptHash
*Encoder : crypto/tls.transcriptHash
func NewWriter(w io.Writer, opts ...EOption) (*Encoder, error)
Header contains information about the first frame and block within that. Dictionary ID.
If 0, no dictionary. First block information. FrameContentSize is the expected uncompressed size of the entire frame. If set there is a checksum present for the block content.
The checksum field at the end is always 4 bytes long. HasFCS specifies whether FrameContentSize has a valid value. HeaderSize is the raw size of the frame header.
For normal frames, it includes the size of the magic number and
the size of the header (per section 3.1.1.1).
It does not include the size for any data blocks (section 3.1.1.2) nor
the size for the trailing content checksum.
For skippable frames, this counts the size of the magic number
along with the size of the size field of the payload.
It does not include the size of the skippable payload itself.
The total frame size is the HeaderSize plus the SkippableSize. SingleSegment specifies whether the data is to be decompressed into a
single contiguous memory segment.
It implies that WindowSize is invalid and that FrameContentSize is valid. Skippable will be true if the frame is meant to be skipped.
This implies that FirstBlock.OK is false. SkippableID is the user-specific ID for the skippable frame.
Valid values are between 0 to 15, inclusive. SkippableSize is the length of the user data to skip following
the header. WindowSize is the window of data to keep while decoding.
Will only be set if SingleSegment is false. AppendTo will append the encoded header to the dst slice.
There is no error checking performed on the header values. Decode the header from the beginning of the stream.
This will decode the frame header and the first block header if enough bytes are provided.
It is recommended to provide at least HeaderMaxSize bytes.
If the frame header cannot be read an error will be returned.
If there isn't enough input, io.ErrUnexpectedEOF is returned.
The FirstBlock.OK will indicate if enough information was available to decode the first block header. DecodeAndStrip will decode the header from the beginning of the stream
and on success return the remaining bytes.
This will decode the frame header and the first block header if enough bytes are provided.
It is recommended to provide at least HeaderMaxSize bytes.
If the frame header cannot be read an error will be returned.
If there isn't enough input, io.ErrUnexpectedEOF is returned.
The FirstBlock.OK will indicate if enough information was available to decode the first block header.
SnappyConverter can read SnappyConverter-compressed streams and convert them to zstd.
Conversion is done by converting the stream directly from Snappy without intermediate
full decoding.
Therefore the compression ratio is much less than what can be done by a full decompression
and compression, and a faulty Snappy stream may lead to a faulty Zstandard stream without
any errors being generated.
No CRC value is being generated and not all CRC values of the Snappy stream are checked.
However, it provides really fast recompression of Snappy streams.
The converter can be reused to avoid allocations, even after errors.block*blockEncbuf[]byteerrerrorrio.Reader Convert the Snappy stream supplied in 'in' and write the zStandard stream to 'w'.
If any error is detected on the Snappy stream it is returned.
The number of bytes written is returned.(*SnappyConverter) readFull(p []byte, allowEOF bool) (ok bool)
bestFastEncoder uses 2 tables, one for short matches (5 bytes) and one for long matches.
The long match table contains the previous entry with the same hash,
effectively making it a "chain" of length 2.
When we find a long match we choose between the two values and select the longest.
When we find a short match, after checking the long, we check if we can find a long at n+1
and that it is longer (lazy matching).dictLongTable[]prevEntrydictTable[]prevEntryfastBasefastBasefastBase.blk*blockEncfastBase.bufferResetint32fastBase.crc*xxhash.Digest cur is the offset at the start of histfastBase.hist[]bytefastBase.lastDict*dictfastBase.lowMembool maximum offset. Should be at least 2x block size.fastBase.tmp[8]bytelongTable[4194304]prevEntrytable[262144]prevEntry AppendCRC will append the CRC to the destination slice and return it. Block returns the current block. CRC returns the underlying CRC writer. Encode improves compression... EncodeNoHist will encode a block with no history and no following blocks.
Most notable difference is that src will not be copied for history and
we do not need to check for max match length. Reset will reset and set a dictionary if not nil useBlock will replace the block with the provided one,
but transfer recent offsets from the previous. WindowSize returns the window size of the encoder,
or a window size small enough to contain the input size, if > 0.(*bestFastEncoder) addBlock(src []byte) int32 ensureHist will ensure that history can keep at least this many bytes.(*bestFastEncoder) matchlen(s, t int32, src []byte) int32 Reset the encoding table.
*bestFastEncoder : encoder
betterFastEncoder uses 2 tables, one for short matches (5 bytes) and one for long matches.
The long match table contains the previous entry with the same hash,
effectively making it a "chain" of length 2.
When we find a long match we choose between the two values and select the longest.
When we find a short match, after checking the long, we check if we can find a long at n+1
and that it is longer (lazy matching).fastBasefastBasefastBase.blk*blockEncfastBase.bufferResetint32fastBase.crc*xxhash.Digest cur is the offset at the start of histfastBase.hist[]bytefastBase.lastDict*dictfastBase.lowMembool maximum offset. Should be at least 2x block size.fastBase.tmp[8]bytelongTable[524288]prevEntrytable[8192]tableEntry AppendCRC will append the CRC to the destination slice and return it. Block returns the current block. CRC returns the underlying CRC writer. Encode improves compression... EncodeNoHist will encode a block with no history and no following blocks.
Most notable difference is that src will not be copied for history and
we do not need to check for max match length. ResetDict will reset and set a dictionary if not nil useBlock will replace the block with the provided one,
but transfer recent offsets from the previous. WindowSize returns the window size of the encoder,
or a window size small enough to contain the input size, if > 0.(*betterFastEncoder) addBlock(src []byte) int32 ensureHist will ensure that history can keep at least this many bytes.(*betterFastEncoder) matchlen(s, t int32, src []byte) int32 Reset the encoding table.
*betterFastEncoder : encoder
bitWriter will write bits.
First bit will be LSB of the first byte of output.bitContaineruint64nBitsuint8out[]byte addBits16Clean will add up to 16 bits. value may not contain more set bits than indicated.
It will not check if there is space for them, so the caller must ensure that it has flushed recently. addBits16NC will add up to 16 bits.
It will not check if there is space for them,
so the caller must ensure that it has flushed recently. addBits32Clean will add up to 32 bits.
It will not check if there is space for them.
The input must not contain more bits than specified. addBits32NC will add up to 31 bits.
It will not check if there is space for them,
so the caller must ensure that it has flushed recently. addBits64NC will add up to 64 bits.
There must be space for 32 bits. close will write the alignment bit and write the final byte(s)
to the output. flush32 will flush out, so there are at least 32 bits available for writing. flushAlign will flush remaining full bytes and align to next byte boundary. reset and continue writing by appending to out.
Is this the last block of a frame? Block is RLE, this is the size.TypeblockType Window size of the block.asyncstruct{newHist *history; literals []byte; seqData []byte; seqSize int; fcs uint64} Check against this crc, if hasCRC is true. Raw source data of the block.dataStorage[]byte Destination of the decoded data.errerrorhasCRCbool Buffer for literals data. Frame to use for singlethreaded decoding.
Should not be used by the decoder itself since parent may be another frame. Use less memorysequence[]seqVals Close will release resources.
Closed blockDec cannot be reset.(*blockDec) String() string decodeBuf decodeCompressed will start decompressing a block.(*blockDec) decodeLiterals(in []byte, hist *history) (remain []byte, err error)(*blockDec) decodeSequences(hist *history) error(*blockDec) executeSequences(hist *history) error(*blockDec) prepareSequences(in []byte, hist *history) (err error) reset will reset the block.
Input must be a start of a block and will be at the end of the block when returned. sendEOF will make the decoder send EOF on this frame.(*blockDec) updateHistory(hist *history) error
*blockDec : fmt.Stringer
*blockDec : context.stringer
*blockDec : runtime.stringer
func newBlockDec(lowMem bool) *blockDec
codersseqCodersdictLitEnc*huff0.ScratchextraLitsintlastboollitEnc*huff0.Scratchliterals[]bytelowMembooloutput[]byteprevRecentOffsets[3]uint32recentOffsets[3]uint32sequences[]seqsizeintwrbitWriter encode will encode the block and append the output in b.output.
Previous offset codes must be pushed if more blocks are expected. encodeLits can be used if the block is only litLen. encodeRLE will encode an RLE block. encodeRaw can be used to set the output to a raw representation of supplied bytes. encodeRaw can be used to set the output to a raw representation of supplied bytes.(*blockEnc) genCodes() init should be used once the block has been created.
If called more than once, the effect is the same as calling reset. initNewEncode can be used to reset offsets and encoders to the initial state. matchOffset will adjust recent offsets and return the adjusted one,
if it matches a previous offset. pushOffsets will push the recent offsets to the backup store. pushOffsets will push the recent offsets to the backup store. reset will reset the block for a new encode, but in the same stream,
meaning that state will be carried over, but the block content is reset.
If a previous block is provided, the recent offsets are carried over. reset will reset the block for a new encode, but in the same stream,
meaning that state will be carried over, but the block content is reset.
If a previous block is provided, the recent offsets are carried over.
func decodeSnappy(blk *blockEnc, src []byte) error
blockHeader contains the information for a block header. String returns a string representation of the block. appendTo will append the block header to a slice. setLast sets the 'last' indicator on a block. setSize will store the compressed size of a block. setType sets the block type.
blockHeader : fmt.Stringer
blockHeader : context.stringer
blockHeader : runtime.stringer
Read >8 bytes.
MAY use the destination slice. Read a single byte. Read up to 8 bytes.
Returns io.ErrUnexpectedEOF if this cannot be satisfied. Skip n bytes.
*byteBuf
*readerWrapper
byteReader provides a byte reader that reads
little endian values from a byte stream.
The input stream is manually advanced.
The reader performs no bounds checks.b[]byteoffint Int32 returns a little endian int32 starting at current offset. Uint32 returns a little endian uint32 starting at current offset. Uint32NC returns a little endian uint32 starting at current offset.
The caller must be sure if there are at least 4 bytes left. Uint8 returns the next byte advance the stream b n bytes. overread returns whether we have advanced too far. remain will return the number of bytes remaining. unread returns the unread portion of the input.
closeWrapper wraps a function call as a closer.d*Decoder Close closes the decoder. Read forwards read calls to the decoder. WriteTo forwards WriteTo calls to the decoder.
closeWrapper : io.Closer
closeWrapper : io.ReadCloser
closeWrapper : io.Reader
closeWrapper : io.WriterTo
cState contains the compression state of a stream.bw*bitWriterstateuint16stateTable[]uint16 flush will write the tablelog to the output and flush the remaining full bytes. init will initialize the compression state to the first symbol of the stream.
decoderState is used for maintaining state when the decoder
is used for streaming. cancel remaining output. crc of current frame current block being written to stream.decodeOutput.b[]bytedecodeOutput.d*blockDecdecodeOutput.errerrorflushedbool output in order to be written to stream.
decSymbol contains information about a state entry,
Including the state offset base, the output symbol and
the number of bits to read for the low part of the destination state.
Using a composite uint64 is faster than a struct with separate members.( decSymbol) addBits() uint8( decSymbol) baselineInt() int final returns the current state symbol without decoding the next.( decSymbol) nbBits() uint8( decSymbol) newState() uint16(*decSymbol) setAddBits(addBits uint8)(*decSymbol) setExt(addBits uint8, baseline uint32)(*decSymbol) setNBits(nBits uint8)(*decSymbol) setNewState(state uint16)
func decSymbolValue(symb uint8, t []baseOffset) (decSymbol, error)
func newDecSymbol(nbits, addBits uint8, newState uint16, baseline uint32) decSymbol
blk*blockEncbufferResetint32crc*xxhash.Digest cur is the offset at the start of histhist[]bytelastDict*dictlowMembool maximum offset. Should be at least 2x block size.tmp[8]byte AppendCRC will append the CRC to the destination slice and return it. Block returns the current block. CRC returns the underlying CRC writer. useBlock will replace the block with the provided one,
but transfer recent offsets from the previous. WindowSize returns the window size of the encoder,
or a window size small enough to contain the input size, if > 0.(*fastBase) addBlock(src []byte) int32 ensureHist will ensure that history can keep at least this many bytes.(*fastBase) matchlen(s, t int32, src []byte) int32 Reset the encoding table.
fastBasefastBasefastBase.blk*blockEncfastBase.bufferResetint32fastBase.crc*xxhash.Digest cur is the offset at the start of histfastBase.hist[]bytefastBase.lastDict*dictfastBase.lowMembool maximum offset. Should be at least 2x block size.fastBase.tmp[8]bytetable[32768]tableEntry AppendCRC will append the CRC to the destination slice and return it. Block returns the current block. CRC returns the underlying CRC writer. Encode mimmics functionality in zstd_fast.c EncodeNoHist will encode a block with no history and no following blocks.
Most notable difference is that src will not be copied for history and
we do not need to check for max match length. ResetDict will reset and set a dictionary if not nil useBlock will replace the block with the provided one,
but transfer recent offsets from the previous. WindowSize returns the window size of the encoder,
or a window size small enough to contain the input size, if > 0.(*fastEncoder) addBlock(src []byte) int32 ensureHist will ensure that history can keep at least this many bytes.(*fastEncoder) matchlen(s, t int32, src []byte) int32 Reset the encoding table.
*fastEncoder : encoder
DictionaryIDuint32FrameContentSizeuint64HasCheckSumboolSingleSegmentboolWindowSizeuint64 Byte buffer that can be reused for small input blocks.crc*xxhash.Digest Frame history passed between blocksodecoderOptionsrawInputbyteBuffer checkCRC will check the checksum, assuming the frame has one.
Will return ErrCRCMismatch if crc check failed, otherwise nil. consumeCRC skips over the checksum, assuming the frame has one. next will start decoding the next block from stream. reset will read the frame header and prepare for block decoding.
If nothing can be read from the input, io.EOF will be returned.
Any other error indicated that the stream contained data, but
there was a problem. runDecoder will run the decoder for the remainder of the frame.
func newFrameDec(o decoderOptions) *frameDec
func (*Decoder).setDict(frame *frameDec) (err error)
fseDecoder provides temporary storage for compression and decompression. // Selected tablelog. // Decompression table. // Maximum number of additional bitsnorm[256]int16preDefinedbool used for table creation to avoid allocations. // Length of active part of the symbol table. buildDtable will build the decoding table.(*fseDecoder) mustReadFrom(r io.Reader) readNCount will read the symbol distribution so decoding tables can be constructed. setRLE will set the decoder til RLE mode. transform will transform the decoder table into a table usable for
decoding without having to apply the transformation while decoding.
The state will contain the base value and the number of bits to read.
func buildDtable_asm(s *fseDecoder, ctx *buildDtableAsmContext) int
Scratch provides temporary storage for compression and decompression. // Selected tablelog. // clear count TODO: Technically zstd should be fine with 64 bytes. // Compression tables. // Maximum output bits after transform. // count of the most probable symbolnorm[256]int16 // This encoder is predefined. // Set to know when the encoder has been reused. // RLE Symbol // Length of active part of the symbol table. // This encoder is for RLE // no bits has prob > 50%. Histogram allows to populate the histogram and skip that step in the compression,
It otherwise allows to inspect the histogram when compression is done.
To indicate that you have populated the histogram call HistogramFinished
with the value of the highest populated symbol, as well as the number of entries
in the most populated entry. These are accepted at face value. HistogramFinished can be called to indicate that the histogram has been populated.
maxSymbol is the index of the highest set symbol of the next data segment.
maxCount is the number of entries in the most populated entry.
These are accepted at face value. allocCtable will allocate tables needed for compression.
If existing tables a re big enough, they are simply re-used. Returns the cost in bits of encoding the distribution in count using ctable.
Histogram should only be up to the last non-zero symbol.
Returns an -1 if ctable cannot represent all the symbols in count. Approximate symbol cost, as fractional value, using fixed-point format (accuracyLog fractional bits)
note 1 : assume symbolValue is valid (<= maxSymbolValue)
note 2 : if freq[symbolValue]==0, @return a fake cost of tableLog+1 bits * buildCTable will populate the compression table so it is ready to be used. maxHeaderSize returns the maximum header size in bits.
This is not exact size, but we want a penalty for new tables anyway. normalizeCount will normalize the count of the symbols so
the total is equal to the table size.
If successful, compression tables will also be made ready. Secondary normalization method.
To be used when primary method fails. optimalTableLog calculates and sets the optimal tableLog in s.actualTableLog setBits will set output bits for the transform.
if nil is provided, the number of bits is equal to the index.(*fseEncoder) setRLE(val byte) validateNorm validates the normalized histogram table. writeCount will write the normalized histogram count to header.
This is read back by readNCount.
history contains the information transferred between blocks. // needed? History buffer... Sequence decompressiondict*dicterrorbool Literal decompression ignoreBuffer is meant to ignore a number of bytes
when checking for matches in historyrecentOffsets[3]intwindowSizeint append bytes to history.
This function will make sure there is space for it,
if the buffer has been allocated with enough extra space. append bytes to history without ever discarding anything. ensureBlock will ensure there is space for at least one block...(*history) freeHuffDecoder() reset will reset the history to initial state of a frame.
The history must already have been initialized to the desired size.(*history) setDict(dict *dict)
literalsHeader contains literals header information.( literalsHeader) String() string appendTo will append the literals header to a byte slice. setSize can be used to set a single size, for uncompressed and RLE content. setSizes will set the size of a compressed literals section and the input length. setType can be used to set the type of literal block. size returns the output size with currently set values.
literalsHeader : fmt.Stringer
literalsHeader : context.stringer
literalsHeader : runtime.stringer
litLenuint32 Codes are stored here for the encoder
so they only have to be looked up once.matchLenuint32 Codes are stored here for the encoder
so they only have to be looked up once. Codes are stored here for the encoder
so they only have to be looked up once.offsetuint32( seq) String() string
seq : fmt.Stringer
seq : context.stringer
seq : runtime.stringer
decoder keeps track of the current state and updates it from the bitstream.repeatboolstatefseState init the state of the decoder with input from stream.
DecodeTo appends the decoded data from src to dst.
The maximum decoded size is 1GiB,
not including what may already be in dst.
EncoderLevelFromString will convert a string representation of an encoding level back
to a compression level. The compare is not case sensitive.
If the string wasn't recognized, (false, SpeedDefault) will be returned.
EncoderLevelFromZstd will return an encoder level that closest matches the compression
ratio of a specific zstd compression level.
Many input values will provide the same compression level.
EncodeTo appends the encoded data from src to dst.
IgnoreChecksum allows to forcibly ignore checksum checking.
Can be changed with ResetWithOptions.
InspectDictionary loads a zstd dictionary and provides functions to inspect the content.
NewReader creates a new decoder.
A nil Reader can be provided in which case Reset can be used to start a decode.
A Decoder can be used in two modes:
1) As a stream, or
2) For stateless decoding using DecodeAll.
Only a single stream can be decoded concurrently, but the same decoder
can run multiple concurrent stateless decodes. It is even possible to
use stateless decodes while a stream is being decoded.
The Reset function can be used to initiate a new stream, which will considerably
reduce the allocations normally caused by NewReader.
NewWriter will create a new Zstandard encoder.
If the encoder will be used for encoding blocks a nil writer can be used.
WithAllLitEntropyCompression will apply entropy compression if no matches are found.
Disabling this will skip incompressible data faster, but in cases with no matches but
skewed character distribution compression is lost.
Default value depends on the compression level selected.
Can be changed with ResetWithOptions.
WithDecodeAllCapLimit will limit DecodeAll to decoding cap(dst)-len(dst) bytes,
or any size set in WithDecoderMaxMemory.
This can be used to limit decoding to a specific maximum output size.
Disabled by default.
Can be changed with ResetWithOptions.
WithDecodeBuffersBelow will fully decode readers that have a
`Bytes() []byte` and `Len() int` interface similar to bytes.Buffer.
This typically uses less allocations but will have the full decompressed object in memory.
Note that DecodeAllCapLimit will disable this, as well as giving a size of 0 or less.
Default is 128KiB.
Cannot be changed with ResetWithOptions.
WithDecoderConcurrency sets the number of created decoders.
When decoding block with DecodeAll, this will limit the number
of possible concurrently running decodes.
When decoding streams, this will limit the number of
inflight blocks.
When decoding streams and setting maximum to 1,
no async decoding will be done.
The value supplied must be at least 0.
When a value of 0 is provided GOMAXPROCS will be used.
By default this will be set to 4 or GOMAXPROCS, whatever is lower.
Cannot be changed with ResetWithOptions.
WithDecoderDictDelete removes dictionaries by ID.
If no ids are passed, all dictionaries are deleted.
Should be used with ResetWithOptions.
WithDecoderDictRaw registers a dictionary that may be used by the decoder.
The slice content can be arbitrary data.
Can be changed with ResetWithOptions.
WithDecoderDicts allows to register one or more dictionaries for the decoder.
Each slice in dict must be in the [dictionary format] produced by
"zstd --train" from the Zstandard reference implementation.
If several dictionaries with the same ID are provided, the last one will be used.
Can be changed with ResetWithOptions.
[dictionary format]: https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#dictionary-format
WithDecoderLowmem will set whether to use a lower amount of memory,
but possibly have to allocate more while running.
Cannot be changed with ResetWithOptions.
WithDecoderMaxMemory allows to set a maximum decoded size for in-memory
non-streaming operations or maximum window size for streaming operations.
This can be used to control memory usage of potentially hostile content.
Maximum is 1 << 63 bytes. Default is 64GiB.
Can be changed with ResetWithOptions.
WithDecoderMaxWindow allows to set a maximum window size for decodes.
This allows rejecting packets that will cause big memory usage.
The Decoder will likely allocate more memory based on the WithDecoderLowmem setting.
If WithDecoderMaxMemory is set to a lower value, that will be used.
Default is 512MB, Maximum is ~3.75 TB as per zstandard spec.
Can be changed with ResetWithOptions.
WithEncoderConcurrency will set the concurrency,
meaning the maximum number of encoders to run concurrently.
The value supplied must be at least 0.
When a value of 0 is provided GOMAXPROCS will be used.
For streams, setting a value of 1 will disable async compression.
By default this will be set to GOMAXPROCS.
Cannot be changed with ResetWithOptions.
WithEncoderCRC will add CRC value to output.
Output will be 4 bytes larger.
Can be changed with ResetWithOptions.
WithEncoderDict allows to register a dictionary that will be used for the encode.
The slice dict must be in the [dictionary format] produced by
"zstd --train" from the Zstandard reference implementation.
The encoder *may* choose to use no dictionary instead for certain payloads.
Can be changed with ResetWithOptions.
[dictionary format]: https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#dictionary-format
WithEncoderDictDelete clears the dictionary, so no dictionary will be used.
Should be used with ResetWithOptions.
WithEncoderDictRaw registers a dictionary that may be used by the encoder.
The slice content may contain arbitrary data. It will be used as an initial
history.
Can be changed with ResetWithOptions.
WithEncoderLevel specifies a predefined compression level.
Cannot be changed with ResetWithOptions.
WithEncoderPadding will add padding to all output so the size will be a multiple of n.
This can be used to obfuscate the exact output size or make blocks of a certain size.
The contents will be a skippable frame, so it will be invisible by the decoder.
n must be > 0 and <= 1GB, 1<<30 bytes.
The padded area will be filled with data from crypto/rand.Reader.
If `EncodeAll` is used with data already in the destination, the total size will be multiple of this.
Can be changed with ResetWithOptions.
WithLowerEncoderMem will trade in some memory cases trade less memory usage for
slower encoding speed.
This will not change the window size which is the primary function for reducing
memory usage. See WithWindowSize.
Cannot be changed with ResetWithOptions.
WithNoEntropyCompression will always skip entropy compression of literals.
This can be useful if content has matches, but unlikely to benefit from entropy
compression. Usually the slight speed improvement is not worth enabling this.
Can be changed with ResetWithOptions.
WithSingleSegment will set the "single segment" flag when EncodeAll is used.
If this flag is set, data must be regenerated within a single continuous memory segment.
In this case, Window_Descriptor byte is skipped, but Frame_Content_Size is necessarily present.
As a consequence, the decoder must allocate a memory segment of size equal or larger than size of your content.
In order to preserve the decoder from unreasonable memory requirements,
a decoder is allowed to reject a compressed frame which requests a memory size beyond decoder's authorized range.
For broader compatibility, decoders are recommended to support memory sizes of at least 8 MB.
This is only a recommendation, each decoder is free to support higher or lower limits, depending on local limitations.
If this is not specified, block encodes will automatically choose this based on the input size and the window size.
This setting has no effect on streamed encodes.
Can be changed with ResetWithOptions.
WithWindowSize will set the maximum allowed back-reference distance.
The value must be a power of two between MinWindowSize and MaxWindowSize.
A larger value will enable better compression but allocate more memory and,
for above-default values, take considerably longer.
The default value is determined by the compression level and max 8MB.
Cannot be changed with ResetWithOptions.
WithZeroFrames will encode 0 length input as full frames.
This can be needed for compatibility with zstandard usage,
but is not needed for this package.
Can be changed with ResetWithOptions.
ZipCompressor returns a compressor that can be registered with zip libraries.
The provided encoder options will be used on all encodes.
ZipDecompressor returns a decompressor that can be registered with zip libraries.
See ZipCompressor for example.
Options can be specified. WithDecoderConcurrency(1) is forced,
and by default a 128MB maximum decompression window is specified.
The window size can be overridden if required.
buildDtable_asm is an x86 assembly implementation of fseDecoder.buildDtable.
Function returns non-zero exit code on error.
calcSkippableFrame will return a total size to be added for written
to be divisible by multiple.
The value will always be > skippableFrameHeader.
The function will panic if written < 0 or wantMultiple <= 0.
decodeSnappy writes the decoding of src to dst. It assumes that the varint-encoded
length of the decompressed bytes has already been read.
decSymbolValue returns the transformed decSymbol for the given symbol.
fillBase will precalculate base offsets with the given bit distributions.
fuzzFseEncoder can be used to fuzz the FSE encoder.
hashLen returns a hash of the lowest mls bytes of with length output bits.
mls must be >=3 and <=8. Any other value will return hash for 4 bytes.
length should always be < 32.
Preferably length and mls should be a constant for inlining.
sequenceDecs_decode implements the main loop of sequenceDecs in x86 asm.
Please refer to seqdec_generic.go for the reference implementation.
sequenceDecs_decode implements the main loop of sequenceDecs in x86 asm with BMI2 extensions.
sequenceDecs_decode implements the main loop of sequenceDecs in x86 asm.
Please refer to seqdec_generic.go for the reference implementation.
sequenceDecs_decode implements the main loop of sequenceDecs in x86 asm with BMI2 extensions.
sequenceDecs_decodeSync_amd64 implements the main loop of sequenceDecs.decodeSync in x86 asm.
Please refer to seqdec_generic.go for the reference implementation.
sequenceDecs_decodeSync_bmi2 implements the main loop of sequenceDecs.decodeSync in x86 asm with BMI2 extensions.
sequenceDecs_decodeSync_safe_amd64 does the same as above, but does not write more than output buffer.
sequenceDecs_decodeSync_safe_bmi2 does the same as above, but does not write more than output buffer.
sequenceDecs_executeSimple_amd64 implements the main loop of sequenceDecs.executeSimple in x86 asm.
Returns false if a match offset is too big.
Please refer to seqdec_generic.go for the reference implementation.
Same as above, but with safe memcopies
skippableFrame will add a skippable frame with a total size of bytes.
total should be >= skippableFrameHeader and < math.MaxUint32.
crc implements the checksum specified in section 3 of
https://github.com/google/snappy/blob/master/framing_format.txt
snappyDecodedLen returns the length of the decoded block and the number of bytes
that the length header occupied.
tableStep returns the next table index.
Package-Level Variables (total 45, in which 18 are exported)
ErrBlockTooSmall is returned when a block is too small to be decoded.
Typically returned on invalid input.
ErrCompressedSizeTooBig is returned when a block is bigger than allowed.
Typically this indicates wrong or corrupted input.
ErrCRCMismatch is returned if CRC mismatches.
ErrDecoderClosed will be returned if the Decoder was used after
Close has been called.
ErrDecoderNilInput is returned when a nil Reader was provided
and an operation other than Reset/DecodeAll/Close was attempted.
ErrDecoderSizeExceeded is returned if decompressed size exceeds the configured limit.
ErrEncoderClosed will be returned if the Encoder was used after
Close has been called.
ErrFrameSizeExceeded is returned if the stated frame size is exceeded.
This is only returned if SingleSegment is specified on the frame.
ErrFrameSizeMismatch is returned if the stated frame size does not match the expected size.
This is only returned if SingleSegment is specified on the frame.
ErrMagicMismatch is returned when a "magic" number isn't what is expected.
Typically this indicates wrong or corrupted input.
ErrReservedBlockType is returned when a reserved block type is found.
Typically this indicates wrong or corrupted input.
ErrSnappyCorrupt reports that the input is invalid.
ErrSnappyTooLarge reports that the uncompressed length is too large.
ErrSnappyUnsupported reports that the input isn't supported.
ErrUnexpectedBlockSize is returned when a block has unexpected size.
Typically returned on invalid input.
ErrUnknownDictionary is returned if the dictionary ID is unknown.
ErrWindowSizeExceeded is returned when a reference exceeds the valid window size.
Typically this indicates wrong or corrupted input.
ErrWindowSizeTooSmall is returned when no window size is specified.
Typically this indicates wrong or corrupted input.
fsePredef are the predefined fse tables as defined here:
https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#default-distributions
These values are already transformed.
fsePredefEnc are the predefined encoder based on fse tables as defined here:
https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#default-distributions
These values are already transformed.
maxTableSymbol is the biggest supported symbol for each table type
https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#the-codes-for-literals-lengths-match-lengths-and-offsets
mlBitsTable translates from ml code to number of bits.
symbolTableX contain the transformations needed for each type as defined in
https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#the-codes-for-literals-lengths-match-lengths-and-offsets
Package-Level Constants (total 132, in which 9 are exported)
HeaderMaxSize is the maximum size of a Frame and Block Header.
If less is sent to Header.Decode it *may* still contain enough information.
MaxWindowSize is the maximum encoder window size
and the default decoder maximum window size.
MinWindowSize is the minimum Window Size, which is 1 KB.
SpeedBestCompression will choose the best available compression option.
This will offer the best compression no matter the CPU cost.
SpeedBetterCompression will yield better compression than the default.
Currently it is about zstd level 7-8 with ~ 2x-3x the default CPU usage.
By using this, notice that CPU usage may go up in the future.
SpeedDefault is the default "pretty fast" compression option.
This is roughly equivalent to the default Zstandard mode (level 3).
SpeedFastest will choose the fastest reasonable compression.
This is roughly equivalent to the fastest Zstandard mode.
ZipMethodPKWare is the original method number used by PKWARE to indicate Zstandard compression.
Deprecated: This has been deprecated by PKWARE, use ZipMethodWinZip instead for compression.
See https://pkware.cachefly.net/webdocs/APPNOTE/APPNOTE-6.3.9.TXT
ZipMethodWinZip is the method for Zstandard compressed data inside Zip files for WinZip.
See https://www.winzip.com/win/en/comp_info.html
Note: Increasing the short table bits or making the hash shorter
can actually lead to compression degradation since it will 'steal' more from the
long match table and match offsets are quite big.
This greatly depends on the type of input.
Note: Increasing the short table bits or making the hash shorter
can actually lead to compression degradation since it will 'steal' more from the
long match table and match offsets are quite big.
This greatly depends on the type of input.
!MEMORY_USAGE :
* Memory usage formula : N->2^N Bytes (examples : 10 -> 1KB; 12 -> 4KB ; 16 -> 64KB; 20 -> 1MB; etc.)
* Increasing memory usage improves compression ratio
* Reduced memory usage can improve speed, due to cache effect
* Recommended max value is 14, for 16KB, which nicely fits into Intel x86 L1 cache
Up to 6 bits
We support slightly less than the reference decoder to be able to
use ints on 32 bit archs.
snappyMaxBlockSize is the maximum size of the input to encodeBlock. It is not
part of the wire format per se, but some parts of the encoder assume
that an offset fits into a uint16.
Also, for the framing format (Writer type instead of Encode function),
https://github.com/google/snappy/blob/master/framing_format.txt says
that "the uncompressed data in a chunk must be no longer than 65536
bytes".
snappyMaxEncodedLenOfMaxBlockSize equals MaxEncodedLen(snappyMaxBlockSize), but is
hard coded to be a const instead of a variable, so that obufLen can also
be a const. Their equivalence is confirmed by
TestMaxEncodedLenOfMaxBlockSize.
The pages are generated with Goldsv0.8.4. (GOOS=linux GOARCH=amd64)
Golds is a Go 101 project developed by Tapir Liu.
PR and bug reports are welcome and can be submitted to the issue list.
Please follow @zigo_101 (reachable from the left QR code) to get the latest news of Golds.