SHA-3 (and SHAKE)
SHA-3
The most commonly used functionality contained in Keccak.jl probably is computation of digests according to the Secure Hash Algorithm-3 (SHA-3) as specified in NIST FIPS-202. SHA-3 is a family of hashing functions differing in the length of the generated digits, offering 224 bits, 256 bits, 384 bits, or 512 bits of output length. These map to the functions sha3_224, sha3_256, sha3_384, and sha3_512, respectively. In the following, examples will be limited to the 256 bits version; the other lengths work similarly.
Basic usage
We start with a very basic example:
julia> using Keccak: sha3_256
julia> sha3_256("a message to compute the digest of")
(0xfa, 0x24, 0x49, 0x91, 0xa1, 0x94, 0x44, 0x1e, 0xd6, 0xa8, 0x34, 0x19, 0xd2, 0x00, 0x44, 0x12, 0x15, 0xa9, 0x7c, 0x49, 0xc6, 0xcf, 0x5e, 0x28, 0x7f, 0x75, 0x75, 0x45, 0x73, 0x6d, 0x25, 0x23)As can be seen, the output is a tuple of UInt8s, namely 32 of them for a total of the required 32⋅8=256 bits. The interface is very similar to that of the SHA standard library:
julia> using SHA: sha3_256
julia> sha3_256("a message to compute the digest of")
32-element Vector{UInt8}:
0xfa
0x24
0x49
0x91
0xa1
0x94
0x44
0x1e
0xd6
0xa8
⋮
0x28
0x7f
0x75
0x75
0x45
0x73
0x6d
0x25
0x23The SHA standard library produces the same output, of course, but stores it in a Vector.
But SHA and Keccak export functions of the same name, so care must be taken when importing both of them to prevent name clashes. It is recommended to import only those symbols actually used, e.g. using Keccak: sha3_256 instead of using Keccak.
Apart from the return type, there are two noteworthy differences between the SHA and the Keccak implementations:
Keccakonly allowsAbstractVector{UInt8}undTuple{Vararg{UInt8}}input in addition to theStringinput demonstrated above, whileSHAalso supportsIOto hash all data coming from anIOobject (e.g. a file).Performance:
using Chairmarks: @b import Keccak, SHA @b rand(UInt8, 1_000_000) Keccak.sha3_256, SHA.sha3_256 # output (1.803 ms, 6.684 ms (9 allocs: 720 bytes))The
Keccakimplementation is faster and avoids allocations. (Comparison was done using Julia v1.12.0-rc1).
Chunked input (and output)
When e.g. needing to hash a large file, it may be inappropriate to read it into memory as a whole. Rather, one would like to process the data in reasonably-sized chunks. This is possible using the sponge-based interface of Keccak. (The term "sponge" stems from the algorithm family underlying SHA-3 – Keccak – using a so-called cryptographic sponge construction.)
Using the sponge-based interface entails the following steps:
- Obtain a suitable sponge with e.g.
sha3_256_sponge. - Process the input data with zero or more calls to
absorb. (Zero calls correspond to an empty input.) - Call
padexactly once. - Produce the output by one or more calls to
squeeze. (Zero calls are technically permitted, too, but pointless.)
An important feature of the sponges used by Keccak is that they are immutable. Therefore, none of the operations above mutate the given sponge in-place; rather, they return an updated sponge.
The following example shows both chunked input (multiple calls to absorb) as well as chunked output (multiple calls to squeeze), although the latter is certainly less useful is this context:
julia> sponge = sha3_256_sponge();
julia> sponge = absorb(sponge, 0x00:0x04); # absorb first data chunk
julia> sponge = absorb(sponge, 0x05:0x09); # absorb another data chunk
julia> sponge = pad(sponge); # absorb appropriate padding
julia> sponge, out1 = squeeze(sponge, Val(16)); # first part of output
julia> sponge, out2 = squeeze(sponge, Val(16)); # second part of output
julia> (out1..., out2...) == sha3_256(0x00:0x09) # same result
trueNote that the desired output length of squeeze has to be given in bytes. If the number is passed directly instead of wrapped in a Val, the output will be a Vector{UInt8} instead of a tuple.
Squeezing fewer or more bytes from the sponge than the standard demands is perfectly valid technically, but obviously not standard-compliant. And as the output length matches the security strength, squeezing more bytes will not produce a more secure digest. An application scenario where squeezing more bytes makes sense is to use the hashing function as a pseudo-random function generator. If this is your aim, consider SHAKE, cSHAKE, or the extensible output variants of KMAC, TupleHash, or ParallelHash.
SHAKE
The NIST FIPS-202 standard specifies an extensible-output variant of SHA-3, called SHAKE, for security strengths 128 bits and 256 bits. These are available in Keccak as shake_128 and shake_256, respectively.
Similar to the SHA-3 functions, one can directly compute a SHAKE-digest from input data, but has to pass the desired output length:
julia> msg = "a message to compute the digest of";
julia> shake_128(msg, 10) # returns a vector
10-element Vector{UInt8}:
0xd4
0xa7
0x25
0x77
0xde
0x29
0x05
0x20
0x9b
0x35
julia> shake_128(msg, Val(10)) # returns a tuple
(0xd4, 0xa7, 0x25, 0x77, 0xde, 0x29, 0x05, 0x20, 0x9b, 0x35)
julia> shake_128(msg, Val(15)) # longer output, first 10 bytes equal
(0xd4, 0xa7, 0x25, 0x77, 0xde, 0x29, 0x05, 0x20, 0x9b, 0x35, 0x64, 0x68, 0x9a, 0x96, 0xad)Chunked input and output is possible in the same way as for SHA-3 (see above), replacing sha3_256_sponge with shake_128_sponge (or shake_256_sponge). For convenience, one can also obtain a sponge ready for squeezing by calling shake_128(data) without specifying the output length:
julia> msg = "a message to compute the digest of";
julia> sponge = shake_128(msg);
julia> sponge, out1 = squeeze(sponge, Val(10));
julia> out1 # as above
(0xd4, 0xa7, 0x25, 0x77, 0xde, 0x29, 0x05, 0x20, 0x9b, 0x35)
julia> sponge, out2 = squeeze(sponge, Val(5));
julia> out2 # last five bytes of the length-15 example above
(0x64, 0x68, 0x9a, 0x96, 0xad)