Provided by: libsession-token-perl_1.503-2build5_amd64 

NAME
Session::Token - Secure, efficient, simple random session token generation
SYNOPSIS
Simple 128-bit session token
my $token = Session::Token->new->get;
## 74da9DABOqgoipxqQDdygw
Keep generator around
my $generator = Session::Token->new;
my $token = $generator->get;
## bu4EXqWt5nEeDjTAZcbTKY
my $token2 = $generator->get;
## 4Vez56Zc7el5Ggx4PoXCNL
Custom minimum entropy in bits
my $token = Session::Token->new(entropy => 256)->get;
## WdLiluxxZVkPUHsoqnfcQ1YpARuj9Z7or3COA4HNNAv
Custom alphabet and length
my $token = Session::Token->new(alphabet => 'ACGT', length => 100_000_000)->get;
## AGTACTTAGCAATCAGCTGGTTCATGGTTGCCCCCATAG...
DESCRIPTION
This module provides a secure, efficient, and simple interface for creating session tokens, password
reset codes, temporary passwords, random identifiers, and anything else you can think of.
When a Session::Token object is created, 1024 bytes are read from "/dev/urandom" (Linux, Solaris, most
BSDs), "/dev/arandom" (some older BSDs), or Crypt::Random::Source::Strong::Win32 (Windows). These bytes
are used to seed the ISAAC-32 <http://www.burtleburtle.net/bob/rand/isaacafa.html> pseudo random number
generator.
Once a generator is created, you can repeatedly call the "get" method on the generator object and it will
return a new token each time.
IMPORTANT: If your application calls "fork", make sure that any generators are re-created in one of the
processes after the fork since forking will duplicate the generator state and both parent and child
processes will go on to produce identical tokens (just like perl's rand after it is seeded).
After the generator context is created, no system calls are used to generate tokens. This is one way that
Session::Token helps with efficiency. However, this is only important for certain use cases (generally
not web sessions).
ISAAC is a cryptographically secure PRNG that improves on the well-known RC4 algorithm in some important
areas. For instance, it doesn't have short cycles or initial bias like RC4 does. A theoretical shortest
possible cycle in ISAAC is "2**40", although no cycles this short have ever been found (and probably
don't exist at all). On average, ISAAC cycles are "2**8295".
GENERATORS AND URANDOM
Developers must choose whether a single token generator will be kept around and used to generate all
tokens, or if a new Session::Token object will be created every time a token is needed. As mentioned
above, this module accesses urandom in its constructor for seeding purposes, but not subsequently while
generating tokens.
Generally speaking the generator should be kept around and re-used. Probably the most important reason
for this is that generating a new token from an existing generator cannot fail due to a full file-
descriptor table. Creating a new Session::Token object for every token can fail because, as described
above, the constructor needs to open "/dev/urandom" and this will not succeed if all allotted descriptors
are in use, or if the read is interrupted by a signal. In these events a perl exception will be thrown.
Programs that re-use a generator are more likely to be portable to "chroot"ed environments where
"/dev/urandom" may not be present. Finally, accessing urandom frequently is inefficient because it
requires making system calls and because (at least on linux) reading from urandom acquires a system-wide
kernel lock.
On the other hand, re-using a generator may be undesirable because servers are typically started
immediately after a system reboot and the kernel's randomness pool might be poorly seeded at that point.
Similarly, when starting a virtual machine a previously used entropy pool state may be restored. In these
cases all subsequently generated tokens will be derived from a weak/predictable seed. For this reason,
you might choose to defer creating the generator until the first request actually comes in, periodically
re-create the generator object, and/or manually handle seeding in some other way.
Programs that assume opening "/dev/urandom" will always succeed can return session tokens based only on
the contents of nulled or uninitialised memory. This is not the case with Session::Token since its
constructor will always throw an exception if it can't seed itself. Some modern systems provide system
calls with fewer failure modes (ie `getentropy(2)` on OpenBSD and `getrandom(2)` on linux). Future
versions of Session::Token will likely use these system calls when available.
CUSTOM ALPHABETS
Being able to choose exactly which characters appear in your token is sometimes useful. This set of
characters is called the alphabet. The default alphabet size is 62 characters: uppercase letters,
lowercase letters, and digits ("a-zA-Z0-9").
For some purposes, base-62 is a sweet spot. It is more compact than hexadecimal encoding which helps with
efficiency because session tokens are usually transferred over the network many times during a session
(often uncompressed in HTTP headers).
Also, base-62 tokens don't use "wacky" characters like base-64 encodings do. These characters sometimes
cause encoding/escaping problems (ie when embedded in URLs) and are annoying because often you can't
select tokens by double-clicking on them.
Although the default is base-62, there are all kinds of reasons for using another alphabet. One example
is if your users are reading tokens from a print-out or SMS or whatever, you may choose to omit
characters like "o", "O", and 0 that can easily be confused.
To set a custom alphabet, just pass in either a string or an array of characters to the "alphabet"
parameter of the constructor:
Session::Token->new(alphabet => '01')->get;
Session::Token->new(alphabet => ['0', '1'])->get; # same thing
Session::Token->new(alphabet => ['a'..'z'])->get; # character range
Constructor args can be a hash-ref too:
Session::Token->new({ alphabet => ['a'..'z'] })->get;
ENTROPY
There are two ways to specify the length of tokens. The most primitive is in terms of characters:
print Session::Token->new(length => 5)->get;
## -> wpLH4
But the primary way is to specify their minimum entropy in terms of bits:
print Session::Token->new(entropy => 24)->get;
## -> Fo5SX
In the above example, the resulting token contains at least 24 bits of entropy. Given the default base-62
alphabet, we can compute the exact entropy of a 5 character token as follows:
$ perl -E 'say 5 * log(62)/log(2)'
29.7709815519344
So these tokens have about 29.8 bits of entropy. Note that if we removed one character from this token,
it would bring it below our desired 24 bits of entropy:
$ perl -E 'say 4 * log(62)/log(2)'
23.8167852415475
The default minimum entropy is 128 bits. Default tokens are 22 characters long and therefore have about
131 bits of entropy:
$ perl -E 'say 22 * log(62)/log(2)'
130.992318828511
An interesting observation is that in base-64 representation, 128-bit minimum tokens also require 22
characters and that these tokens contain only 1 more bit of entropy.
Another Session::Token design criterion is that all tokens should be the same length. The default token
length is 22 characters and the tokens are always exactly 22 characters (no more, no less). Instead of
tokens that are exactly "N" characters, some libraries that use arbitrary precision arithmetic end up
creating tokens of at most "N" characters.
A fixed token length is nice because it makes writing matching regular expressions easier, simplifies
storage (you never have to store length), causes various log files and things to line up neatly on your
screen, and ensures that encrypted tokens won't leak token entropy due to length (see "VARIABLE LENGTH
TOKENS").
In summary, the default token length of exactly 22 characters is a consequence of these decisions:
base-62 representation, 128 bit minimum token entropy, and fixed token length.
MOD BIAS
Some token generation libraries that implement custom alphabets will generate a random value, compute its
modulus over the size of an alphabet, and then use this modulus to index into the alphabet to determine
an output character.
Assume we have a uniform random number source that generates values in the set "[0,1,2,3]" (most PRNGs
provide sequences of bits, in other words power-of-2 size sets) and wish to use the alphabet "abc".
If we use the naïve modulus algorithm described above then 0 maps to "a", 1 maps to "b", 2 maps to "c",
and 3 also maps to "a". This results in the following biased distribution for each character in the
token:
P(a) = 2/4 = 1/2
P(b) = 1/4
P(c) = 1/4
Of course in an unbiased distribution, each character would have the same chance:
P(a) = 1/3
P(b) = 1/3
P(c) = 1/3
Bias is undesirable because certain tokens are obvious starting points when token guessing and certain
other tokens are very unlikely. Tokens that are unbiased are equally likely and therefore there is no
obvious starting point with them.
Session::Token provides unbiased tokens regardless of the size of your alphabet (though see the
"INTRODUCING BIAS" section for a mis-use warning). It does this in the same way that you might simulate
producing unbiased random numbers from 1 to 5 given an unbiased 6-sided die: Re-roll every time a 6 comes
up.
In the above example, Session::Token eliminates bias by only using values of 0, 1, and 2 (the
"t/no-mod-bias.t" test contains some more notes on this topic).
Note that mod bias can be made arbitrarily small by increasing the amount of data consumed from a random
number generator (provided that arbitrary precision modulus is available). Because this module
fundamentally avoids mod bias, it can use each of the 4 bytes from an ISAAC-32 word for a separate
character (excepting "re-rolls").
EFFICIENCY OF RE-ROLLING
Throwing away a portion of random data in order to avoid mod bias is slightly inefficient. How many bytes
from ISAAC do we expect to consume for every character in the token? It depends on the size of the
alphabet.
Session::Token masks off each byte using the smallest power of two greater than or equal to the alphabet
size minus one so the probability that any particular byte can be used is:
P = alphabet_size / next_power_of_two(alphabet_size)
For example, with the default base-62 alphabet "P" is "62/64".
In order to find the average number of bytes consumed for each character, calculate the expected value
"E". There is a probability "P" that the first byte will be used and therefore only one byte will be
consumed, and a probability "1 - P" that "1 + E" bytes will be consumed:
E = P*1 + (1 - P)*(1 + E)
E = P + 1 + E - P - P*E
0 = 1 - P*E
P*E = 1
E = 1/P
So for the default base-62 alphabet, the average number of bytes consumed for each character in a token
is:
E = 1/(62/64) = 64/62 ≅ 1.0323
Because of the next power of two masking optimisation described above, "E" will always be less than 2. In
the worst case scenario of an alphabet with 129 characters, "E" is roughly 1.9845.
This minor inefficiency isn't an issue because the ISAAC implementation used is quite fast and this
module is very thrifty in how it uses ISAAC's output.
INTRODUCING BIAS
If your alphabet contains the same character two or more times, this character will be more biased than a
character that only occurs once. You should be careful that your alphabets don't repeat in this way if
you are trying to create random session tokens.
However, if you wish to introduce bias this library doesn't try to stop you. (Maybe it should print a
warning?)
Session::Token->new(alphabet => '0000001', length => 5000)->get; # don't do this
## -> 0000000000010000000110000000000000000000000100...
Due to a limitation discussed below, alphabets larger than 256 aren't currently supported so your bias
can't get very granular.
Aside: If you have a constant-biased output stream like the above example produces then you can re-
construct an un-biased bit sequence with the von neumann algorithm. This works by comparing pairs of
bits. If the pair consists of identical bits, it is discarded. Otherwise the order of the different bits
is used to determine an output bit, ie 00 and 11 are discarded but 01 and 10 are mapped to output bits of
0 and 1 respectively. This only works if the bias in each bit is constant (like all characters in a
Session::Token are).
ALPHABET SIZE LIMITATION
Due to a limitation in this module's code, alphabets can't be larger than 256 characters. Everywhere the
above manual says "characters" it actually means bytes. This isn't a Unicode limitation per se, just the
maximum size of the alphabet. If you like, you can map tokens onto new alphabets as long as they aren't
more than 256 characters long. Here is how to generate a 128-bit minimum entropy token using the
lowercase greek alphabet (note that both forms of lowercase sigma are included which may not be
desirable):
use utf8;
my $token = Session::Token->new(alphabet => [map {chr} 0..25])->get;
$token = join '', map {chr} map {ord($_) + ord('α')} split //, $token;
# ρφνδαπξδββφδοςλχτμγσψδψζειετ
Here's an interesting way to generate a uniform random integer between 0 to 999 inclusive:
0 + Session::Token->new(alphabet => ['0'..'9'], length => 3)->get
If you wanted to natively support high code points, there is no point in hard-coding a limitation on the
size of Unicode or even the (higher) limitation of perl characters. Instead, arbitrary precision
"characters" should be supported with bigint. Here's an example of something similar in lisp: isaac.lisp
<http://hcsw.org/downloads/isaac.lisp>.
This module is not however designed to be the ultimate random number generator and at this time I think
changing the design as described above would interfere with its goal of being secure, efficient, and
simple.
TOKEN TEMPLATES
String::Random has a method called "randpattern" where you provide a pattern that serves as a template
when creating the token. You define the meaning of 1 or more template characters and each one that occurs
in the pattern is replaced by a random character from a corresponding alphabet.
Andrew Beverley requested this feature for Session::Token and I suggested approximately the following:
use Session::Token;
sub token_template {
my (%m) = @_;
%m = map { $_ => Session::Token->new(alphabet => $m{$_}, length => 1) } keys %m;
return sub {
my $v = shift;
$v =~ s/(.)/exists $m{$1} ? $m{$1}->get : $1/eg;
return $v;
};
}
In order to use "token_template" you should pass it key-vaue pairs of the different token characters and
the alphabets they represent. It will return a sub that should be passed the template pattern and it will
return the resulting random tokens.
For example, here is how to create UUID version 4
<https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_.28random.29> tokens:
sub uuid_v4_generator {
my $t = token_template(
x => [ 0..9, 'a'..'f' ],
y => [ 8, 9, 'a', 'b' ],
);
return sub {
return $t->('xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx');
}
}
"uuid_v4_generator" returns a generator function that will return tokens of the following form:
1b782499-9913-4726-a80a-25e7b2221a7c
90f85a64-d826-43bf-98e7-94ba87406bfb
b8b73175-3cce-4861-b43b-3dec5ed5d641
3afb64ab-6de3-4647-bbff-eb94dfa7d4b0
447d2001-2aec-4d32-9910-8c289ae34c48
Note that characters in the pattern which don't have template characters defined ("-" and 4 in the above
example) are passed through to the output token.
SEEDING
This module is designed to always seed itself from your kernel's secure random number source. You should
never need to seed it yourself.
However if you know what you're doing you can pass in a custom seed as a 1024 byte long string. For
example, here is how to create a "null seeded" generator:
my $gen = Session::Token(seed => "\x00" x 1024);
This is done in the test-suite to compare against Jenkins' reference ISAAC output, but obviously don't do
this in regular applications because the generated tokens will be the same every time your program is
run.
One valid reason for manually seeding is if you have some reason to believe that there isn't enough
entropy in your kernel's randomness pool and therefore you don't trust "/dev/urandom". In this case you
should acquire your own seed data from somewhere trustworthy (maybe "/dev/random" or a previously stored
trusted seed).
VARIABLE LENGTH TOKENS
As mentioned above, all tokens produced by a Session::Token generator are the same length. If you prefer
tokens of variable length, it is possible to post-process the tokens in order to achieve this so long as
you keep some things in mind.
If you randomly truncate tokens created by Session::Token, be careful not to introduce bias. For example,
if you choose the length of the token as a uniformly distributed random length between 8 and 10, then the
output will be biased towards shorter token sizes. Length 8 tokens should appear less frequently than
length 9 or 10 tokens because there are fewer of them.
Another approach is to eliminate leading characters of a given value in the same way as leading 0s are
commonly eliminated from numeric representations. Although this approach doesn't introduce bias, the
tokens 1 and 01 are not distinct so it does not increase token entropy given a fixed maximum token length
which is the main reason for preferring variable length tokens. The ideal variable length algorithm would
generate both 1 and 01 tokens (with identical frequency of course).
Implementing unbiased, variable-length tokens would complicate the Session::Token implementation
especially since you should still be able to specify minimum entropy variable-length tokens. Minimum
entropy is the primary input to Session::Token, not token length. This is the reason that the default
token length of 22 isn't hard-coded anywhere in the Session::Token source code (but 128 is).
The final reason that Session::Token discourages variable length tokens is that they can leak token
information through a side-channel. This could occur when a message is encrypted but the length of the
original message can be inferred from the encrypted ciphertext.
BUGS
Should check for biased alphabets and print warnings.
Would be cool if it could detect forks and warn or re-seed in the child process (without incurring
"getpid" overhead).
There is currently no way to extract the seed from a Session::Token object. Note when implementing this:
The saved seed must either store the current state of the ISAAC round as well as the 1024 byte "randsl"
array or else do some kind of minimum fast forwarding in order to protect against a partially duplicated
output-stream bug.
Doesn't work on perl 5.6 and below due to the use of ":raw" (thanks CPAN testers). It could probably use
"binmode" instead, but meh.
On windows we use Crypt::Random::Source::Strong::Win32 which has a big dependency tree. We should instead
use a slimmer module like Crypt::Random::Seed.
COMMAND-LINE APP
There is a command-line application called App::Session::Token which is a convenience wrapper around
Session::Token. You can generate session tokens by running the "session-token" binary:
$ echo "Your password is `session-token`"
Your password is 8Yom6z4AeB1RXxCGzklJFt
It supports all the options of this module via command line parameters, and multiple session tokens can
be generated with the "--num" (aka "-n") switch. For example:
$ session-token --alphabet ABC --entropy 32 --num 5
BACAACABCCCCAACBBBCAB
BCBACACBBCACCBABABCBA
ABBBCBABBACBBBCBBBCCA
AACCBBBCCAAACBABACABC
CCABCABBCCCAACAAACCAA
SEE ALSO
The Session::Token github repo <https://github.com/hoytech/Session-Token>
App::Session::Token
Presentation for Toronto Perl Mongers <https://www.youtube.com/watch?v=c2KZBTtrmZE?start=3705>
There are lots of different modules for generating random data. If the characterisations of any of them
below are inaccurate or out-of-date, please file a github issue and I will correct them.
Like this module, perl's rand() function implements a user-space PRNG seeded from "/dev/urandom".
However, perl's rand() is not secure. Perl doesn't specify a PRNG algorithm at all. On linux, whatever it
is is seeded with a mere 4 bytes from "/dev/urandom".
Data::Token is the first thing I saw when I looked around on CPAN. It has an inflexible and unspecified
alphabet. It tries to get its source of unpredictability from UUIDs and then hashes these UUIDs with
SHA-1. I think this is bad design because some standard UUID formats aren't designed to be unpredictable
at all. This is acknowledged in RFC 4122 section 6: "Do not assume that UUIDs are hard to guess; they
should not be used as security capabilities (identifiers whose mere possession grants access)." With
certain UUIDs, knowing a target's MAC address or the rough time the token was issued may help you predict
a reduced area of token-space to concentrate guessing attacks upon. I don't know if Data::Token uses
these types of UUIDs or the potentially secure "version 4" UUIDs, but because this wasn't addressed in
the documentation and because of an apparent misapplication of hash functions (if you really had an
unpredictable UUID, there would be no need to hash), I don't feel good about using this module.
There are several decent random number generators like Math::Random::Secure and Crypt::URandom but they
usually don't implement alphabets and some of them require you open or read from "/dev/urandom" for every
chunk of random bytes. Note that Math::Random::Secure does prevent mod bias in its random integers and
could be used to implement unbiased alphabets (slowly).
String::Random has a neat regexp-like language for specifying random tokens which is more flexible than
alphabets. However, it uses perl's rand() and its documentation fails to discuss performance, bias, or
security. See the "TOKEN TEMPLATES" section for a similar feature.
String::Urandom has alphabets, but it uses the flawed mod algorithm described above and opens
"/dev/urandom" for every token.
There are other modules like Data::Random, App::Genpass, String::MkPasswd, Crypt::RandPasswd,
Crypt::GeneratePassword, and Data::SimplePassword but they use insecure PRNGs such as rand() or mersenne
twister, don't adequately deal with bias, and/or don't let you specify generic alphabets.
Bytes::Random::Secure has alphabets (aka "bags"), uses ISAAC, and avoids mod bias using the re-roll
algorithm. It is much slower than Session::Token (even when using Math::Random::ISAAC::XS) but does
support alphabets larger than 256 and might work in environments without XS.
Neil Bowers has conducted a 3rd party review <http://neilb.org/reviews/passwords.html> of various
token/password generation modules including Session::Token.
Leo Zovic has created a Common Lisp implementation of session-token
<https://github.com/Inaimathi/session-token>.
AUTHOR
Doug Hoyte, "<doug@hcsw.org>"
COPYRIGHT & LICENSE
Copyright 2012-2016 Doug Hoyte.
This module is licensed under the same terms as perl itself.
ISAAC code:
By Bob Jenkins. My random number generator, ISAAC. Public Domain
perl v5.40.0 2024-10-20 Session::Token(3pm)