Posted on 2007-04-05 06:23:54-07 by dtlavelle
Problem with acquiring random seeds
I wrote a script using Math::Random::MT::Auto to acquire random seeds from the internet and place them into a table within a MySQL database. Here is my simple schema:
CREATE TABLE randseed32 ( randid bigint NOT NULL auto_increment, randseed bigint default NULL, KEY randid (randid) );
Here is the small perl script:
use Math::Random::MT::Auto qw(rand),qw(rn_info), qw(get_seed set_seed get_state set_state); my @rand_seed = get_seed(); for my $rand_seed_iter (0..$#rand_seed){ my $rand_seed = $rand_seed[$rand_seed_iter]; $dbh->do ("INSERT INTO randseed32 SET randseed = $rand_seed"); }
When I run this script for 100 iterations waiting 10 seconds between each script system call, I obtain a table of 25,000 32-bit integers. The unexpected outcome was that out of these 25,000 there were 3093 entries with a count of 2. It would seem that the web site (http://www.randomnumbers.info/) may recycle numbers or maybe I am completely missing something important. I made a different database and changed the random seed source to 'random_org' (http://www.random.org/). I obtain a total of 32804 32-bit integers. Of these only 20 are repeated, but each of these are repeated 49 times!
+----------------+------------+ | count(randseed)| randseed | +----------------+------------+ | 49 | 538979961 | | 49 | 543515987 | | 49 | 543516788 | | 49 | 543584032 | | 49 | 544370534 | | 49 | 544437353 | | 49 | 544567129 | | 49 | 1633972084 | | 49 | 1634738273 | | 49 | 1635020661 | | 49 | 1646292335 | | 49 | 1679848047 | | 49 | 1684955506 | | 49 | 1702065440 | | 49 | 1702257000 | | 49 | 1713399143 | | 49 | 1767994469 | | 49 | 1870209124 | | 49 | 1897951861 | | 49 | 1953461617 | +----------------+------------+ 20 rows in set (0.00 sec)
This seems to be the result of exhausting the daily limit from random.org which is currently set for 1,000,000 bits. I saw no errors dumped to screen while running the script. However, it would seem that the 49 duplicates of the 20 integers are from seeds using the same numerical inputs (is this from the time,pid of the master script used to call the above script every 10 seconds?) I do not see repeated numbers if I draw 100 iterations of seeds from /dev/urandom. Am I completely missing something and doing something blatantly wrong? Have others seen this as well? I now have a couple of questions. 1) Does it make any sense to seed a MT with pseudo random numbers from /dev/urandom? 2) Is a MT pseudorandom number generator better than the standard perl rand() function when using only one 32-bit seed value? I have a project that I would like to generate about 20 million random numbers and possibly the MT algorithm may be a bit overkill. Any hints or suggestions would be greatly appreciated. Thank you.
Direct Responses: 4789 | Write a response
Posted on 2007-04-09 14:51:22-07 by jdhedden in response to 4766
Re: Problem with acquiring random seeds
While MRMA does try to get a full seed array from random sources, it is not alway guaranteed to do so. If a full seed is not acquired, a warning is issued, and the values in the seed are repeated to fill up the PRNG. In such a case, get_seed returns a full array with some duplicate values. For instance, in the case when no sources available, MRMA will use the PID and time as a partial seed, and then repeat them over and over until the PRNG is fully seeded. get_seed will then return (PID, time, PID, time, PID, time, ...). This phenonomenon accounts for the repeated values in your table.

For this reason, I don't recommend saving seeds as a source of random numbers.

That being said, you can ward against partial seeds by trapping warnings when doing srand, and then skipping the seed:
use Math::Random::MT::Auto ':!auto'; my @WARN; $SIG{'__WARN__'} = sub { push(@WARN, @_); }; my $prng = Math::Random::MT::Auto(); for (1..50) { # This will exhaust your daily quota $prng->srand('random_org'); if (@WARN) { print('Skipping seed: ', @WARN); undef(@WARN); } else { my @seed = $prng->get_seed(); # Store seed in database } sleep(10); }
You also asked about using /dev/urandom. This device draws from /dev/random until such time (if ever) that /dev/random is exhausted. Then it falls back to a PRNG to fill the rest of the request. So using /dev/urandom is a good idea if you're not using it too frequently.

If you want to guarantee a fully random seed, however, then use /dev/random. If you need a full seed, then trap warnings as per the above and using:
$prng->srand('/dev/random/');
Is MRMA better than rand() even if using just on 32-bit seed? Yes, because MRMA can produce a longer sequence before repeating. (I think I give some idea on this in the POD, but maybe not.)
Direct Responses: Write a response
Perl Weekly newsletter
A free weekly newsletter for people who are busy to read all the blogs. click here to check it out.