DATA SIMPLIFICATION: The Many Uses of Random Number Generators

Over the next few weeks, I will be writing on topics related to my latest book, Data Simplification: Taming Information With Open Source Tools (release date March 17, 2016). I hope I can convince you that this is a book worth reading. Blog readers can use the discount code: COMP315 for a 30% discount, at checkout.If you are among the many students and professionals who are intimidated by statistics, then fear no more! With a little imagination, random number generators (to be accurate, pseudorandom number generators) can substitute for a wide range of statistical methods. As it happens, modern computers can perform two simple processes, easily and very quickly. These two processes are: 1) generating random numbers, and 2) repeating sets of instructions thousands or millions of times. Using these two computational steps, we can accurately predict outcomes that would be intractable to any direct mathematical analysis. You are about to be rewarded with simple methods whereby every statistical test can be replicated and every probabilistic dilemma can be resolved; usually with a few lines of code (1-5). To begin, let's perform a few very simple simulations that confirm what we already know, intuitively. Imagine that you have a pair of dice, and you would like to know how often you might expect each of the numbers (from one to six) to appear after you've thrown one die (5). Let's simulate 600,000 throws of a die, using the Perl script, randtest.pl: #!/usr/bin/perl ...
Source: Specified Life - Category: Information Technology Tags: computer science data analysis data repurposing data simplification Monte Carlo probability pseudorandom resampling simplifying data simulations Source Type: blogs