Sunday, October 12, 2014

The Reciprocal Normal Distribution

A recent question on OR-Exchange dealt with the reciprocal normal distribution. Specifically, if $k$ is a constant and $X$ is a Gaussian random variable, the distribution of $Y=k/X$ is reciprocal normal. The poster had questions about approximating the distribution of $Y$ with a Gaussian (normal) distribution.

This gave me a reason (excuse?) to tackle something on my to-do list: learning to use Shiny to create an interactive document containing statistical analysis (or at least statistical mumbo-jumbo). I won't repeat the full discussion here, but instead will link the Shiny document I created. It lets you tweak settings for an example of a reciprocal normal variable and judge for yourself how well various normal approximations fit. I'll just make a few short observations here:
  • No way does $Y$ actually have a normal distribution.
  • Dividing by $X$ suggests that you probably should be using a distribution with finite tails (e.g., a truncated normal distribution) for $X$. In particular, the original question had $X$ being speed of something, $k$ being (fixed) distance to travel and $Y$ being travel time. Unless the driver is fond of randomly jamming the gear shift into reverse, chances are $X$ should be nonnegative; and unless this vehicle wants to break all laws of physics, $X$ probably should have a finite upper bound (check local posted speed limits for suggestions). That said, I yield to the tendency of academics to prefer tractible/well-known approximations (e.g., normal) over realistic ones.
  • The coefficient of variation of $X$ will be a key factor in determining whether approximating the distribution of $Y$ with a normal distribution is "good enough for government work". The smaller the coefficient of variation, the less likely it is that $X$ wanders near zero, where bad things happen. In particular, the less likely it is that $X$ gets anywhere near zero, the less skewness $Y$ suffers.
  • There is no one obvious way to pick parameters (mean and standard deviation) for a normal approximation to $Y$. I've suggested a few in the Shiny application, and you can try them to see their effect.
I'd also like to give a shout-out to the tools I used to generate the interactive document, and to the folks at RStudio.com for providing free hosting at ShinyApps.io. The tool chain was:
  • R (version 3.1.1) to do the computations;
  • R Studio as the IDE for development (highly recommended);
  • R Markdown as the "language" for the document;
  • Shiny to handle the interactive parts;
  • various R packages/tools to generate the final product.
It's obvious that a lot of loving effort (and probably no small amount of swearing) has gone into the development of all those tools.

No comments:

Post a Comment

If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on OR-Exchange.