Alternative for acroread (Adobe Reader) in LINUX?

Bill Oliver vendor at billoblog.com
Sat Oct 18 01:10:15 UTC 2014


On Fri, 17 Oct 2014, Ranjan Maitra wrote:

>>
>> What I mean is that R has the capability of generating PDFs, and R has
>> the capability of calculating various goodness of fit measures, but if
>> you want to check goodness of fit measures against, say, 50 PDFs, then
>> you have to write the package.  It's easier for me to use easyfit than
>> write the package.
>
> Never having heard of "easyfit" before now, I guess I am confused as to what you mean when you say fitting a pdf. What is the form of the pdfs that you want to fit? It is very unusual to want to fit 50 different parametric pdfs, unless what you mean is something totally different. In that case, have you considered going the (nonparametric) density estimation route?
>
> Many thanks,
> Ranjan
>

Well, this isn't really a fedora thing, but since I think it's interesting I'll impose a little longer.

Here's the problem.  Let's say you have a set of data and you want to characterize it in order to use it as the basis of a model.  In order to do that, you really need to know the underlying PDF.

Here are two simple examples that I've run into in the past couple of years.  I'm a forensic pathologist, and investigate unnatural death.  One common problem in the field is the issue of abusive head trauma -- can you tell from the injuries on a child that the injuries *must* have been caused by another, or could they be from an accident of some sort.

There has been a great deal of biomechanical modeling involved with this issue.  Some of these models are based on physical measurements of the amount of force it takes to fracture the skull of a child.  One very commonly cited study of this actually uses a very small data set of donated skulls.  The data is reported as if it were gaussian, but in fact if you look at it, it is a uniform distribution.

It's a uniform distribution because the investigators took one or two skulls from infants of varying ages -- and what they are really measuring is the change in skull properties over time.  It's as if they did a study on "average human height" and then took one sample from humans at each month from birth to 3 years old.

In situations like this, it's important to see and understand the underlying PDF, because they then use the data *as if it were gaussian* to create biomechanical models.  And it's wrong to do that -- it's wrong to apply the "average"  and "standard deviation" of height of people from birth to 3 years as the supposed "average" height of a newborn baby.  If you look at the distribution, the error becomes obvious.

A second example occurred when a group attempted to apply Benford's Law to look for bias in manner of death determination in forensic death investigation.  The investigators looked at the number of homicides, suicides, accidents, and natural deaths in their jurisdiction each month over a period of a couple of years, and it *seemed* as if it followed Benford's Law.

However, it was an artifact of their workload.  My office has about twice their workload, and the first digit for my practice is scaled by two.  The distribution is really a pretty simple gaussian distribution, and these distributions tend not to follow Benford's Law.  Thus, knowing that the distribution fits a normal well is an argument that manner determination should *not* follow Benford's Law.  If, however, the data fit something like the gamma distribution well, it would *not* argue against the applicability of Benford's law.


billo


More information about the users mailing list