Talk:Generalized p-value

Issues[edit]

This article has a number of issues:

It does not define what a generalized p-value is, or how it differs from a regular p-value.

It should provide a concrete example of a generalized p-value.

The article (and the subject it discusses) seems to derive entirely from the work of one individual. The lack of independent interest in this topic raises questions about whether it belongs in Wikipedia.

The article does not provide a balanced view of the subject. When the sample size is small, the uncertainty in the model specification is as important, or more important, than the uncertainty in the parameter estimates (since there is not enough data to check the model specification). Thus, achieving exact p-values in this sense seems to address the wrong question.

The article seems largely geared toward promoting the methods of the XPro software package.

Skbkekas (talk) 15:43, 2 December 2009 (UTC)[reply]

I've added a proper introductory sentence, telling the reader in the first two word that statistics is the topic, and setting the title phrase in bold. Putting the topic at the beginning justified removing the "context" tag and I've done so.

I agree with "Skbkekas"'s first two bullet points. His fourth bullet point seems to make a substantial point, maybe worth mentioning in the article.

Concerning the fifth bullet point, one could delete the software-promotion. Michael Hardy (talk) 21:53, 8 December 2009 (UTC)[reply]

My view, on having had a brief look at the publications, is that these are extremely unclear about what is going on or what the authors are trying to do, but that something important may well be going on. Has anyone done a search for otheer people citing the work?

I don't think "Skbkekas"'s is quite right. The point being made is that things like maximum likelihood lead to tests with acceptable properties in a general sense because of asymptotic behaviours, but that strictly the probabilities of the test statistic exceeding the critical points are (not only approximate but also) dependent on the unknown values of nuisance parameters. I think the point is that the authors claim to have a method for assigning a value called "'p'", and presumably that this assignment has an interpretation that makes sense in inference terms, although it would be different from the interpretation of a p-value ... and that this somehow takes care of the nuisance parameters. Melcombe (talk) 12:19, 9 December 2009 (UTC)[reply]

Concerning All Points User: sweeraha August 22, 2010

I have cleaned up the whole article.
It should be noted that unlike StatXact, XPro is a free software package.
Also note that now there are hundreds of generalized p-value papers in the literature solving problems ranging from MANOVA under unequal covariances to BLUP in Mixed Models, problems where MLE based methods are known to have very poor performance. —Preceding unsigned comment added by Sweeraha (talk • contribs) 20:01, 22 August 2010 (UTC)[reply]

There have been good improvements. But it is not clear how R can be calculated in practice since its calculation appears to depend not only on unknown parameters but also on extra random variables. Are there errors in the equation or steps missing?

The user-name Sweeraha appears similar to one of the authors of a primary source ... as a formality it would be good to either confirm this here or say otherwise ... there are rules about this but it is mainly matter of just not over-selling the topic (which looks OK at present). It would also be good fo the article to have some citations to the "hundreds of generalized p-value papers in the literature". Melcombe (talk) 09:17, 24 August 2010 (UTC)[reply]

Thanks for your comments, which lead me do revise the orginal write-up by another contributor. I am the author of the original artcle on generalized p-values appeared in Econometrica and the author of two books on the subject, which provide references to a large fraction of the articles you desire to see. About your question, R plays a role similar that of a posterior distribution in Bayesian Statistics. It is the probabilities (or expected values in the case of point estimation) of R that one calculates in testing and in interval estimation, and hence there is no computational issue. Freely available XPro software package computes generalized p-values for widely used methods such as ANOVA under unequal variances. —Preceding unsigned comment added by sweeraha (talk) 01:48, 5 September 2010 (UTC)[reply]

But the article has (slightly snipped)..."the generalized test variable

R={\frac {{\overline {x}}S}{s\sigma }}-{\frac {{\overline {X}}-\mu }{\sigma }},

Note that the distribution of

R

and its observed value are both free of nuisance parameters." But the "observed value" of R clearly depends on σ and μ (according to this snipped formula, which is the first part of the equation on the article), so how can it be "free of nuisance parameters". If testing for ρ, one or other of σ and μ is effectively a nuisance parameter. Perhaps the article isn't saying here what it should be saying. Melcombe (talk) 16:15, 7 September 2010 (UTC)[reply]

There is no error in the article. From the first expression, the observed value of R is

\rho

, the parameter of interest, and from the second expression it is clear that the distribution of R is free of unknown parameters. Please refer to the JASA papaer by Tsui and Weehandi for details about Test Variables. Sweeraha (talk), 9:50, 11 September 2010 (UTC)

Whatever, R is, it is not a "parameter" since it depends on the data, and it is not a "statistic" in any usual sense since it depends on some parameters. What is here may well coincide with what appeared in JASA, but that doesn't make it either right or understandable. If you are using some non-standard meaning for the words in the article, then you need to explain that meaning.Melcombe (talk) 09:33, 13 September 2010 (UTC)[reply]

Another possible issue that someone should check: on the given example, the coefficient of variation is defined as $\rho ={\frac {\mu }{\sigma }}$ , while the actual definition of it should be $\rho ={\frac {\sigma }{\mu }}$ . Alternatively, maybe the author meant "signal to noise ratio", which can be expressed as ${\frac {\mu }{\sigma }}$ . — Preceding unsigned comment added by 5.249.40.171 (talk) 07:38, 9 October 2014 (UTC)[reply]