This article needs attention from an expert in Statistics.November 2008)(
This article includes a list of references, but its sources remain unclear because it has insufficient inline citations. (November 2010) (Learn how and when to remove this template message)
In statistics, a likelihood-ratio test (LR test) is a statistical test used for comparing the goodness of fit of two statistical models—a null model (representing the null hypothesis) against an alternative model (representing an alternative hypothesis). The test is based on the ratio of the likelihoods of the two models; the ratio expresses how many times more likely the data are under one model than the other. This likelihood ratio, or equivalently its logarithm, can be used to compute a p-value, or compared to a critical value to decide whether to reject the null model.
In the case of comparing two models each of which has no unknown parameters, use of the likelihood-ratio test can be justified by the Neyman–Pearson lemma. The lemma demonstrates that the test has the highest power among all competitors.
Suppose that we have a statistical model with parameter space . A null hypothesis is often stated by saying that the parameter is in a specified subset of . The alternative hypothesis is thus that is in the complement of , i.e. in , which is denoted by .
The likelihood function is (where f is the probability density function or probability mass function) viewed as a function of the parameter , with held fixed at the value that was actually observed, i.e. the data. The likelihood-ratio test statistic, which is often denoted by (the Greek letter lambda), is as follows.
Here, the notation refers to the supremum function.
A likelihood-ratio test is any test with critical region (or rejection region) of the form where is any number satisfying .
The likelihood-ratio test requires that the models be nested—i.e. the more complex model can be transformed into the simpler model by imposing constraints on the former's parameters. Many common test statistics are tests for nested models and can be phrased as log-likelihood ratios or approximations thereof: e.g. the Z-test, the F-test, the G-test, and Pearson's chi-squared test; for an illustration with the one-sample t-test, see below.
If the models are not nested, then instead of the likelihood-ratio test, there is a generalization of the test that can usually be used: for details, see relative likelihood.
Case of simple hypotheses
A simple-vs.-simple hypothesis test has completely specified models under both the null hypothesis and the alternative hypothesis, which for convenience are written in terms of fixed values of a notional parameter :
In this case, under either hypothesis, the distribution of the data is fully specified: there are no unknown parameters to estimate. For this case, a variant of the likelihood-ratio test is available:
(Some older references may use the reciprocal of the function above as the definition.) Thus, the likelihood ratio is small if the alternative model is better than the null model.
The likelihood-ratio test provides the decision rule as follows:
- If , do not reject ;
- If , reject ;
- Reject with probability if
The values and are usually chosen to obtain a specified significance level , via the relation
The likelihood ratio is a function of the data ; therefore, it is a statistic. The likelihood-ratio test rejects the null hypothesis if the value of this statistic is too small. How small is too small depends on the significance level of the test, i.e. on what probability of Type I error is considered tolerable (Type I errors consist of the rejection of a null hypothesis that is true).
The numerator corresponds to the likelihood of an observed outcome under the null hypothesis. The denominator corresponds to the maximum likelihood of an observed outcome, varying parameters over the whole parameter space. The numerator of this ratio is less than the denominator; so, the likelihood ratio is between 0 and 1. Low values of the likelihood ratio mean that the observed result was much less likely to occur under the null hypothesis as compared to the alternative. High values of the statistic mean that the observed outcome was nearly as likely to occur under the null hypothesis as the alternative, and so the null hypothesis cannot be rejected.
The following example is adapted and abridged from Stuart et al. (1999, §22.2).
Suppose that we have a random sample, of size n, from a population that is normally-distributed. Both the mean, μ, and the standard deviation, σ, of the population are unknown. We want to test whether the mean is equal to a given value, μ0.
Thus, our null hypothesis is H0: μ = μ0 and our alternative hypothesis is H1: μ ≠ μ0. The likelihood function is
With some calculation (omitted here), it can then be shown that
where t is the t-statistic with n − 1 degrees of freedom. Hence we may use the known exact distribution of t2 to draw inferences.
Asymptotic distribution: Wilks’ theorem
If the distribution of the likelihood ratio corresponding to a particular null and alternative hypothesis can be explicitly determined then it can directly be used to form decision regions (to sustain or reject the null hypothesis). In most cases, however, the exact distribution of the likelihood ratio corresponding to specific hypotheses is very difficult to determine.
Assuming H0 is true, there is a fundamental result by Samuel S. Wilks: as the sample size approaches , the test statistic asymptotically will be chi-squared distributed () with degrees of freedom equal to the difference in dimensionality of and . This implies that for a great variety of hypotheses, we can calculate the likelihood ratio for the data and then compare to the value corresponding to a desired statistical significance as an approximate statistical test. Other extensions exist.
- Casella, George; Berger, R. L. (2002), Statistical Inference (Second ed.), Wadsworth Publishing, ISBN 0-534-24312-6.
- Cox, D. R.; Hinkley, D. V. (1974), Theoretical Statistics, Chapman & Hall, ISBN 0-412-12420-3.
- Mood, A. M.; Graybill, F. A.; Boes, D. C. (1974), Introduction to the Theory of Statistics (3rd ed.), McGraw-Hill.
- Neyman, J.; Pearson, E. S. (1933), "On the problem of the most efficient tests of statistical hypotheses" (PDF), Philosophical Transactions of the Royal Society of London A, 231 (694–706): 289–337, Bibcode:1933RSPTA.231..289N, doi:10.1098/rsta.1933.0009, JSTOR 91247.
- Stuart, A.; Ord, K.; Arnold, S. (1999), Kendall's Advanced Theory of Statistics, 2A, Arnold.
- Wilks, S. S. (1938), "The large-sample distribution of the likelihood ratio for testing composite hypotheses", Annals of Mathematical Statistics, 9: 60–62, doi:10.1214/aoms/1177732360.
- Glover, Scott; Dixon, Peter (2004), "Likelihood ratios: A simple and flexible statistic for empirical psychologists", Psychonomic Bulletin & Review, 11 (5): 791–806, doi:10.3758/BF03196706
- Held, Leonhard; Sabanés Bové, Daniel (2014), Applied Statistical Inference—Likelihood and Bayes, Springer
- Kalbfleisch, J. G. (1985), Probability and Statistical Inference, 2, Springer-Verlag
- Perlman, Michael D.; Wu, Lang (1999), "The emperor's new tests", Statistical Science, 14 (4): 355–381, doi:10.1214/ss/1009212517
- Perneger, Thomas V. (2001), "Sifting the evidence: Likelihood ratios are alternatives to P values", The BMJ, 322 (7295): 1184–5, PMC 1120301, PMID 11379590
- Pinheiro, José C.; Bates, Douglas M. (2000), Mixed-Effects Models in S and S-PLUS, Springer-Verlag, pp. 82–93
- Solomon, Daniel L. (1975), "A note on the non-equivalence of the Neyman-Pearson and generalized likelihood ratio tests for testing a simple null versus a simple alternative hypothesis", The American Statistician, 29 (2): 101–102, doi:10.1080/00031305.1975.10477383