Validity in Test Administration | TESL Issues

Validity

Validity

The two scholars who proposed the concept of validity were Cronbach and Meehl (1950s).

Messick changed the way in which we understand validity. He described validity as: An integrated evaluative judgement of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores and other modes of assessment. Validity is a unitary concept. That is, it is the degree to which all the accumulated evidence supports the same intended interpretation of test scores for the proposed purpose.

Validity is the most important consideration in test evaluation. The concept refers to the appropriateness, meaningfulness, and usefulness of the specific inferences made from test scores. Test evaluation is the process of accumulating evidence to support such inferences. A variety of inferences may be made from scores produced by a given test, and there are many ways of accumulating evidence to support any particular inference.

The two main types of validity that are important in conducting research are internal and external validity. Validity in testing and assessment has traditionally been understood to mean discovering whether a test measures accurately what it is intended to measure (Hughes, 2003).

Three types of validity in early theory: (1) criterion-oriented validity, (2) content validity, and (3) construct validity (Cronbach & Meehl, 1955).

Validity implies considerations of social responsibility, both to the candidate (protecting him or her against unfair exclusion) and to the receiving institution and those whose quality of health care will be a function in part of the adequacy of the candidate’s communicative skill. In short, validity is addressing issues of fairness, which have social implications.

Bachman (1990) presented validity as a unitary concept, requiring evidence to support the inferences that we make on the basis of test scores. Messick’s way of looking at validity has become the accepted paradigm in psychological, educational and language testing.

It has become accepted that the more tasks and items a test contains, the more reliable and valid it is likely to be (Fulcher & Davidson, 2007). This is because if the response a learner makes to one item is influenced, or even dictated, by a response to another item, that item carries less unique information.

Validity theory in large-scale testing is extremely concerned with the generalisability of score meaning. That is, to what extent is a particular test score meaningful beyond the specific context of the test that generated the score? Part of this is the notion of reliability or consistency (Fulcher & Davidson, 2007).

References

  1. Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281-302.
  2. Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.
  3. Fulcher, G., & Davidson, F. (2007). Language testing and assessment: An advanced resource book. Oxen: Routledge.
  4. Hughes, A. (2003). Testing for language teachers. (2nd). Cambridge: Cambridge University Press.

Leave a Comment