Validity in Test Administration | TESL Issues


The two scholars who proposed the concept of validity were and (1950s).

changed the way in which we understand validity. He described validity as: An integrated evaluative judgement of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores and other modes of . (, 1989, p. 13).

Validity is a unitary concept. That is, it is the degree to which all the accumulated evidence supports the same intended interpretation of test scores for the proposed purpose (, 1999).

Aera (1985, p. 9): Validity is the most important consideration in test evaluation. The concept refers to the appropriateness, meaningfulness, and usefulness of the specific inferences made from test scores. Test evaluation is the process of accumulating evidence to support such inferences. A variety of inferences may be made from scores produced by a given test, and there are many ways of accumulating evidence to support any particular inference

The two main types of validity that are important in conducting research are internal and . Validity in and assessment has traditionally been understood to mean discovering whether a test measures accurately what it is intended to measure (, 1989).

Three types of validity in early theory: (1) , (2) , and (3) (Cronbach & Meehl, 1955).

Validity implies considerations of social responsibility, both to the candidate (protecting him or her against unfair exclusion) and to the receiving institution and those whose quality of health care will be a function in part of the adequacy of the candidate’s communicative skill (Messick). In short, validity is addressing issues of , which have social implications.

presented validity as a unitary concept, requiring evidence to support the inferences that we make on the basis of test scores.

Messick’s (1989) way of looking at validity has become the accepted paradigm in psychological, educational and .

It has become accepted that the more tasks and items a test contains, the more reliable and valid it is likely to be (, 2007). This is because if the response a learner makes to one item is influenced, or even dictated, by a response to another item, that item carries less unique information.

Validity theory in is extremely concerned with the of score meaning. That is, to what extent is a particular meaningful beyond the specific context of the test that generated the score? Part of this is the notion of or consistency (Fulcher & Davidson, 2007).

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Leave a Comment

Your email address will not be published.

fourteen + ten =