There are three basic elements to look for when judging the quality of a psychological test – reliability, validity, and standardization. RELIABILITY is a measure of the test’s consistency. A useful test is consistent over time. As an analogy, think of a bathroom scale. If it gives you one weight the first time you step on it, and a different weight when you step on it a moment later, it is not reliable. Similarly, if an IQ test yields a score of 95 for an individual today and 130 next week, it is not reliable. Reliability also can be a measure of a test’s internal consistency. All of the items on a test should be measuring the same thing – from a statistical standpoint, the items should correlate with each other. VALIDITY is a measure of a test’s usefulness. Scores on the test should be related to some other behavior, reflective (反映出…的) of personality, ability, or interest. For instance, a person who scores high on an IQ test would be expected to do well in school or on jobs requiring intelligence. A person who scores high on a scale of depression should be diagnosed as depressed by mental health professionals who assess him. A validity correlation reflects the degree to which such relationships exist. Relatively low correlations mean that some people may score high on a scale of depression without being depressed and some people may score high on an IQ test and yet not do well in school. STANDARDIZATION is the process of trying out the test on a group of people to see the scores which are typically obtained. In this way, any test taker can make sense of his score by comparing it to typical scores. This standardization provides a mean and standard deviation (标准差) relative to a certain group. When an individual takes the test, he can determine how far above or below the average his score is, relative to the norm group. When evaluating a test, it is very important to determine how the norm group was selected. For instance, if everyone in the norm group took the test by logging into a website, you are probably being compared to a group which is very different from the general population.