Glossary

Below are definitions of common terminology used in the testing industry.

Assessment-based Training Development
Training programs teach the knowledge and skills needed for successful job performance. Assessments or tests determine appropriate starting points for learners, assist in the monitoring of their progress, and ensure that they ultimately learn the training program’s objectives.
Candidate Services
Services such as application processing, eligibility determination, test scheduling, and renewal or recertification services that are provided to candidates and/or associations to assist with continued success in their field.
Certification Testing
High stakes, legally defensible testing that is proctored and delivered either through paper and pencil or a computer-based testing.
Classification System
An organized list or grouping of the performance domains, tasks, knowledge, skills, and abilities required for minimum competence.
Content Validity
The extent to which the content of an examination contains a balanced and adequate sample of questions representative of the knowledge and skills an individual needs for successful and competent job performance.
Criterion-referenced (Content-referenced) Test
A test or passing point designed to evaluate the specific knowledge or skills possessed by an examinee. Scores are based on what the examinee knows or can do, rather than on their relation to other examinees’ performance on the examination.
Cultural-sensitivity
Culture-fair tests provide an equal opportunity for success by persons of all cultures and life experiences. The content is either common or entirely unfamiliar to all persons regardless of their cultural background.
Diagnostic Test
A test used to diagnose or analyze the examinee’s specific areas of strength or weakness. Diagnostic practice tests are often very useful tools in preparing for certification examinations.
Domain
Performance domains are the major responsibilities or duties that make up the practice. Each domain can be characterized as a major heading in an outline format and may include a brief behavioral description.
Equivalent Form
Two or more forms of an examination that are parallel in content, length, and difficulty of the items included. Equivalent test forms should yield similar average scores and measures of variability for a given group of examinees.
High Stakes Testing
A test for which the results have significant and direct consequences for the test taker. An example is a test used to determine whether the candidate is minimally competent in the knowledge and skills required to practice their profession or be admitted to a chosen educational institution.
IBT/CBT- Internet-based Testing/Computer-based Testing
A secure method of examination delivery via the Internet to individual computers. Tests may be proctored or non-proctored.
Item Analysis
The statistical evaluation of single test items, including the item’s difficulty, discriminating power, and distractor effectiveness.
Item Bank
Items/questions available for use on the examination. Item banks generally are maintained in a secure electronic format and include item performance data, history of use, classification/content codes, and content validation documentation.
Item Development
The process of writing, reviewing, and validating examination items by subject matter experts under the direction of a psychometrician. Different item types may be used to evaluate knowledge and skills at a variety of cognitive levels (e.g., multiple choice, simulation, and oral/practical formats).

Job Analysis Study
The process of defining the knowledge, skills, and abilities required of the minimally competent practitioner. The job analysis study typically consists of two phases:
  1. A panel of subject matter experts defines the responsibilities of the minimally competent professional
  2. Individuals working in the profession evaluate and validate the responsibilities identified by the panel. The second phase is typically accomplished using a survey.
Kuder-Richardson Formula(s)
Formulas for estimating the reliability of a test that are based on inter-item consistency and require only a single administration of the test.
Mastery Test
A test designed to determine whether an examinee has mastered a specific unit of instruction or a single knowledge or skill.
Outsource Testing Solutions
Partnering with a vendor to develop and administer certification and testing programs.
Paper and Pencil Testing
A paper-based examination administered to the examinee using a test booklet, scannable answer sheet, and pencil.
Psychometrician
An individual who has training in the field of measurement and evaluation and who has experience in assessing people’s knowledge, skill, and ability. The psychometrician is experienced in applying methodologies that enhance the assessment instruments’ validity, reliability, and legal defensibility.
Random Sample
A sample of the members of some total population drawn in such a way that every member of the population has an equal chance of being included. If the sample is free of bias and representative of the total population, then findings for the sample can be generalized to the total population.
Reliability
The extent to which a measurement is consistent, dependable, and relatively free from errors of measurement.
Test Administration
Ensuring standardized testing conditions for all candidates during each administration of the examination.
Test Delivery
Administering the examination, via paper and pencil or Internet-based means, to examinees. Related services may include candidate eligibility determination, test scheduling, test distribution, and scoring.
Testing Experts
Individuals who are educated and experienced in developing, administering, scoring, and analyzing examinations.
Testing Facilities
The physical location and equipment associated with testing. Facilities must be consistent and conducive to the best possible performance by the examinee. Frequently used testing facilities include, but are not limited to, universities, community colleges, conference rooms, training rooms, and quiet locations with computer stations.
Test Specifications
Also called an examination blueprint. Test specifications identify the proportion of questions from each domain and task that will appear on the examination. Test specifications are derived by combining the overall evaluations of each task’s importance, criticality, and frequency and converting the results into percentages.
 

Computer-based Testing Models

Adaptive Test
Variable-length test with items presented to the candidate based on his/her proficiency on the previously presented items. Adaptive tests require a large item bank, which has been pre-tested on a large candidate population.
Linear-on-the-Fly Test
A fixed-length test, with test items uniquely assembled for each candidate according to defined content and psychometric specifications. That is, upon each access, items are drawn from a large item bank according to the test content outline and statistical item performance data. A large item bank, from which the unique exam forms are drawn, reduces item exposure. Statistical performance data is required on all items in the item bank.
Linear Test
A fixed test form of predetermined items and length. In computer-based testing, the sequence of test items may be consistent for each candidate or randomly presented (i.e., scrambled). A large item bank is not required; however, items should be pre-tested to ensure their reliability if immediate score reporting is desired.
 

Sample Item Types

Multiple-choice Item
A test item for which the examinee must choose the correct or best answer from several given options. Multiple-choice items assess the examinee’s ability to recall facts, apply specific knowledge to a given problem or situation, and reach an appropriate conclusion by analyzing or evaluating information. Multiple-choice examinations can evaluate an examinee’s depth and breadth of knowledge in an objective manner during a reasonable testing period.
Oral/Practical/Performance-based Problem
Oral/practical problems provide an opportunity to assess how well an examinee brings all of his or her skills to bear in situations similar to those faced on the job. This complex interplay of skills is much more difficult to evaluate or score than deciding whether the response to a single multiple-choice test item is right or wrong. The many facets of competent performance on a complex task introduce the possibility that evaluators may view an examinee's performance differently. If two evaluators do not provide the same rating for an examinee’s performance, then the examination's reliability is compromised. There are many ways to reduce the subjectivity in scoring oral/practical examinations and increase the consistency across evaluators. One way is to break competent performance into small units or individual aspects of a task for scoring by the evaluators. It is much more likely that evaluators will be consistent when they are rating specific items rather than just providing a global rating on the examinee’s adequacy of general performance.
Simulation Problem
Simulation problems assess the examinee’s ability to expediently resolve real-life, decision-making situations. Examinees are presented with scenarios similar to those that might be encountered in actual practice. Examinees then must choose the most appropriate action(s) from among several alternatives. Simulation problems are objectively scored and may be offered in either a paper-based or computer-based format.

RELATED LINKS

Request More Information

Candidate Services
Choose from CASTLE's comprehensive group of services designed to ease the administrative burden of candidate management.

Press Releases

Conferences
CASTLE representatives will be available for exchanging ideas at industry conferences throughout the year.

Internet-Based Testing
Take full advantage of our sophisticated Internet-Based Testing technology that delivers a secure candidate testing environment.