Assessment of Examination Evaluation Data Essay

Assessment data is a tool instructors can use to determine if students are meeting course or learning outcomes. Assessments can be utilized in many ways, such as student practice, student self-assessment, determining readiness, determine grades, etc. The purpose of this assignment is to analyze sample test statistics to determine if student learning has taken place.

To address the questions below in this essay assignment, you will need to use sample statistics provided in the textbooks. For Questions 1-4, use the sample test statistics in Chapter 24 of Teaching in Nursing: A Guide for Faculty. For Questions 5-9, use Chapter 11 in The Nurse Educator\’s Guide to Assessing Learning Outcomes.

In a 1,000-1,250 word essay, use the sample statistics data from the textbooks to respond on the following questions:

Explain what reliability is. Based on the sample statistics, is this test reliable? What evidence from the statistics supports your answer?
What trends are seen in the raw scores? How would an instructor use this information?
What is the range for this sample? What information does the range provide and why is it important?
What information does the standard error of measurement provide? Based on the data provided, does the test have a small or large standard error of measurement? How would an instructor use this information?
Explain the process of analyzing individual items once an instructor has analyzed basic concepts of measurement.
If one of the questions on the exam had a p value of 0.76, would it be a best practice to eliminate the item? Justify your answer.
If one of the questions on the exam has a negative PBI for the correct option and one or more of the distractors have a positive PBI, what information does this give the instructor? How would you recommend the instructor adjust this item?
Based on the sample statistics, has student learning taken place? Justify your answer with data.
Based on the sample statistics, what steps would you take to improve learning?

Prepare this assignment according to the guidelines found in the APA Style Guide, located in the Student Success Center.

This assignment uses a rubric. Please review the rubric prior to beginning the assignment to become familiar with the expectations for successful completion.  Assessment of Examination Evaluation Data Essay

You are required to submit this assignment to LopesWrite. Please refer to the directions in the Student Success Center.

Analyzing Assessment Data


Assessment of exam results is necessary to determine if learners met the criteria for learning. To give meaning to the presented or collected data, it is necessary to analyze the data for context, understanding and finally to arrive at conclusions.  Therefore, analysis of assessment data gives meaning to the information, facilitating effective communication and the use of the assessment results (Madlung, 2018). The assessment would be used to determine the students’ learning and the efficacy of the teaching. The purpose of this assignment is to analyze assessment data using the provided sample test statistics.

Reliability refers to the level to which the findings of a measurement, specification or calculation produces consistent findings when repeated several times and thus validating the measurement’s accuracy (Boateng et al., 2018). The reliability coefficient is used to measure reliability in a piece of data. According to Billings & Halstead (2015), a data whose reliability is below 7.0 is deemed unreliable. The test statistics data from the book have a reliability coefficient of 0.84. This, therefore, indicates that the test is reliable. This is because for findings to be considered reliable, the level of reliability is supposed to fall between 0.70 and 0.80 and the reliability of data findings improves when the reliability coefficient approaches 1.00.


The raw scores show that the high number of students has scores ranging from 70 to 74.4. Therefore, the raw scores follow a bell-shaped trend. Similarly, the mode of the students’ score is between 70 and 74.4. The median of the presented data is 73 while the total mean score is 72.690. From the analysis, it is clear that the central tendency of all measures is equal implying that the trend of the data is a normal distribution. From the trend, the instructor can adopt a normal distribution. Therefore, other analyses that use normal distribution can also be conducted.

Range refers to the measure in statistics that point out the variation between the largest values and the lowest values (Billings & Halstead, 2015). The range also illustrates the smallest likely interval containing the sample data. For the sample statistics, the highest score in 92.0 while the smallest score is 48.0; therefore, the range is 44.0. The instructor can use the range to demonstrate the rate of data distribution. The presented data sample seems to have a high standard error. Therefore, this information would be valuable to the instructor when making decisions regarding appropriate inferential statistics useful in the data analysis. The measures of central tendency can then be calculated and the distribution measures used to perform the item analysis (Billings & Halstead, 2015). The key items likely to be calculated in a class performance setting include item difficulty, item distractors and item discrimination. The p-value is used to represent item difficulty. P-value refers to the likelihood of getting findings as correct as of the observed findings of a statistical hypothesis test. Therefore, a p-value is used to indicate the accurateness of findings. A p-value of 1.00 would imply that students were correct in answering specific questions correctly.

On the other hand, a p-value of 0.3 shows that students experienced difficulties in the question, whereas a p-value of 0.8 illustrates a low difficulty. To ensure the efficacy of item discrimination, the item difficulty is supposed to be put at p = 0.5. A p-value of 0.4 shows that there was high discrimination whereas a p-value of 0 would indicate that there was no discrimination whatsoever. Moderate discrimination ranges from 0.15 to 0.29.  Assessment of Examination Evaluation Data Essay

The instructor also needs to analyze item distractors. The level of correlation is measured using biserial correlation where a value of 0 shows that some reviews are necessary. The most appropriate approach is to use open-ended questions. PBI is also used in comparing the general performance of students with the way they answer specific questions (McDonald, 2017). A performance lower than 0.2 would point out poor performance for the learners while performance ranging from 0.4 to 0.7 would illustrate a good performance.

If one of the questions on the exam had a p-value of 0.76, the item should not be eliminated. This is because such a p-value shows that the question is not difficult. In addition, it would assist in differentiating high-performing learners with low-performing learning within a classroom setting.

If one of the questions on the exam has a negative PBI, this means that majority of the learners whose performance was poor have the correct answer for the item (McDonald, 2017). If one or more of the distractors have a positive PBI, this shows that learners whose performance appeared excellent in the exam had wrong answers. In this case, the instructor can use positive distractor to be among other answers. The other option would be to do away with the item from the examination (McDonald, 2017).

It is clear from the sample test statistics that learning has occurred. This is because several analysis measures have been conducted. For example, the presented data adopts a normal distribution further implying that learning has occurred. The mean of the sample statistics is 73.0 while the standard deviation is 9.8. Moreover, it is clear that the sample data statistics follow a normal curve where the median, mean and mode scores are above average; this emphasizes that learning has occurred (Madlung, 2018).

Basing on the provided dataset, various measures could be applied to improve learning. For example, students with poor performance can be provided with additional coaching classes to ensure their performance matches the performance of other students. Additionally, an investigation can be performed to identify challenges that may be hindering good performance among students with poor performance. This can significantly lower variability in the dataset (Madlung, 2018).


Analysis of the test results is important to examine if learning has occurred and test whether learners have gained the target knowledge. Instructors can use descriptive statistics to conduct item analysis and at the same time analysis the performance of learners in exams. Item analysis, on the other hand, can be used to select the appropriate items for an example.

Assessment is the process of evaluating the level of effectiveness of a procedure that is used in educating the healthcare providers. In the school setting, assessment may be carried out to establish whether the set objectives of a class have been met (McDonald, 2017). Assessment would be used to find out the effectiveness of the teaching approach. This paper discusses the various aspects of assessment data by the use of sample test statistics.

Reliability in statistics is a term that is used to refer to the level at which a measure produces consistent results when it is repeated several times. Reliability of data is measured by the use of reliability coefficient (Wildemuth, 2016). A reliability coefficient below 7.0 is considered to be unreliable. In order for test results to be considered to be reliable, the reliability level should be between 0.70 and 0.80. As the reliability coefficient approaches 1.00, the level of reliability approaches 100%.  The test statistics data have a reliability coefficient of 0.84 and thus this indicates that the data is reliable.

The raw scores indicate that the frequency of the scores follows a bell-shaped trend as the majority of the students scored between 70 and 74.4. This indicates that the mode of the scores also falls in between the two scores, 70 and 74.4. From the descriptive statistics data presented, the mean score is 72.690 while the median is 73. Since all the measures of central tendency are equal, this implies that the data follows a normal distribution (Wildemuth, 2016). The trend therefore would help the instructor to assume a normal distribution and thus other analysis that relies on the assumption of normal distribution could be carried out.  Assessment of Examination Evaluation Data Essay

Range is a statistical measure that indicates the difference between the smallest and the largest values. It indicates the smallest possible interval that contains all the sample data. In the presented sample test statistics, the range is 44 since the highest score is 92.0 and the lowest score is 48.0. The range is important to the instructor since it helps to show the level of dispersion of the data (McDonald, 2017). The data has a large standard error and thus this information would help the instructor to make decisions on the suitable inferential statistics that would be used to analyse the data.

After calculating the measures of central tendency and the measures of dispersion the instructor could then carry out item analysis. There are three major items that would be calculated in the case of class performance setting. One of the items is the item difficulty which is represented by the p-value. A p-value of 1.00 indicates that all the students answered the specific question of concern correctly (Billings, & Halstead, 2015). A p-value of 0.3 indicates that the question is difficult while a value of 0.8 also indicates a low difficulty. The suitable range for the p-value to isolate learners from non-learners would be to put 0.7. The item discrimination index quantifies the difference between the learners and non-learners. The item difficulty should be put at p = 0.5 so as to ensure that the item discrimination is also effective. A value of 0.4 indicates a high discrimination while a value of 0 indicates no discrimination. Moderate discrimination is between 0.15 and 0.29. Item distractors are the other features that the instructors should consider analysing. Biserial correlation is used to measure the level of correlation (Schoening, 2018). A value of zero may indicate the need for revision. The suitable approach would be to ask open ended questions. The other item analysis is the PBI which compares the general performance of the learners with how they answer specific items. A performance below 0.2 indicates a poor performance while a range between 0.4 and 0.7 indicates a very good item.

A test question with p-value of 0.76 should not be eliminated from the examination. This is considering that this p-value level indicates a less difficult question and would help to distinguish between the learners and the non-learners in the classroom.

In the situation where the question PBI is negative, this implies that most of the students who performed poorly in the exam answered the item correctly (Schoening, 2018). On the other hand, the positive PBI on the distractors indicates that the students who performed well in the whole exam answered the question wrongly. In such a situation, the instructor may consider to have the positive distractor to be one of the other answers. The other option would be to eliminate the item completely form the exam.

In the presented sample test statistics, it could be concluded that leaning has taken place. This is considering the various measures of analysis that have been carried out. One of the analysis that contributes to this decision is the fact that the data follows a normal distribution.

The standard deviation of the data is 9.8, while the mean is 73. The interval which is ±1 standard deviations from the mean is 73±9.8. this interval would be (63.2, 82.8). The number of studens contained in this interval would be; 1+ 5+ 9 + 3+ 5 = 23. This represents 23/29* 100 = 79.3% of the total number of students. Two standard deviations from the mean contains all the students who conducted the examinations (Billings, & Halstead, 2015). This is an indication that the data follows a normal curve. The average score, mode and median scores are also above average and thus indicates that learning has taken place.

There are several measures that could be taken in order to improve learning basing on the data. One of the measures would be to offer specialised extra classes to the two students who performed the poorest. This would be identifying the possible challenges they may have had and thus help to reduce the level of variability in the data. Reduced variability helps to improve the level of effectiveness of the data.

In summary, instructors ought to have the knowledge on analysing evaluation results so as to make the right decisions on whether learning has occurred during the course. The analysis could be carried out using descriptive s statistics besides carrying out item analysis. While the item analysis contributes to the selection of suitable items in the examination, the descriptive statistics helps to analyse the performance in the whole examinations.


Billings, D. M., & Halstead, J. A. (2015). Teaching in nursing-e-book: A guide for faculty. Elsevier Health Sciences.

McDonald, M. E. (2017). The nurse educator’s guide to assessing learning outcomes. Jones & Bartlett Learning.

Schoening, A. (2018). Interpreting exam performance: what do those stats mean anyway.

Wildemuth, B. M. (2016). Descriptive statistics. Applications of Social Research Methods to Questions in Information and Library Science, 338-47.Assessment of Examination Evaluation Data Essay



Work With US!

Order your high-quality Nursing Paper that Meet University Standards and get it delivered before your deadlines. 

+1 631-259-7728
WhatsApp chat +1 631-259-7728