An OSCE including critical simulation: an evaluation for medical student competence


Ming-Chen Hsieh
Shao-Yin Chua


OSCE and Standard Setting




Buddhist Tzu-Chi General Hospital; Tzu Chi University, Hualien, Taiwan


Failure to design a skill assessment tool is a missed opportunity to more fully understand and apply the results to the clinical performance of medical students.


To develop an OSCE station to assess critical condition evaluation skills of medical students in applying evidence and appropriate treatment options with a simulated patient. To assess the result using discrimination and reliability comparison of standardized and simulated patient stations.

Summary of Work

 OSCE performance scores of 58 seven-year medical students at the University of Tzu-Chi School of Medicine were analyzed from April 10 to 11, 2011 using descriptive statistics and item discrimination. Through a consensus process, 13 OSCE cases were identified for evaluation.

Take-home Messages

The OSCE is an important tool for clinical competence evaluation that will soon grant access to the national medical practitioner license test for medical students worldwide. How to elevate OSCE quality and assess student ability, such as actual management of an emergency condition patient, is currently a critical issue.


This research study was initiated and co-ordinated by the Tzu-Chi Simulation Interest Group(TSIG). A apecial thanks to all instructables staff members, feature team members, volunteers, and members! 

Summary of Results

The discrimination statistics indicate that the only critical scenario station prepared with a high-fidelity simulator was effective in distinguishing between medical students.



1. Lee M, Wimmers PF. Clinical competence understood through the construct validity of three clerkship assessments. Med Educ 2011;45(8):849-857.

2. Pell G, Fuller R, Homer M, et al. How to measure the quality of the OSCE: A review of metrics - AMEE guide no. 49. Med Teach 2010;32(10):802-811.


The main findings of this study include two parts: First, during quality estimation of each station and each item, effectiveness was noted in the high-fidelity simulator station, including discrimination between high-scoring and low-scoring student effectiveness. Secondly, our findings have no significant correlation with the certification examinations, proving that we could discern different levels of patient care from individuals in a standardized scenario on a high-fidelity simulator. Quality control is another important issue in test development, particularly for certifying examinations used to classify examinees.


Competency-based education has been tremendously popular in medical education for the past decade and is currently the master stream method of teaching clinical knowledge by incorporating a new model to create medical education objectivesAssessing student clinical skills is also a crucial element in their training. The Objective Structured Clinical Examination (OSCE) is a widely accepted tool to evaluate the clinical competence of medical students. Studies have demonstrated that the OSCE is an effective tool for evaluating areas most critical to performance of health care professionals: the ability to obtain information from a patient, establish rapport and communicate, and interpret data and solve problems. Although assessment may be part of an institution or course evaluative process, or have other purposes, teachers use assessment for either summative or formative processes. The station content varies according to student experience and the nature of the assessment. The types of problems portrayed in an OSCE are those students would commonly encounter in a clinic or hospital. Standardized Patients (SPs) typically have general complaints, although some could present problems related to emergency conditions. Although students in training are familiar with basic practices in critical care medicine, little OSCE is included in critical stations. A critical action is defined as a scenario whose evaluation process is critical to ensure an optimal patient outcome and avoid medical error. Failure to address this critical condition performance is a missed opportunity to better understand and use the results of such examinations for a competence-based evaluation for medical students. Developing an OSCE station for complex critical conditions poses unique challenges. However, current technology allows for critical care scenarios, complete with cardiac and respiratory arrest on a computerized patient simulator in rapid transit stations.

The study design was chosen to allow for collecting quantitative measures of medical student performance in managing a set of simulated critical shock emergencies. We develop an OSCE station to assess the critical condition evaluation skills of medical students in applying evidence and appropriate treatment options with a simulated patient. This investigation determines whether a critical management OSCE station plays a meaningful role in a summative examination and assesses the result of using discrimination and reliability comparison of standardized and simulated patient stations.

Summary of Work

Study Participants

Medical school in Taiwan begins as an undergraduate major and runs seven years. The Department of Medical Education and School of Medicine at Tzu Chi University in Hualien, Taiwan held examinations from April 10 to 11, 2010. This retrospective study collected and analyzed relevant OSCE information conducted on 7th-year medical students at Tzu Chi General Hospital in 2010. Fifty-eight participants in their under-graduate year had completed training courses in various subjects, including internal medicine, surgery, pediatrics, and critical care.


Study Design

The development of the OSCE examination component was based on a collaborative effort led by faculty members who had experience with OSCEs. The OSCE examined the range of clinical competence in clinical scenarios including interviewing, physical examination skills, critical thinking, clinical judgments, and technical skills. All participants were instructed to perform all appropriate diagnostic and therapeutic actions and verbalize their thoughts and actions. This study focuses on assessing the critical thinking abilities of students.



Students had one week of hands-on participation to familiarize with the simulators, represented by an experienced operator before the test.

During the OSCE, the simulated scenario was conducted in a general ward featuring a high-fidelity simulator. The iStan (METI, Medical Education Technologies, Inc., Sarasota, FL) provides a human-like, full-scale computerized mannequin in a realistic clinical setting. The scenario lasted fifteen minutes. Participants were given clear instructions to state the emergency diagnosis and the treatments they were instituting. We presented a fifty-five year old man who was admitted to the hospital due to pneumonia, complicated with hypotension. Two status respiratory failure and septic shock were shown in the stages. The data announced that prescription orders, vital signs records, electrocardiograms, and chest radiographs were collected in the chart. They needed to assess the patient, including review of the patient chart and perform a physical exam. The scenario ended when the patient began a downhill course. Following the station, participants must provide a brief summary as a duty note to display the assessment, problems, and plans in an organized format.



Audiovisual recordings were made of each scenario to facilitate scoring and to allow independent review and further analyses. The crisis evaluation and summary of the event and scoring measures are presented in Appendix A. Five medically qualified educators designed the written sheet for the duty note, including four parts of subjective, objective, diagnosis, and plan. The three-part checklist included history taking and physical exam, imperative diagnosis with differential diagnosis, and management of septic shock. Reference resources for evaluating the management of severe sepsis and septic shock skills were based on surviving sepsis campaign international guidelines (9). A panel of four well-experienced physicians using a modified Delphi technique selected and prioritized the passing score. During the OSCE, four well-experienced raters were formally trained in assessing each examination paper and were given specific instructions on scoring.

The scores were three-point scales ranging from 0 (failed to perform) to 1 (performed poorly or out of sequence) to 2 (properly performed in correct sequence).

Data processing and analysis

For descriptive analysis, data from a high-fidelity simulator station was analyzed, including the maximum score, minimum score, mean score, and standard deviation. We compared the pass rate, quality estimation between standardized patient stations, and the high-fidelity simulator (HFS) station (Table 1). The measure of item difficulty (P)—the proportion of participants who received credit for the item, was based on the average of the two raters’ values. A value of 1 indicates that all students received credit. The second measure was item discrimination (D)—the correlation between the item-level score and the total checklist score. Here, higher values (i.e., D >0.30) indicate that the item is able to discriminate between low- and high-ability individuals. In some instances (i.e., all or no students receiving credit), the D value cannot be calculated. The third measure was reliability between inter-rater agreements, which is estimated as the Pearson product-moment correlation coefficient between two administrations of the same measure. A value of 1 indicates that the two raters were in perfect agreement on a particular element across all items in the SP station. For the high-fidelity simulator station, reliability was estimated as the Kendall coefficient of concordance for raters. A value larger than 0.9 shows high agreement between raters in individualized performance evaluations. The following psychometric analysis was performed to design each evaluation item in the high-fidelity simulator station (Table 2). Student results of the national medical board were collected three months after this study. Analyses were processed by SPSS version 10.0 (SPSS Inc. Chicago IL, USA). The correlation between significant part and results of the national board exam were estimated by the Pearson Chi-square test.


Take-home Messages

Special thanks go to all the authors: Wei-Chun Cheng, Tsung-Ying Chen, Chun-Hou Huang, Hsiang-Man Liu and Jimmy Ong.

Summary of Results


After averaging scores from all raters of individual students in the simulation station, scores of the 58 students ranged from 12 points (33.33%) to 30 points (83.33%), with 19.5 (54.13%) ±4.032 points (11.20%; mean ± a standard deviation) . Finally, 21 students (36%) failed. The low-scoring group obtained 16.78 ± 2.95 points and the high-scoring group obtained 21.31 ± 3.59 points.


The discrimination statistics (D) indicate that the only station prepared with a high-fidelity simulator was effective in distinguishing between low- and high-ability medical students. Other than the simulator station, discrimination statistics of SP stations indicate that both low- and high-ability students performed equally with respect to these settings.


Three months later, all students took the national medical board qualification in Taiwan. Seven students failed to pass the exam. The pass rate was calculated as 87.9%. A binomial logistic regression found no significant difference between results of the national board exam and the parts of physics, history taking, differential diagnosis, and management, including average score of the entire test. Data analyzed by the Chi-square test showed no significance (p = .219)





1. Lee M, Wimmers PF. Clinical competence understood through the construct validity of three clerkship assessments. Med Educ 2011;45(8):849-857.

2. Pell G, Fuller R, Homer M, et al. How to measure the quality of the OSCE: A review of metrics - AMEE guide no. 49. Med Teach 2010;32(10):802-811.

3. Townsend AH, McLlvenny S, Miller CJ, et al. The use of an objective structured clinical examination (OSCE) for formative and summative assessment in a general practice clinical attachment and its relationship to final medical school examination performance. Med Educ 2001;35(9):841-846.

4. Payne NJ, Bradley EB, Heald EB, et al. Sharpening the eye of the OSCE with critical action analysis. Acad Med 2008;83(10):900-905.

5. Williams RG. Have standardized patient examinations stood the test of time and experience? Teach Learn Med 2004;16(2):215-222.

6. Nackman GB, Bermann M, Hammond J. Effective use of human simulators in surgical education. J Surg Res 2003;115(2):214-218.

7. Rogers PL, Jacob H, Rashwan AS, et al. Quantifying learning in medical students during a critical care medicine elective: a comparison of three evaluation instruments. Crit Care Med 2001;29(6):1268-1273.

8. Lai CW. Experiences of accreditation of medical education in taiwan. J Educ Eval Health Prof 2009;6:2.

9. Levy MM, Dellinger RP, Townsend SR, et al. The Surviving Sepsis Campaign: results of an international guideline-based performance improvement program targeting severe sepsis. Intensive Care Med 2010;36(2):222-231.

10. Papadakis MA, Teherani A, Banach MA, et al. Disciplinary action by medical boards and prior behavior in medical school. N Engl J Med 2005;353(25):2673-2682.

11. Huang YS, Liu M, Huang CH, et al. Implementation of an OSCE at Kaohsiung Medical University. Kaohsiung J Med Sci 2007;23(4):161-169.

12. Chang KY, Tsou MY, Chan KH, et al. Item analysis for the written test of Taiwanese board certification examination in anaesthesiology using the Rasch model. Br J Anaesth 2010;104(6):717-722.

13. Schumacher HR, Jr., Beasley RP. Medical education, hospitals, and health care on Taiwan. Ann Intern Med 1985;102(3):409-410.

14. Hendrix D, Hasman L. A survey of collection development for United States Medical Licensing Examination (USMLE) and National Board Dental Examination (NBDE) preparation material. J Med Libr Assoc 2008;96(3):207-216.

15. Walsh M, Bailey PH, Koren I. Objective structured clinical evaluation of clinical competence: an integrative review. J Adv Nurs 2009;65(8):1584-1595.

16. Troncon LE. Clinical skills assessment: limitations to the introduction of an "OSCE" (Objective Structured Clinical Examination) in a traditional Brazilian medical school. Sao Paulo Med J 2004;122(1):12-17.

17. Wayne DB, Didwania A, Feinglass J, et al. Simulation-based education improves quality of care during cardiac arrest team responses at an academic teaching hospital: a case-control study. Chest 2008;133(1):56-61.

18. Patricio M, Juliao M, Fareleira F, et al. A comprehensive checklist for reporting the use of OSCEs. Med Teach 2009;31(2):112-124.


Send ePoster Link