By Kathleen Kapusta, J.D.
Five women denied paramedic jobs with the City of Chicago when they failed its physical skills entrance exam should have prevailed on their Title VII disparate impact claims because the physical skills study was neither reliable nor validated under federal law, the Seventh Circuit ruled, reversing the district court’s entry of judgment for Chicago with instructions to enter judgment in their favor. The appeals court also remanded for a new trial on their disparate treatment claim, finding that the district court provided an erroneous jury instruction (Ernst v. City of Chicago
, September 19, 2016, Manion, D.).
The five experienced paramedics sought and were denied paramedic jobs with the Chicago Fire Department because they failed Chicago’s physical skills entrance exam that was created by Human Performance Systems, Inc. (HPS). The president of HPS had previously created a physical test for the fire department’s entry-level firefighters that had a disparate impact on women. Suing the city, the plaintiffs argued that Chicago’s decision to rehire the president for the paramedic test without taking bids from anyone else reflected its desire to reduce the number of women it hired as paramedics.
Disparate treatment, impact.
The case was split into two parts. Their disparate treatment claims went to a jury and their disparate impact claims were tried in a separate bench trial. When the jury expressed confusion over a jury instruction, the court told them it spoke for itself. Four minutes later, the jury returned a verdict for the defense. As to the disparate impact trial, the district court found that Chicago satisfied its burden of proving that the test was job-related and consistent with business necessity. Thus, it entered judgment for Chicago.
On appeal, the court turned first to the jury instruction provided by the district court in the plaintiffs’ disparate treatment trial, which stated: "To determine that a Plaintiff was not hired because of her gender, you must decide that the City would have hired the Plaintiff had she been male but everything else had been the same." Noting that the jury should have been instructed on the plaintiffs’ burden of proving that Chicago was motivated by anti-female bias when Chicago created the entrance exam that caused them not to be hired, the court found that instead, jurors were instructed on a different burden, which failed to address Chicago’s motive for creating the skills test. This instruction, said the court, focused on gender as a factor in the specific decisions not to hire these five plaintiffs, without expressly stating the mandatory question: whether Chicago had an anti-female motivation for creating its skills test.
Although the legal error was enough to establish prejudice, the court noted that the jurors saw it as the pivotal issue before them when they informed the court that the instruction was confusing and that they couldn’t deliberate further without clarification. Only four minutes after the district judge told them to take the instruction at face value, they returned a defense verdict. Thus, the court remanded the disparate treatment claims for a new trial with proper instructions.
As to the plaintiffs’ disparate impact claims, Chicago relied on the validity study to establish that its physical skills test was job-related and consistent with business necessity. Noting federal regulations establish technical standards for validity studies relied upon by employers, the court pointed out that based on the work of the HPS president, Chicago implemented a physical entrance exam with three components: a modified stair-climb, arm-endurance test, and leg lift.
In the Title VII context, the court explained, a validity study examines whether an employer is using an appropriate selection procedure in its hiring process. Federal regulations require that the validity study must establish specific criteria, which empirically demonstrate that the selection procedure predicts or significantly correlates with important job-performance elements. Here, the specific criteria were the physical skills that the president tested against work samples to see whether the skills could be validated as job-related skills. She ultimately found three valid.
The applicable federal regulations provide two specific guidelines for determining whether a sample population is representative: whether the individual in the same sample population (here, volunteer incumbent paramedics) are representative of individuals who are normally available in the Chicago paramedic market and whether, in a concurrent validity study, the test focuses on specific skills or knowledge that are the "primary" focus of skills or knowledge that Chicago paramedics learn on the job.
Volunteers not representative.
Turning to the president’s use of volunteer paramedics for her study, the court found that on its own, this was not a basis for setting aside her study results. By her own testimony, however, these volunteer paramedics did not represent the skill-set in the general population of Chicago paramedics as they performed better than public-sector and private-sector paramedics normally perform. Concerned that the study results might not be representative, she combined her data on 52 Chicago paramedics with a comparable data-set on 87 New York City paramedics. But there was no evidence that the New York City paramedics performed at a lower skill level than paramedics who are "normally available in the [Chicago] labor market," and Chicago did not explain why it believed that by adding 87 New York scores, the problem of abnormally high scores was resolved.
On the issue of reliability, the court noted that the lift and carry score was only 0.503. "That is a 50/50 chance of reliability," said the court, observing that there was no apparent effort to separate the lift and carry from the rest of the study. Because each of the three skills (modified stair-climb, arm endurance, and leg lift) was validated by correlating it against all three work samples (lift and carry, stair-chair push, and stretcher lift), the unreliability of this lift and carry score undermined all three skills that Chicago tested in its physical entrance exam. Thus, the president could not establish the validity of her study.
Moreover, even if reliability was fully established, validity would still be a problem. Because Chicago used the work-sample tests to validate the skills tests—without ever validating the work samples—the court found that it could not conclude that these work samples reflected "the primary focus of" paramedic skills learned on the job. There simply was no evidence that the work-sample test was a proper validation of job skills. On the contrary, the court questioned whether the work samples actually tested the skills that Chicago paramedics learn on the job, as expressly required by the governing regulations.
Addressing the work samples, the court found that it lacked information needed to reach a conclusion on how appropriate it was to time two of the tests. Moreover, the third test did not resemble skills learned on the job. Concluding that at least two out of three work samples were not valid, the court noted that the validity of the three skills that were tested in Chicago’s entrance examination depended on all three work samples being valid. This undermined the entire physical-skills entrance test that Chicago administers, said the court. Finding a lack of connection between real job skills and tested job skills, which was fatal to Chicago’s case, the court reversed the disparate impact trial verdict.