FDAI logo   ::  Site Map  ::   
Home  |  About this Website  |  Contact Us
About this Website » Analysis of Observational Studies, Experiments, and Surveys

Analysis of Observational Studies, Experiments, and Surveys
Overview

Introduction

We reviewed documentation from various automation-related experiments, surveys (including our own), observation studies, andother studies that represented a combination of approaches. In these documents we identified and recorded supportive and contradictory evidence related to the flight deck automation issues in 24 experiments, 29 surveys, and 15 observational studies. Strength ratings depended on the type of study reviewed, the methodology and type of subjects used in the study, and the type of evidence yielded by the study. Details are described below.

Experiments and Observation Studies

Some of the studies we reviewed were experiments conducted in simulators or in laboratories. Others were observation studies in which an observer observed pilots in simulators or in flight operations. In such studies, subjects could perform in a manner consistent with or contradictory to the problem suggested by an issue statement. We reviewed these studies for such supportive and contradictory evidence. Where the results of such studies were reported as percentages of subjects performing in a manner consistent with or in a manner contradictory to an issue, we assigned strengths according to the following table.

Strength % of Subjects
± 5 90-100%
± 4 75-89%
± 3 50-74%
± 2 25-49%
± 1 1-24%

 For example, given an experiment in which 25% of the line pilot subjects did not know how to perform an important flight management system operation we would have recorded two instances of evidence related to issue105 (understanding of automation inadequate), one supportive and one contradictory. The first would have been recorded with strength +2 for those pilots who lacked understanding, the second with strength -4 for the remaining pilots.

In such a study in which we could not determine exact percentages, we used the following table to assign strengths.

Strength % of Subjects
± 3 > 50% (e.g., 'most')
± 1 number cannot be determined from excerpt (e.g., 'some')

For example, if 'most' of the data from such a study supported (or contradicted) the problem suggested by the issue statement of issue105, we would have recorded evidence for it with strength +3 (or -3). If 'some' of the data supported (or contradicted) issue105, we would have recorded evidence for it with strength +1 (or -1).

Some experimental studies we reviewed tested hypotheses. For each of these we based evidence strength on the type of subjects used in the experiment, the type of tasks the subjects performed, and the results. We used the following table to assign strengths.

Strength Subjects Tasks Results
+5 line pilots trained on equipment line operations supportive
+4 line pilots trained on equipment simulated line operations supportive
+3 other line pilots simulated line operations supportive
line pilots trained on equipment other simulated flight tasks
+2 other line pilots other simulated flight tasks supportive
GA/student pilots simulated line operations
+1 line pilots generic automation tasks supportive
GA/student pilots other flight automation tasks
-1 line pilots generic automation tasks contradictory
GA/student pilots other flight automation task
-2 other line pilots other simulated flight tasks contradictory
GA/student pilots simulated line operations
-3 other line pilots simulated line operations contradictory
line pilots trained on equipment other simulated flight tasks
-4 line pilots trained on equipment simulated line operations contradictory
-5 line pilots trained on equipment line operations contradictory

Subjects in such a study could be line pilots actually trained on the equipment used in the experiment, other line pilots (not trained on the equipment used in the experiment), general aviation (GA) pilots, or student pilots. The tasks the subjects performed in the experiments could conceivably be tasks performed in actual line operations or they could be tasks performed in simulated line operations, simulated flight tasks in part-task simulators, or generic automation tasks performed in laboratories. The results could either be supportive of or contradictory to an issue.

For example, consider an experiment conducted to test the hypothesis that flightcrews respond more quickly to air traffic control (ATC) clearances when flying the airplane manually than when using the flight management system (FMS). The experiment involves line pilots using a part-task simulator modeling equipment they have been trained on. Each flies several scenarios, flying half of them manually and half with the FMS, and responds to ATC clearances. The results show that mean time to begin complying with an ATC clearance takes, on the average, 4.5 seconds manually and 8.1 seconds with the FMS, and that the difference is statistically significant at the p = 0.0963 level (good statistical significance for this type of experiment). This would be supportive evidence for issue161, When using automation, pilot response to unanticipated events and clearances may be slower than it would be under manual control, possibly increasing the likelihood of unsafe conditions. We would rate the strength of this evidence as +3 (line pilots trained on equipment, other simulated flight tasks, supportive of issue).

 

Surveys

In some of the surveys we reviewed, respondents were asked to rate their level of agreement with assertions equivalent to our issue statements. We assigned strengths based on the percentage of respondents agreeing with or disagreeing with these assertions, according to the following table.

Strength % of Respondents
± 5 90-100%
± 4 75-89%
± 3 50-74%
± 2 25-49%
± 1 1-24%

For example, suppose that, in such a survey, 63% of the subjects agreed or strongly agreed with the assertion "Overall, the flight management system reduces workload." We would have recorded evidence related to issue079 (automation may adversely affect pilot workload) with strength -3, for these results contradict the problem suggested by issue079's issue statement. In this case we would not have recorded supportive evidence, because the fact that some of the respondents (maybe as many as 37%) disagreed that the FMS reduces workload does not mean that they think that it actually increases workload.

In some of the surveys we reviewed, subjects were asked to respond to assertions equivalent to our issue statements by giving their level of agreement as Likert scores (e.g., 1 means strongly disagree, 5 means strongly agree). When the results were given as a mean score as a percentage of the maximum possible score (a percentage of 5, in the example), we used the following table to assign strengths to evidence.

Strength Mean Score as a % of Maximum
-5 0-9%
-4 10-19%
-3 20-29%
-2 30-39%
-1 40-49%
0 50%
+1 51-60%
+2 61-70%
+3 71-80%
+4 81-90%
+5 91-100%

The table is based on the assumptions that scores could range from 0% of maximum score (for strongly disagree) to 100% of maximum score (for strongly agree), that the survey question was worded consistently with the asserted issue, and that 50% of maximum score was neutral (did not agree or disagree).

If the assertion to which the subjects responded was worded to be opposite to that of the issue statement, we reversed the strength signs. Unless response distribution information was given, we did not count evidence both for and against the issue unless minimum and maximum responses were given. For example, consider a survey in which pilot respondents were asked to give their level of agreement with the assertion "Pilots fully understand the flight management system." If the mean response was 1.9 on a scale of 1 (strongly disagree) to 5 (strongly agree), we would have recorded evidence with strength +2 (1.9 is 38% of 5) since the subjects as a group tended to disagree with the survey statement, thereby agreeing with the issue statement of issue105. If it was also reported that the maximum response was 5, we would also have recorded evidence with strength -1, since at least one subject strongly disagreed with the issue105.

In surveys in which we could not determine exact percentages, we used the following table to assign strengths.

Strength % of Respondents
± 3 > 50% (e.g., 'most')
± 1 number cannot be determined from excerpt (e.g., 'some')

For example, if 'most' of the respondents from such a study agreed with an assertion consistent with the issue statement of issue079, we would have recorded evidence for it with strength +3. If 'some' of the respondents agreed with the assertion, we would have recorded evidence for it with strength +1.

 


Results

Experiments

We found evidence for flight deck automation issues in 24 of the experiments we reviewed. The experiments are listed alphabetically by author and include links to the bibliographic information and evidence found in the report.

Investigator(s) Short Description of Experiment
Airbus Industrie An experiment designed to compare the sidestick/fly-by-wire combination and conventional controls EVIDENCE
Airbus Industrie Two experimental studies comparing the performance of conventional instruments and advanced instruments in the A310. EVIDENCE
Barbato, G. An experiment designed to evaluate impact of automatic target cueing and pilot voice recognition and automatic target cueing integrated into a single-seat fighter cockpit simulator on the pilot. EVIDENCE
Beringer, D.B. An experiment designed to explore some of the more serious and more subtle malfunctions that have a moderate probability of causing the termination of the flight in civil aviation. EVIDENCE
Beringer, D.B., & Harris, H.C., Jr. Two experiments designed to explore pilot response during autopilot malfunctions and system malfunctions that influence the autopilot that vary in how obvious and how quickly the effects are manifest. EVIDENCE
Edwards, R.E., Tolin, P., & Jonsen, G.L. A simulation study used to assess the impact of two navigation- and two flight control modes on pilot visual behavior. EVIDENCE
Inagaki, T., Takae, Y., & Moray, N. An experiment designed to explore the effect of Go/Abort messages presented on the interface to aid the pilot in making correct Go/No-Go decisions. EVIDENCE
Lin, H.X. & Salvendy, G. An experiment design to explore whether a specific class of warnings can reduce human error by increasing their their level of conceptual knowledge. EVIDENCE
Lozito, S., McGann, A., & Corker, K. An experimental simulation used to investigate the effect of using the data-linked ATC and an automated flightdeck. EVIDENCE
Mosier, K.L., Skitka, L.J., Heers, S., & Burdick, M. An experimental simulation used to investigate omission and commission errors resulting from the use of automated cues as a heuristic replacement for vigilant information seeking and processing. EVIDENCE
Mumaw, R.J., Sarter, N.B., & Wickens, C.D. A fixed-based simulator experiment of Boeing 747-400 line pilots designed to address issues related to the role of pilot monitoring in the loss of mode awareness on automated flight decks. EVIDENCE
Muthard, E.K. & Wickens, C.D. An experiment designed to investigate the effects of automation on the pilot's task of plan monitoring and making plan revisions. EVIDENCE
Muthard, E.K. & Wickens, C.D. An experiment designed to investigate the effects of automation and task loading on the pilot's task of plan monitoring and making plan revisions. EVIDENCE
Petridis, R.S., Lyall, E.A., & Robideau, R.L. A study in which the activities of pilots were coded during flight and then coding was used to analyze the effect of automation EVIDENCE
Pritchett, A.R. & Johnson, E.N. A part-task simulator study conducted to explore incidents occurring in A320 aircraft involving Vertical Speed Mode EVIDENCE
Pritchett, A.R., & Johnson, E.N. An experimental simulator study was run to test pilot detection of an error in autopilot mode selection EVIDENCE
Riley, V., Lyall, E., & Wiener, E. Two experiments were designed to identify and characterize factors that influence pilot decisions about whether or not to use automation. The first was a simple computer-based experiment and the second was a series of similator studies. EVIDENCE
Riley, V.A. Experiments designed to provide basic empirical evidence on how selected factors influence automation use decisions EVIDENCE
Roscoe, A.H. An experiment used to compare levels of workload between B767 and B707-200 EVIDENCE
Sarter, N.B. & Woods, D.D. A part-task simulator experiment designed to address issues related to pilot's proficiency in standard tasks, mental models of the FMS, and mode awareness EVIDENCE
Sarter, N.B. & Woods, D.D. An experimental simulation study of mode awareness and pilot-automation coordination on the flight deck of the A-320 EVIDENCE
Skitka, L.J., Mosier, K.L., Burdick, M., & Rosenblatt, B. Experiment to study whether two-person crews are as likely as one-person crews to commit errors due to automation bias. EVIDENCE
Speyer, J.J. & Blomberg, R.D. An experiment involving the interrogation pilots for scaled workload assessment ratings EVIDENCE
Speyer, J.J., Fort, A., Fouillot, J.P., & Blomberg, R.D. A comparison of workload between DC-9 and A300FF using the Static Taskload Analysis EVIDENCE



Observation Studies

We found evidence for the flight deck automation issues in 15 of the observation studies we reviewed. The observation studies are listed alphabetically by author and include links to the bibliographic information and evidence found in the report.

Investigator(s) Short Description of Observation Study
Billings, C.E. Presents principles and guidelines for human-centered automation in aircraft and in the aviation system. EVIDENCE
Bruseberg, A., & Johnson, P. Discusses human-computer collaboration and it's relationship to different foci that can be used to model temporal aspects of tasks in dynamic and complex work situations. EVIDENCE
Damos, D.L., John, R.S., & Lyall, E.A. An observational study designed to explore the relationship between the level of automtion in the flight deck and the amount of time the pilot spends performing specific activities. EVIDENCE
Damos, D.L., John, R.S., & Lyall, E.A. An observational study designed to investigate the frequency of 23 activities that were varied as a function of the level of auotmation in the flight deck. EVIDENCE
Hughes, D. Observation made during visit to TWA and Air Canada training centers in St. Louis and and the flights on which the author road jumpseat EVIDENCE
Norman, S.D., & Orlady, H.W. A discussion of the major ideas and concepts presented in the panels and papers about automation in the air transport system EVIDENCE
Orlady, H.W. A discussion of training issues for advanced technology aircraft EVIDENCE
Palmer, E.A. & Mitchell, C.M. Flight deck descent procedures were developed for a field evaluation of the CTAS Descent advisor conducted in the fall of 1995. EVIDENCE
Sarter, N.B. & Woods, D.D. An observation of simulator check ride in the process of transitioning to the B-737-300 aircraft EVIDENCE
Sarter, N.B. & Woods, D.D. A discussion about mode awareness problems in glass cockpits EVIDENCE
Sarter, N.B. & Woods, D.D. A study of the pilots' transition into the advanced technology B737-300 from non-advanced aircraft. EVIDENCE
Wiener, E.L. An observation made while author was riding in the jumpseat of a glass cockpit aircraft EVIDENCE
Wiener, E.L. A discussion of the management of human error in the cockpit. EVIDENCE
Wise, J.A., Abbott, D.W., Tilden, D., Dyck, J.L., Guide, P.C., Ryan, L. A discussion with workshop participants used to investigated the impact of automation in corporate aviation cockpits EVIDENCE
Woods, D.D. A discussion of intelligent interfaces EVIDENCE

Surveys

We found evidence for the flight deck automation issues in 29 of the surveys we reviewed. The surveys are listed alphabetically by author and include links to the bibliographic information and evidence found in the report.

Investigator(s) Short Description of Survey
Airbus Industrie A detailed survey given to teams of visiting aircrew about the A320 sidestick/fly-by-wire 'proof of concept' EVIDENCE
Braune, R. A survey of Deutsche Lufthansa pilots flying in a mixed fleet of 737-200/ -300 EVIDENCE
Bruseberg, A., & Johnson, P. This paper discusses the merits of drawing analogies between human-computer interaction and human-human collaboration in the light of the ever-advancing capability of computer systems. EVIDENCE
Curry, R.E. A survey of pilots during the introduction of an advanced technology B-767 aircraft EVIDENCE
Gras, A., Moricot, C., et. al. A survey of pilots working for French airline companies to assess their attitudes about flight deck automation EVIDENCE
Hutchins, E., Holder, B., & Hayward, M. A survey of line pilots' attitudes about autoflight automation. EVIDENCE
James, M., McClumpha, A., Green, R., Wilson, P., & Belyavin, A. A survey of UK commercial pilots used to assess their opinions and attitudes toward advanced automated aircraft EVIDENCE
Last, S., & Alder, M. A survey of pilots to determine the views of line pilots about the lack of feedback movement of A320 thrust levers EVIDENCE
LUFTHANSA Airline A survey of pilots about general characteristics of airplane and electronic interfaces EVIDENCE
Lyall, B., Wilson, J., & Funk, K. A survey of pilots for evidence related to flight deck automation issues. EVIDENCE
Lyall, E.A. A survey of 737 pilots to assess the effects of allowing pilots to concurrently fly two derivatives of the Boeing 737 EVIDENCE
Lyall, E., Niemczyk, M. & Lyall, R. A survey of aviation experts used to compile evidence for problems or concerns about flightdeck automation. EVIDENCE
McClumpha, A.J., & James, M.R. A survey of UK commercial pilots used to assess their opinions and attitudes toward advanced automated aircraft EVIDENCE
Morters, K. A survey of pilots used to examine issues concerning the pilot-automated flight-deck interface on the B767 EVIDENCE
Noyes, J.M. & Starr, A.F. A survey of commercial flight crews to identify user requirements for designing the next generation of warning systems. EVIDENCE
Orlady, H.W., & Wheeler, W.A. A survey of pilots of advanced technology aircraft used to investigate training and maintenance of basic flying skills EVIDENCE
Rash, C.E., Adam, G.E., LeDuc, P.A., & Francis, G. The study identified which aspects of the two cockpit designs were most favorable or troublesome to the pilots, and identified differences in opinions across pilots who flew traditional or glass cockpit designs. EVIDENCE
Rudisill, M. A survey of line pilots' attitudes about flight deck automation - I EVIDENCE
Rudisill, M. A survey of line pilots' attitudes about flight deck automation - II EVIDENCE
Sarter, N.B. & Woods, D.D. A survey of pilots' experiences with training for and the operation of the A320 automation EVIDENCE
Sarter, N.B. & Woods, D.D. A survey of B-737-300 pilot's attitudes about FMS EVIDENCE
Sherman, P.J., Helmreich, R.L., & Merritt, A. Survey of multi-national airline pilots' attitudes toward automation. EVIDENCE
Speyer, J.J. A survey of pilots about the quality of the man-machine interface EVIDENCE
Stefanovich, Y., & Thouanel, B. An exchange of opinions about the A320 among pilots EVIDENCE
Wiener, E.L. A study of the pilots' transition into the advanced technology B757 from non-advanced aircraft. EVIDENCE
Wiener, E.L. A longitudinal survey of pilots about transitioning from a traditional technology aircraft to a highly automated derivative model EVIDENCE
Wiener, E.L., Chidester, T.R., Kanki, B.G., Palmer, E.A., Curry, R.E., & Gregorich, S.E. A survey of pilots to assess subjective workload in a LOFT simulator experiment EVIDENCE
Wiener, E.L., Chidester, T.R., Kanki, B.G., Palmer, E.A., Curry, R.E., & Gregorich, S.E. A questionnaire given to pilots designed to elicit their opinions, experience level, and specific information and viewpoints on the DC-9 and MD-88 EVIDENCE
Wise, J.A., Abbott, D.W., Tilden, D., Dyck, J.L., Guide, P.C., Ryan, L. A survey of pilots who regularly fly corporate missions used to obtain information on various aspects of cockpit automation EVIDENCE

Top of Page  

  Last update: 20 September 2007 Flight Deck Automation Issues Website  
© 1997-2013 Research Integrations, Inc.