Title: Information Search in an Autocorrelated Causal Learning Environment - eScholarship
Abstract: Information Search in an Autocorrelated Causal Learning Environment Benjamin Margolin Rottman ([email protected]) Learning Research and Development Center, 3939 O’Hara St Pittsburgh, PA 15260 USA Abstract Information Search in Decisions from Experience When trying to determine which of two causes produces a more desirable outcome, if the outcome is autocorrelated (goes through higher and lower periods) it is critical to switch back and forth between the causes. If one first tries Cause 1, and then tries Cause 2, it is likely that an autocorrelated outcome would appear to change with the second cause even though it is merely undergoing normal change over time. Experiment 1 found that people tend to perseverate rather than alternate when testing the effectiveness of causes, and perseveration is associated with substantial errors in judgment. Experiment 2 found that forcing people to alternate improves judgment. This research suggests that a debiasing approach to teach people when to alternate may be warranted to improve causal learning. Keywords: Information Search, Causal Inference, Autocorrelated Environment, Dynamic Environment Introduction As researchers we are all familiar with history as a threat to internal validity. For example, suppose that we are comparing two interventions. Designing an experiment in which all participants first experience Intervention 1 and then Intervention 2 is flawed because a historical event, maturational change, or order effect could confound the results and make it seem as if there is a real difference between 1 and 2 even if there is not. Despite the dubiousness of such a learning strategy it seems common in every-day learning situations. For example, a person is prescribed a new blood pressure medicine, tries it for a week, notices an improvement, and concludes that the new medicine works better than the old medicine. This inference is flawed because any number of other changes over time such as changes in diet or stress could be responsible for the change in blood pressure. Or, consider a parent who starts to bribe his child to behave better and notices an improvement. The change could be due to the bribe or any number of other factors such as starting to play a sport or growing more mature. One way to increase the validity of such a “single- subject” design is to alternate between the two conditions (e.g., 1, 2, 1, 2) (Barlow & Hayes, 1979). With more alternations it is less likely that the baseline trend would correlate with the two conditions, reducing the likelihood of being fooled into believing that there is a difference merely due to the baseline trend. The current manuscript examines what sort of “experiments” people tend to design [e.g., (1, 1, 1, 2, 2, 2) vs. (1, 2, 1, 2, 1, 2)], and whether the experimental design influences their conclusions. In general, when a learner has the opportunity to choose a piece of information to sample it is called “active learning” or “information search.” One common information search experimental task involves a learner repeatedly choosing between two or more options, x=1 or x=2, and after each choice the learner receives the outcome Y. By sampling the two choices the learner forms an expectation of the outcome of Y given the different choices of X, and can use that expectation to choose a value of X that produces a desired outcome of Y. Experiments of this sort can reveal the patterns that people use when selecting X, how the information search pattern influences what is learned, and how well the learner obtains the desired outcome. Information search paradigms vary on many different dimensions; here I focus on the difference between “stable” and “dynamic” environments. In a stable environment the outcome of Y given a particular choice (e.g., x=1) is stable over time. For example, Y given x=1 could be determined by a normal distribution with mean=10 and SD=2 whereas Y given x=2 could be determined by a normal distribution with mean=12 and SD=2. Hills and Hertwig (2010) found that in a stable environment the sampling pattern that individual participants used influenced their beliefs about which choice produced higher payoffs. It appears that people who frequently switched back and forth between the two options were essentially comparing which choice produced a higher outcome on sequential choices. At the end they tended to choose the option that more frequently produced a higher outcome even though on average it produced a lower mean outcome. In contrast, people who switched less frequently tended to choose the option that on average produced the higher outcome. In sum, perseverating was associated with maximizing expected value. Other experiments have investigated information search in dynamic environments. A dynamic environment is one in which the probability of reward does not remain stable over time. The main type of dynamic environment that has been studied is one in which sometimes x=1 produces a higher reward than x=2, and sometimes it produces a lower reward than x=2 (e.g., Biele, Erev, & Ert, 2009; Daw, O’Doherty, Dayan, Seymour, & Dolan, 2006; Yi, Steyvers, & Lee, 2009). The outcome is autocorrelated in the sense that if x=1 is the better choice at Time 5, it will likely be the better choice at Time 6, but participants do not know how long it will remain the better choice. Dynamic environments have been used primarily in conjunction with tasks that involve both exploration and exploitation; participants are instructed
Publication Year: 2014
Publication Date: 2014-01-01
Language: en
Type: article
Access and Citation
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot