From ml-cald-faculty-request@mlist-1.sp.cs.cmu.edu Wed Jan 14 19:07:48 1998 Received: from mlist-1.sp.cs.cmu.edu (MLIST-1.SP.CS.CMU.EDU [128.2.185.162]) by temper.stat.cmu.edu (8.6.10/8.6.6) with SMTP id TAA14066; Wed, 14 Jan 1998 19:07:46 -0500 Message-Id: <199801150007.TAA14066@temper.stat.cmu.edu> Received: from mlist-1.sp.cs.cmu.edu by mlist-1.sp.cs.cmu.edu id aa13797; 14 Jan 98 19:06 EST Received: from DAYLILY.LEARNING.CS.CMU.EDU by mlist-1.sp.cs.cmu.edu id aa13795; 14 Jan 98 19:05 EST Received: from [127.0.0.1] by daylily.learning.cs.cmu.edu id aa10684; 14 Jan 98 19:04 EST X-Mailer: exmh version 2.0gamma 1/27/96 From: Tom Mitchell To: Tom Mitchell cc: Diane Stidle , cald-faculty@cs.cmu.edu, STC-email@cs.cmu.edu, anderson@psy.cmu.edu, jack.mostow@cmu.edu, ken.koedinger@cmu.edu, jef@cs.cmu.edu, marcel.just@cmu.edu, macw@cmu.edu, Marcia.Lovett@cmu.edu, dan.olsen@cs.cmu.edu, vanlehn+@pitt.edu, Chi@vm2.cis.pitt.edu, resnick@vms.cis.pitt.edu, klatzky+@andrew.cmu.edu, cg09+@andrew.cmu.edu Subject: Re: Next steps: NSF STC on Learning In-reply-to: Your message of "Thu, 08 Jan 1998 18:54:24 EST." <199801082355.SAA05500@cmu1.acs.cmu.edu> X-url: http://www.cs.cmu.edu/~tom/ Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 14 Jan 1998 19:04:52 -0500 Sender: Tom_Mitchell@daylily.learning.cs.cmu.edu Status: R Folks, Here is the list of contributed 200 word abstracts. We'll discuss these and the next steps at 4:30pm thursday, Wean 4623. Please try to have at least one representative from each team present. cheers Tom ================================================================ Learning to Select Actions - in real-time dynamic decision making Complex environments where machines, humans, or animals need to perform tasks are dynamic and require the ability to make decisions in real time. We have experienced with a variety of such environments, including air-traffic control simulation, network data traffic control, medical data monitoring, industrial process optimization, dynamic control of a robot manipulator, coordination of multiple robots in an adversarial domain, and financial investment selection. Our current investigations support our belief that sensitivity to dynamic changes in the patterns of an environment or task are essential to optimal performance in a task, both of natural and artificial systems. We propose to investigate how learning occurs and can be used to improve a system's decision making performance in a dynamic and real-time environment. Several questions will drive our proposed work. How to acquire robust state-action mappings for real-time decision making that handle the uncertainty of the environment? If a goal needs to be achieved and the learner must experiment with its actions to do so, how should it experiment? How to learn to focus attention on the most task-relevant features of the environment? How to capture statistically relevant patterns? Which models represent the behavior of humans operating dynamic systems? How do people merge targeted instruction with their own experiences? What are appropriate biophysical models of neuromodulation and action selection? How do multiple decision makers learn to decompose a task, and share dynamic information efficiently? How do teams adapt to changes in role and status among members? Working together, the cross fertilization will help us to develop new procedures and algorithms to create better artificial systems and help us gain insights into how natural systems perform optimally. We aim at a seamless integration of our expertises that will also lead into a comprehensive comparisons of different approaches. GROUP MEMBERS: ************** Manuela Veloso - Captain veloso@cs.cmu.edu John Anderson ja+@CMU.EDU Avrim Blum avrim+@cs.cmu.edu Howie Choset choset+@cs.cmu.edu Randy Gobbel gobbel@andrew.cmu.edu Marcel Just just+@cmu.edu Jay Kadane kadane@stat.cmu.edu Tai Sing Lee tai@eagle.cnbc.cmu.edu John Levine jml@vms.cis.pitt.edu Roy Maxion maxion@cs.cmu.edu Andrew Moore awm@cs.cmu.edu Lynne Reder reder+@CMU.EDU Reid Simmons reids+@cs.cmu.edu Sebastian Thrun thrun@cs.cmu.edu ================================================================ Reinforcement learning in structured task environments Reinforcement learning is prospering in several fields simultaneously: Computer Science, AI, Animal Learning, Neural models of cognition, Human Cognition and Statistics. Reinforcement learning concerns systems (natural and artificial) that alter their behavior according to rewards and punishments received from the environment. Within computer science this notion has frequently been formalized into the problem of learning Markov Decision Processes. One of the central problems facing Reinforcement Learning within each field is that of structure. Tasks are made up of subtasks, in turn made up of subtasks forming a hierarchy. Also, some tasks may have components that are shared with other tasks, or similar to components in other tasks. Conventional approaches to reinforcement learning do not exploit this structure, thereby missing major opportunities to improve performance. We propose a cross-disciplinary attack on the problem of structure in reinforcement learning. We will investigate the theoretical, practical, human cognitive, animal learning, and neural aspects. A major aspect of this attack will be to compare human performance with that of known algorithms on structured and unstructured problems, and on transfer from one problem to another. We expect that currently, the best machine algorithms will beat humans on unstructured tasks, but lose badly when there is structure and will show far poorer transfer performance. The goal will be to analyse natural performance to understand what procedures humans use to beat the machine algorithms in structured tasks, and then to incorporate these methods into new algorithms. Neuronal recordings in animals and imaging of regional brain activity in humans will be used to gain insights into the internal representations and processes underlying performance and learning in reinforcement learning tasks (what more can you say in ~300 words?). Theoretical analysis and practical applications of the new algorithms will then be pursued to understand convergence properties and assess algorithm robustness in the face of real-world task constraints. Complementing the basic analysis of structure, we will also consider the role that structuring the delivery of reinforcement may play in facilitating learning. It is well-known from the animal learning literature that complex behaviors in animals can be shaped by artful trainers. Such shaping could dramatically improve the performance of many reinforcement learning algorithms, but to our knowledge there is no theoretical analysis of what impact shaping would have on convergence of various reinforcement learning algorithms or of how shaping itself can be optimized or tuned in the context of the particular demands of the various relevant tasks. Again a multi-disciplinary approach will be used, with behavioral, neurophsiological, theoretical and practical methodologies all contributing to the analysis and optimization of the role of shaping in reinforcement learning. Jay McClelland-Captain jlm@cnbc.cmu.edu Manuela Veloso Manuela_Veloso@gama.prodigy.cs.cmu.edu Marsha Lovett lovett@cmu.edu Andrew McCallum mccallum@sandbox.jprc.com Andrew Moore awm@cs.cmu.edu John Anderson ja+@CMU.EDU =========================================================================== Learning in Large Feature Spaces In the traditional approach to machine learning, one of the first tasks the designer must do is select a small number of (hopefully important) features to give to the learning algorithm. However, as machine learning becomes more ubiquitous, often embedded within other systems, it is becoming increasingly critical to remove the designer from this part of the loop. To do so we need to better understand and address the problem of focusing attention: how to learn even though there may be a huge quantity of potentially irrelevant information in the input. There has been some recent progress on a number of fronts: statistical methods for dimension reduction, theoretical analyses of algorithms for learning in large feature spaces, empirical results, and cognitive models of human behavior. However, to achieve a more comprehensive understanding and solution requires combining forces from areas that to date have attacked this problem separately. For example, given measurements of human behavior on tasks where relevant information must be discovered [Roy's wafer fault detection problem], can we extract a model whose properties can be understood mathematically, and then translated to an algorithm for artificial learning? At a more immediate level, how do the biases of statistical methods for dimension reduction (feature selection) interact with the biases of artificial learning algorithms (e.g., neural nets and decision trees) that are going to be using those features? How can one use background knowledge and experimentation to generate plausible features in the first place? What about mixed-media data: how do we solve these problems when inputs are coming from a variety of sources (text, audio, video)? By combining insights from psychology, artifical learning, and theoretical foundations, our goal is to make substantial progress on these basic questions crucial to both the science and engineering of learning. Avrim Blum-Captain avrim+@cs.cmu.edu Peter Spirtes ps7z@andrew.cmu.edu Johan Kumlien kumlien@cs.cmu.edu Tai Sing Lee tai@eagle.cnbc.cmu.edu Christos Faloutsos christos@cs.cmu.edu Roy Maxion maxion@cs.cmu.edu =========================================================================== Unsupervised + Supervised Learning Machine learning techniques may be classified into four broad groups according to the type of training information they use: 1. In SUPERVISED LEARNING, each training example consists of a set of input values and a "label" or set of desired output values. Techniques of this type, including most kinds of neural-net and decision-tree learning algorithms, have been well studied and can be very powerful. However, in many important problem domains it is impossible or impractical to obtain enough labeled training data to build an accurate model. 2. In UNSUPERVISED LEARNING, the training examples are not labeled. The learning system must discover useful patterns or clusters within this set of examples. In many domains -- speech recognition is a good example -- unlabeled data is plentiful (just turn on the radio) while labeled data is scarce and expensive. However, these unsupervised-learning algorithms are not as powerful or as well-studied as supervised learning algorithms. While they will usually find some kind of structure within the data set, the structure they find may have no relation to the problem the user is trying to solve. 3. In REINFORCEMENT LEARNING, the system is allowed to perform (perhaps based on a stream of inputs), and the performance is evaluated by an external teacher. Good performance is reinforced, while bad performance is punished. Reinforcement learning techniques have been well studied, especially in the domain of control problems, but they can be extremely inefficient. Most existing reinforcement-learning algorithms operate in a single space of fine-grained features, rather like planning a trip across the country by focusing on each footstep. Researchers have just begun to study hierarchical reinforcement learning, in which the planning is done first at a high level of abstraction and then refined through filling in of successive levels of detail. 4. In EXPLICIT TEACHING, an external (usually human) teacher provides rules or constraints that will govern the system's behavior. This input is usually provided in symbolic form. For example, a character-recognition system might be told that small lateral shifts and changes of scale do not change the classification of the character. Such constraints can be extremely useful, but we do not have good general techniques for combining symbolic constraints with statistical, data-driven machine learning. Most of the real-world problems that we would like to study in the STC will require more than one of these techniques. Each of these approaches has been developed largely in isolation from the others, but we believe that the time has come to explore ways of combining them. By breaking down these boundaries, we should be able to build more powerful learning systems that will take better advantage of *all* the available information about the problem at hand. Some examples of approaches we would like to investigate: * In a "mostly unsupervised" learning system, we might use a small amount of labeled training data to build a crude but appropriate structure for a classification system. We would then feed a much larger amount of unlabeled training data through the resulting system in order to refine it. Some form of the expectation-maximization (EM) algorithm or graphical-model learning might be used in constructing such a system. * We might use explicit teaching to create the hierarchical levels and general structure of a control system, and then use some combination of reinforcement learning and unsupervised learning to tune this system. * Explicit teaching could be used to supply some structural constraints for a constructive neural network, which then could be trained using supervised learning. We have many specific (and rather ad hoc) examples of this technique, especially in systems for handwriting and speech recognition, but we do not yet have general techniques for combining symbolic and statistical learning. * Biological systems employ all of these approaches in various combinations (though only some "higher" animals seem to use explicit symbolic descriptions). Careful investigation of human and animal learning and problem-solving should give us some insight into the ways in which these modalities are combined in biological systems. Scott Fahlman-Captain sef+@cs.cmu.edu Tom Mitchell tom.mitchell@cs.cmu.edu Jay McClelland jlm@cnbc.cmu.edu John Lafferty lafferty@cs.cmu.edu Andrew McCallum mccallum@sandbox.jprc.com Randy Gobbel gobbel@andrew.cmu.edu Mark Schervish mark@stat.cmu.edu Roni Rosenfeld roni@cmu.edu James Garrett garrett@CMU.EDU =========================================================================== Learning to Interact with Physical Environment The world is full of different types of objects -- wooden objects, metal, plastic, plants, animals; some rigid, some not. People have a great facility for recognizing the material properties of objects, even unfamiliar objects (e.g., distinguishing plastic from wood, man-made from natural materials), and for understanding the properties of such materials. Recognition takes place mainly through visual input, but also through haptic, and even acoustic, feedback. A key problem in both natural and machine learning is how such representations of materials are acquired and utilized. While much research has focused on recognizing and classifying objects, comparatively little has been done to date on recognizing and classifying the material properties of objects (relevant work includes research on texture recognition, Shaffer's work on visual color models and specularity, Krotkov's work on acoustic recognition of material properties, [FILL IN OTHER EXAMPLES HERE]). This is a natural area for cross-cutting research: the natural learning community (e.g, Klatzky) has investigated material and geometric properties of objects, as sensed via the haptic system; The statistics learning community is interested in pattern recognition in very high dimension feature spaces, especially in the presence of uncertainty (e.g., sensor uncertainty); The AI (vision and robotics) community has investigated object recognition and has some experience with modeling object properties for manipulation. A possible focus application that drives this area of research may be to investigate the problem of modeling, understanding and learning to recognize *non-rigid* objects of various materials. This is a difficult, yet important, problem for dealing with the physical environment. Reid Simmons - Captain reids+@cs.cmu.edu Bobby Klatsky klatzky+@andrew.cmu.edu Manuela Veloso Manuela_Veloso@gama.prodigy.cs.cmu.edu Tom Mitchell tom.mitchell@cs.cmu.edu Randy Gobbel gobbel@andrew.cmu.edu Mark Schervish mark@stat.cmu.edu James Garrett garrett@CMU.EDU =========================================================================== Learning and Inference in Large Datasets Models of learning have the potential to explain and predict performance in educational, industrial, and technological settings. Such models are often represented as complex, nonlinear systems that make detailed predictions about step-by-step actions across time. While current techniques have brought us a long way toward an understanding of learning in basic tasks, research is now moving to the study of real-world applications. The sheer complexity of the learning processes and the immense amount and diversity of data in these contexts require new methods and better models. Three principal challenges arise. First, sophisticated models typically involve an implicit, rather than explicit, relationship between parameters and data, making it difficult, both conceptually and computationally, to fit these models to data and to make comparisons among competing models. Second, the unprecedented size of datasets (e.g., Gigabytes of data from students interacting with an intelligent tutoring system or even Terabytes of data from customer purchase records) make typical inference algorithms untractable and thus require the incorporation of efficient algorithms for indexing, clustering, and compression. Third, the learning processes under study potentially depend on complex interactions among many features of the environment. If we must select a priori a subset of features to be included in a model, the model becomes unduly constrained and potentially inaccurate. Instead, we need techniques that effectively search the space of possible models and, guided by the data, automatically generate models that include the most predictive features. Moreover, such model development can be accelerated by replacing passive data mining with active exploration. For example, a factory controller or intelligent tutor can discover a prescriptive model faster by running controlled experiments rather than just by analyzing the consequences of its current behaviors. We propose the following * To apply and extend advanced methods of statistical inference that will make it computationally feasible to fit complicated learning models to data. * To integrate state-of-the-art database retrieval methods with model-fitting techniques, to achieve fast inference from massive datasets. This cross-disciplinary effort will spur database research (by providing a compelling application: inferencing), which in return will provide fast tools to enable inferencing from large datasets that are intractable today. * To develop new methods for automated model generation that use the data to construct predictive combinations of features. * To devise algorithms that adaptively and dynamically acquire data so as to maximize the available information for model fitting and model generation. Intelligent tutoring systems provide a natural platform for testing new learning theories and for developing new algorithms and statistical methods that can process (both in real time and in post-processing) the immense datasets from recorded student interactions. This application area offers an opportunity to apply theoretical and machine learning approaches to the problem of human learning. Using intelligent tutors as a research vehicle builds on CMU's world leadership this area and also provides an opportunity for direct practical outcomes of the center's basic research advances. [We're over the word limit already, but we could provide a more detailed intelligent tutoring system example to demonstrate some of the above points. Also, it would be nice to add something about drawing on research on complex, natural learning processes (e.g., perceptual learning) that are carrying out model fitting and generation analogous to that described above.] Marsha Lovett-Captain lovett+@CMU.EDU Rob Kass kass@stat.cmu.edu Ken Koedinger koedinger@cmu.edu Jack Mostow mostow@cs.cmu.edu Jay McClelland jlm@cnbc.cmu.edu Johan Kumlien kumlien@cs.cmu.edu Kannan Srinivasan kannan.srinivasan@cmu.edu Peter Spirtes ps7z@andrew.cmu.edu Tai Sing Lee tai@eagle.cnbc.cmu.edu Larry Wasserman larry@stat.cmu.edu Christos Faloutsos christos@cs.cmu.edu =========================================================================== Dynamic Data Discovery (3-D) (Previous title: Data Visualization) The explosive growth of scientific databases is illustrated by the terabytes of data acquired in hundreds of scientific laboratories, such as the functional imaging studies in cognitive neuroscience, spike train patterns from large ensembles of neurons, particle detection data in high energy physics, and the human genome project. To help address the challenges for learning from data that such databases pose, scientists often look to data displays for both summary (e.g., see Tufte, etc.) and exploration. But traditional forms of static data display are not well suited to large-scale data sets, and thus interest has been turning to dynamic displays. We propose to systematically study the role of dynamic data displays for complex multi-dimensional data, drawing on both models of human pattern recognition and that integrate statistical modeling procedures with dynamic graphic displays. Such displays would provide advances over previous exploratory visualization techniques in statistics (e.g.,Tukey, Cleveland) that have by-and-large relied on relatively small dimension data sets, familiar transformations (such as normalization), and static displays. One reason to incorporate dynamics in the data mining technique is the human observer's sensitivity to motion; it is a powerful pattern recognition cue that can be used in addition to 2-D patterns. Another is that many of the interesting processes in scientific research are themselves dynamic. What we look to are displays that can be linked to statistical models, especially of a nonlinear nature, that are computationally efficient to create (given that massive amounts of data may be involved) such as time-sequence image processing, and that are efficient from the perspective of human pattern recognition. The proposed dynamic data discovery research also has potential uses in connection with the other research areas in the proposed S&T Center, e.g., functional neuroimaging data, and efforts to study, understand, and model the biological system's ability to develop perceptual models and skills, in primates via neural recordings. Stephen Fienberg-Captain Pat Carpenter Marcel Just Bill Eddy David Casasent Tai Sing Lee Steve Fienberg-Captain fienberg@stat.cmu.edu Pat Carpenter carpenter+@cmu.edu Bill Eddy bill@stat.cmu.edu Roy Maxion maxion@cs.cmu.edu Christos Faloutsos christos@cs.cmu.edu Tai Sing Lee tai@eagle.cnbc.cmu.edu ================================================================ Learning + Spacial Domains Will be sent directly to Tom, Steve, Jay 1/14/98, evening. ================================================================ Learning by Instruction plus Experience (implicit/explicit) Below is the 200 word submission from the implicit/explicit learning team. We found it very hard to say *anything* in 200 words, and some of us are worried that the pre-proposal is going to appear too superficial if it consists of many such paragraphs. In short, the fate of these paragraphs should be an agenda item for Thursday's meeting. Psychologists have found a dissociation between explicit and implicit forms of learning. The first of these utilizes explicit representations of abstract knowledge, as when verbal instructions are provided. Implicit learning proceeds without such explicit representations, as when a behavior is shaped by experience. While psychologists have studied these two integrated mechanisms in humans, this dichotomy extends beyond natural learning systems. Computer scientists face this distinction as one between explicit programming and machine learning. Statisticians confront these issues in the guise of prior knowledge/model selection and algorithms for statistical inference. At this STC, these researchers will come together to forge new normative and effective techniques for integrated implicit/explicit learning. These techniques will guide models of human learning, advancing our understanding of the functional and biological architecture of these dual mechanisms. Teaching methods will be designed to leverage the strengths of both learning strategies. New software development tools will allow systems to arise from a mixture of explicit programming and experience-based learning. This will facilitate software fabrication and also allow programs to adapt to individual users and changing environments. And, at the foundation will be formal mathematical approaches to the incorporation of arbitrary prior knowledge into the statistical analysis of empirical data. Dave Noelle-Captain noelle@acm.org Sebastian Thrun thrun@cs.cmu.edu Dave Touretzsky Dave_Touretzky@cs.cmu.edu Jay Kadane kadane@stat.cmu.edu Tai Sing Lee tai@eagle.cnbc.cmu.edu Ken Koedinger koedinger@cmu.edu Jack Mostow mostow@cs.cmu.edu Lynne Reder reder+@CMU.EDU ================================================================ Teaching Intelligent Tutoring Systems as Data for and Outcome of Learning Research The creation and evaluation of intelligent tutoring systems (ITSs) has been an important driver of basic research in the learning sciences as well as a dramatic practical outcome of such research. ITS research has been fundamental, for instance, to the development of Anderson's ACT theories, including advances in machine learning techniques and Bayesian statistics that enhance our understanding of human learning. Carnegie Mellon researchers have created a great diversity of ITSs (elementary reading, pre-school language, math, programming, logic) and have been world leaders in making such systems practically feasible (tutors in 40 schools), accessible to disadvantaged populations (urban schools, developmentally-delayed students) and instructionally effective (standard deviation achievement gains over normal instruction). Despite these hard-fought successes, fundamental problems remain in modeling student learning and automating interactive instruction and thus, our ITSs are far from the instructional effectiveness possible (human tutors produce two standard deviation achievement gains). By combining the expertise of leading researchers in machine algorithms, theoretical foundations, and human psychology, the center can make great advances. Use of our ITSs has yielded a number of vast databases of student learning interactions. These databases, as well as on-line experiments with ITSs, provide an opportunity to not only extend human learning theories, but also to develop machine learning algorithms and statistical techniques to automatically extract and refine models of student learning and effective instruction. ------ Note: There are good connections to be made to other groups, in particular, "learning and inference from large datasets" and "learning from instruction or example (explicit/implicit)". Ken Koedinger-Captain koedinger@cmu.edu Jack Mostow mostow@cs.cmu.edu Rob Kass kass@stat.cmu.edu John Anderson ja+@CMU.EDU Jill Fain jef@cs.cmu.edu Richard Scheines rs2l+@andrew.cmu.edu Peter Brusilovsky plb@cs.cmu.edu ================================================================ Language Learning Models of language learning in humans and machines contrast in terms of both goals and methods. Studies of human language learning have traditionally emphasized explicit, domain-specific knowledge, whereas computer algorithms that learn language have tended to rely on implicit, statistical, data-driven methods. Recently, psycholinguists have discovered that data-driven methods play an important role in explaining natural language learning in areas as diverse as infant speech segmentation, lexical semantic structures, and phrasal attachment during sentence parsing. At the same time, computer scientists have developed algorithms that use background domain knowledge to improve the accuracy of machine learning in a number of non-language learning tasks (e.g., for learning to choose actions in control problems). The challenge currently facing both approaches to language learning is to discover learning processes that take advantage of both data and domain knowledge in a synergistic fashion, and to understand whether/how rich and embodied cognitive representations can emerge dynamically from mindless data-driven techniques. For machines, we want to learn how embodiment and domain knowledge can facilitate learning, teaching, and the construction of a richer representational system. For humans, we want to learn how the statistical properties of language work to tune the processing system. We have a unique opportunity to explore these problems in the testbeds of our work on tutors (Reading Tutor, Simone Says), robots, digital libraries, the child language database (CHILDES), and functional MRI studies of language processing and learning. Brian MacWhinney-Captain macw@CMU.EDU Tom Mitchell tom.mitchell@cs.cmu.edu Alex Waibel ahw@speech2.cs.cmu.edu Andrew McCallum mccallum@sandbox.jprc.com Dave Plaut plaut+@cmu.edu John Lafferty lafferty@cs.cmu.edu Randy Gobbel gobbel@andrew.cmu.edu Jack Mostow mostow@cs.cmu.edu Johan Kumlien kumlien@cs.cmu.edu Jill Fain jef@cs.cmu.edu Roni Rosenfeld roni@cmu.edu ================================================================ Perceptual Learning at CMU STC How does a baby learn to perceive? As he crawls, plays with his toys and touches his mother, his nervous system rapidly wires itself up so that he can interpret the visual environment and make sense of the 3D structure of objects in the world based on the 2D retinal images. The experience of manipulating real physical objects is critical to the emergence of his perceptual ability. How does this happen in biological and robotic systems? When a girl is learning to read, memorizing A,B,C, then words, and then phrases, letters gradually dissolve into words, words into phrases in her perceputal system as her reading speed increases with practice. Does the visual system rewire itself continuously to learn new spatial conjunctions of letters in words or in general new visual skills? When a boy learns to play soccer, his perceptual system has to learn a different set of perceptual strategies and visual routines depending on the different maneuvering actions. How could these action-dependent perceptual routines be learned and linked together temporally? When a `mobot' is moving around in a hospital, a `webot' navigating in the world wide web, our eye scanning a visual scene, richer and richer internal representations of the world are or should be constructed dynamically and cumulatively using a small amount of selected information at a time. How are these internal representations updated, combined and composed flexibly and dynamically? Answers to these questions are not only crucial to the understanding of brain and cognition, but also critical to advancing technologies in robotics, internet agent technologies and data-mining. The focus of our team is on LIFE-LONG perceptual learning: the early development of perceptual ability, the emergence of higher order perceptual structures, the acquisition of new perceptual skills, and the learning of spatiotemporal conjunctive constructs and visual routines in association with motor behaviors in both natural and artificial systems. Our approaches include neuropsychological, psychophysical and MRI studies on human (Behrmann and McClelland), robot, sensor, computer and data-mining experiments and applications (Veloso and Mitchell), computational modeling and theoretical studies (Mccallum, Kass, McClelland, Mitchell, Lee), neurophysiological experiments on awake monkeys (Lee and Miyashita). Tai Sing Lee-Captain tai@eagle.cnbc.cmu.edu Jay McClelland jlm@cnbc.cmu.edu Tom Mitchell tom.mitchell@cs.cmu.edu Manuela Veloso Manuela_Veloso@gama.prodigy.cs.cmu.edu Marlene Behrmann mb9h@crab.psy.cmu.edu. Andrew McCallum mccallum@sandbox.jprc.com