From ml-cald-faculty-request@mlist-1.sp.cs.cmu.edu  Wed Jan 14 19:07:48 1998
Received: from mlist-1.sp.cs.cmu.edu (MLIST-1.SP.CS.CMU.EDU [128.2.185.162]) by temper.stat.cmu.edu (8.6.10/8.6.6) with SMTP id TAA14066; Wed, 14 Jan 1998 19:07:46 -0500
Message-Id: <199801150007.TAA14066@temper.stat.cmu.edu>
Received: from mlist-1.sp.cs.cmu.edu by mlist-1.sp.cs.cmu.edu id aa13797;
          14 Jan 98 19:06 EST
Received: from DAYLILY.LEARNING.CS.CMU.EDU by mlist-1.sp.cs.cmu.edu id aa13795;
          14 Jan 98 19:05 EST
Received: from [127.0.0.1] by daylily.learning.cs.cmu.edu id aa10684;
          14 Jan 98 19:04 EST
X-Mailer: exmh version 2.0gamma 1/27/96
From: Tom Mitchell <Tom.Mitchell@cs.cmu.edu>
To: Tom Mitchell <Tom.Mitchell@cs.cmu.edu>
cc: Diane Stidle <diane@cs.cmu.edu>, cald-faculty@cs.cmu.edu,
        STC-email@cs.cmu.edu, anderson@psy.cmu.edu, jack.mostow@cmu.edu,
        ken.koedinger@cmu.edu, jef@cs.cmu.edu, marcel.just@cmu.edu,
        macw@cmu.edu, Marcia.Lovett@cmu.edu, dan.olsen@cs.cmu.edu,
        vanlehn+@pitt.edu, Chi@vm2.cis.pitt.edu, resnick@vms.cis.pitt.edu,
        klatzky+@andrew.cmu.edu, cg09+@andrew.cmu.edu
Subject: Re: Next steps: NSF STC on Learning 
In-reply-to: Your message of "Thu, 08 Jan 1998 18:54:24 EST."
             <199801082355.SAA05500@cmu1.acs.cmu.edu> 
X-url: http://www.cs.cmu.edu/~tom/
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Wed, 14 Jan 1998 19:04:52 -0500
Sender: Tom_Mitchell@daylily.learning.cs.cmu.edu
Status: R

Folks,

Here is the list of contributed 200 word abstracts.  We'll discuss these and
the next steps at 4:30pm thursday, Wean 4623.  Please try to have at least one 
representative from each team present.

cheers
Tom

================================================================
Learning to Select Actions - in real-time dynamic decision making

Complex environments where machines, humans, or animals need to perform tasks
are dynamic and require the ability to make decisions in real time.

We have experienced with a variety of such environments, including air-traffic
control simulation, network data traffic control, medical data monitoring,
industrial process optimization, dynamic control of a robot manipulator,
coordination of multiple robots in an adversarial domain, and financial
investment selection.

Our current investigations support our belief that sensitivity to dynamic
changes in the patterns of an environment or task are essential to optimal
performance in a task, both of natural and artificial systems.

We propose to investigate how learning occurs and can be used to improve a
system's decision making performance in a dynamic and real-time environment.

Several questions will drive our proposed work. How to acquire robust
state-action mappings for real-time decision making that handle the
uncertainty of the environment? If a goal needs to be achieved and the learner
must experiment with its actions to do so, how should it experiment?  How to
learn to focus attention on the most task-relevant features of the
environment? How to capture statistically relevant patterns?  Which models
represent the behavior of humans operating dynamic systems? How do people
merge targeted instruction with their own experiences?  What are appropriate
biophysical models of neuromodulation and action selection?  How do multiple
decision makers learn to decompose a task, and share dynamic information
efficiently?  How do teams adapt to changes in role and status among members?

Working together, the cross fertilization will help us to develop new
procedures and algorithms to create better artificial systems and help us gain
insights into how natural systems perform optimally.  We aim at a seamless
integration of our expertises that will also lead into a comprehensive
comparisons of different approaches.

GROUP MEMBERS:
**************
Manuela Veloso  - Captain   	veloso@cs.cmu.edu
John Anderson			ja+@CMU.EDU
Avrim Blum			avrim+@cs.cmu.edu
Howie Choset			choset+@cs.cmu.edu	
Randy Gobbel			gobbel@andrew.cmu.edu
Marcel Just			just+@cmu.edu
Jay Kadane			kadane@stat.cmu.edu
Tai Sing Lee			tai@eagle.cnbc.cmu.edu
John Levine 	    	    	jml@vms.cis.pitt.edu
Roy Maxion			maxion@cs.cmu.edu
Andrew Moore	    	    	awm@cs.cmu.edu
Lynne Reder			reder+@CMU.EDU
Reid Simmons			reids+@cs.cmu.edu
Sebastian Thrun		    	thrun@cs.cmu.edu
================================================================
Reinforcement learning in structured task environments

Reinforcement learning is prospering in several fields simultaneously:
Computer Science, AI, Animal Learning, Neural models of cognition, Human
Cognition and Statistics.  Reinforcement learning concerns systems (natural
and artificial) that alter their behavior according to rewards and punishments
received from the environment. Within computer science this notion has
frequently been formalized into the problem of learning Markov Decision
Processes.

One of the central problems facing Reinforcement Learning within each field is
that of structure.  Tasks are made up of subtasks, in turn made up of subtasks
forming a hierarchy.  Also, some tasks may have components that are shared
with other tasks, or similar to components in other tasks.  Conventional
approaches to reinforcement learning do not exploit this structure, thereby
missing major opportunities to improve performance.

We propose a cross-disciplinary attack on the problem of structure in
reinforcement learning.  We will investigate the theoretical, practical, human
cognitive, animal learning, and neural aspects.  A major aspect of this attack
will be to compare human performance with that of known algorithms on
structured and unstructured problems, and on transfer from one problem to
another.  We expect that currently, the best machine algorithms will beat
humans on unstructured tasks, but lose badly when there is structure and will
show far poorer transfer performance.  The goal will be to analyse natural
performance to understand what procedures humans use to beat the machine
algorithms in structured tasks, and then to incorporate these methods into new
algorithms.  Neuronal recordings in animals and imaging of regional brain
activity in humans will be used to gain insights into the internal
representations and processes underlying performance and learning in
reinforcement learning tasks (what more can you say in ~300 words?).
Theoretical analysis and practical applications of the new algorithms will
then be pursued to understand convergence properties and assess algorithm
robustness in the face of real-world task constraints.

Complementing the basic analysis of structure, we will also consider the role
that structuring the delivery of reinforcement may play in facilitating
learning.  It is well-known from the animal learning literature that complex
behaviors in animals can be shaped by artful trainers.  Such shaping could
dramatically improve the performance of many reinforcement learning
algorithms, but to our knowledge there is no theoretical analysis of what
impact shaping would have on convergence of various reinforcement learning
algorithms or of how shaping itself can be optimized or tuned in the context
of the particular demands of the various relevant tasks.  Again a
multi-disciplinary approach will be used, with behavioral, neurophsiological,
theoretical and practical methodologies all contributing to the analysis and
optimization of the role of shaping in reinforcement learning.

Jay McClelland-Captain	jlm@cnbc.cmu.edu
Manuela Veloso	Manuela_Veloso@gama.prodigy.cs.cmu.edu	
Marsha Lovett	lovett@cmu.edu
Andrew McCallum	mccallum@sandbox.jprc.com
Andrew Moore	awm@cs.cmu.edu
John Anderson	ja+@CMU.EDU

===========================================================================
Learning in Large Feature Spaces

In the traditional approach to machine learning, one of the first tasks the
designer must do is select a small number of (hopefully important) features to
give to the learning algorithm.  However, as machine learning becomes more
ubiquitous, often embedded within other systems, it is becoming increasingly
critical to remove the designer from this part of the loop.  To do so we need
to better understand and address the problem of focusing attention: how to
learn even though there may be a huge quantity of potentially irrelevant
information in the input.

There has been some recent progress on a number of fronts: statistical methods
for dimension reduction, theoretical analyses of algorithms for learning in
large feature spaces, empirical results, and cognitive models of human
behavior.  However, to achieve a more comprehensive understanding and solution
requires combining forces from areas that to date have attacked this problem
separately.  For example, given measurements of human behavior on tasks where
relevant information must be discovered [Roy's wafer fault detection problem],
can we extract a model whose properties can be understood mathematically, and
then translated to an algorithm for artificial learning?  At a more immediate
level, how do the biases of statistical methods for dimension reduction
(feature selection) interact with the biases of artificial learning algorithms
(e.g., neural nets and decision trees) that are going to be using those
features?  How can one use background knowledge and experimentation to
generate plausible features in the first place? What about mixed-media data:
how do we solve these problems when inputs are coming from a variety of
sources (text, audio, video)?  By combining insights from psychology,
artifical learning, and theoretical foundations, our goal is to make
substantial progress on these basic questions crucial to both the science and
engineering of learning.

Avrim Blum-Captain	avrim+@cs.cmu.edu
Peter Spirtes	ps7z@andrew.cmu.edu
Johan Kumlien	kumlien@cs.cmu.edu
Tai Sing Lee	tai@eagle.cnbc.cmu.edu
Christos Faloutsos	christos@cs.cmu.edu
Roy Maxion	maxion@cs.cmu.edu

===========================================================================
Unsupervised + Supervised Learning

Machine learning techniques may be classified into four broad groups
according to the type of training information they use:

1. In SUPERVISED LEARNING, each training example consists of a set of input
   values and a "label" or set of desired output values.  Techniques of
   this type, including most kinds of neural-net and decision-tree learning
   algorithms, have been well studied and can be very powerful.  However,
   in many important problem domains it is impossible or impractical to
   obtain enough labeled training data to build an accurate model.

2. In UNSUPERVISED LEARNING, the training examples are not labeled.  The
   learning system must discover useful patterns or clusters within this
   set of examples.  In many domains -- speech recognition is a good
   example -- unlabeled data is plentiful (just turn on the radio) while
   labeled data is scarce and expensive.  However, these
   unsupervised-learning algorithms are not as powerful or as well-studied
   as supervised learning algorithms.  While they will usually find some
   kind of structure within the data set, the structure they find may have
   no relation to the problem the user is trying to solve.

3. In REINFORCEMENT LEARNING, the system is allowed to perform (perhaps
   based on a stream of inputs), and the performance is evaluated by an
   external teacher.  Good performance is reinforced, while bad performance
   is punished.  Reinforcement learning techniques have been well studied,
   especially in the domain of control problems, but they can be extremely
   inefficient.  Most existing reinforcement-learning algorithms operate in
   a single space of fine-grained features, rather like planning a trip
   across the country by focusing on each footstep.  Researchers have just
   begun to study hierarchical reinforcement learning, in which the
   planning is done first at a high level of abstraction and then refined
   through filling in of successive levels of detail.

4. In EXPLICIT TEACHING, an external (usually human) teacher provides rules
   or constraints that will govern the system's behavior.  This input is
   usually provided in symbolic form.  For example, a character-recognition
   system might be told that small lateral shifts and changes of scale do
   not change the classification of the character.  Such constraints can be
   extremely useful, but we do not have good general techniques for
   combining symbolic constraints with statistical, data-driven machine
   learning.

Most of the real-world problems that we would like to study in the STC will
require more than one of these techniques.  Each of these approaches has
been developed largely in isolation from the others, but we believe that
the time has come to explore ways of combining them.  By breaking down
these boundaries, we should be able to build more powerful learning systems
that will take better advantage of *all* the available information about
the problem at hand.

Some examples of approaches we would like to investigate:

* In a "mostly unsupervised" learning system, we might use a small amount
  of labeled training data to build a crude but appropriate structure for a
  classification system.  We would then feed a much larger amount of
  unlabeled training data through the resulting system in order to refine
  it.  Some form of the expectation-maximization (EM) algorithm or
  graphical-model learning might be used in constructing such a system.

* We might use explicit teaching to create the hierarchical levels and
  general structure of a control system, and then use some combination of
  reinforcement learning and unsupervised learning to tune this system.

* Explicit teaching could be used to supply some structural constraints for
  a constructive neural network, which then could be trained using
  supervised learning.  We have many specific (and rather ad hoc) examples
  of this technique, especially in systems for handwriting and speech
  recognition, but we do not yet have general techniques for combining
  symbolic and statistical learning.

* Biological systems employ all of these approaches in various
  combinations (though only some "higher" animals seem to use explicit
  symbolic descriptions).  Careful investigation of human and animal
  learning and problem-solving should give us some insight into the ways in
  which these modalities are combined in biological systems. 

Scott Fahlman-Captain	sef+@cs.cmu.edu
Tom Mitchell	tom.mitchell@cs.cmu.edu
Jay McClelland	jlm@cnbc.cmu.edu
John Lafferty	lafferty@cs.cmu.edu
Andrew McCallum	mccallum@sandbox.jprc.com	
Randy Gobbel	gobbel@andrew.cmu.edu
Mark Schervish	mark@stat.cmu.edu
Roni Rosenfeld	roni@cmu.edu
James Garrett	garrett@CMU.EDU

===========================================================================
Learning to Interact with Physical Environment

The world is full of different types of objects -- wooden objects, metal,
plastic, plants, animals; some rigid, some not.  People have a great facility
for recognizing the material properties of objects, even unfamiliar objects
(e.g., distinguishing plastic from wood, man-made from natural materials), and
for understanding the properties of such materials.  Recognition takes place
mainly through visual input, but also through haptic, and even acoustic,
feedback.

A key problem in both natural and machine learning is how such representations
of materials are acquired and utilized.  While much research has focused on
recognizing and classifying objects, comparatively little has been done to
date on recognizing and classifying the material properties of objects
(relevant work includes research on texture recognition, Shaffer's work on
visual color models and specularity, Krotkov's work on acoustic recognition of
material properties, [FILL IN OTHER EXAMPLES HERE]).  This is a natural area
for cross-cutting research: the natural learning community (e.g, Klatzky) has
investigated material and geometric properties of objects, as sensed via the
haptic system; The statistics learning community is interested in pattern
recognition in very high dimension feature spaces, especially in the presence
of uncertainty (e.g., sensor uncertainty); The AI (vision and robotics)
community has investigated object recognition and has some experience with
modeling object properties for manipulation.

A possible focus application that drives this area of research may be to
investigate the problem of modeling, understanding and learning to recognize
*non-rigid* objects of various materials.  This is a difficult, yet important,
problem for dealing with the physical environment.

Reid Simmons - Captain	reids+@cs.cmu.edu	
Bobby Klatsky	klatzky+@andrew.cmu.edu
Manuela Veloso	Manuela_Veloso@gama.prodigy.cs.cmu.edu
Tom Mitchell	tom.mitchell@cs.cmu.edu
Randy Gobbel	gobbel@andrew.cmu.edu
Mark Schervish	mark@stat.cmu.edu
James Garrett	garrett@CMU.EDU

===========================================================================
Learning and Inference in Large Datasets

Models of learning have the potential to explain and predict performance in
educational, industrial, and technological settings.  Such models are often
represented as complex, nonlinear systems that make detailed predictions about
step-by-step actions across time.  While current techniques have brought us a
long way toward an understanding of learning in basic tasks, research is now
moving to the study of real-world applications.  The sheer complexity of the
learning processes and the immense amount and diversity of data in these
contexts require new methods and better models. Three principal challenges
arise.

 First, sophisticated models typically involve an implicit, rather than
explicit, relationship between parameters and data, making it difficult, both
conceptually and computationally, to fit these models to data and to make
comparisons among competing models.  Second, the unprecedented size of
datasets (e.g., Gigabytes of data from students interacting with an
intelligent tutoring system or even Terabytes of data from customer purchase
records) make typical inference algorithms untractable and thus require the
incorporation of efficient algorithms for indexing, clustering, and
compression. Third, the learning processes under study potentially depend on
complex interactions among many features of the environment.  If we must
select a priori a subset of features to be included in a model, the model
becomes unduly constrained and potentially inaccurate.  Instead, we need
techniques that effectively search the space of possible models and, guided by
the data, automatically generate models that include the most predictive
features.

  Moreover, such model development can be accelerated by replacing passive
data mining with active exploration.  For example, a factory controller or
intelligent tutor can discover a prescriptive model faster by running
controlled experiments rather than just by analyzing the consequences of its
current behaviors.

We propose the following 
*  To apply and extend advanced methods of statistical inference that
will make it computationally feasible to fit complicated learning models
to data.   
* To integrate state-of-the-art database retrieval methods with
model-fitting techniques, to achieve fast inference from massive
datasets.   This cross-disciplinary effort will spur database research
(by providing a compelling application: inferencing), which in return
will provide fast tools to enable inferencing from large datasets that
are intractable today. 
*  To develop new methods for automated model generation that use the
data to construct predictive combinations of features.  
*  To devise algorithms that adaptively and dynamically acquire data so
as to maximize the available information for model fitting and model
generation.

Intelligent tutoring systems provide a natural platform for testing new
learning theories and for developing new algorithms and statistical methods
that can process (both in real time and in post-processing) the immense
datasets from recorded student interactions.  This application area offers an
opportunity to apply theoretical and machine learning approaches to the
problem of human learning. Using intelligent tutors as a research vehicle
builds on CMU's world leadership this area and also provides an opportunity
for direct practical outcomes of the center's basic research advances.

 [We're over the word limit already, but we could provide a more detailed
intelligent tutoring system example to demonstrate some of the above points.
Also, it would be nice to add something about drawing on research on complex,
natural learning processes (e.g., perceptual learning) that are carrying out
model fitting and generation analogous to that described above.]

Marsha Lovett-Captain	lovett+@CMU.EDU
Rob Kass	kass@stat.cmu.edu
Ken Koedinger	koedinger@cmu.edu
Jack Mostow	mostow@cs.cmu.edu
Jay McClelland	jlm@cnbc.cmu.edu
Johan Kumlien	kumlien@cs.cmu.edu
Kannan Srinivasan	kannan.srinivasan@cmu.edu
Peter Spirtes	ps7z@andrew.cmu.edu
Tai Sing Lee	tai@eagle.cnbc.cmu.edu
Larry Wasserman	larry@stat.cmu.edu
Christos Faloutsos	christos@cs.cmu.edu

===========================================================================

Dynamic Data Discovery (3-D)  (Previous title: Data Visualization)

The explosive growth of scientific databases is illustrated by the terabytes
of data acquired in hundreds of scientific laboratories, such as the
functional imaging studies in cognitive neuroscience, spike train patterns
from large ensembles of neurons, particle detection data in high energy
physics, and the human genome project.  To help address the challenges for
learning from data that such databases pose, scientists often look to data
displays for both summary (e.g., see Tufte, etc.) and exploration. But
traditional forms of static data display are not well suited to large-scale
data sets, and thus interest has been turning to dynamic displays.  We propose
to systematically study the role of dynamic data displays for complex
multi-dimensional data, drawing on both models of human pattern recognition
and that integrate statistical modeling procedures with dynamic graphic
displays. Such displays would provide advances over previous exploratory
visualization techniques in statistics (e.g.,Tukey, Cleveland) that have
by-and-large relied on relatively small dimension data sets, familiar
transformations (such as normalization), and static displays. One reason to
incorporate dynamics in the data mining technique is the human observer's
sensitivity to motion; it is a powerful pattern recognition cue that can be
used in addition to 2-D patterns.  Another is that many of the interesting
processes in scientific research are themselves dynamic. What we look to are
displays that can be linked to statistical models, especially of a nonlinear
nature, that are computationally efficient to create (given that massive
amounts of data may be involved) such as time-sequence image processing, and
that are efficient from the perspective of human pattern recognition.  The
proposed dynamic data discovery research also has potential uses in connection
with the other research areas in the proposed S&T Center, e.g., functional
neuroimaging data, and efforts to study, understand, and model the biological
system's ability to develop perceptual models and skills, in primates via
neural recordings.
     

Stephen Fienberg-Captain	
Pat Carpenter			
Marcel Just
Bill Eddy
David Casasent
Tai Sing Lee

Steve Fienberg-Captain	fienberg@stat.cmu.edu
Pat Carpenter	carpenter+@cmu.edu
Bill Eddy	 bill@stat.cmu.edu
Roy Maxion	maxion@cs.cmu.edu
Christos Faloutsos	christos@cs.cmu.edu
Tai Sing Lee	tai@eagle.cnbc.cmu.edu

================================================================
Learning + Spacial Domains
Will be sent directly to Tom, Steve, Jay 1/14/98, evening.

================================================================
Learning by Instruction plus Experience (implicit/explicit)

Below is the 200 word submission from the implicit/explicit learning team.  We
found it very hard to say *anything* in 200 words, and some of us are worried
that the pre-proposal is going to appear too superficial if it consists of
many such paragraphs.  In short, the fate of these paragraphs should be an
agenda item for Thursday's meeting.

Psychologists have found a dissociation between explicit and implicit forms of
learning.  The first of these utilizes explicit representations of abstract
knowledge, as when verbal instructions are provided.  Implicit learning
proceeds without such explicit representations, as when a behavior is shaped
by experience.  While psychologists have studied these two integrated
mechanisms in humans, this dichotomy extends beyond natural learning systems.
Computer scientists face this distinction as one between explicit programming
and machine learning.  Statisticians confront these issues in the guise of
prior knowledge/model selection and algorithms for statistical inference.  At
this STC, these researchers will come together to forge new normative and
effective techniques for integrated implicit/explicit learning.  These
techniques will guide models of human learning, advancing our understanding of
the functional and biological architecture of these dual mechanisms.  Teaching
methods will be designed to leverage the strengths of both learning
strategies.  New software development tools will allow systems to arise from a
mixture of explicit programming and experience-based learning.  This will
facilitate software fabrication and also allow programs to adapt to individual
users and changing environments.  And, at the foundation will be formal
mathematical approaches to the incorporation of arbitrary prior knowledge into
the statistical analysis of empirical data.

Dave Noelle-Captain  noelle@acm.org
Sebastian Thrun	thrun@cs.cmu.edu
Dave Touretzsky	Dave_Touretzky@cs.cmu.edu
Jay Kadane	kadane@stat.cmu.edu
Tai Sing Lee	tai@eagle.cnbc.cmu.edu
Ken Koedinger	koedinger@cmu.edu
Jack Mostow	mostow@cs.cmu.edu
Lynne Reder	reder+@CMU.EDU

================================================================
Teaching

Intelligent Tutoring Systems as Data for and Outcome of Learning Research

The creation and evaluation of intelligent tutoring systems (ITSs) has been an
important driver of basic research in the learning sciences as well as a
dramatic practical outcome of such research.  ITS research has been
fundamental, for instance, to the development of Anderson's ACT theories,
including advances in machine learning techniques and Bayesian statistics that
enhance our understanding of human learning.  Carnegie Mellon researchers have
created a great diversity of ITSs (elementary reading, pre-school language,
math, programming, logic) and have been world leaders in making such systems
practically feasible (tutors in 40 schools), accessible to disadvantaged
populations (urban schools, developmentally-delayed students) and
instructionally effective (standard deviation achievement gains over normal
instruction).

Despite these hard-fought successes, fundamental problems remain in modeling
student learning and automating interactive instruction and thus, our ITSs are
far from the instructional effectiveness possible (human tutors produce two
standard deviation achievement gains).  By combining the expertise of leading
researchers in machine algorithms, theoretical foundations, and human
psychology, the center can make great advances.  Use of our ITSs has yielded a
number of vast databases of student learning interactions.  These databases,
as well as on-line experiments with ITSs, provide an opportunity to not only
extend human learning theories, but also to develop machine learning
algorithms and statistical techniques to automatically extract and refine
models of student learning and effective instruction.

------

Note: There are good connections to be made to other groups, in particular,
"learning and inference from large datasets" and "learning from instruction
or example (explicit/implicit)".

Ken Koedinger-Captain	koedinger@cmu.edu
Jack Mostow	mostow@cs.cmu.edu
Rob Kass	kass@stat.cmu.edu
John Anderson	ja+@CMU.EDU
Jill Fain	jef@cs.cmu.edu
Richard Scheines	 rs2l+@andrew.cmu.edu
Peter Brusilovsky 	plb@cs.cmu.edu

================================================================
Language Learning

Models of language learning in humans and machines contrast in terms of both
goals and methods.  Studies of human language learning have traditionally
emphasized explicit, domain-specific knowledge, whereas computer algorithms
that learn language have tended to rely on implicit, statistical, data-driven
methods.  Recently, psycholinguists have discovered that data-driven methods
play an important role in explaining natural language learning in areas as
diverse as infant speech segmentation, lexical semantic structures, and
phrasal attachment during sentence parsing.  At the same time, computer
scientists have developed algorithms that use background domain knowledge to
improve the accuracy of machine learning in a number of non-language learning
tasks (e.g., for learning to choose actions in control problems).

The challenge currently facing both approaches to language learning is to
discover learning processes that take advantage of both data and domain
knowledge in a synergistic fashion, and to understand whether/how rich and
embodied cognitive representations can emerge dynamically from mindless
data-driven techniques.  For machines, we want to learn how embodiment and
domain knowledge can facilitate learning, teaching, and the construction of a
richer representational system.  For humans, we want to learn how the
statistical properties of language work to tune the processing system.  We
have a unique opportunity to explore these problems in the testbeds of our
work on tutors (Reading Tutor, Simone Says), robots, digital libraries, the
child language database (CHILDES), and functional MRI studies of language
processing and learning.


Brian MacWhinney-Captain	macw@CMU.EDU
Tom Mitchell	tom.mitchell@cs.cmu.edu
Alex Waibel	ahw@speech2.cs.cmu.edu
Andrew McCallum	mccallum@sandbox.jprc.com
Dave Plaut	plaut+@cmu.edu
John Lafferty	lafferty@cs.cmu.edu
Randy Gobbel	gobbel@andrew.cmu.edu
Jack Mostow	mostow@cs.cmu.edu
Johan Kumlien	kumlien@cs.cmu.edu
Jill Fain	jef@cs.cmu.edu
Roni Rosenfeld	roni@cmu.edu

================================================================
Perceptual Learning at CMU STC 

How does a baby learn to perceive? As he crawls, plays with his toys 
and touches his mother, his nervous system rapidly wires itself up 
so that he can interpret the visual environment and make sense of
the 3D structure of objects in the world based on the 2D retinal
images. The experience of manipulating real physical objects is
critical to the emergence of his perceptual ability. How does 
this happen in biological and robotic systems?  
    When a girl is learning to read, memorizing A,B,C, then words, 
and then phrases, letters gradually dissolve into words, words into 
phrases in her perceputal system as her reading speed increases with 
practice. Does the visual system rewire itself continuously to 
learn new spatial conjunctions of letters in words or in general
new visual skills?          
    When a boy learns to play soccer, his perceptual system
has to learn a different set of perceptual strategies and visual 
routines depending on the different maneuvering actions. How could 
these action-dependent perceptual routines be learned and linked 
together temporally? 
    When a `mobot' is moving around in a hospital,  
a `webot' navigating in the world wide web, our eye scanning
a visual scene, richer and richer internal representations of the
world are or should be constructed dynamically and cumulatively 
using a small amount of selected information at a time. How
are these internal representations updated, combined and composed 
flexibly and dynamically?   
   Answers to these questions are not only crucial to the 
understanding of brain and cognition, but also critical to 
advancing technologies in robotics, internet agent technologies 
and data-mining. 
    The focus of our team is on LIFE-LONG perceptual learning:
the early development of perceptual ability, the emergence of
higher order perceptual structures, the acquisition of new perceptual
skills, and the learning of spatiotemporal conjunctive constructs
and visual routines in association with motor behaviors  
in both natural and artificial systems.  
   Our approaches include neuropsychological, psychophysical and 
MRI studies on human (Behrmann and McClelland), robot, sensor, computer 
and data-mining experiments and applications (Veloso and Mitchell), 
computational modeling and theoretical studies (Mccallum, Kass, 
McClelland, Mitchell, Lee), neurophysiological experiments on awake 
monkeys (Lee and Miyashita).   

Tai Sing Lee-Captain	tai@eagle.cnbc.cmu.edu
Jay McClelland	jlm@cnbc.cmu.edu
Tom Mitchell	tom.mitchell@cs.cmu.edu
Manuela Veloso	Manuela_Veloso@gama.prodigy.cs.cmu.edu
Marlene Behrmann	mb9h@crab.psy.cmu.edu.
Andrew McCallum	mccallum@sandbox.jprc.com