Artificial Intelligence


  If you are taking the Breadth Exam:

       Answer any 4 out of the 6 Basic questions B1 - B6.

  If you are taking the Depth Exam:

       Answer any 4 out of the 6 Basic questions B1 - B6,  and
       answer  any 4 out of the 13 Advanced Depth questions D1
       - D13.


                   Breadth (Basic) Questions


  B1.  The bias of a learning algorithm is defined to  be  all
       factors,  other  than  the  provided training examples,
       that influence the selection of possible  solutions  to
       the   learning   task   at   hand.    Provide  a  rough
       computational complexity argument  that  justifies  the
       need for bias in a learning algorithm.  For two (2) out
       of the three  (3)  learning  algorithms  listed  below,
       describe their biases and the role these biases play.

                      Michalski's AQ
                      Mitchell et al.'s EBG
                      Rumelhart et al.'s Back-Propagation


  B2.  One  may  attempt  to  parse  English  sentences  as  a
       context-free grammar -- that is, using productions such
       as ``S -> NP VP''.  Consider the sentences

           The boy painted the bottom of the barrel with a paintbrush.
           The boy painted the barrel with green paint after dinner.
           The boy painted the barrel with the paintbrush with short bristles.

       Describe a simple context-free grammar which will yield
       a correct parse (among others) of these sentences.  How
       many (not necessarily semantically correct) parses will
       be   obtained  for  these  sentences?   Explain  how  a
       language recognition system  might  disambiguate  these
       and find the correct parses.


  B3.  Explain how alpha-beta search works for the  game  tic-
       tac-toe.   More  specifically,  the  players take turns
       putting their mark on a 3 by 3 board, and the first one
       to  have  3  marks  in a line (horizontal, vertical, or
       diagonal) wins.  Do your search to  depth  2  (you  go,
       your  opponent  goes).  Say you move first, and show in
       detail  how  the  first  move  is  determined.   As   a
       heuristic,  use  the number of threats by you minus the
       number of threats by your opponent, where a threat by a
       player  is a line with 1 or 2 of that player's marks on
       it and none of the opponent's.  Explore the center move
       first.   You don't have to draw the entire tree search,
       but show enough to see explicitly how alpha-beta prunes
       the  min-max  tree.  How many nodes of the min-max tree
       are left unsearched by alpha-beta?


  B4.  One of the more important programs in  the  history  of
       artificial  intelligence  has  been the General Problem
       Solver (GPS) of Newell and Simon.   Describe  the  main
       components  of  GPS,  show their recursive structuring,
       illustrate the application of GPS  to  some  particular
       problem  domain  by  giving examples for that domain of
       GPS differences and operators, and describe why and how
       artificial  intelligence  scientists  have decided that
       the "weak methods" of very general  programs  like  GPS
       need to be supplemented.


  B5.  (a) What are hidden units  in  connectionist  networks?
       Why  are  they  important?   Consider  both what can be
       expressed and what can  be  learned  with  and  without
       hidden units.

       (b) Assume you have a training set of N examples and  a
       testing   set  of  M  examples.   Also  assume  that  a
       connectionist network is trained,  using  the  training
       set, until convergence (where convergence is defined as
       classification  correctness   on   the   training   set
       remaining  constant for 100,000 epochs).  Qualitatively
       plot, as  function  of  the  number  of  hidden  units,
       predicted  classification  correctness (at convergence)
       for both the training and test sets.  Explain the  form
       of your two curves.


  B6.  Given  an  initial  description  of  the   world,   the
       occurrence   of   some   events,  and  some  notion  of
       causality, we need to infer  what  is  true  after  the
       events  occur.   To  do  this,  we  need  some  kind of
       temporal representation and reasoning system.

       (a) How are basic events represented in  the  situation
       calculus?  Give an example.

       (b)  How  are  basic  events  represented  in   Allen's
       temporal  logic?   Give an example.  How does it differ
       from the situation calculus?

       (c) What functions do the frame axioms in the situation
       calculus  and the transitivity axioms in Allen's theory
       serve?  Describe briefly.


                      Advanced Depth Questions


  D1.  Some explanation-based learning systems  learn  control
       knowledge,    while   others   learn   macro-operators.
       Describe the difference  between  these  two  types  of
       acquired  knowledge,  and  indicate  the advantages and
       disadvantages of learning each type.  Briefly  describe
       one  developed system that learns control knowledge and
       one   that   acquires   macro-operators.    Design   an
       experiment  that  compares  learning macro-operators to
       learning control knowledge.  Describe the dependent and
       independent    variables,   and   clearly   state   the
       experimental hypotheses.


  D2.  Wyatt I. Gott has just received  a  large  database  of
       feature vectors (i.e., lists of attribute-value pairs).
       Each entry in the database contains the history of some
       small  business  over  the  last  year (e.g., number of
       employees, products, monthly sales,  monthly  expenses,
       etc.).   Wyatt  assigns  you  the  ill-defined  task of
       producing interesting statements about this database.

       For  four  (4)  of  the  eight  (8)  machine   learning
       paradigms listed below, state how you would apply it to
       your assigned task or  argue  why  doing  so  would  be
       inappropriate.   Be as specific as possible, mentioning
       specific systems where appropriate.  Be sure to discuss
       issues  such as computational tractability, sensitivity
       to noise and missing feature values, the need  for  you
       to  provide  a  small  amount of additional information
       (and any leverage this provides), etc.

                   Similarity-Based Learning from Examples
                   Conceptual Clustering and Discovery Systems
                   Explanation-Based Learning
                   Connectionist Learning
                   Genetic Algorithms
                   Analogical/Case-Based/Exemplar-Based Learning
                   Formal Learnability Theory
                   Integrated Approaches


  D3.  Give a precise description of one-layer and multi-layer
       connectionist  networks  -  the  overall  topology, the
       functions computed, the way learning  works.   Describe
       how  these  have  been  used, or could best be used, to
       handle  one  (1)  of  the  following  problems:  object
       recognition;  word  pronunciation;  logical  expression
       evaluation (including but not limited to exclusive-or).
       Describe  how  well  the  approach  you  just presented
       worked (or would work), and suggest  how  it  could  be
       extended to address one (1) of the following issues:

                 accommodation of delayed feedback during training
                 choice of a good network topology
                 incorporation of recursion


  D4.  You've just received a call from NASA to  help  process
       and  analyze  images  of Neptune received from Voyager.
       The  set  of  images  is  a  ``mosaic''  of   partially
       overlapping   images,   each   with  differing  imaging
       conditions (e.g., lighting and exposure time).  Briefly
       describe  procedures  that  you  would  try in order to
       reconstruct and enhance a single composite  image  from
       the set.


  D5.  Multiresolution image representations can  be  used  in
       many  ways  in  computer vision.  Two examples are fast
       feature detection and coarse-to-fine search.

       (a) Given an N x N image f and an M  x  M  template  t,
       specify how a match measure can be defined.

       (b) Instead  of  doing  matching  based  on  the  given
       resolutions  described in (a), say we wanted to match t
       at a finer scale.  Given an increase  in  scale  of  s,
       briefly  describe  two alternative ways of changing the
       scale of f and/or t  to  achieve  the  desired  result.
       What is the difference in cost of your two approaches?

       (c) Describe how coarse-to-fine search has been used in
       a    published   algorithm;   e.g.,   for   model-based
       recognition, or for stereo or motion analysis.


  D6.  You are asked to start a project to develop a  computer
       vision  system that will guard buildings by taking a TV
       picture of each person, and admitting them only if  the
       face is recognized.  Explain what the problems are, and
       assess  the  difficulties  and  chances   of   success.
       Describe  two  approaches  you  would  explore, and the
       previous work you would build on.  Describe and  assess
       the advantages and the problems with each approach.


  D7.  Describe two contrasting  multi-computer  architectures
       that  are  appropriate  for  developing fast, real-time
       vision systems  (ones  that  can  recognize  and  track
       moving  objects  imaged  in  TV  frames on the order of
       500-by-500 resolution input roughly 30 times a second).
       Describe  how  big and fast each architecture is in its
       best present-day version (this might only  be  a  paper
       design),  and specify how big and fast it would need to
       be to handle this task.  Describe the kind  of  program
       you  would  develop for each architecture, making clear
       where it is and is not important that the  program  fit
       the architecture.


  D8.  In what sense are so-called  "expert  systems"  expert?
       Present  the  typical  structure and describe the usual
       capabilities of expert systems, illustrated in terms of
       two actual examples from the literature, and comment on
       the  extent  to  which  such  systems  do  and  do  not
       constitute models of intelligence.


  D9.  The capabilities of any particular expert system  could
       be  realized  by  means  of  conventional  programming.
       Contrast  the  system  structuring  and  implementation
       techniques  of  expert  system technology with those of
       conventional programming, and comment on  the  relative
       advantages and disadvantages of the two approaches.


  D10. Consider the following clauses:

           1. X +o 1 = X
           2. X +o i(X) = 1
           3. X +o (Y +o Z) = X +o (Y +o Z)
           4. a +o a = a
           5. - (a = 1)

       Add the standard equality axioms, and  derive  []  from
       these  clauses using binary resolution only.  Note that
       without the equality axioms, you cannot derive anything
       from  these by binary resolution.  X,Y,Z are variables;
       a and 1 are constants.


  D11. In Prolog, answers  which  follow  logically  from  the
       program  are  not  always  computed.  In this question,
       consider  just  pure  Horn  Prolog,  and  consider  the
       following program:

           p(X,Y) :- p(X,B).
           p(X,Y) :- p(B,Y).
           p(a,b).

       Say the query is ``?- p(c,d)''.  Logically, the  answer
       should be ``yes''.

       (a) Explain why Prolog  will  not  return  this  answer
       under  any  of  the  possible  6  ways  of ordering the
       program.

       (b) Explain how to modify the Prolog search  by  adding
       subsumption  checks  so  that  this answer is obtained.
       Your answer  should  include  an  explanation  of  what
       subsumption is.

       (c) In view of (b), why don't existing  Prolog  systems
       use subsumption checks?

       (d) Will the modification described in (b) be  complete
       (in  the  sense that any answer which follows logically
       from the program will indeed be returned  in  a  finite
       amount  of  time)?    Explain the reasoning behind your
       answer.


  D12. Discuss the use of  frames  and  semantic  features  in
       sentence  parsing  and  text analysis.  Illustrate your
       discussion with an example.  What techniques  might  be
       used  to convert a frame-based sentence parser into one
       that uses semantic features exclusively?


  D13. Discuss   techniques   for   knowledge-based    machine
       translation in both specialized and general application
       domains.   What  levels  of   success   are   currently
       obtainable,  and what limitations seem to be imposed by
       unresolved epistemological problems?