Friday, November 9, 2012

Unifying thinking and acting

I've made some new friends who may be interested in AGI, so this is a summary of the theory of Genifer so far.

Central dogma

First I assume you understand the central dogma of logic-based AI:  "Knowledge is represented as a set of logic formulas in a knowledge base (KB), and the 3 modes of reasoning are deduction, abduction, and induction."  These basics are explained in all standard AI textbooks, such as AI: a modern approach.

Genifer also has some innovations that enhance classical logic, such as fuzzy-probabilistic truth values and concept composition (similar to combinatory logic and combinatory categorial  grammar), but they are not the subject of this article.

AGI = thinking + acting

If you buy my theory, I am now confident that Genifer's logic is sufficient to cover all the "thinking" aspects, and am also very satisfied with its elegance and simplicity.  What remains is to extend this system to cover planning / acting aspects.

Recently I realized that reinforcement learning can be a suitable over-arching mechanism to unify thinking and acting.

Reinforcement learning (RL)

In a nutshell, RL takes as input a set of actions, a set of states, a reward function for each state, and outputs a policy of actions that maximizes rewards in the long run.  One can also say that RL searches for plans in the plan space.  Richard Sutton and Andrew Barto's 1998 book contains a superb exposition of RL.  What is special about RL is that the problem setting is of an agent acting in an environment, in contrast to other machine learning settings that are just pattern recognition.  Thus, RL is uniquely suitable as the top-level architecture of an AGI.

Recall BF Skinner's behaviorism that tries to reduce human psychology to a set of stimulus-response associations, for example food $\rightarrow$ saliva.  We know intuitively that this is inadequate because higher animals also build cognitive models of the world in their brains.

In abstract terms, the maintenance of such a cognitive model can be viewed as actions that manipulate (brain) memory states.  But due to the astronomical number of possible cognitive states (and thus the number of "cognitive actions"), reinforcement learning cannot effectively deal with cognition -- in other words, it is impractical for RL to automatically figure out how to perform "thinking".  (A critical limitation of RL is that the set of possible actions must not be too large.)  This is where the need for logic comes in.  A logic system is a set of inductive bias that tells us how to maintain truths and perform sensible reasoning.

The need to include a cognitive model of the world in an intelligent agent is already recognized by Sutton and Barto's book (§9.2, "Integrating planning, acting, and learning"), which they called the Dyna architecture.

For your convenience, this is their diagram for the Dyna architecture:



The thinking-acting (logic-RL) interface

From the central dogma we have the 3 modes of inference: deduction, abduction, and induction.  But these are not algorithms per se, rather the forms of search problems.  It is still up to us to design the algorithms that do the searching.  For example, a deduction algorithm may reach a point in the search space where it would be advantageous to ask the user for a key piece of assumption, or to look in the environment to confirm such assumption, which allows to produce a nicer proof.  This is an example of "inference control" that involves user or environmental interaction.  The complex strategies of inference control seem most suitable for RL to learn.

So, we should break down the inference algorithms (deduction, etc) into micro-steps and then allow RL to learn policies of these actions.  For example, RL may tell deduction to perform search for a few layers, then check if the current best solution is good enough under current time constraints, or continue to search deeper.

Meta-reasoning

So far so good, but there is one more problem.  We are using RL to learn "how to think".  Sometimes the decisions on how to think are also dependent on a cognitive model of the thinking process itself.  For example, if the current query is related to quantum mechanics and the AGI knows that it is not good at that, it should more probably consult external references or experts.  Another example is when an internal belief contradiction arises, and the AGI should introspect on how each of the contradicting beliefs was derived, and then decide which one(s) to retract.

Interestingly, the cognitive model of the thinking process itself can be handled by the very logic engine that handles the cognitive modeling of the external world.  The only difference is that the meta-cognitive model would produce "rewards" that guide the RL to perform better thinking.

So, this is my current design for a complete AGI architecture.  My next task is to spell out the details.  Thanks for reading

8 comments:

  1. Hi YKY

    Don't take this too harsh, however, my very friendly advice is to go back to the drawing board and start all-over.

    All this Genifer theory is too primitive and will not hold. I can't tell you more sorry.

    Take it o leave it.

    PS. You can delete this post after reading,
    good luck.

    F./

    ReplyDelete
  2. I won't delete your post - jokes are allowed :)

    ReplyDelete
  3. This is just a first attempt at AGI architecture, so naturally there is a lot of room for improvement. Or the design may even be displaced by a paradigm shift (like the steam engine was later replaced by the combustion engine). I definitely want to see such improvements / changes.

    ReplyDelete
  4. Hi YKY,

    I agree with you and I like your enthusiasm.
    The reason, people don't like your Genifer is probably they feel it is not the perfect solution but probably nobody really knows which way to go.

    I don't like Genifer either (for no obvious reason), but I'm happy that you trying and don't give up …you have a chance to get it right....

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. PS: Opencog and NARS also use separate planners in addition to their logic modules. This configuration seems to be the status quo of our AGIs.

    @Anonymous: You have to control your jealousy / racism / sexism. I won't delete your post, as a reminder to others how immature some AGI folks can be.

    ReplyDelete
  7. Just out of curiosity, you probably know well both your mentioned systems – OpenCog and NARS. What do you think, what is their biggest weakness? Especially the NARS system, why do you thing, is the community around it so small? And what would be the biggest advantage of Genifer compare to them?

    ReplyDelete
  8. I have made some friendly criticism to both Opencog and NARS and suggested to combine our ideas, but they both declined and preferred to continue exploring their own directions.

    My belief is that a good AGI project should be open to external ideas, that gives the project vitality. That's why I offered to merge with NAR or Opencog. Ironically, OpenCog and OpenNARS aren't really open, but those who come to my project tend to complain: "why don't you spell out all the details so we can code according to specs?"

    ReplyDelete