Latex Maths

Showing posts with label programming. Show all posts
Showing posts with label programming. Show all posts

Monday, July 12, 2010

Feature Matrix

YKY has highlighted the modules essential to the Genifer prototype.
SeH has included many more features.  Feel free to suggest more!

Thursday, July 8, 2010

implementation: the "Logic" class

Definition of the Logic class:

Formula------------ Constant ------------------------ Atom
                |                                          |
                --------- Variable                  ---------- Char, and String
                |                                          |
                --------- Apply-Operator       ---------- Number ------------ Integer, Real, Bool, etc...
                |                                                                        
                --------- Apply-Functor


Atom:  {  string: name,  int: index  }                                 // atoms are eg: "john", "mary"
Variable:  {  int: index  }
Apply-Operator:  {  char: op,  Formula[]: arguments  }          // op applied to 1 or more arguments
Apply-Functor:  {  Atom: functor, Formula[]: arguments  }    // functor applied to 1 or more arguments
                                                                                      // functor can be function or predicate
                                                                                      // functor's name is just an Atom

Definition of FOL terminology:

This set of terminology is standard for FOL:
  1. At the top level is the formula (or proposition, or sentence, or logic statement, all synonyms).
  2. A formula can be a rule or a fact.  A rule is of the form
         A <- X, Y, Z, ...
    And a fact does not contain the "<-" operator.
  3. Each formula consists of a number of literals joined by logical operators (aka connectives) such as AND and OR, and operations can be nested.  Note that NOT is a unary operator.  Eg:
    • (loves john mary)                                     is a literal
    • (NOT (female john))                                is a literal
    • (AND (loves john mary) (female mary))    are 2 literals joined by AND
    • (OR (AND (p X) (q Y))  (r Z))                  is nested
  4. Each literal consists of a predicate plus its arguments, eg:
    • (loves john mary)                    where love = predicate with 2 arguments
    • (male paul)                             where male = predicate with 1 argument
    • (female (daughter-of paul))      where female = predicate with 1 argument, "daughter-of" is a function symbol
  5. Terms (or arguments) can be:
    • variables   (eg:  X, Y, Z)
    • constants   (eg:  john, mary, 17, "String")
    • functions of other terms   (eg:  daughter-of(john),  sine(3.14))
That's all!

Our simplified terminology:

In anticipation of higher-order logic:
  1. The notion of predicates and functions are combined into functors.
  2. We can also absorb functors into operators, but this is not currently done.
  3. In FOL, a function-application is not a formula, but this is relaxed in our definition.  Potentially it can lead to ill-formed formulas.  For example, sine(3.14) is clearly not a formula;  but we want to allow some functions that return formulas.
The result is more elegant, and we can express this by a set of re-write rules:
     formula ===> constant
     formula ===> variable
     formula ===> (functor formula1 formula2 ...)
     formula ===> (op formula1 formula2 ...)

More about rules

Rules are of the form:
     A <- B, C, D, ...
where A is the "head", B,C,D,... is the "body", and "<-" is just a special operator like "AND";  we can call it "COND" for conditional.
Each "," is equivalent to an "AND".  So, the above rule is represented as:
     (COND A (AND B C D...))
where "AND" can accept more than 2 arguments.

( In our logic, <- translates to a link in a Bayesian network.  For this reason, it seems not very meaningful for a formula to have "<-" inside the body instead of at the head position. )

Some special cases:
  1. If a rule has no head, it is still a rule, such as:
         (interesting X)    means "everything is interesting"
    -- note that all variables are implicitly universally quantified, ie, the above is really:
         for-all X,  interesting X
  2. We can ignore the case where a rule has no body (it's not useful).
  3. A rule without variables is still a rule, eg:
         wet <- rains
         not cloudy <- sunny
Representing variables and constants

Notice that we can use any way we want to represent variables and constants internally.  In my Lisp code, variables are like "?1" etc and constants are "john" etc.  In a more sophisticated KB, we would of course use some kind of indexes.

Code

This is Russell's C# code, and Chuck's Cobra code.  When in doubt, please conform to the definition above.

Friday, June 25, 2010

Use cases

It's kind of hard to give detailed use cases, so I'll try to explain the main principles.

How can Genifer be taught?  A user can ask a query and force an answer.  Then Genifer will generate a number of hypotheses in the KB that enable to infer that answer next time.  Since some of these hypotheses can be wrong, the user should provide multiple examples to reduce the chance of wrong but accidentally correct hypotheses.

For example:
[ Sorry that this example is not related to programming... it's easier to think of a daily-life example.  But I'll try to use programming examples as much as possible. ]

    User:  "Given:  John eats spaghetti with Ben.  Conclude:  Ben eats spaghetti"
    User:  "Given:  John eats spaghetti with Paul.  Conclude:  Paul eats spaghetti"
    User:  "Given:  John dances with Mary.  Conclude:  Mary dances."

    User:  "Given:  John eats spaghetti with meatball.  Query:  Does meatball eat spaghetti?"
    Genifer:  "Yes."
    User:   "That's incorrect."

    etc  etc ....

After the above session, an example rule that Genifer will have learnt is:
    A VerbPhrase with B /\ B is a person -> B VerbPhrase
Notice that this rule may still have exceptions, but it is a useful rule.


Thus, inductive learning allows even lay people to teach Genifer.  This is very important in our strategy to let online users contribute to the KB.


Some more examples with the word 'with':

  • "factorial n" matches with "factorial 6"              (pattern matching in Haskell)
  • The function is evaluated with n = 6
  • We can surround the function name with back-quotes
  • [x,y,z] is a list with 3 elements
  • Lists can be built with the : operator.

What needs to be taught?


Two areas:
1.  semantics of basic / restricted English (with a vocabulary restricted to programming)
2.  knowledge about programming

Possible first application areas:

  1. Find-and-replace, eg: "Find names of variables"
  2. Style checking (a good candidate because it only requires inference and pattern matching, no acting)
  3. String manipulation, eg: "Convert to camel case"
Sorry that this is still very vague...  I have not figured out everything myself and need more time to think on it...