Monday, July 5, 2010

HOL update

YKY doesn't want to focus on HOL at the moment, but he suggested I write down my thoughts anyway. I was going to put it off since I'd really want to do a more detailed write-up taking care of all the loose ends, but since there is some interest from Russel, I'm outlining my approach now.

The approach is to translate system IG from the paper Illative Combinatory Logic. This is an untyped lambda-calculus logic which soundly & completely interprets first-order logic both "directly" (ie, as truth functions) and via curry-howard isomorphism (which indicates the strength of the type reasoning the system can interpret).

The way I'm planning on translating it is to treat a statement and its negation as essentially separate statements in the probability distribution; ie, the Liar sentence can have probability 0, but its negation can simultaneously have probability 0. Only when we know a statement is a proposition can we derive that P(prop) = 1 - P(~prop).

After this, what we do is just encode the valid sequents as rules in our existing logic. However, IG has an infinite number of valid sequents, specified with some sequent derivation rules. This means that we've got an infinite number of rules now-- we need an efficient mechanism for constructing valid rules as they are needed. The sequent derivation rules get turned into "meta-rules" in our logic. This is a bit complicated, but better than the alternatives (see my previous post).

One complication is that sequent-based need to be combined with something like the noisy-or combination rule [I need to add a reference...]. At present we're planning on dealing with this by having noisy-or be the default combination rule in the whole system; there are some cases where it is a terrible rule, though. YKY thinks we can determine which cases to apply it to using causal reasoning.

For efficient reasoning, we'd like to normalize terms before reasoning so that we can quickly recognize equality. However, in untyped HOL, some terms have nonterminating normalizations. There are a few things we can do. One is to just have a time cutoff on normalization attempts. Another is to try and do some type reasoning before normalization (so we'd only normalize when we could first prove termination). A third option that might have some interesting properties is what I'm calling "compressive normalization"... rather than the usual normalization, use the shortest form of a term as the normal form. This is non-terminating and we'd still need a time cutoff, but unlike the usual normal form, it exists even for terms which aren't propositions-- it just always exists. It might be useful to keep short forms of terms around for compression purposes.

Speaking of compression; integrating both HOL and likelihood reasoning into Gen means that might be feasible to specify a prior probability distribution over models within Gen itself, meaning iduction just becomes probabilistic deduction. However, I have not worked out details here.

That's all for today! :)


    Page 7, section 3, Noisy Or Classifier - The noisy-or model was first introduced by Pearl [9]. As its name suggests it is a generalization of the deterministic OR relation...

    The noisy-or classifier can have the following semantics. If an attribute, say Aj , is in a state aj then the instance (a1, . . . , aj , . . . , ak) is classified to class C = 1 unless there is an inhibitory effect, which has probability PM(A0j = 0|Aj = aj ). All inhibitory effects are assumed to be independent. Therefore the probability that an instance does not belong to class C, i.e. C = 0, is a product of all inhibitory effects Qj PM(A0j = 0 | Aj = aj)6.

  2. [9] J. Pearl. Probabilistic reasoning in intelligent systems: networks of plausible
    inference. Morgan Kaufmann, 1988.

  3. One thing I don't understand is the usefulness of those "higher-order terms" that you talk about. In most cases I assume the terms would be terminating.