This is not a new idea -- Ben and Jared in Opencog has mentioned it before (in the context of MOSES):
Abram seems to have an idea where the GP is replaced by RL (reinforcement learning).
Yesterday I was analyzing the GP + IE idea in more details:
- Let the GP side and the IE side gradually evolve in cycles, starting with $GP_1 + IE_1$.
- The key question is whether the cyclic route is faster than hand-coding IE. Initially, it would involve more work because the GP side needs to be custom-made (we cannot use off-the-shelf GP software). It may pay off only if $GP_1 + IE_1$ increases programming productivity significantly.
- A very weak $IE_1$ cannot increase programming productivity because GP + weak IE is still too slow to be usable. For example, one idea is to have IE suggest a number of primitive functions when given a goal, so GP can include those primitives in the genes for that population. But, even with current state-of-the-art GP, this cannot efficiently solve > 1 line programs, even if primitives are suggested.
- $IE_*$ (the ideal form of IE) will be able to deduce the program when given the desired goal: $$ G:goal \mapsto \{ P:program | \quad P \vdash G \}. $$ Whereas the above $IE_1$ is too weak (suggesting primitives similar to the goal): $$ G:goal \mapsto \{ x | \quad x \approx G \}. $$ Perhaps we need to find something in between weak $IE_1$ and $IE_*$.
- In other words, we simply have to hand-code $IE_1$ to reach a certain level of functionality before putting it to use with GP. That basic level seems to include:
- Ability to express simple plans (so that human teachers can supply basic programming knowledge as decomposition of tasks into sub-tasks)
- Ability to express similarity and to perform simple associative recall.
The new insight may change our priorities during implementation...
GP does not have the flavor of "local computations"; very loosely speaking, GP is to RL what MCMC is to belief prop. That is why I prefer to think of RL as the starting point.
ReplyDeleteThat said, obviously you are explicitly stating that it would not be off-the-shelf GP. You intend to suggest some form of GP which can utilize probabilistic-logical assertions which the IE makes about the kids of solutions to look for. That is good! MOSES does not go that far, even though it makes internal probabilistic models. It would be nice if MOSES could use (or even make) arbitrary PLN models of the fitness landscape during its search, but that would be a big modification.
Part of the difference between the RL mindset and the GP mindset is that RL wants to respond to new data in real time, whereas GP operates in batch mode. Which is better depends on the initial domain of application. Still, the real answer is not RL or GP, but "effective search for high-utility policies".
I am told there is some good stuff in the papers of Rina Dechter. I have yet to review them.
@Abram: That's a very good point. In fact we can call the left hand side "program search" in general. GP is just a kind of non-local stochastic search.
ReplyDeleteAnother point is about top-down vs bottom-up. My personal favorite is to add architectural constraints to program search. For example, we already have some vague and relatively abstract ideas about what an inference engine would look like. (Or, ideas about what a logic or proof should look like in general, a field of study known as "ludics".) This will prune the program search space greatly.
ReplyDeleteOn the other hand I'm not very familiar with the kind of knowledge that can be used in a bottom-up strategy. Perhaps you're thinking of small programs (= functions) that have high utility value? But how do you know the utility of the programs unless you have deployed them in a massive number of trials?
PS: I have Rina Dechter's book on constrain processing. CP seems to be THE appropriate framework to implement self programming. Thanks for suggesting that!