2011-01-18

Baby Steps into Genetic Programming

1 Introduction

While my final ranking in the Google AI Contest was disappointing (280th), it was a very educational experience and was totally offset by Gábor Melis' dominating win using Common Lisp as well.

One of things that piqued my interest during the contest was a post on the AI Challenge forums about a bot written using genetic programming. Genetic programming (GP) and genetic algorithms have always held my interest but seeing the bot in action really motivated me to dive into the matter.

Genetic programming is inspired by biological evolution and is a way of solving problems by setting up an environment (tuned to the problem at hand!) and allowing computer programs to evolve towards a possible solution in that environment.

This article shows my initial exploration into GP using Common Lisp and should be an example of a typical REPL session (my session was a couple of hours divided over two evenings). The code has been reviewed, made a more readable and lispier but still looks very much like what I wrote initially. From the REPL session useful output has been cut and pasted into this article.

This is a really basic introduction into GP and is meant for people interested in GP and/or interested in (Common) Lisp. It is heavy on code and examples and light on theory. My intention is for the reader to play around with the functions on the REPL as they come by in the article.

The code in this article should be portable Common Lisp. If you need to know what a function or macro does please consult the Common Lisp HyperSpec.

1.1 Toy Project

I found A Field Guide to Genetic Programming on-line and decided to duplicate Bill Clementson's toy project of discovering the formula for calculating the area of a circle. Except for educational purposes we will be starting from scratch.

1.2 Environment

Many Common Lispers use Emacs, Slime and usually Unix or OS X as development environment. However, do not get the impression this is the only way to write Lisp. The development page on CLiki lists many other alternatives in various states of usability.

My recommendation is to use your favorite text editor and CLISP since it comes with command line history and tab completion built in. Write or paste the code in your editor and save the file after every addition as for example "gp.lisp". Then you only need to type:

(load "/path/to/gp.lisp")

once and can use the command line history to reload it.

2 Generating Random Code

As you might have read, the space.invaders bot was written in PHP. Our weapon of choice is Common Lisp and the first matter of business is generating random code.

2.1 Operators

(defparameter *operators* '(+ - * /))

To keep things simple and already knowing our target function (πr²) we opt for using four operators: addition, subtraction, multiplication and division. In the name of simplicity (again) they will have an arity of 2, so they will all take two arguments.

*OPERATORS* is the name of the variable. The *earmuffs* convention is used to signify global variables.

2.2 RANDOM-ELT

(defun random-elt (sequence)
  (let ((seq-length (length sequence)))
    (when (> seq-length 0)
      (elt sequence (random seq-length)))))

The function RANDOM-FORM in the next section uses RANDOM-ELT to select a random element from the list of operators.

2.3 Generating Random Function Forms

(defun random-form (operators)
  (append (list (random-elt operators))
          (loop repeat 2  ; arity
                collect (let ((random-nr (random 100)))
                          (cond ((< random-nr 50) (random-form operators))
                                ((< random-nr 75) (random 10.0))
                                (t                '=input=))))))

The function works as follows: it creates a list of three items. The first item is a random operator and the second and third items are the arguments to that operator:

(operator argument-1 argument-2)

The arguments are either a random number between 0.0 and 10.0 (arbitrary limit), a variable called =INPUT= (more about this later) or, and this is important, another list of three items. For example:

(* =INPUT= 2.345)

or

(+ 6.789 (- =INPUT= 0.123))

So the RANDOM-FORM function will call itself recursively if needed. The (arbitrary) probabilities for the ARGUMENT-1 and ARGUMENT-2 are: 50% of the time another random form, 25% of the time a random number and 25% of the time the =INPUT= variable.

The process of generating random forms described above is similar to the "grow" method described in Chapter 2.2 of the Field Guide.

2.3.1 Limiting RANDOM-FORM

(defun random-form (operators &optional (max-depth 4))
  (append (list (random-elt operators))
          (loop repeat 2  ; arity
                collect (let ((random-nr (random 100)))
                          (if (> max-depth 0)
                              (cond ((< random-nr 50)
                                     (random-form operators (- max-depth 1)))
                                    ((< random-nr 75) (random 10.0))
                                    (t                '=input=))
                              (cond ((< random-nr 50) (random 10.0))
                                    (t                '=input=)))))))

Since RANDOM-FORM can call itself recursively it can potentially create huge amounts of random code and even exhaust the stack of the Lisp implementation. Hence we need to place some limits on RANDOM-FORM. The addition of the MAX-DEPTH parameter and the deduction of MAX-DEPTH when RANDOM-FORM is called again prevents nasty things from happening.

2.4 =INPUT=

Since we want our generated code to calculate the area of a circle from an input value, the radius, we've added the possibility to generate the =INPUT= variable to RANDOM-FORM. This variable is the same as the input argument of the RUN-FORM function that is discussed later.

2.5 Testing RANDOM-ELT and RANDOM-FORM

Pasting RANDOM-ELT and RANDOM-FORM into the REPL and running the latter a couple of times will yield some randomly generated code:

CL-USER> (random-form *operators*)
(/ 2.8173013 5.378826)

CL-USER> (random-form *operators*)
(* =INPUT= =INPUT=)

CL-USER> (random-form *operators*)
(+ 7.9595613
 (- (- =INPUT= (* =INPUT= 0.57189345))
  (* (- (/ 6.9174767 9.723027) =INPUT=) (* =INPUT= =INPUT=))))

CL-USER> (random-form *operators*)
(- (- =INPUT= (- (/ 2.002045 (* =INPUT= 9.829036)) 4.531122)) =INPUT=)

3 Running Generated Code

(defun run-form (form input)
  (funcall (eval `(lambda (=input=) ,form))  ; note: backquote!
           input))

To run the generated forms we wrap them in a LAMBDA with an =INPUT= argument and funcall the evaluated lambda with an input value. The following REPL interaction shows line by line what happens in RANDOM-FORM:

CL-USER> (defparameter random-form (random-form *operators*))
RANDOM-FORM

CL-USER> random-form
(* 6.341989 =INPUT=)

CL-USER> `(lambda (=input=) ,random-form)
(LAMBDA (=INPUT=) (* 6.341989 =INPUT=))

CL-USER> (eval `(lambda (=input=) ,random-form))
#<FUNCTION (LAMBDA (=INPUT=)) {AF2F08D}>

CL-USER> (funcall (eval `(lambda (=input=) ,random-form)) 2)
12.683978

CL-USER> (* 6.341989 2)
12.683978

However, the generated code can be illegal and cause errors. Since we'll be running a great many pieces of generated code multiple times we don't want to see error messages and drop into the debugger. To keep things simple we let illegal forms return NIL so we can kill them off later. Adding an error handler to RUN-FORM will do this:

(defun run-form (form input)
  (let ((*error-output* (make-broadcast-stream)))
    (handler-case (funcall (eval `(lambda (=input=) ,form))
                           input)
      (error () nil))))

*ERROR-OUTPUT* is redirected so we won't be bothered by all kinds of compilation notices. They only serve to distract in this situation.

3.1 SBCL note

SBCL is a very good, open-source CL implementation but it doesn't compile code very fast. The way we currently handle RUN-FORM by recompiling the form every time causes the advancing of generations later in the article to be very slow . Compared to SBCL, which compiles to native code, CLISP, which compiles to byte-code, does the advancing of generations much faster.

Since we're in exploration mode we will leave RUN-FORM as it is.

4 Population

(defun create-initial-population (operators &optional (size 100))
  (loop repeat size
        collect (random-form operators)))

If you're not familiar with CL: the COLLECT statement in LOOP accumulates a random form on each iteration and returns them as a list once iteration has finished. For example:

CL-USER> (create-initial-population *operators* 10)
((* =INPUT= 6.8947406)
 (* (+ =INPUT= 2.1140647) 6.7930226)
 (+
  (/ (+ (* =INPUT= (/ 8.296512 =INPUT=)) 1.1989254)
   (- =INPUT= (/ =INPUT= (- 5.1625586 1.4763731))))
  =INPUT=)
 (- 1.5379852 2.1223724)
 (* =INPUT= (- 3.0632048 (* (+ (- 5.8116364 =INPUT=) 0.1915878) =INPUT=)))
 (/ 8.237898 =INPUT=)
 (* (/ (/ (- =INPUT= =INPUT=) 0.56726336) =INPUT=) =INPUT=)
 (+ (+ 7.147339 =INPUT=) =INPUT=)
 (- (- =INPUT= =INPUT=) =INPUT=)
 (+ (/ =INPUT= (- (* =INPUT= =INPUT=) (* (- =INPUT= =INPUT=) =INPUT=)))
  6.520609))

The way we're creating the initial population is not ideal. This is because RANDOM-FORM only creates one type of syntax tree (using the "grow" method mentioned earlier). For a more diverse initial population we'd need more code generation functions that would output different kinds of syntax trees.

The "full" method is easy to add and is left as an exercise to the reader. See the Field Guide for a description.

5 Fitness

(defun fitness (form fitness-fn test-input)
  (loop for input in test-input
        for output = (run-form form input)
        for target = (funcall fitness-fn input)
        for difference = (when output (abs (- target output)))
        for fitness = (when output (/ 1.0 (+ 1 difference)))
        when (null output) do (return-from fitness nil)
        collect fitness into fitness-values
        finally (return (reduce #'* fitness-values))))

We need a way to test the output of a generated function against our desired outcome and represent this as a number: the fitness. This is commonly a number between 0.0 and 1.0. The closer to 1.0 the better the fitness.

We also need to check whether the return value of RUN-FORM is NIL in which case it executed illegal code. FITNESS will return NIL in that case as well. (Poor man's exception handling but suffices for now.)

So the function needs: 1) the form to check, 2) the fitness function to check against and 3) test input. We also want to check against multiple input values.

The TEST-INPUT argument is a list of input values so the most direct approach will be to iterate over these values and running both the form and the fitness functions against them.

To get a fitness value between 0.0 and 1.0 we take the absolute difference between the output of the generated form (OUTPUT) and the output of the fitness function (TARGET). We then add 1 to this difference and use that to divide 1.0. To illustrate:

CL-USER> (defun fit (x) (/ 1.0 (+ 1 x)))
FIT
CL-USER> (fit 0)  ; no difference so the desired output
1.0
CL-USER> (fit 0.1)
0.9090909
CL-USER> (fit 5)
0.16666667
CL-USER> (fit 500)
0.001996008

We're not testing negative values since DIFFERENCE in FITNESS will never be negative.

REPL test of FITNESS:

CL-USER> (defparameter random-form (random-form *operators*))
RANDOM-FORM

CL-USER> random-form
(/ 8.552494 =INPUT=)
CL-USER> (fitness random-form (lambda (r) (* pi r r)) '(0 1 -2))
NIL  ; this is correct since we divided by zero

CL-USER> (setf random-form (random-form *operators*))
(- =INPUT= 5.246996)
CL-USER> (fitness random-form (lambda (r) (* pi r r)) '(0 1 -2))
9.168484389517398d-4

CL-USER> (setf random-form (random-form *operators*))
(* =INPUT= (+ (* =INPUT= 1.8322039) (- =INPUT= 6.812643)))
CL-USER> (fitness random-form (lambda (r) (* pi r r)) '(0 1 -2))
0.009196622001630076d0

CL-USER> (fitness '(* pi =input= =input=) (lambda (r) (* pi r r)) '(0 1 -2))
1.0d0
CL-USER> (fitness '(* (- pi 0.1) =input= =input=) (lambda (r) (* pi r r)) '(0 1 -2))
0.649350645706412d0

6 Generation Functions

Now everything is in place to start thinking about creating new generations from the initial population. Currently we will only support cross-overs and mutations.

6.1 Traversing Nodes

Before we get to cross-overs and mutations we need to look at three important functions: N-NODES, RANDOM-NODE and REPLACE-NODE. They all share a common, but for each slightly different, function: TRAVERSE-NODES.

It is a basic recursive function. It iterates through each element of the FORM list and if that element is a list as well it calls TRAVERSE-NODES again with that list element as argument.

(defun traverse-nodes-example (form)
  (labels ((traverse-nodes (subform &optional (indent ""))
             (loop for node in subform
                   do (format t "~D:~A ~S~%" (/ (length indent) 2) indent node)
                      (when (listp node)
                        (traverse-nodes node
                                        (concatenate 'string indent "  "))))))
    (traverse-nodes form)))

TRAVERSE-NODES-EXAMPLE goes through FORM exactly how TRAVERSE-NODES does and for each node it prints its nesting level and the node itself:

CL-USER> (traverse-nodes-example '(a (b c) (d (e f) g) h))
0: A
0: (B C)
1:   B
1:   C
0: (D (E F) G)
1:   D
1:   (E F)
2:     E
2:     F
1:   G
0: H

6.1.1 N-NODES

(defun n-nodes (form)
  (let ((nodes 1))
    (labels ((traverse-nodes (subform)
               (loop for node in subform
                     do (incf nodes)
                        (when (listp node)
                          (traverse-nodes node)))))
      (traverse-nodes form))
    nodes))

Helper function for RANDOM-NODE. Returns the number of nodes in FORM including the root node. Note that "(B C)" as well as B and C are counted as nodes:

CL-USER> (n-nodes '(b c))
3
CL-USER> (n-nodes '(a (b c) (d (e f) g) h))
12
CL-USER> (n-nodes '())
1

6.1.2 Picking Random Nodes

We want to be able to pick a random node from a form to perform operations on. RANDOM-NODE does this:

(defun random-node (form)
  (let* ((index 1)
         (nodes-1 (- (n-nodes form) 1))
         (random-node-index (+ (random nodes-1) 1)))
    (labels ((traverse-nodes (subform)
               (loop for node in subform
                     do (when (= index random-node-index)
                          (return-from random-node
                                       (list :index index :node node)))
                        (incf index)
                        (when (listp node)
                          (traverse-nodes node)))))
      (traverse-nodes form))))

It picks a RANDOM-NODE-INDEX and starts traversing the nodes of FORM. When it arrives at the index it returns both the node and the index as a property list:

(:index random-node-index :node random-node)

Will not return the root node at index 0.

6.1.3 Replacing Nodes

We need to be able to replace a node in a form with another node. The avoid bugs and confusion we'll let REPLACE-NODE return this result as a new form.

(defun replace-node (form node-index new-node)
  (let ((index 0))
    (labels ((traverse-nodes (subform)
               (loop for node in subform
                     do (incf index)
                     when (= index node-index)
                       collect new-node
                     when (and (/= index node-index)
                               (not (listp node)))
                       collect node
                     when (and (/= index node-index)
                               (listp node))
                       collect (traverse-nodes node))))
      (traverse-nodes form))))

Traverses the nodes of FORM and collects them to return as a new form. When its INDEX counter is equal to NODE-INDEX it collects NEW-NODE instead.

The function does not replace the root node (index 0).

6.2 Cross-overs

There are different kinds of cross-overs but we will be doing the most straight-forward one: replace a random node in form A with a random node from form B.

Cross-overs are the most used genetic operation in GP and some people have even suggested to never use mutations. They can be compared to the procreation of humans in which attributes of the male and female are represented in their offspring.

(defun cross-over (form1 form2 &key (debug nil))
  (let ((rnode1 (random-node form1))
        (rnode2 (random-node form2)))
    (when debug
      (format t "form1: ~S~%form2: ~S~%rnode1: ~S~%rnode2: ~S~%"
              form1 form2 rnode1 rnode2))
    (replace-node form1 (getf rnode1 :index) (getf rnode2 :node))))

CROSS-OVER takes two forms as arguments and returns a new form. The new form is largely similar to FORM1 but one random node is replaced by a random node from FORM2. The function RANDOM-NODE is used to pick these random nodes.

CL-USER> (cross-over '(1 (2 3) (4 (5 6) 7) 8) '(a (b c) (d (e f) g) h) :debug t)
form1: (1 (2 3) (4 (5 6) 7) 8)
form2: (A (B C) (D (E F) G) H)
rnode1: (:INDEX 1 :NODE 1)
rnode2: (:INDEX 3 :NODE B)
(B (2 3) (4 (5 6) 7) 8)

CL-USER> (cross-over '(1 (2 3) (4 (5 6) 7) 8) '(a (b c) (d (e f) g) h) :debug t)
form1: (1 (2 3) (4 (5 6) 7) 8)
form2: (A (B C) (D (E F) G) H)
rnode1: (:INDEX 3 :NODE 2)
rnode2: (:INDEX 7 :NODE (E F))
(1 ((E F) 3) (4 (5 6) 7) 8)

CL-USER> (cross-over '(1 (2 3) (4 (5 6) 7) 8) '(a (b c) (d (e f) g) h) :debug t)
form1: (1 (2 3) (4 (5 6) 7) 8)
form2: (A (B C) (D (E F) G) H)
rnode1: (:INDEX 2 :NODE (2 3))
rnode2: (:INDEX 6 :NODE D)
(1 D (4 (5 6) 7) 8)

CL-USER> (cross-over '(1 (2 3) (4 (5 6) 7) 8) '(a (b c) (d (e f) g) h) :debug t)
form1: (1 (2 3) (4 (5 6) 7) 8)
form2: (A (B C) (D (E F) G) H)
rnode1: (:INDEX 5 :NODE (4 (5 6) 7))
rnode2: (:INDEX 5 :NODE (D (E F) G))
(1 (2 3) (D (E F) G) 8)

6.3 Mutation

Mutations change a part of a form without needing another form for the operation. It is perhaps analogue to a cosmic ray changing a bit of genetic information in a biological entity.

We will be using subtree mutation since it is straight-forward to write and has a potentially large effect.

(defun mutate (form operators &key (debug nil))
  (let ((rform (random-form operators))
        (rnode (random-node form)))
    (when debug
      (format t "form: ~S~%rform: ~S~%rnode: ~S~%" form rform rnode))
    (replace-node form (getf rnode :index) rform)))

Mutation replaces a random node of FORM with a form created by RANDOM-FORM.

CL-USER> (mutate '(a (b c) (d (e f) g) h) *operators* :debug t)
form: (A (B C) (D (E F) G) H)
rform: (+ =INPUT= (- 4.4699216 (+ 9.623513 =INPUT=)))
rnode: (:INDEX 3 :NODE B)
(A ((+ =INPUT= (- 4.4699216 (+ 9.623513 =INPUT=))) C) (D (E F) G) H)

CL-USER> (mutate '(a (b c) (d (e f) g) h) *operators* :debug t)
form: (A (B C) (D (E F) G) H)
rform: (+ 4.5209084 (- 8.943897 (+ 6.657296 =INPUT=)))
rnode: (:INDEX 2 :NODE (B C))
(A (+ 4.5209084 (- 8.943897 (+ 6.657296 =INPUT=))) (D (E F) G) H)

7 Advancing a Generation

7.1 Evaluating a Population

(defun evaluate-population (population fitness-fn test-input)
  (loop for form in population
        for fitness = (fitness form fitness-fn test-input)
        when fitness collect (list :fitness fitness :form form) into result
        finally (return (sort result
                              (lambda (a b)
                                (> (getf a :fitness) (getf b :fitness)))))))

To advance a generation we need to do cross-overs and mutations to the forms in the population. The literature suggests we give forms with a higher fitness more chance to be a candidate for a cross-over or mutation.

EVALUATE-POPULATION evaluates a population and returns a list of the forms in that population and their fitness (minus any forms that gave errors). For ease of testing on the REPL and for use in our next function the output is sorted from best fitness to worst:

CL-USER> (defparameter population (create-initial-population *operators* 10))
POPULATION

CL-USER> (evaluate-population population (lambda (r) (* pi r r)) '(0 1 -2))
((:FITNESS 0.016815323605722716111d0 :FORM (- (- =INPUT= =INPUT=) =INPUT=))
 (:FITNESS 0.009437720627636827027d0 :FORM (- 1.5379852 2.1223724))
 (:FITNESS 0.007690745276452599039d0 :FORM (* =INPUT= 6.8947406))
 (:FITNESS 0.0031801166742824690382d0 :FORM
  (* =INPUT= (- 3.0632048 (* (+ (- 5.8116364 =INPUT=) 0.1915878) =INPUT=))))
 (:FITNESS 0.0016815216288940106286d0 :FORM (+ (+ 7.147339 =INPUT=) =INPUT=))
 (:FITNESS 2.6768631466526024146d-4 :FORM (* (+ =INPUT= 2.1140647) 6.7930226)))

This is the same population as created in the Population chapter. Note that four of the ten forms have been eliminated since they executed illegal code.

7.2 HEAD

(defun head (sequence &optional (amount 1))
  (if (<= amount 0)
      nil
      (if (< (length sequence) amount)
          sequence
          (subseq sequence 0 amount))))

Utility function used by ADVANCE-GENERATION. Returns AMOUNT elements from the start of SEQUENCE. If SEQUENCE is shorter than AMOUNT it will return the whole SEQUENCE.

7.3 Running the Advancement

(defun advance-generation (population fitness-fn operators test-input
                           &optional (max-population 100))
  (let ((epop (evaluate-population population fitness-fn test-input)))
    (format t "Best fitness of current population: ~S~%"
            (getf (first epop) :fitness))
    (loop for plist in (head epop max-population)
          for i from 0
          for fitness = (getf plist :fitness)
          for form = (getf plist :form)
          collect form
          when (<= (random 1.0d0) fitness)
            collect (if (<= (random 100) 90)
                        (cross-over form (getf (random-elt epop) :form))
                        (mutate form operators))
          ;; Add a new random form to the population now and then.
          when (<= (random 100) 2) collect (random-form operators))))

Using the list of forms and their fitness from EVALUATE-POPULATION as input we only loop over the first MAX-POPULATION items to keep the population size in check. We then check the form's fitness against a random number between 0.0 and 1.0 and if the number is lower than the fitness the form will be selected for either a cross-over (90% chance) or a mutation (10% chance).

So the form's fitness is its chance to be selected. Since the fitness of the forms in the initial population is usually very low it will take quite a few generations before something starts happening.

We collect the form and if it's been selected we also collect the result from either CROSS-OVER or MUTATE. These will all be accumulated and eventually returned as the new generation.

There's also a small chance a new random form will be added to the population since the way we're currently handling things it is very possible for a successful form to take over the entire population and make it too homogeneous for good results. This is a quick and easy hack to introduce some chaos into populations.

7.4 Finding a Solution

Lets run ADVANCE-GENERATION a hundred times over a new population (we also update the population on each iteration with SETF):

CL-USER> (defparameter population (create-initial-population *operators* 100))
POPULATION
CL-USER> (loop repeat 100
               for i from 0
               do (format t "[~S] " i)
                  (setf population
                        (advance-generation population
                                            (lambda (r) (* pi r r))
                                            *operators*
                                            '(0 1 -2))))
[0] Best fitness of current population: 0.08546627574466652d0
[...]
[43] Best fitness of current population: 0.08626213422506805d0
[...]
[66] Best fitness of current population: 0.10050107335294403d0
[...]

And again:

CL-USER> (loop repeat 100 for i from 0 do (format t "[~S] " i) (setf population (advance-generation population (lambda (r) (* pi r r)) *operators* '(0 1 -2))))
[...]
[26] Best fitness of current population: 0.3865141184760903d0
[...]
[93] Best fitness of current population: 0.49884414719455095d0
[...]

Lets do another 300 runs so we've done 500 in total and see what the best form looks like in the end:

CL-USER> (loop repeat 300 for i from 0 do (format t "[~S] " i) (setf population (advance-generation population (lambda (r) (* pi r r)) *operators* '(0 1 -2))))
[...]
[35] Best fitness of current population: 0.5397727425319018d0
[...]
[62] Best fitness of current population: 0.559234953025742d0
[...]
[117] Best fitness of current population: 0.6436990901489741d0
[...]
[179] Best fitness of current population: 0.9657338407311192d0
[...]
[201] Best fitness of current population: 0.9705968409735506d0
[202] Best fitness of current population: 0.9755729636966523d0
[203] Best fitness of current population: 0.9994359703065686d0
[...]

CL-USER> (defparameter best-form (first (evaluate-population population (lambda (r) (* pi r r)) '(0 1 -2))))
BEST-FORM
CL-USER> best-form
(:FITNESS 0.9994359703065686d0 :FORM
 (* (+ (+ =INPUT= =INPUT=) (+ =INPUT= (/ =INPUT= (+ 3.5050452 3.5518444))))
    =INPUT=))

CL-USER> (run-form (getf best-form :form) 0)
0.0
CL-USER> (run-form (getf best-form :form) 1)
3.1417055
CL-USER> (run-form (getf best-form :form) 2)
12.566822
CL-USER> (run-form (getf best-form :form) 3)
28.275349

Here's the output from the real function to calculate the area of a circle for comparison:

CL-USER> (run-form '(* pi =input= =input=) 0)
0.0d0
CL-USER> (run-form '(* pi =input= =input=) 1)
3.141592653589793d0
CL-USER> (run-form '(* pi =input= =input=) 2)
12.566370614359172d0
CL-USER> (run-form '(* pi =input= =input=) 3)
28.274333882308138d0

Not quite the impressive result of Bill Clementson's attempt but not half bad either!

8 Conclusion

I hope to have shown how one can start with a few simple functions (RANDOM-FORM and its RANDOM-ELT helper) and explore a topic which might interest you.

If you read the literature you'll notice that there are a lot of things still wrong with the current setup. As you will find out if you start experimenting with the code in this article bloat is a big problem, as well as a successful form taking over the entire population.

If the topic of GP interests you I would suggest you read up and look into:

  • eliminating bloat
  • improving the creation of the initial population
  • improvements to the cross-over and mutation functions

for the code in this article.

9 Thanks

Thanks to the denizens of #lisp at freenode for Common Lisp help in general.

Thanks to the following people for proofreading and comments: Marijn Haverbeke, Gábor Melis, David O'Toole and Matthias of the space.invaders team.

Keep in mind that any mistakes are mine.

Labels: , , , , ,

4 Comments:

OpenID AdamSanderson said...

Just as a note, getting your fitness function wrong can lead to many hours of amusing failure. I wrote an Othello AI that selected for the worst strategy possible. Sadly it still beat me so I didn't notice for a week ;)

Tuesday, 18 January, 2011  
Blogger cperkins said...

I had a project once where I had to take a bitmap of a curved line and figure out the mathematical formula that best described that curve. I used Lisp and a Genetic Algorithm, using a very similar technique. It worked surprisingly well. I was very pleased. Sadly, that part of the project was later dropped. But I kept the GA code around for years hoping to use the technique again, but never did. It's on some backup drive now.

I've often wanted to develop a little DSL where the _only_ tools available were AI techniques like genetic algorithms, neural nets, nondeterministic programming, etc. What would that be like? I'm sure it would be horrible but hopefully entertaining.

Chris Perkins

Thursday, 20 January, 2011  
Blogger Jim said...

Thanks for the fun article! When it comes to SBCL, I think you might want to try something like this for run-form:

(defmacro run-form-new (form input)
`(let ((*error-output* (make-broadcast-stream)))
(handler-case ((lambda (=input=) ,form) ,input)
(error () nil))))

Here are my timing results for the original implementation, then the new one:

CL-USER> (time (loop for i from 1 to 10000 do (run-form '(+ =input= 2) i)))
Evaluation took:
11.406 seconds of real time
11.215166 seconds of total run time (10.642486 user, 0.572680 system)
[ Run times consist of 0.632 seconds GC time, and 10.584 seconds non-GC time. ]
98.33% CPU
10,000 forms interpreted
30,000 lambdas converted
24,651,068,000 processor cycles
1,478,891,344 bytes consed

NIL


CL-USER> (time (loop for i from 1 to 10000 do (run-form-new (+ =input= 2) i)))
Evaluation took:
0.002 seconds of real time
0.001285 seconds of total run time (0.001270 user, 0.000015 system)
50.00% CPU
3,152,409 processor cycles
1,120,928 bytes consed

NIL

... of course, under CLISP, the timing results seem to be exactly the opposite (but not /quite/ as extreme)!

[3]> (time (loop for i from 1 to 10000 do (run-form '(+ =input= 2) i)))
Real time: 0.238547 sec.
Run time: 0.221582 sec.
Space: 5124580 Bytes
GC: 9, GC time: 0.048605 sec.
NIL

[4]> (time (loop for i from 1 to 10000 do (run-form-new (+ =input= 2) i)))
Real time: 6.631785 sec.
Run time: 6.232759 sec.
Space: 99943588 Bytes
GC: 191, GC time: 1.093844 sec.
NIL

Friday, 21 January, 2011  
Blogger Peter said...

Some months ago, I wrote a master-slave style CL library called CL-MW which could be easily used for parallel evaluation of genetic programming. In fact, that was one of the purposes for me writing it. Check out the higher-order example and the fleshed out manual.

http://www.cliki.net/cl-mw

Tuesday, 10 May, 2011  

Post a Comment

<< Home