Monday, November 16, 2015

Evolution in a Computer, pt. 3

After months of waiting, I welcome you to the final part of my humble introductory mini-series about evolutionary computation. And today's topic is Genetic Programming

Discovering functions

Recall the example of finding the minimum of the function from the first part of this mini-series? We had a function we didn't know anything about (actually, we did, but that was just for the sake of showing what we can do when we know) and we wanted to find a minimum of that function. Now, let's look at a different problem: we don't have the function at all, only a handful of $x, y$ pairs, and we want to find out what the function actually looks like. How to do it?

If we restrict ourselves to a specific set of functions, e.g. linear functions, we can use specialized methods like (multiple) linear regression that will just throw the function at us given the data. Period. However, this has the problem of restricting to only a specific set of functions. If the "secret" function was actually a linear function, it would be the best we could do but if it wasn't, we would get errorneous function.

We could take a different approach and use more complicated machine learning methods like Artificial neural networks (ANN) that have the universal approximation property (simplified: given enough neurons they are, in principle, capable of representing almost any function with arbitrarily low error). However, their problem is that they can be immensely complex and if one wanted to use it as a mathematical expression, it would be mess.

As you might already have guessed, we are going to use.. Evolutionary algoritm! Well, we know how to evolve binary strings and real numbers... how do we use it to find a function? Well, it is very complicated. You could design some algorithm that would translate binary string to a function but it would probably be very messy and not efficient, not speaking about the fixed length of the binary strings which would be a major problem. However, there is a representation that can be used... trees!

Tree representation

A mathematical expression can be described like a tree. Let's take some made-up expression like $$ (x + \sin{y})^2 - \cos{3x} $$ Where is the tree in there? It's easy - the operators (i.e. addition, mutiplication...) and functions (i.e. sin, cos...) are inner nodes and the variables (i.e. x, y...) and constants (i.e. numbers) are leaves of the tree. The corresponding tree for the expression above looks like this:

Evolution of trees

So now that we have a representation for math expressions, how do we actually evolve it? Well, pretty much in the same way as in classical GA - by selection, recombination and mutation. Selection is identical - the better solutions (in this case those that have lower error) are preferred more than the worse ones. Recombination, i.e. crossover, and muation are the more interesting ones.

Crossover

There are more types of crossover but I'm going to speak about the most basic type - the subtree crossover. This crossover works in the following way:

Pick a random node $ N_1 $ in the first parent
Pick a random node $ N_2 $ in the second parent
Exchange the subtrees rooting in $ N_1 $ and $ N_2 $ between the two parents

Look at this example: And that's it! It's quite simple. Sometimes some tweaks are used, e.g. giving more preference to nodes that are deeper in the tree, but in essence, the basics are the same - just swapping subtrees.

Mutation

As there is subtree crossover, there is subtree mutation too. Again, it is based around mangling subtrees. However, contrary to the crossover, in mutation we don't exchange subtrees between multiple solutions but we modify only one. The most basic subtree mutation looks like this:

Randomly pick a node
Throw away that node (and its subtree) and replace it with a new randomly generated subtree

No illustration here - it's really that simple.

Demo

This is a little demo of GP in action. The task is to find the function based on a bunch of samples (the red points; the sought function is the dashed curve). The actual function that we want the GP to find is $$ f(x) = 5 \sin{x^2} $$ ATTENTION: you might find that the algorithm "runs away" with bigger and bigger and messier and messier functions. I suggest you stop the algorithm and start again. This algorithm is really not tuned very well, it's just a demo.

Clicking Start starts the evolution. Clicking Cancel during the evolution terminates it. You can play with the population size (must be positive even number), crossover and mutation probs (must be between 0 and 1) and the tournament size (must be between 2 and size of population). The evolution runs indefinitely (i.e. until you stop it). You can also play with the number of available data points (the more the easier it should be for the algorithm to find the solution). This thing is written in Dart and you can find the source here.

Not just functions

We can evolve functions. But one could say “Fine, but where's the programming in that?” and she'll be totally right. So far we can make mathematical expressions. But the tree representation goes beyond that. All you need to do is to represent a program like a tree. And it is quite simple. Let's imagine that we control an ant in a 2D grid world. The ant can turn left, turn right and decide what to do based on whether there is a piece of food just ahead of it. And here is what a program in this domain might look like: As you can see, the only difference is that the branching point (or non-terminal) is not an operator but a decision-making point and the leaves are not variables and constants but actions. In fact, there could be many more non-terminals, e.g. cycles, function calls etc.

Closure property

All the Genetic Programming I introduced above is called Koza-style GP because it is the idea of John R. Koza which he published in his book [1]. For this type of GP to work a so called closure property must hold. Closure property holds if an output of every terminal and non-terminal is acceptable by every non-terminal as an input. That menas that whatever our nodes output, it must be possible to chain it with an arbitrary node.

If our non-terminal set is made of plus, minus and multiplication and our terminal set of variables and numbers, closure property holds - however we add, subtract or multiply numbers, we always get a number that can be furter added to, subtracted from or multiplied. However, if we extended our non-terminal set by a non-terminal such as if that would take three children - a condition, a result when condition is true and a result when condition is false - then the closure property does not hold - the condition requires a boolean value but that is different from numbers. We would have to employ some rule that would define true and false in terms of numbers.

In fact, it would be enough to just add division to the non-terminal set. Yes, it would still produce only numbers but what if the second argument to a division resulted to zero? We can't divide by zero so division cannot accept zero as its second argument. You see, it is sufficient that a non-terminal doesn't accept some of the values even though they are of the same type.

Summary

In this long-awaited (I hope :)) post I described a little bit of what is known as Genetic Programming. It is basically a way of evolving mathematical expressions, computer programs or basically any tree-like structure using the principles of genetic algorithms. It might (or might not) seem as a brilliant technique able to solve anything... but don't let yourself be fooled. Finding functions and programs is really hard and most of the time this vanilla GP doesn't work very well. But it's a start and it can solve at least some easy problems.

Regarding further posts I'm going to publish here, I don't have any clear plan. I suppose I might dive into specific algorithms or things I do or something totally different. We'll see :).

^[1] Koza, John R. Genetic programming: on the programming of computers by means of natural selection. Vol. 1. MIT press, 1992.

Monday, July 20, 2015

Evolution in a Computer, Pt. 2

After a long waiting, it's here! To seamlessly continue from the previous part, I prepared a little demo of a genetic algorithm. What we have here is a so-called Facility location problem, or a variation of it.

There is a bunch of cities (50 in our case) and we want to supply the cities with goods so we must open some factories. We must decide how many factories we are going to open and in which cities. We won't explicitly assign which cities are to be supplied by which factories but a city will be assigned to the nearest factory (imagine the factories have unlmitted capacity). The catch: we want to minimize the total cost:

every opened factory costs us some fixed amount of money
delivering goods from a factory to a city costs us an amount of money equal to the (euclidean) distance from the factory to the city

Clearly, we cannot try every combination of cities with a factory as it's $2^{50} \approx 1.13 \cdot 10^{15}$ combinations. That would took almost 36 years if checking one solution took one microsecond. Yes, this problem is NP-hard. Instead, we are going to use a... Genetic Algorithm!

So how do we represent a solution to our problem? Well, we'll use a binary string, as in proper GA, with the number of bits equal to the number of cities. Then $1$ in position $n$ means that the $n$-th city has a factory and vice versa, e.g. a string like 001000110 would mean that the third, seventh and eighth city would have a factory while the first, second, fourth, fifth, sixth and ninth wouldn't. Now, give it a try!

Clicking Start starts the genetic algorithm from beginning with the current map. Clicking New map generates a new (random) map. Clicking Stop during the optimization stops it. You can edit the cost of opening a facility, size of population (must be positive even number), probability of crossover and mutation (must be between 0 and 1) and size of tournament (must be between 2 and size of population). The evolution runs indefinitely (i.e. until you stop it). The world is 100 by 50 units. This thing is written in Dart and you can find the source here.

Real-valued Evolution

So far we have spoken about a genetic algorithm that uses binary strings. This means that our solutions are inherently discrete. But what if we wanted to really find the minimum of the function from the part one?

The „standard“ way

The first way of dealing with real number is to just express them using a binary standard, e.g. the IEEE 754 single-precision floating-point format. But this has problems. Imagine numbers 1.763 and -2.611. These would look like this (e.g. using this tool):

1.763 = 00111111111000011010100111111100
-2.611 = 11000000001001110001101010100000

And now imagine what would happen when we crossed them over using single-point crossover after the 5th bit:

00111111111000011010100111111100
11000000001001110001101010100000
turns into
00111000001001110001101010100000 = 3.9840699e-05
11000111111000011010100111111100 = -115539.96875

One could say "Well, that escalated quickly!" and indeed, these problems arise all the time. If we explore all the possible crossovers (there are 32 of them resulting in 64 numbers) they include gems like -3.252161001305617e+19, 3.591135600494495e-39 and NaN. Bit-flip mutation is going to have similar problems.

The good way

Much better way than sticking to the binary representation is just use the real numbers. Then, if we have a one-dimensional problem (like in searching for the minimum of that function), the genotype is just one real number. This brings a lot of changes in the crossover and mutation operators.

As we have seen, just crossing over the binary representation is bad. Therefore we need to find a better way of combining two individuals. We would like the offspring(s) to be „close“ to the parents, but be different from them. Since we are in the real numbers, we can use the power of probability and random sampling! One of the simplest real-valued crossover is a so-called arithmetic crossover, which is basically a weighted mean of two (or more!) parents with randomly chosen weights. Or more formally:

$$ \mathbf{x}^{Offs} = \sum_{i=1}^{N_{Pa}} \alpha_i \mathbf{x}_i $$ where $\sum_{i=1}^{N_{Pa}} \alpha_i = 1, \alpha_i \in (0, 1)$; $\alpha_i$ are picked randomly for each offspring.

And there are many other crossover operators, e.g. BLX-α, SBX, PCX...

The mutation is easier. One of the most used way is to just add a randomly generated number (or vector if our genotype is more numbers) from a distribution of our choice. The most used distributions are the uniform distribution (where we need to specify the bounds) and the normal distribution (where we need to specify the variance).

Next...

The final part of this introductory mini-series, which will finally get us to what I'm actually doing, is going to be big. It is going to be about Genetic Programming. So stay tuned!

Saturday, July 18, 2015

GECCO - days 3, 4 and 5

I'm very sorry I didn't post updates on GECCO as it went. I was quite busy at the conference as well as tired at the end of the day(s). So I'll sum up the three days of the main conference in this one post.

Day 3

It was the first day of the main conference, i.e. the part where the full papers were presented in a 25 minutes long talks (20 + 5 for questions and discussion). It started with a keynote by Ricard Solé from Universitat Pompeu Fabra entitled „Re-designing Nature with Synthetic Biology: from Artificial Ants and Tissues to a New Biosphere“. He talked about designing new organisms to do useful stuff like fixing cracks in concrete or repairing damaged land. To be honest I didn't understand most of it very much but it was interesting.

Here are two interesting talks I have visited. They are not the only ones but these I find the most interesting and having the highest impact on my own research.

Building Predictive Models via Feature Synthesis

A great talk given by Una-May O'Reilly, from MIT, introduced a new approach for building predictive models. It is based on Genetic Programming and LASSO, a linear regression method that penalizes coeffitients size. I still have to read the paper but from what was told during the talk, but this is really a step forward in evolutionary machine learning as it is supposed to be fast and produce good models.

Examining the „Best of Both Worlds“ of GE

This talk I would call „a bad day for GE“. GE, or Grammatical Evolution, is an algorithm for Genetic Programming which happens to be the one I use very extensively (I promise I will write about this! Really! Sometimes...). In his talk, Peter A. Whigham has shown that the GE is not as good algorithm as it is thought of. In fact he has shown that it is only a little bit better than random search. If one wants to use grammar-based GP approach he recommends to use Context-free Grammar Genetic Programming (CFGGP), an approach older than GE.

Day 4

The fourth day, or the second day of the main conference, started with a keynote by Kate Smith-Miles from the Monash University, entitled „Visualising the Diversity of Benchmark Instances and Generating New Test Instances to Elicit Insights into Algorithm Performance“. The title sounds scary but it really is not that. Kate described the problem of current algorithm research procedure:

Propose a new algorithm which is (usually) a minor variation on the existing ones.
Measure the performance on this, this ... and this benchmark.
Report that my new algorithm is overall a little bit better than the state of the art.
Publish a paper.

She argued that this approach can only show strengths of the algorithm and not its weaknesses, and she raised a question whether the benchmarks that are commonly being used are really that diverse. She continued on with her talk by demonstrating a new approach for performance measurement of algorithms by computing features of the problems, projecting this space into 2D and there visualising the areas where particular algorithms are the most effective and where lay their weaknesses.

Interactively Evolving Compositional Sound Synthesis Networks

An interesting talk at the Digital Entertainment Technologies and Arts track given by Amy Hoover. The research she presented was about using CPPNs evolved by NEAT to produce sound synthesisers. All this happens through an interactive evolution, i.e. the user manually selects which „solutions“ are good. After all, you can try the very product of the research here. http://bthj.is/breedesizer/

An Efficient Structural Diversity Technique for Genetic Programming

The first of the three best paper nominees in the Genetic Programming track. The talk itself was not that great but the presented research, or the idea behind it, is brilliant. It was about enhancing diversity in GP, but with a very very simple yet powerful mechanism: you record first (few) levels of the trees, called genetic markers, and you record a „density“, or how many individuals have the same genetic marker, and you then use some multi-objective paradigm to optimize both the fitnes and the density.

Genetic Programming with Epigenetic Local Search

Second of the best paper nominees. The talk was about an epigenetic search in GP. In biology, epigenetics is a subfield of genetics which looks at the changes in gene expression other than the changes in the DNA (very simplified). Here it meant the ability of parts of the genotype to be simply switched off. Since this would be very difficult to do with classical tree GP, the Lee Spector's Push language was used. The genotypes simply carry a bit for each element in the genotype which simply means whether this element is to be used or not. The epigenetic local search then plays with these bits.

Memetic Semantic Genetic Programming

The last of the best paper nominees and actually the one that won (though I voted for the first one). The research was focused on GP for boolean algebra (and extended for finite algebras). The „trick“ was to analyze where an individual is wrong and repairing the subtrees that cause it to be wrong, using a static library of trees generated once at the start.

Day 5

The last day of the conference. It started with a „ceremony“ where best paper winners were presented, as well as the winners of competitions and other awards. Then the next GECCO place was presented, which is going to be Denver, Colorado, USA.

After the ceremony there was the last keynote, given by Manuel Martín-Loeches from Complutense University of Madrid entitled „Origins and Evolution of Human Language: Saltation or Gradualism?“. And as the title says, it was about the evolution of the human language. It was very interesting and the message is that Manurel Martín-Loeches stands by the theory that the language was a result of gradual evolution by small mutations, rather than one (or a few) big macro-mutation(s).

The last session of GECCO did not have any talks that were very relevant for my research so I looked mostly for things that were just interesting...

Novelty Search for Soft Robotic Space Exploration

An interesting talk about using Novelty Search for evolving soft robots for space exploration. The research was focused on evolving robots made of soft materials of varying properties for exploration of other bodies, like the Moon or Mars. It used indirect encoding by means of evolving CPPNs that produce the robot instead of evolving the robot directly. The CPPN told whether a voxel is to be filled with the material and if yes then with which material. The presentation was quite funny because the presenter showed us videos of the evolved robots moving, some of which being very funny.

Final summary

This was my first GECCO and also my first conference of such magnitude. In my opinion the conference ran very smoothly, there were no problems. I successfuly presented both my papers, one at the Student Workshop and the second one in the regular poster session. I had a chance to talk with other researchers and students which was quite nice. To sum up, I consider GECCO 2015 to have been great and I hope it was not the last one for me.

Monday, July 13, 2015

GECCO - day 2

„Hola!“ as they say here in Madrid. The day two of GECCO 2015 is over. Well, still not the GECCO itself but the workshops and tutorials before. I presented my student poster today and I would like to mention two of the tutorials I attended.

Evolutionary Computation: A Unified Approach

Given by Kenneth A. De Jong, it was a nice and simple tutorial about the general aspects of evolutionary computation. The main goal of the tutorial was to give a broad overview of various characteristics and aspets of evolutionary algorithm, evolution strategies, genetic algorithms etc. and extract all of these into a „unyfing“ field called Evolutionary Computation. The tutorial was very basic and I can't say I was taught something new, but Kenneth's overall presentation put all the things into a single perspective which was very good and excellent to sort thoughts out.

Expressive Genetic Programming

Given by Lee Spector (GitHub), this was also a nice tutorial heavily targeted at the Push programming language designed for genetic programming. The language is, one could say, weird: everything is on stacks, including the program itself, everything reads from stacks and pushes to them or somehow mangles with them. It is designed with minimal syntax constraints. It is for a reason - so that random programs can exist and be valid. It's worth giving a look. There are many implementations of Push, like Clojush which is in, you guessed it, Clojure! I decided I'll try to implement it in Haskell and name it Hush :). I already created a GitHub repo, but there is nothing there yet.

Stay tuned for the report from today which is the third day and also the first day of the main conference. It is (almost) filled with interesting talks, unfortunately some of them I will have to miss (because there are other ones taking place at the same time slot). Stay tuned...

Saturday, July 11, 2015

Hello from Madrid! GECCO - day 1

Hello everyone from Madrid, Spain! First of all, it is quite hellish weather here - no clouds whatsoever and close to 40°C (I won't tell you Farenheits because I hate that!) in the shadow. At least there is some wind blowing... Anyway, today was the first day of GECCO, or more precisely the workshops and tutorials before the main conference. My main focus was the Student Workshop where I presented my paper which I won't write about yet because the Evolution in Computer, pt. 3 is not out yet. As a matter of fact, pt. 2 is not out yet too for which I'm very sorry (but it's almost ready to be released, maybe I'll post it while still in Madrid, stay tuned).

Low or No Cost Distributed Evolutionary Computation

One of the very few tutorials I had time to attend was this one given by JJ Merelo of Universidad de Granada.

The @gecco2015 talk on "low or no cost distributed evolutionary algorithms" http://t.co/ixkMNaeeRs (+ last minute changes) /cc @jormaral
— JJ Merelo (@jjmerelo) July 11, 2015

It was about how to use (almost) free (to you) resources to do your evolutionary computation job. He claimed to use JavaScript because it is a standard, it is everywhere and you can write both client and server app (through node.js) and that you should to it in JavaScript because in the end you must write both the client and server so why not do both in the same language. Good point, but he probably does not know about Dart. That one has all the attributes of JavaScript he presented and it is not **** :). By the way, if you want to know more about Dart and things around, look at a blog of a friend of mine, Jana Moudrá, who is also a Google Developer Expert on Dart.

Back to the tutorial. The main point was to distribute the evolutionary computation using pool EA. Unfortunately, he didn't talk about that, instead he talked about the technical realization of such distributed computation. The idea is to use web browsers, which are the new operating systems, of ordinary people to run parts of your computation. In other words, turn people into computing power by „convincing“ them to visit your page that contains the client code that will make them take part in the distributed computation. The second main idea is to use (almost) free services to run the server part (he mentioned Heroku and OpenShift). The third idea is to tell about this to people using social networks (he mentioned Twitter) to get as many „clicks“ as possible. But not to smuggle the dirty computation hidden somewhere in an otherwise innocent page, but be open, honest and give „something“ in return. That something is to be knowledge, e.g. code on github.

Should I rate the tutorial, I would say I'm a little bit disappointed. I expected more about the algorithmic part, but on the other hand the idea of using web browsers of ordinary people is brilliant.

That concludes the day one at the Genetic and Evolutionary Computation Conference or GECCO in Madrid, Spain. Stay tuned for the next day.

Monday, June 22, 2015

Evolution in a Computer

As I already mentioned in The First post, my main interest is the evolutionary computation, so I decided I'll write about some basics of it.

Optimisation

Because evolution is in fact a sort of optimisation, what that one actually is? Generally, optimisation is a process of developing or tuning a solution to some problem such that the solution is as good as we can get.

Example: suppose we want to find the minimum (e.g. because it is the money we have to spend) of the following function:

$$ f(x) = \mathrm{e}^x\left(x^3 - 3x^2 + 5x − 5\right) + 6 $$

In this case it is super-easy - we can just compute the derivative

$$ f'(x) = \mathrm{e}^x x \left(x + 1\right) \left(x - 1\right) $$

and then find the points where it equals zero

$$ x = \left\lbrace \begin{matrix} -1 \\ 0 \\ 1 \end{matrix} \right. $$

and from them pick the one with the lowest value

$$ f(-1) \approx 0.849688 \quad f(0) = 1 \quad f(1) \approx 0.563436 $$

... and the winner is... $x = 1$!

Black-Box Optimisation

The previous example was nice, because we had the analytical description of the function (and it was very "user-friendly"). But what if $f(x)$ is not differentiable (e.g. it's discrete)? Or what if $f(x)$ is a black-box and we know nothing about it?

We might want to systematically try some points, e.g. in a grid, or pick them randomly and try luck... but what if the process of getting $f(x)$ is expensive? What if $x$ is a set of parameters of an aerodynamical body and evaluating $f(x)$ means physically manufacturing the body and testing it in a wind tunnel? It is clear that we have to find some clever way of exploring the space of $x$.

Evolutionary Algorithms

Believe it or not, natural evolution is a process of optimisation. What is the $x$ and $f(x)$ then? Well, in nature, $x$ is the DNA that determines the organism (very simplified) and $f(x)$ is the success of the particular organism of surviving and reproducing in the given environment. So natural evolution optimizes the organisms to surivive and reproduce well in the environment; it seeks the "solution" to the life in the environment.

Nature is, generally, quite successful in finding good solutions. We mimick nature all the time (insect-like robots, bat-like sonar...) and evolutionary algorithms (EAs) are no exception.

The general structure of an EA is following:

initialize a population of solutions \(P\)
while not stopping condition
    \(\forall x \in P:\) compute \(f(x)\)
    select parents from \(P\)
    generate children from parents
    mutate children
    put children back in \(P\)
    possibly remove (kill) some individuals in \(P\)
return the individual with the best \(f(x)\) encountered

Genetic Algorithm

Genetic Algorithm (GA) is one particular type of EA. There are others but GA is probably the most famous and the one being extended and based upon the most. So how does the GA fit to the EA structure described above?

In GA, the genotype of a solution, or $x$, is a string of 0s and 1s of predefined fixed length (depends on the problem we are solving).

The population is a fixed-sized collection of solutions (the right word would be genotype but most of the time the difference is not important so I'll use solution or individual).

Selection is the crucial part of the algorithm. That's the part "where it all happens". Selection's job is to pick such individuals from $P$ which are better than the others but in a non-greedy way. One of the most widely used selection strategies is the Tournament selection.

Generating children from parets is done by crossover. In general, crossover is a process of combining the genotypes of two (or more) parents to create a genotype of one (or more) child. One of the most simple (and also the most widely used) crossover strategy is the One-point (or single-point) crossover.

Mutation is the second crutial part of the algorithm (together with crossover). It's the part where new genetic material is created. Mutation consists of a small random change in the genotype. The easiest and the most simple mutation is just to randomly pick one (or more) bits and flip their values.

The last two steps, that are putting children back into $P$ and killing some individuals in $P$, are usually bound together. In a so-called generational GA, the process of selecting parents, crossing them over to children and mutating the children is repeated until there is a number of children equal to the size of $P$ and then the $P$ is discarded and replaced with the generated children. On the other hand, in a so-called steady-state GA, only a few parents are selected (usualy just enough so that crossover can be performed) producing only a few children (defined by the crossover method). These few children are mutated and then they are put back into $P$ such that the size of $P$ remains the same. This means that some individuals must be killed. The strategy of chosing those individuals is called replacement strategy and has to be chosen. A very simple strategy might be just killing the worst individual(s).

Both the crossover and mutation are usually applied probabilistically, i.e. there is a predefined probability (chosen by the user) of two parents being subject to crossover (if they are not, they are usually just copied so that the children are identical to their parents) and another probability (also chosen by the user) of a child being subject to mutation. Crossover probability is usually rather high (0.8 is a usual value) while the mutation probability is usually very low (e.g. 0.01 per bit).

Final words

GA is definitely the only EA in town. I also only scratched the surface. The Evolution in a Computer is probably going to become a little mini-series covering some aspects of evolutionary computation. In the next post I'll write about other EAs which are more suitable for continuous problems. And also await a post about evolutionary programming! But it is really too early for that one.

Thursday, June 11, 2015

The First

So this is the very first post...

... so I'll start with a very brief introduction of myself. My name is Jan Žegklitz. I'm a PhD student at Czech Technical University in Prague. I study artificial intelligence, evolutionary algorithms and programming in particular. I also like programming in general, trying new languages and stuff...

In this blog I'm going to write about various things from my academic/hobby stuff. The nearest big one will be the GECCO conference in Madrid, Spain, which starts exactly in one month, where I'm presenting two papers:

Model Selection and Overfitting in Genetic Programming: Empirical Study (as a poster at the main conference)
Symbolic Regression by Grammar-based Multi-Gene Genetic Programming (at the Student Workshop)

I'll write more about these later...