Sunday, February 26, 2017

Quickie: new twitter username

So, I decided it's time to grow up, so I changed my twitter username from @Gandalv to @zegkljan. Feel free to follow, tweet, or ignore :).

Tuesday, October 11, 2016

Schlieren imaging system

Hello and welcome to another blog post. And it is not going to be neither about genetic algorithms or programming nor about space related stuff. Today's blog post is going to be about a nice physics phenomenon called schlieren effect and it's application for visualising air flows. There is going to be a lot of pictures and if you get to the end, a video is waiting for you. Pleasant reading!

What is the schlieren effect?

The word "schlieren" comes from the German language and is, in fact, a plural of a word "schliere" which means "streak", so schlieren means streaks. And streaks are what all this is about, but let's call it schlieren. The schlieren in question are schlieren in gas. For example, when there is a heat source, the air around this source heatens up which makes it less dense which means that it also refracts light differently than the surrounding cold air. That leads to schlieren produced by light passing thourgh this non-homogeneous environment. You can see this effect every summer - look over a roof of a car that has been standing in sunlight for some time. You are going to see the image from behind the car subtly shaking as the hot air rises from the roof.

What is the schlieren photography, how does it work?

Except for such extreme cases like hot air rising from the black car in a summer day, the schlieren are often next to invisible for naked eye. For example, when you light a candle, you can't see much more than the flame itself. Another example might be an air stream from a haidryer. However, with a proper schlieren photography setup, you can visualise these subtle differences.

There are several ways of visualising the tiny differences in air (gas) density. Here I'm going to focus on the probably simplest and cheapest one which is the single mirror setup. This setup requires three key components:

  • a point light source,
  • a concave spherical mirror (parabolic can also work) and
  • a sharp edge.
The mirror must have a focal length (half its curvature radius in case of spherical mirror) in the order of meter(s).

Now, the best way to illustrate how does it actually work is to show it with a picture:

As you can see, the point light source is placed at double the focal length from the mirror and a little bit off-axis. Double the focal length is the radius of the mirror's curvature. If you put something in the center of a reflective sphere, its image gets reflected back to itself. In this case the point gets reflected to the opposite side of the optical axis back to a point (technically, because it is off-axis, it does not focus exactly into a point but the difference is so small that it can be ignored). The sharp edge, e.g. a razor's edge, is placed so that it goes through exactly the focused point of light. A camera is placed behind the edge so that it looks straight into the mirror through the focused point at the edge. The testing area is in front of the mirror.

Normally, when there is no disturbance in the path of the light, it is just focused back to a point and the camera captures a uniform brightness across the mirror. However, when there is even a small disturbance in the air (gas) density, it very slightly bends the light. It gets reflected from the mirror and the slight disturbance gets magnified as it travels towards the focus point. Since the light is diverted from its original path it's going to be out of focus. Some of this distubed light will get blocked by the razor, causing a dark spot in the image, and the other part will "dodge" the razor, causing a bright spot in the image.

When everything is set up and you light a candle in front of the mirror, it might look like below. This is the very first image I captured on my schlieren imaging setup:

First light #schlieren

Fotka zveřejněná uživatelem Jan Žegklitz (@gandalvv),

My schlieren setup

My schlieren setup and the making of it is heavily inspired by this great guide, so everything that is said there more or less applies to my setup as well.

The mirror

The most expensive and the most important component is the mirror. I have a primary telescope mirror, spherical, 160 mm in diameter and with a focal length of 1400 mm. That means that the light source and the edge is 2800 mm from the mirror. I got my mirror at eBay from optics21406 for 69.99 $. I can say that the mirror is great, very precise.

The mirror looks like this:

New toy has arrived. 160 mm diameter, 1400 mm focal length, spherical.

Fotka zveřejněná uživatelem Jan Žegklitz (@gandalvv),

It's a nice and robust piece of glass. It weighs about a kilogram! Quite a lot for such, one would say, a small mirror :).

Regarding the mirror, if you want to build your own schlieren system, forget about using your magnifying bathroom mirror. These are VERY imprecise. You need to get a telescope grade mirror. The focal length must not be too short. Otherwise the light might not get enough space to be diverted enough to be blocked by the razor. Also, the size matters. Much. The bigger the mirror, the bigger the testing area is going to be. So forget about laser optics - these are very small (since laser beams tend to be very narrow).

The mirror mount is very homemade :). Sticking to the guide I mentioned above, it's made of cardboard, specifically from the box the mirror arrived in:

Homemade mirror mount for #schlieren. Made from the box the mirror arrived in.

Fotka zveřejněná uživatelem Jan Žegklitz (@gandalvv),

The light source and the edge

I use a super-bright white LED (3.4 V, 20 mA, I forgot the luminosity), shining though a piece of aluminium foil with a small hole in it. I power it by a 9V battery through a 270Ω resistor. The edge is just a simple razor edge. I have it all "mounted" and taped on a piece of polystyrene which itself is mounted (taped) on a tripod. It looks like this:

#schlieren light source. 3.4 V, 20 mA super bright LED.

Fotka zveřejněná uživatelem Jan Žegklitz (@gandalvv),

To be honest, this image was taken before I set the camera up. After I had set it up, I had to cover the LED from behind so that it hasn't produced parasite light.

The result

Finally, I recorded a video of some stuff in front of the mirror. To be specific, you will see a candle being lit up and moved around, a hairdryer blowing on its own and also on an obstacle (my hand), and also the famous chemical reaction of baking soda and vinegar, producing some carbon dioxide (which is heavier than air). Enjoy!

What can be improved

My current setup is far from perfect and there are some things that can certainly be improved.

The light source

The super-bright LED is very fine, but the other stuff is not. I think I could make the hole in the aluminium foil even smaller so I would get better resolution of the disturbances. Also, as you saw it in the photo above, the backwards reflection from the aluminium foil shined a lot into my camera's lens and I had to additionally cover it with a spare lens cover. The best would be to just make some nice box with a switch, proper place for battery, a tripod thread for secure mounting and maybe some additional screws for precise alignment.

The mirror

The mirror is amazing! I could, of course, use bigger but that would be at a completely different budget level... But back to the troubles. It is terribly easy to get it dirty and it was. For telescope usage it is not a big deal as there the whole mirror makes up the whole image "collectively", so even with quite a significant amount of dust (judging by eye) it can still produce crystal clear images. Here, however, every spot on the mirror makes its own part of the image. And cleaning a telescope mirror is not an easy task if you really don't want to damage the protective and reflective coating. I tried somewhat and kind of failed. Luckily, it was the good type of failure - I failed at removing the dirt, not scratching the mirror.

The razor

In the video you could have seen two phenomena. The first one is that when the hairdryer was blowing horizontally it was much less visible than when it was blowing vertically. This has to do with the orientation of the razor. If the razor was horizontally aligned it would have been vice versa. The second one is that sometimes, e.g. when I put something in front of the mirror and it shaked, some of the stuff in the mirror that seemed like dirt shaked too. That means that it was not a dirt on the mirror but some other abberation somewhere else. I suspect that it could be the razor.

Another thing I would like to try is to replace the opaque razor with some color filter. The effect would then be colored instead of black and white. It could be quite nice :). Another idea that occurred to me was to replace the razor with yet another pinhole (about the same size as the one in front of the LED) or with two razors perpendicular so that any disturbancea in all/both directions would be pronounced. But I'm not very sure about this.

Other stuff

As you could have seen, the colors in the video are wildly red. That's because of me screwing up the white balance. I also had a lot of troubles with stability of the setup as I have set it up on carpets so the razor was sliding to one side all the time. Also, my tripods are far from ideal. The one with the light source is no big deal but my camera tripod is a mess. It's very old, some pieces are broken and it's not very stiff, which is kind of a problem when you need to align the camera precisely :). Some intermediate piece between the tripod and the camera, with extra collimation screws for precise alignment would be awesome but that is a pie in the sky for now.

I hope you enjoyed this first experiment. I will do more of them and better and I will report about it here. If you have any comments or questions, feel free to leave them below.

Sunday, September 11, 2016

ESA Citizen's Debate 2016

Yesterday, on Saturday, 10th September 2016, I took part in the first ever Citizen's Debate about space related stuff organized by ESA, the European Space Agency. This debate took place in all of the 22 member countries of ESA.

Each country had up to 100 people participating in the debate. Since I'm Czech and live in the Czech Republic, I was at Prague, the capital of Czech Republic. The Prague debate was organized by the Technology Centre of the Czech Academy of Sciences and was held at the Masaryk dormitory. In order to be able to participate in the debate, one had to apply for it through a web form where he or she stated, among other information, some demographic data - age, job, region of residence, etc. The debate organizers then selected from the pople who applied to cover the broadest part of the demographic spectrum as possible. I think this was successful as I have seen people of all genders, ages, shapes and sizes :).

The goal of the debate was for ESA to get a glimpse at what ordinary people think and feel about space in general and about ESA and what it is doing. The organisation of the debate was following. The people were divided into small groups of about 8 people (at least in Prague) who were sitting around a table and debating the topics. The debate was divided into 5 distinct parts, called sequences. Each sequence was dedicated to some topic. For each sequence, each participant received a questionaire with a bunch of questions. The people around the table then debated over the questions in the questionaires. More, there were facilitators, people who directed and constrained the small debates to keep on the topic and who also extracted and written down some interesting ideas and opinions that came up in the debates. However, the questionaires were filled individually. In addition to the 5 sequences, there were two more questionaires (at the beginning and at the very end) which were to be filled completely individually, without any discussion.

There were three big topics which were closely watched and written down (the essence of the discussion) by the facilitators. The first one was the issue of space debris and how to handle it. The second big topic was ESA financing. The last one was the future of space exploration in the form of what would we like, if we had the power.

Overall, the debate was interesting and, as Johann-Dietrich Wörner, the Director General of ESA, said, the first ever such debate in the history. ESA said that it would listen to the debate results very closely. The results of the debate (the first data are already coming int) and more info about the whole thing can be found at www.citizensdebate.space. The materials used in the debate should be available soon for everyone in the "Replica debate" section.

Tuesday, August 02, 2016

Symbolic Regression and Multi-Gene Genetic Programming

I'm back with my blog being about something useful. Sorry for the long waiting, I was really really busy. Today's topic will be about a class of techniques for speeding up Genetic Programming algorithms in Symbolic Regression scenarios.

Regression vs. Classification

First of all, let me do a quick introduction/clarification regarding regression and classification. These two are machine learning tasks, similar in part, but different in the end. In both tasks we have a bunch of objects each described by a set of numbers (not necessarily but for the sake of simplicity numbers will do great).

In classification the goal is to assign a class to an object based on the numbers it is described by. The class can be any discrete value. Examples are: true/false, red/green/blue, high/middle/low, 1/2/3/4/5/6/7/8/9/10 rating, or any label. The special case with just two classes also has a special name binary classification. Typical classification problem is spam detection - is an e-mail spam or ham?

In regression the goal is to assign a numerical value to each object, again based on the numbers describing that object. Generally, these assigned numbers can have any value. Typical regression problems are various prediction and estimation problems. For example, suppose there is some complicated machine and you want to predict fuel consumption based on some operational data.

Symbolic Regression

Symbolic Regression is a class of tasks for GP systems. Actually, it is the very first example I mentioned in Evolution in a Computer, pt. 3 - we have a bunch of data points and we want to find a function that fits them accurately (enough). It doesn't matter where the data points come from, be it anything from a demonstrative problem as in that blog post to data from medical scans where we need to asses a patient's condition. Whenever we want to find a function fitting some data, it is symbolic regression.

The word symbolic is quite important. It means that the output model is a mathematical formula, that could be, in principle, read and interpreted by a human, or, for example, be given to a computer algebra system (CAS) to compute derivatives, find roots, etc. This is in contrast with, e.g., neural networks which is characterized by its topology (how many neurons are there and how are they connected) and the values of the weights (the degrees of how pairs of neurons are connected). That is not something one could look at and see a pattern nor could be given to some CAS even though it still does regression.

So, from now on, let's talk about Symbolic Regression, or SR in short.

Problem of SR

If you played around with the SR example from the already mentioned previous blog post, you might have noticed that it does not work very well. In the better case it just took quite a long time to get to some sensible function. In the worse case it failed completely and got stuck with some obscure function.

There are several reasons why this does not work that well. I would like to write about one of them and that is fine-tuning the parameters. Vanilla GP works with a set of terminals and non-temrinals and evolves functions by juggling around with pieces of mathematical expressions to hopefully get meaningful results. However, this juggling is often also the only mechnism of how to get some specific numbers in the final function. Imagine, that the true function (i.e. what we would like to find by evolution) is \( f(x) = \sin(3.18x - 6.57) \). However, the GP algorithm is very unlikely to have the 3.18 and 6.57 numbers in its terminal set. It must mainpulate what it has to get (close to) these numbers.

Multi-Gene Gentic Programming

The idea of Multi-Gene Genetic Programming, or MGGP for short, is to decompose the problem into more, presumably simpler, subproblems. However, the decomposition is not exact, as (almost) nothing around GP is. MGGP addreses part of the problem with finte-tuning parameters.

How MGGP works

In classical GP, each individual is one tree which is also the solution to the problem, in case of SR it is the expression which is supposed to describe the data and its output is the regression estimate. In MGGP, in contrast to GP, each individual is made of multiple trees which together make the solution. These trees are called genes.

But how do we get a single number out of multiple expressions? The answer is linear combiantion. That means that the output of every gene is multiplied by some constant number and then they are all summed up together. It is basically a weighted sum of the genes. More, in MGGP, there is always an implicit gene which is always equal to the number 1. The reason for such gene is that we get an additive factor for free.

Example: let's have these three genes and (for simplicity) there is only one variable to work with, \(x\): $$ g_1(x) = x $$ $$ g_2(x) = \sin(x) $$ $$ g_3(x) = \cos(x) $$ and of course there is always the implicit gene of 1, let's give it the index of 0: \(g_0(x) = 1\). Now, to get the final expression, there are some constants, let's call them \( w_i \) which multiply the individual genes. The final expression would be, in this case: $$ f(x) = w_0 g_0(x) + w_1 g_1(x) + w_2 g_2(x) + w_3 g_3(x) $$ or to make it simpler (because \( g_0(x) \) is always going to be 1): $$ f(x) = w_0 + w_1 g_1(x) + w_2 g_2(x) + w_3 g_3(x) $$

By now, you should have an idea but you should be wandering: Where do we get the constants (\( w_i \)) from? Yes, that is the key for MGGP. And the trick is really simple. We use the power of linear regression. Linear regression is a kind of regression analysis (see, SR is another kind of regression analysis) which produces models in the form of linear combination of the input features. So, for example, if our problem has two input features, the linear regression produces models like \( f(x_1, x_2) = w_0 + w_1 x_1 + w_2 x_2 \).

There are several ways of how to get the constants \( w_i \) but probably the most widely used is the ordinary least-squares solution. Let the input features be arranged in a matrix \( \mathbf{X} \) so that each datapoint (its input features) is one row of the matrix and let the target values be arranged in a column vectory \( \mathbf{y} \) so that \(i\)-th element corresponds to the \(i\)-th row of the matrix \( \mathbf{X} \). If we have these, the vector of the constants, also called weights, can be computed as:

$$ \mathbf{w} = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y} $$
This way we get the weights that multiply the input features but we don't get the additive factor, \( w_0 \). This is fixed easilly by prepending a column of ones to the matrix \( \mathbf{X} \).

Identical procedure is applied in MGGP. Instead of using the vanilla input variables, we first compute “new” features by computing the outputs of the genes - each gene produces one new feature - and of course create the implicit feature of 1. Important fact is that the computation of the weights is done at evaluation, i.e. they are always computed optimally with respect to the genes that are available in the individual.

Additional operators

Since there are multiple genes, the crossover operator is split into two types:

  • Low-level crossover - this is a classical crossover known from GP: a random node is chosen in each individual and the subtrees rooted in these nodes are swapped.
  • High-level crossover - a new type of crossover arising from the fact that there are multiple genes: whole genes (one or more) are swapped between the individuals.

What is it good for

The major benefit of this approach is that it lifts the duty of finding the linear parts from the shoulders of the underlying GP which needs to focus on the non-linear parts only. Also, it makes a kind of decomposition of the problem - multiple, maybe smaller, genes work together to create a reasonably good solution. It addresses the problem of fine-tuning the parameters to a ceratin degree - the top-level parameters, i.e. the linear coefficients, are not needed to be found by evolution. However, the constants hidden inside the non-linear parts are not adressed by this mechanism and are sought traditionally (in GP terms).

References

  • GPTIPS - the leading implementation (and maybe the only one really used) of (not just) MGGP by Dominic Searson.
  • Hinchliffe MP, Willis MJ, Hiden H, Tham MT, McKay B & Barton, GW. Modelling chemical process systems using a multi-gene genetic programming algorithm. In Genetic Programming: Proceedings of the First Annual Conference (late breaking papers), 56-65. The MIT Press, USA, 1996.
  • Searson DP, Willis MJ & Montague GA. Co-evolution of non-linear PLS model components, Journal of Chemometrics, 2, 592-603, 2007.

Monday, January 18, 2016

Quickie: zegkljan.net

Hi folks, this is just a quick one. I have finally set up my own domain zegkljan.net. So, from now on, you can find my blog at blog.zegkljan.net. The domain itself currently redirects to the blog but in the future I want to have some other stuff there as well. Of course, the original address zegkljan.blogspot.com still works and redirects to the blog.

Monday, November 16, 2015

Evolution in a Computer, pt. 3

After months of waiting, I welcome you to the final part of my humble introductory mini-series about evolutionary computation. And today's topic is Genetic Programming

Discovering functions

Recall the example of finding the minimum of the function from the first part of this mini-series? We had a function we didn't know anything about (actually, we did, but that was just for the sake of showing what we can do when we know) and we wanted to find a minimum of that function. Now, let's look at a different problem: we don't have the function at all, only a handful of \(x, y\) pairs, and we want to find out what the function actually looks like. How to do it?

If we restrict ourselves to a specific set of functions, e.g. linear functions, we can use specialized methods like (multiple) linear regression that will just throw the function at us given the data. Period. However, this has the problem of restricting to only a specific set of functions. If the "secret" function was actually a linear function, it would be the best we could do but if it wasn't, we would get errorneous function.

We could take a different approach and use more complicated machine learning methods like Artificial neural networks (ANN) that have the universal approximation property (simplified: given enough neurons they are, in principle, capable of representing almost any function with arbitrarily low error). However, their problem is that they can be immensely complex and if one wanted to use it as a mathematical expression, it would be mess.

As you might already have guessed, we are going to use.. Evolutionary algoritm! Well, we know how to evolve binary strings and real numbers... how do we use it to find a function? Well, it is very complicated. You could design some algorithm that would translate binary string to a function but it would probably be very messy and not efficient, not speaking about the fixed length of the binary strings which would be a major problem. However, there is a representation that can be used... trees!

Tree representation

A mathematical expression can be described like a tree. Let's take some made-up expression like $$ (x + \sin{y})^2 - \cos{3x} $$ Where is the tree in there? It's easy - the operators (i.e. addition, mutiplication...) and functions (i.e. sin, cos...) are inner nodes and the variables (i.e. x, y...) and constants (i.e. numbers) are leaves of the tree. The corresponding tree for the expression above looks like this: x sin y cos 3 2 + - ^ * x

Evolution of trees

So now that we have a representation for math expressions, how do we actually evolve it? Well, pretty much in the same way as in classical GA - by selection, recombination and mutation. Selection is identical - the better solutions (in this case those that have lower error) are preferred more than the worse ones. Recombination, i.e. crossover, and muation are the more interesting ones.

Crossover

There are more types of crossover but I'm going to speak about the most basic type - the subtree crossover. This crossover works in the following way:

  1. Pick a random node \( N_1 \) in the first parent
  2. Pick a random node \( N_2 \) in the second parent
  3. Exchange the subtrees rooting in \( N_1 \) and \( N_2 \) between the two parents
Look at this example: x sin y cos 3 2 + ^ * x N1 N2 x sin y cos 3 2 + ^ * x N1 N2 And that's it! It's quite simple. Sometimes some tweaks are used, e.g. giving more preference to nodes that are deeper in the tree, but in essence, the basics are the same - just swapping subtrees.

Mutation

As there is subtree crossover, there is subtree mutation too. Again, it is based around mangling subtrees. However, contrary to the crossover, in mutation we don't exchange subtrees between multiple solutions but we modify only one. The most basic subtree mutation looks like this:

  1. Randomly pick a node
  2. Throw away that node (and its subtree) and replace it with a new randomly generated subtree
No illustration here - it's really that simple.

Demo

This is a little demo of GP in action. The task is to find the function based on a bunch of samples (the red points; the sought function is the dashed curve). The actual function that we want the GP to find is $$ f(x) = 5 \sin{x^2} $$ ATTENTION: you might find that the algorithm "runs away" with bigger and bigger and messier and messier functions. I suggest you stop the algorithm and start again. This algorithm is really not tuned very well, it's just a demo.

Clicking Start starts the evolution. Clicking Cancel during the evolution terminates it. You can play with the population size (must be positive even number), crossover and mutation probs (must be between 0 and 1) and the tournament size (must be between 2 and size of population). The evolution runs indefinitely (i.e. until you stop it). You can also play with the number of available data points (the more the easier it should be for the algorithm to find the solution). This thing is written in Dart and you can find the source here.

Not just functions

We can evolve functions. But one could say “Fine, but where's the programming in that?” and she'll be totally right. So far we can make mathematical expressions. But the tree representation goes beyond that. All you need to do is to represent a program like a tree. And it is quite simple. Let's imagine that we control an ant in a 2D grid world. The ant can turn left, turn right and decide what to do based on whether there is a piece of food just ahead of it. And here is what a program in this domain might look like: if food_ahead() forward() turn_left() As you can see, the only difference is that the branching point (or non-terminal) is not an operator but a decision-making point and the leaves are not variables and constants but actions. In fact, there could be many more non-terminals, e.g. cycles, function calls etc.

Closure property

All the Genetic Programming I introduced above is called Koza-style GP because it is the idea of John R. Koza which he published in his book [1]. For this type of GP to work a so called closure property must hold. Closure property holds if an output of every terminal and non-terminal is acceptable by every non-terminal as an input. That menas that whatever our nodes output, it must be possible to chain it with an arbitrary node.

If our non-terminal set is made of plus, minus and multiplication and our terminal set of variables and numbers, closure property holds - however we add, subtract or multiply numbers, we always get a number that can be furter added to, subtracted from or multiplied. However, if we extended our non-terminal set by a non-terminal such as if that would take three children - a condition, a result when condition is true and a result when condition is false - then the closure property does not hold - the condition requires a boolean value but that is different from numbers. We would have to employ some rule that would define true and false in terms of numbers.

In fact, it would be enough to just add division to the non-terminal set. Yes, it would still produce only numbers but what if the second argument to a division resulted to zero? We can't divide by zero so division cannot accept zero as its second argument. You see, it is sufficient that a non-terminal doesn't accept some of the values even though they are of the same type.

Summary

In this long-awaited (I hope :)) post I described a little bit of what is known as Genetic Programming. It is basically a way of evolving mathematical expressions, computer programs or basically any tree-like structure using the principles of genetic algorithms. It might (or might not) seem as a brilliant technique able to solve anything... but don't let yourself be fooled. Finding functions and programs is really hard and most of the time this vanilla GP doesn't work very well. But it's a start and it can solve at least some easy problems.

Regarding further posts I'm going to publish here, I don't have any clear plan. I suppose I might dive into specific algorithms or things I do or something totally different. We'll see :).


^[1] Koza, John R. Genetic programming: on the programming of computers by means of natural selection. Vol. 1. MIT press, 1992.

Monday, July 20, 2015

Evolution in a Computer, Pt. 2

After a long waiting, it's here! To seamlessly continue from the previous part, I prepared a little demo of a genetic algorithm. What we have here is a so-called Facility location problem, or a variation of it.

There is a bunch of cities (50 in our case) and we want to supply the cities with goods so we must open some factories. We must decide how many factories we are going to open and in which cities. We won't explicitly assign which cities are to be supplied by which factories but a city will be assigned to the nearest factory (imagine the factories have unlmitted capacity). The catch: we want to minimize the total cost:

  • every opened factory costs us some fixed amount of money
  • delivering goods from a factory to a city costs us an amount of money equal to the (euclidean) distance from the factory to the city
Clearly, we cannot try every combination of cities with a factory as it's \(2^{50} \approx 1.13 \cdot 10^{15}\) combinations. That would took almost 36 years if checking one solution took one microsecond. Yes, this problem is NP-hard. Instead, we are going to use a... Genetic Algorithm!

So how do we represent a solution to our problem? Well, we'll use a binary string, as in proper GA, with the number of bits equal to the number of cities. Then \(1\) in position \(n\) means that the \(n\)-th city has a factory and vice versa, e.g. a string like 001000110 would mean that the third, seventh and eighth city would have a factory while the first, second, fourth, fifth, sixth and ninth wouldn't. Now, give it a try!

Clicking Start starts the genetic algorithm from beginning with the current map. Clicking New map generates a new (random) map. Clicking Stop during the optimization stops it. You can edit the cost of opening a facility, size of population (must be positive even number), probability of crossover and mutation (must be between 0 and 1) and size of tournament (must be between 2 and size of population). The evolution runs indefinitely (i.e. until you stop it). The world is 100 by 50 units. This thing is written in Dart and you can find the source here.

Real-valued Evolution

So far we have spoken about a genetic algorithm that uses binary strings. This means that our solutions are inherently discrete. But what if we wanted to really find the minimum of the function from the part one?

The „standard“ way

The first way of dealing with real number is to just express them using a binary standard, e.g. the IEEE 754 single-precision floating-point format. But this has problems. Imagine numbers 1.763 and -2.611. These would look like this (e.g. using this tool):

 1.763 = 00111111111000011010100111111100
-2.611 = 11000000001001110001101010100000
And now imagine what would happen when we crossed them over using single-point crossover after the 5th bit:
00111111111000011010100111111100
11000000001001110001101010100000
turns into
00111000001001110001101010100000 = 3.9840699e-05
11000111111000011010100111111100 = -115539.96875
One could say "Well, that escalated quickly!" and indeed, these problems arise all the time. If we explore all the possible crossovers (there are 32 of them resulting in 64 numbers) they include gems like -3.252161001305617e+19, 3.591135600494495e-39 and NaN. Bit-flip mutation is going to have similar problems.

The good way

Much better way than sticking to the binary representation is just use the real numbers. Then, if we have a one-dimensional problem (like in searching for the minimum of that function), the genotype is just one real number. This brings a lot of changes in the crossover and mutation operators.

As we have seen, just crossing over the binary representation is bad. Therefore we need to find a better way of combining two individuals. We would like the offspring(s) to be „close“ to the parents, but be different from them. Since we are in the real numbers, we can use the power of probability and random sampling! One of the simplest real-valued crossover is a so-called arithmetic crossover, which is basically a weighted mean of two (or more!) parents with randomly chosen weights. Or more formally:

$$ \mathbf{x}^{Offs} = \sum_{i=1}^{N_{Pa}} \alpha_i \mathbf{x}_i $$ where \(\sum_{i=1}^{N_{Pa}} \alpha_i = 1, \alpha_i \in (0, 1)\); \(\alpha_i\) are picked randomly for each offspring.
And there are many other crossover operators, e.g. BLX-α, SBX, PCX...

The mutation is easier. One of the most used way is to just add a randomly generated number (or vector if our genotype is more numbers) from a distribution of our choice. The most used distributions are the uniform distribution (where we need to specify the bounds) and the normal distribution (where we need to specify the variance).

Next...

The final part of this introductory mini-series, which will finally get us to what I'm actually doing, is going to be big. It is going to be about Genetic Programming. So stay tuned!