Tag Archives: AI

Learning AI – Part 3: Genetic Algorithms – Improving the algorithms and more advanced topics

 

Introduction

In this part and final (for the moment) on Genetic Algorithms I will introduce some new topic, operators and ways of thinking and improving on your Genetic Algorithm.

Most of the topics discussed here I gathered from the book AI Game Programming Wisdom 2 in the section regarding Genetic Algorithms.

Previous parts:

Learning AI – Part 1: Genetic Algorithms Intro

Learning AI – Part 2: Genetic Algorithms Continued

Source codes:

I do have to warn though because I am moving at the moment to the neural network so I only created the base codes based on the books and C++ source codes for my Unity project to be used later. So there are some problems at the moment with the code but the code samples can work as an example to you to modify and use how it best fits you.

Also keep looking at this GitHub address for update to the GA sample classes used in the previous posts and this one: https://github.com/lionadi/NatureOfCodePlayground/tree/master/AIEngineNew/Assets/Scripts/AI

Contents

Introduction. 1

Niching techniques. 1

Explicit fitness sharing. 1

Speciation. 2

The compatibility function. 2

Co-evolution. 2

The Michalewicz method. 2

Mutation operators. 3

Crossover operators. 3

Genetic programming. 3

Growth in Genetic Algorithms. 3

Growth. 3

Environment. 4

Neutral network. 4

Comparing parametrization and growth. 4

Few helper functions

Bibliography. 5

 

Niching techniques

“Can be great for retaining population diversity, and are particularly useful where fitness landscape might contain multiple peaks or where it is essential to protect new innovation within a population.” (Rabin, 2003)

Explicit fitness sharing

“Is a mechanism where individuals with similar genetic properties are grouped together. Each individual’s score is then divided by the number of genomes within its group.

Newfitness = oldfitness / numberofneighboors

This punishes the fitness scores of individuals who have many similar neighbors, thereby preventing any one group from growing too large and taking over the population.” (Rabin, 2003)

Speciation

“Goes one step further by separating the genomes into groups in the same way as explicit fitness sharing, but this time individuals are only allowed to breed with members of their own species. Typically, a species is killed when either its size decreases to zero or its fitness has not increased within a user-defined number of generations. This means that the individuals that would normally have died out early in the evolution of a population remain active for much longer, protected among their species members. Because of the protection you can experiment with much larger mutation rates than normal.” (Rabin, 2003)

The compatibility function

“To determine if one genome should belong in the same species as another, you must use a function that compares two genes strings and returns a compatibility distance. If the distance is below a user defined threshold, then the genomes are considered to be in the same species. This compatibility function varies considerably depending on the encoding representation.

Dist= (x-y) / n

N = the number of genes in each chromosome

X and y= represent two different individuals in the population

Each iteration of the genetic algorithm, the compatibility function is used to test every individual against the first member in each species. If the distance is within a user defined threshold, then the individual is added to the appropriate species. Id the individual is incompatible with all the current species, then a new species is created and the individual is added to that.” (Rabin, 2003)

Sample code on the species class which will hold different species:



public class Species
 {
 //this is the genome all other genomes in the population are
 //compared to see if they should be included in this species or not
 Host SampleGenome;

 List<Host> Members;

 //the number of generations that has passed without the fittest
 //member in the species showing an improvement.
 int numGensNoImprovement;

 float expectedNumberOfOffspring;

 float bestEverFitness;

 //the combined fitness scores of every member
 float totalFitness;

 //it's often useful to have an identity number for the species
 int id;



 public Species(Host firstMember, int id)
 {
 this.Members = new List<Host>();
 this.id = id;
 this.SampleGenome = firstMember;
 this.expectedNumberOfOffspring = 0;
 this.numGensNoImprovement = 0;
 Members.Add(firstMember);

 bestEverFitness = firstMember.Fitness();

 totalFitness += firstMember.Fitness();
 }

 public void AddGenome(Host genome)
 {
 Members.Add(genome);

 totalFitness += genome.Fitness();

 if (genome.Fitness() > bestEverFitness)
 {
 bestEverFitness = genome.Fitness();

 //fitness has improved so this can be reset
 numGensNoImprovement = 0;
 }
 }

 ///
<summary>
 /// selects a genome from the species using tournament selection. 
 /// this method uses tournament selection to spawn genomes from the 
 /// species.
 /// 
 /// As it is set presently it uses a high selection pressure.
 /// </summary>

 /// <returns></returns>
 public Host SpawnGenome()
 {
 //first chose the number in the tournament. selection_pressure must be
 //between zero and 1.0
 int NumInComp = 0; float selection_pressure = 0.5F;

 while (NumInComp < 1)
 {
 NumInComp = (int)(NumMembers() * selection_pressure);

 selection_pressure += 0.1F;
 }

 int winner = 0;

 for (int i = 0; i < NumInComp; ++i) { int ThisTry = RandomProvider.RND.Next(0, Members.Count() - 1); if (Members[ThisTry].Fitness() > Members[winner].Fitness())
 {
 winner = i;
 }
 }

 return Members[winner];
 }


 ///
<summary>
 /// makes sure the sample genome is always the genome with the highest fitness found so far
 /// </summary>

 public void UpdateSampleGenome()
 {
 if (Members.Count > 0)
 {
 if (Members.Last().Fitness() > SampleGenome.Fitness())
 {
 SampleGenome = Members.Last();
 }

 else
 {
 Members[0] = SampleGenome;
 }
 }

 ++numGensNoImprovement;
 }

 public void Clear()
 {
 SampleGenome = Members[0];
 expectedNumberOfOffspring = 0.0F;
 Members.Clear();
 totalFitness = 0.0F;
 }

 public Host Sample() { return SampleGenome; }

 public float ExpectedOffspring() { return expectedNumberOfOffspring; }

 ///
<summary>
 /// adjusts the fitness scores so they are set to the value of their expected number of offspring
 /// </summary>

 /// <param name="totalFitnessForPopoulation"></param>
 /// <param name="PopSize"></param>
 public void SetExpectedOffspring(float totalFitnessForPopoulation, int PopSize)
 {
 //check that we have some fitness scores to work with
 if (totalFitnessForPopoulation == 0.0) return;

 expectedNumberOfOffspring = 0.0F;

 for (int gen = 0; gen < Members.Count(); ++gen)
 {
 float ExpectedForThisIndividual =

 (Members[gen].Fitness() / totalFitnessForPopoulation) * PopSize;

 expectedNumberOfOffspring += ExpectedForThisIndividual;
 }
 }

 public int GenerationsNoImprovement() {return numGensNoImprovement;}

 public bool Empty() {return (Members.Count() == 0);}

 public float BestEverFitness() {return bestEverFitness;}
 public float TotalFitness() {return totalFitness;}

 ///
<summary>
 /// this method applies explicit fitness sharing by dividing each member's
 /// fitness by the number of members in the species
 /// </summary>

 public void FitnessShare()
 {
 float NewTotal = 0.0F;

 //divide each member's fitness by the number in the species
 for (int m = 0; m < Members.Count(); ++m)
 {
 Members[m].SetFitness(Members[m].Fitness() / NumMembers());

 NewTotal += Members[m].Fitness();
 }

 totalFitness = NewTotal;
 }

 public int ID() {return id;}
 int NumMembers() {return Members.Count();}

 public bool IsLessThan(Species species)
 {
 return (this.BestEverFitness() < species.BestEverFitness());
 }

}


The helper function to be used with the Species class.


///
<summary>
 /// Separate the population into species. This separates the individuals into species of similar genomes.
 /// </summary>

 public void Speciate(ref List<Host> population)
 {
 //first clear the existing members and kill off any non developing
 //species
 this.Species.Clear();

 //now separate the population into species
 for (int gen = 0; gen < population.Count; ++gen)
 {
 bool bAdded = false;

 foreach(Species curSpecies in this.Species)
 {
 //calculate the compatibility score
 double cs = Compatibility(population[gen], curSpecies.Sample());

 //if the compatibility score is less than our tolerance then
 //this genome is added to the species
 if (cs < CompatibilityTolerance) { curSpecies.AddGenome(population[gen]); bAdded = true; break; } }//next species if (!bAdded) { //not compatible with any current species so create a new //species Species.Add(new Species(population[gen], NextSpeciesID++)); } }//next genome //update all the species to make sure their sample member is set //to the best genome found so far. Kill off any empty species //foreach (Species curSpecies in this.Species) for(int x = this.Species.Count -1; x >= 0; --x)
 {
 var curSpecies = this.Species[x];
 curSpecies.UpdateSampleGenome();

 if ((curSpecies.Empty() ||
 (curSpecies.GenerationsNoImprovement() > GenerationsAllowedWithoutImprovement)) &&
 (Species.Count > 1))
 {
 Species.RemoveAt(x);
 }
 }
 }

 ///
<summary>
 /// this allocates a compatibility score between two genomes. If the
 /// score is below a certain threshold then the two genomes are </summary>

 /// considered to be of the same species<param name="g1"></param>
 /// <param name="g2"></param>
 /// <returns></returns>
 public float Compatibility(Host g1, Host g2)
 {
 if (g1.DNA.Genes.Count != g2.DNA.Genes.Count) return 0;

 float RunningTotal = 0.0F;

 for (int gene = 0; gene < g1.DNA.Genes.Count; ++gene)
 {
 //RunningTotal += Mathf.Abs(g1.Genes[gene] - g2.Genes[gene]);
 RunningTotal += Vector2.Distance(g1.DNA.Genes[gene], g2.DNA.Genes[gene]);
 }

 return RunningTotal / g1.DNA.Genes.Count;
 }

 ///
<summary>
 /// this method calculates the amount of offspring each species
 /// should produce.</summary>

 /// <param name="AmountNeeded"></param>
 public void CalculateExpectedOffspring(int AmountNeeded)
 {
 //first calculate the total fitness of all active genomes
 float TotalFitness = 0.0F;

 foreach (Species curSpecies in this.Species)
 {
 //apply fitness sharing first
 curSpecies.FitnessShare();

 TotalFitness += curSpecies.TotalFitness();
 }

 //now it is necessary to calculate the expected amount of offspring
 //from each species
 double expec = 0.0;

 foreach (Species curSpecies in this.Species)
 {
 curSpecies.SetExpectedOffspring(TotalFitness, AmountNeeded);

 expec += curSpecies.ExpectedOffspring();
 }
 }

 ///
<summary>
 /// this sorts the species and assigns a color to each one.
 /// The color is just cosmetic to be used as a visual aid.</summary>

 public void SortAndAssignVisualAid()
 {
 if (this.Species.Count < 0) return; this.Species.OrderByDescending(o => o.BestEverFitness());
 }



Co-evolution

“If you have different species competing with each other, they have to work harder at solving the problem. If the fitness of one is somehow inversely proportional to the fitness of the other, you have competition. This can greatly increase the speed and quality of the evolution process.” (Rabin, 2003)

The Michalewicz method

“Contains several mutation and crossover operators that are used in combination to fully exploit the characteristics of real value encoding GAs.

Mutation operators

  • Boundary mutation: With probability p, this operator changes the value of a gene to either its minimum possible value, Gmin, or its maximum possible value Gmax.
  • Replace mutation: With probability p, this operator resets a gene to a uniform random between Gmin and Gmax.
  • Non-Uniform mutation: With probability p, this operator adjusts a gene’s size by a small random amount, the limits of which decrease over time.

Crossover operators

  • Arithmetical crossover: This simply averages the values of the genes at each locus.
  • Simple crossover: this operator is the same as a single point crossover.
  • Heuristics crossover: Given the parents G1 and G2 this operator creates an offspring according to Equation: Child = r(G2 – G1) + G2
    • Variable G2 is the fitter of the two parents, and r is a random number between 0 and 1.

Alone these operators produce bad results but combined together they have a tendency to produce fitter offspring’s.

Tip: Using adaptive probability distribution for the operators can be speedier and lose little in performance. This means that each operator uses the same equal probability.” (Rabin, 2003)


/// <summary>
 /// displaces a gene's value by a small random amount (limited by
 /// MutationDelta)
 /// </summary>
 /// <param name="genes"></param>
 public void MutateUniform(ref List<Vector2> genes)
 {
 for (int i = 0; i < genes.Count; i++)
 {
 //flip this bit?
 if ((float)RandomProvider.GetRandomNumber(RandomProvider.RND, 0, 1) < MutationRate)
 {
 float angle = UnityEngine.Random.Range(0, AIConstants.TWO_PI);
 genes[i] = ((new Vector2(UnityEngine.Mathf.Cos(angle), UnityEngine.Mathf.Sin(angle))) * UnityEngine.Random.Range(0, (float)RandomProvider.RandomClamped() * this.MutationDelta));
 }
 }
 }

 /// <summary>
 /// displaces the genes value by an amount described by a normal distribution
 /// </summary>
 /// <param name="genes"></param>
 public void MutateGaussian(ref List<Vector2> genes)
 {
 for (int i = 0; i < genes.Count; i++)
 {
 //flip this bit?
 if ((float)RandomProvider.GetRandomNumber(RandomProvider.RND, 0, 1) < MutationRate)
 {
 //genes[i] =
 
 /*
 //do we mutate this gene?
 if (RandFloat() < m_dMutationRate)
 {
 gen.vecGenes[gene] += RandGaussian(0.0, 0.1);

 //make sure the value stays within the desired lims
 Clamp(gen.vecGenes[gene], 0.0, 1.0); 
 */
 }
 }
 }

 /// <summary>
 /// Sets a gene to either the min or max possible value (0 or 1 in this
 /// implementation)
 /// </summary>
 /// <param name="genes"></param>
 public void MutateBoundary(ref List<Vector2> genes)
 {
 for (int i = 0; i < genes.Count; i++)
 {
 float angle = UnityEngine.Random.Range(0, AIConstants.TWO_PI);
 //flip this bit?
 Vector2 tempVector = new Vector2(UnityEngine.Mathf.Cos(angle), UnityEngine.Mathf.Sin(angle)) * UnityEngine.Random.Range(0, AIConstants.maxforce);
 if (tempVector.x < mutationBoundryValueMin.x && tempVector.y < mutationBoundryValueMin.y)
 {
 genes[i] = Vector2.zero;
 }
 else
 {
 genes[i] = this.mutationBoundryValueMax;
 }
 }
 } 
 
 

 public void MutateMichalewicz(ref List<Vector2> genes)
 {
 float chance = (float)RandomProvider.GetRandomNumber(RandomProvider.RND, 0, 1);

 if (chance <= 0.333)
 {
 MutateBoundary(ref genes);
 }

 else if (chance >= 0.667)
 {
 MutateUniform(ref genes);
 }

 else
 {
 MutateReplace(ref genes);
 }
 }

/// <summary>
 /// the Michalewicz crossover operator choses one of three different
 /// crossover operators based on an even probability
 /// </summary>
 /// <param name="mum"></param>
 /// <param name="dad"></param>
 /// <param name="baby1"></param>
 /// <param name="baby2"></param>
 public void CrossoverMichalewicz(ref List<Vector2> mum, ref List<Vector2> dad, ref List<Vector2> baby1, ref List<Vector2> baby2)
 {
 if (((float)RandomProvider.GetRandomNumber(RandomProvider.RND, 0, 1) > CrossoverRate) || (mum == dad) || mum.Count <= 0 || dad.Count <= 0)
 {
 baby1 = mum;
 baby2 = dad;
 return;
 }

 float chance = (float)RandomProvider.GetRandomNumber(RandomProvider.RND, 0, 1);

 if (chance <= 0.333)
 {
 CrossoverTwoPoint(ref mum, ref dad, ref baby1, ref baby2);
 }

 else if (chance >= 0.667)
 {
 CrossoverAverage(ref mum, ref dad, ref baby1, ref baby2);
 }

 else
 {
 CrossoverHeuristic(ref mum, ref dad, ref baby1);
 CrossoverHeuristic(ref mum, ref dad, ref baby2);
 }
 }

/// <summary>
 /// This crossover operator simply averages the genes at each locus
 /// </summary>
 /// <param name="mum"></param>
 /// <param name="dad"></param>
 /// <param name="baby1"></param>
 /// <param name="baby2"></param>
 public void CrossoverAverage(ref List<Vector2> mum, ref List<Vector2> dad, ref List<Vector2> baby1, ref List<Vector2> baby2)
 {
 for (int gen = 0; gen < mum.Count; ++gen)
 {
 baby1.Add((mum[gen] + dad[gen]) * 0.5F);
 baby2.Add((mum[gen] + dad[gen]) * 0.5F);
 }
 }

 /// <summary>
 /// this operator creates a child whos genes are biased towards the 
 /// fitter parent.
 ///
 /// The heuristic used is r(V1-V2) + V1 where V1 is the fitter parent
 /// </summary>
 /// <param name="mum"></param>
 /// <param name="dad"></param>
 /// <param name="baby1"></param>
 /// <param name="baby2"></param>
 public void CrossoverHeuristic(ref List<Vector2> fittest, ref List<Vector2> not_so_fit, ref List<Vector2> baby1)
 {

 float rate = (float)RandomProvider.GetRandomNumber(RandomProvider.RND, 0, 1);

 //iterate down the length of the genome using the heuristic
 for (int gene = 0; gene < fittest.Count; ++gene)
 {
 Vector2 NewGeneValue = fittest[gene] + rate *
 (fittest[gene] - not_so_fit[gene]);
 baby1.Add(NewGeneValue);
 }
 }

Genetic programming

“Genetic programming is a powerful extension to evolutionary computing. They allow considerable flexibility, allowing the application programmer to develop solutions of more diverse structure that those provided by genetic algorithms. Grammatical evolution is one way to implement genetic programming while minimizing the computation expense related to the creation and subsequent elimination of nonsense organism.” (Rabin, 2003)

Growth in Genetic Algorithms

Idea: Enhancements through flexibility

  • Growth
  • Neutral networks
  • Variable-length genome
  • Speciation
  • Co-evolution.

Growth

Look at nature for examples on how to give more “life” to Gas and not look to static created by a mechanical code. In nature things grow rather than are strictly parametrized. Look at this growth methods from biology, chemistry etc. (Rabin, 2003)

Environment

Such example would be to look at the environment something lives in and how they affect one another. What factors, things and properties of both the environment and the entity play into the role of environment based growth? When do the growth stop for a living organism? Take into consideration how the growth is going to be terminated. Look again into nature. (Rabin, 2003)

Neutral network

A concept that increases evolvability. Have some junk DNA switched on inside an important host. This junk DNA can then be subject to the same rules of evolution. This might not produce great results at first but every now and then might help be of advantage. Increases the probability of finding a better solution.

Implementation ideas:

  • Scanning through the genome with a logic that determines an interaction values by a single mutation, for example.
  • Variable-length genomes: Have unequal crossover points. Both parents have their own crossover points so a child can have a genome much bigger or smaller than its two parents, which allows evolution to control the complexity of the solutions.
  • Another example would be to duplicate a gene once it is identified or to just make a random duplications during crossover.

This would mean that the system cannot be a simple parametrization system.

(Rabin, 2003)

Comparing parametrization and growth

Aspect Parametrization Growth
Encoding scheme +Intuitive (explicit) -Incomprehensible
+/- Environment independent +/- Environment dependence
+ Immediate mapping ·         Needs growth period
Evolvability ·         Restricted/fixed complexity + Potentially limitless
·         Designer biased + Evolution free to play
·         Mutation effect fixed + Neutral networks
·         Complexity DNA length + Controllable mutation
+/- Just optimization + Specialization

(Rabin, 2003)

Few helper functions


/// <summary>
 /// this function calculates the variance of the population
 /// </summary>
 /// <param name="hosts"></param>
 /// <returns></returns>
 float CalculateVariance(ref List<Host> hosts)
 {
 float RunningTotal = 0.0F;

 //first iterate through the population to calculate the standard
 //deviation
 for (int gen = 0; gen < hosts.Count; ++gen)
 {
 RunningTotal += (hosts[gen].DNA.Fitness - this.AverageFitnessScore) *
 (hosts[gen].DNA.Fitness - this.AverageFitnessScore);
 }

 return RunningTotal / hosts.Count;
 }

 /// <summary>
 /// calculates the fittest and weakest genome and the average/total 
 /// fitness scores.
 /// </summary>
 /// <param name="source"></param>
 public void CalculateBestWorstAverageTotalFitnessScore(ref List<Host> source)
 {
 this.TotalFitnessScore = 0;
 this.BestFitnessScore = 0;
 this.WorstFitnessScore = float.MaxValue;
 foreach (Host host in source)
 {
 float normalizedFitnessScore = host.DNA.Fitness;

 if (normalizedFitnessScore < this.WorstFitnessScore)
 this.WorstFitnessScore = normalizedFitnessScore;

 if (normalizedFitnessScore > this.BestFitnessScore)
 this.BestFitnessScore = normalizedFitnessScore;

 this.TotalFitnessScore += normalizedFitnessScore;
 }

 this.AverageFitnessScore = this.TotalFitnessScore / source.Count;
 }


Bibliography

Rabin, S. (2003). AI Game Programming Wisdom 2.

 

 

 

 

 

 

Learning AI – Part 1: Genetic Algorithms Intro

Hi,

This is my first post in a series of Artificial Intelligence posts on AI learning. I got interested in the subject a few weeks ago when I started to work on some ideas for a game prototype I had in my mind. I was reading a book on game AI and turned the pages to a section of the book about AI learning It was fascinating :)! Anyways I remembered one of my older project where I played around with physics and nature I decided to back to it and the book that inspired me to nature and how it works.

So in this first post I’ll show you guys some sample C# code for a simple Genetic Algorithm that solves a text puzzle. On in other words I am using a GA algorithm so find the fastest solution to get the same text output from the algorithm that is input to the algorithm.

To start with if you are not familiar with Genetic Algorithms this source is a good place to start. My sample code at the moment is based on this page. This is where I will continue from later in the future parts of these upcoming blog posts on learning AI: http://natureofcode.com/book/chapter-9-the-evolution-of-code/

Logic and code

So lets start, ouh and remember I am just starting to learn this so double check the info and test for yourself :):

To put it in simple words GA algorithm is about inheriting data from generation to generation where the best candidate from the desired goal is chosen until the desired outcome generations later is achieved. Now this is a simplification of what is going on and there are more variables to take into consideration. I will keep things as simple as I can and hope that you refer to the link above or to more technical sources for more in-depth knowledge.

To start with a GA algorithm needs two things:

  1. genotype => Digital information that is passed down from generation to generation
  2. phenotype => this is the expression of data, how the data is going to be represented. This could be data that tells a system where an object is on the screen.

In this first part where the code sample is trying to solve a text problem the genotype and phenotype are one and the same called: DNA

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace AIEngine.DataStructures
{
 public class DNA : IDNA
 {
 public List<Char> Genes { get; set; }
 public float Fitness { get; set; }
 public String Target { get; set; }

 public DNA()
 {
 this.Genes = new List<char>();
 }

 public DNA(int genesCount, String target)
 {
 this.InitializeGenes(genesCount);
 this.Target = target;
 }

 public void InitializeGenes(int genesCount)
 {
 this.Genes = new List<char>();
 if (genesCount < 0)
 throw new Exception("The genesCount must be larger than 0");
 else if (genesCount == 0)
 return;

 int index = 0;
 do
 {
 this.Genes.Add((char)RandomProvider.RND.Next(32, 128));
 index++;
 } while (index < genesCount);
 }

 public void EvalutateFitness()
 {
 int score = 0;
 for (int i = 0; i < this.Genes.Count; i++) 
 {

 if (this.Target[i] == this.Genes[i])
 score++;
 }

 this.Fitness = (float)score/this.Target.Length;
 }

 public IDNA Crossover(IDNA partner)
 {
 DNA child = new DNA(0, this.Target);
 int midpoint = RandomProvider.RND.Next(this.Genes.Count);

 for(int x = 0; x < this.Genes.Count; ++x) { if (x > midpoint)
 child.Genes.Insert(x,this.Genes[x]);
 else
 child.Genes.Insert(x, partner.Genes[x]);
 }

 return child;
 }

 public void Mutate(float mutationRate)
 {
 for (int i = 0; i < this.Genes.Count; i++) {
 float rndValue = (float)RandomProvider.GetRandomNumber(RandomProvider.RND, 0, 1);
 if (rndValue < mutationRate)
 {
 this.Genes[i] = (char)RandomProvider.RND.Next(32, 128);
 }
 }
 }

 public override string ToString()
 {
 return new StringBuilder().Append(this.Genes.ToArray()).ToString();
 }
 }
}

What this class will do is generate a random number of character between a certain ASCII range. It will also evaluate a fitness for a DNA object to be used when deciding if the DNA will reproduce. Also to keep variation and increase the changes of a faster solution solving a crossover between two DNA objects are made. Also as a last measure to ensure that the best possible solution is achieved mutation to the DNA is introduced where a random character at random times is replaced in a DNA sequence of characters.

Next we will look at the population control class 😀

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using AIEngine.DataStructures;

namespace AIEngine
{
 public class Population
 {
 ///
<summary>
 /// Change this value to alter how fast a problem is solved
 /// </summary>

 private int populationCount = 1000;
 private String target = "TO BE OR NOT TO BE";
 private List<DNA> population;
 private List<DNA> matingPool;
 private int perfectScore;
 private Boolean finished;
 private int generations;

 ///
<summary>
 /// Change this value to alter how fast a problem is solved
 /// </summary>

 private float mutationRate = 0.01F; // This value of 0.01 with a population count of 1000 seems to be generating the fastest result with monte carlo mating

 ///
<summary>
 /// 
 /// </summary>

 /// <param name="target"></param>
 /// <param name="mutationRate"></param>
 /// <param name="populationCount"></param>
 public Population(String target, float mutationRate, int populationCount)
 {
 this.target = target;
 this.mutationRate = mutationRate;
 this.populationCount = populationCount;

 population = new List<DNA>();
 matingPool = new List<DNA>();
 for (int x = 0; x < this.populationCount; x++)
 {
 population.Add(new DNA(target.Length, target.ToUpper()));
 }

 this.CalculateFitness();

 this.finished = false;
 this.generations = 0;
 this.perfectScore = 1;
 }

 ///
<summary>
 /// In each iteration we calculate the fitness of each DNA sequence to be used later in the algorithm logic
 /// </summary>

 public void CalculateFitness()
 {
 for (int x = 0; x < population.Count; x++)
 {
 population[x].EvalutateFitness();
 }
 }

 ///
<summary>
 /// Here the algorithm implements a selection method for chosing the best DNA sequences from the population.
 /// </summary>

 public void NaturalSelection()
 {
 bool duplicateFound = true;
 matingPool.Clear();

 for (int x = 0; x < population.Count; x++) { //while (duplicateFound) { // Monte carlo method(Faster than the below commented approach) int a = RandomProvider.RND.Next(population.Count); int b = RandomProvider.RND.Next(population.Count); int populationIndexChosen = -1; if (population[a].Fitness > population[b].Fitness)
 {
 populationIndexChosen = a;
 }
 else
 {
 populationIndexChosen = b;
 }

 matingPool.Add(population[populationIndexChosen]);

 //if (!matingPool.Exists( o => o.ToString() == population[populationIndexChosen].ToString()))
 //{
 // matingPool.Add(population[populationIndexChosen]);
 // duplicateFound = false;
 //}

 //int n = (int)(population[x].Fitness * 100);
 //for (int j = 0; j < n; j++)
 //{

 // matingPool.Add(population[x]);
 //}
 }
 }
 }

 ///
<summary>
 /// Next we will generate a new population based on algorithmic logic of crossover between two random DNA sequences and adding some mutation into it.
 /// </summary>

 public void Generate()
 {
 for (int i = 0; i < population.Count; i++)
 {
 {
 int a = RandomProvider.RND.Next(matingPool.Count);
 int b = RandomProvider.RND.Next(matingPool.Count);
 // TODO: Avoid duplicates
 DNA partnerA = matingPool[a];
 DNA partnerB = matingPool[b];

 DNA child = (DNA)partnerA.Crossover(partnerB);
 child.Mutate(mutationRate);
 child.EvalutateFitness();
 population[i] = child;
 }
 }
 this.generations++;
 }

 public String GetBest()
 {
 float worldrecord = 0.0F;
 int index = 0;
 for (int i = 0; i < population.Count; i++) { if (population[i].Fitness > worldrecord)
 {
 index = i;
 worldrecord = population[i].Fitness;
 }
 }

 if (worldrecord == perfectScore) finished = true;
 return population[index].ToString();
 }

 public Boolean Finished()
 {
 return this.finished;
 }

 public int GetGenerations()
 {
 return generations;
 }

 // Compute average fitness for the population
 public float GetAverageFitness()
 {
 float total = 0;
 for (int i = 0; i < population.Count; i++) { total += population[i].Fitness; } return total / (population.Count); } public String AllPhrases() { StringBuilder everything = new StringBuilder(); for (int i = population.Count -1; i > population.Count - 11; i--)
 {
 everything.AppendLine(population[i].ToString());
 }
 return everything.ToString();
 }
 }
}

The Population classes main function is to hold the main population of DNA sequences and the mating pool used to generate the next generation of population (or to replace the current generation of population with the next generation). Also some statistical data can be retrieved from the class itself along with the information when the problem has been solved. In this case when the output is the same as the input.

Next is actual main program which will run the population to solve a problem.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using AIEngine.DataStructures;
using AIEngine;

namespace GeneticAlgoritmTextTest
{
 class Program
 {
 ///
<summary>
 /// Change this value to alter how fast a problem is solved
 /// </summary>

 public static int PopulationCount = 1000;
 public static String target = "TO BE OR NOT TO BE";

 ///
<summary>
 /// Change this value to alter how fast a problem is solved
 /// </summary>

 public static float mutationRate = 0.01F;
 static void Main(string[] args)
 {
 bool exit = false;

 // Create the population which is responsible for solving the problem.
 Population population = new Population(target, mutationRate, PopulationCount);
 while (!exit)
 {
 // In each iteration we calculate the fitness of each DNA sequence to be used later in the algorithm logic
 population.CalculateFitness();
 // Here the algorithm implements a selection method for chosing the best DNA sequences from the population.
 population.NaturalSelection();
 // Next we will generate a new population based on algorithmic logic of crossover between two random DNA sequences and adding some mutation into it.
 population.Generate();
 

 Console.WriteLine();
 Console.WriteLine(population.AllPhrases());
 Console.WriteLine("Cycle average fitness: " + population.GetAverageFitness());
 Console.WriteLine("Total generations: " + population.GetGenerations());
 Console.WriteLine("Best fitness in cycle: " + population.GetBest());

 // And before we go to the next iteration we check to see if the text puzzle has been solved.
 exit = population.Finished();
 }
 Console.ReadLine();
 }
 }
}

The code above simple creates a population which will try to solve the a text puzzle. A fitness is calculated for each member of the population. A selection method is applied for the population and a mutation is introduced at random population members.

The end result should be something like this:

GABasicDNAStart

GABasicDNAEnd

There are two images above. The first one shows the very first populations in the algorithm. The last image show the very last population generations up until the very last one which solved the problem.

Optimization

There are two things to do if you want to optimize the algorithm in this example:

  1. Changing the mutation rate and population count variables.
    1. By playing around with these values you can find out the most “optimal” values to solve your desired problem. You can do this manually or you can write a piece of code which will do this for you based on the telemetry from the algorithm.
      1. Notice that having to a too large mutation rate will make the algorithm unsolvable and like wise having a too small or way too large population count will cause your algorithm to take longer or a very long time.
  2. The fitness function logic
    1. Here I will simply quote Daniel Shiffman from The Nature of code: “If you cannot define your problem’s goals and evaluate numerically how well those goals have been achieved, then you will not have successful evolution in your simulation.” So defining a good fitness function will go a long way towards creating a better result. Also each problem will most likely have a very unique fitness function. Similarities can occur(Still learning 🙂 ).

The source code for this project you will find here: https://github.com/lionadi/NatureOfCodePlayground/tree/master/NatureOfCodeCSharp%20AI%20Project

Also as you can see the code is a work in progress, not pretty and not ready :). So changes in the github versions are due to come in the future.