Update README.md

dd6e4b74 · Philip Maas · e12a37ee · dd6e4b74
Commit dd6e4b74 authored 3 years ago by Philip Maas
--- a/README.md
+++ b/README.md
 # Bipedal Walker Evo

-This project tries to solve OpenAI's bipedal walker with an evolutionary strategy.\
+This project tries to solve OpenAI's bipedal walker using three different ways: Q-Learning, Mutation of Actions and Evolution Strategies.\
+
+# Q-Learning
+Coming soon
+
+# Action Mutation
+Will get 0 reward, which is basically learning to prevent falling on it's head.
+
+## How it works
+1. Generate a population with a starting number randomized actions (we don't need enough actions to solve the problem right now)
+2. Let the population play the game reward every walker of the generation accordingly
+3. The best walker survives without mutating
+3. The better the reward the higher the chance to pass actions to next generation. Each child has a single parent, no crossover.
+4. Mutate all children and increment their number of actions
+
+## Hyperparameters
+| Parameter         | Description                                                 | Interval  |
+|-------------------|-------------------------------------------------------------|-----------|
+| `POP_SIZE`        | Size of population.                                         | [0;∞[     |
+| `MUTATION_FACTOR` | Percentage of weights that will be mutated for each mutant. | [0;1]     |
+| `ACTIONS_START`   | Number of actions in the first generation.                  | [0;1600]  |
+| `INCREASE BY`     | Incrementation of steps for each episode.                   | [0;∞[     |
+| `MAX_STEPS`       | Number of steps that are played in one episode.             | [0; 1600] |
+
+# Evolution Strategies
 After 1000 episodes, which is about 1h of learning, it will reach ~250 reward.\
-Best score until now: 292/300
+Best score until now: 292/300 in 7000 episodes \
+![Reward](/repository/EvolutionStrategies/Experiments/12 1 50 0.1 decaying 300/12_2_50_0.1_decaying_300.png "Employee Data title")

 ## How it works
 1. Generate a randomly weighted neural net
@@ -22,7 +47,7 @@ Best score until now: 292/300
 | `MAX_STEPS`       | Number of steps that are played in one episode.             | [0; 1600] |


-## Installation
+# Installation
 We use Windows, Anaconda and Python 3.7 \
 `conda create -n evo_neuro python=3.7` \
 `conda activate evo_neuro`\
@@ -31,7 +56,7 @@ We use Windows, Anaconda and Python 3.7 \



-## Sources
+# Important Sources
 Environment: https://github.com/openai/gym/wiki/BipedalWalker-v2 \
 Table of all Environments: https://github.com/openai/gym/wiki/Table-of-environments
 OpenAI Website: https://gym.openai.com/envs/BipedalWalker-v2/ \