diff --git a/README.md b/README.md
index 2344a911bd9fc76bf73bc7e118cd826a8737b55d..aee404464acb43dc56738cef5c2bb3f9aa2b5c2e 100644
--- a/README.md
+++ b/README.md
@@ -6,6 +6,8 @@ This project tries to solve OpenAI's bipedal walker using three different ways:
 ❌ Will get a reward of -64. But instead of spreading it's legs the walker tries to fall on its head in a slow motion.\
 At least the walker learns to fall slower of time.
 
+![Average Rewards while learning](./DeepQLearning/averageRewards.png)
+
 ## How it works
 1. Choose action based on Q-Function
 2. Execute chosen action or explore
@@ -36,7 +38,7 @@ At least the walker learns to fall slower of time.
 # Action Mutation
 ❌ Will get 0 reward, which is basically learning to prevent falling on it's head. The more actions the walker can use, the worse the reward.
 This is because the walker tries to generate movement by trembling with it's legs. The covered distance doesn't cover the punishment for doing actions. So after 1600 moves the walker will get a reward around -60.
-![Reward](./MutateActions/5_50_50_0.2.png)
+![Rewards while learning](./MutateActions/5_50_50_0.2.png)
 
 ## How it works
 1. Generate a population with a starting number randomized actions (we don't need enough actions to solve the problem right now)
@@ -59,7 +61,7 @@ After 1000 episodes, which is about 1h of learning, it will reach ~250 reward.\
 ✅ Best score until now: 304/300 in under 7000 episodes with a decaying learning rate and mutation factor. \
 \
 Learning curve:\
-![Rewards while Learning](./EvolutionStrategies/Experiments/12_1_50_decaying_decaying_300/12_1_50_decaying_decaying_300.png)
+![Rewards while learning](./EvolutionStrategies/Experiments/12_1_50_decaying_decaying_300/12_1_50_decaying_decaying_300.png)
 \
 \
 Rewards of fully learned agent in 50 episodes:\