Using Q-Learning to solve a navigation problem. The learning process will runs for 500 iterations.
Accumulate paths