This paper considers the problem of intelligent agent functioning in non-Markovian environments. We advice to divide the problem into two subproblems: just finding non-Markovian states in the environment and building an internal representation of original environment by the agent. The internal representation is free from non Markovian states because insufficient number of additional dynamically created states and transitions are provided. Then, the obtained environment might be used in classical reinforcement learning algorithms (like SARSA(λ)) which guarantee the convergence by Bellman equation. A great difficulty is to recognize different “copies” of the same states. The paper contains a theoretical introduction, ideas and problem description, and, finally, an illustration of results and conclusions.