Notice
Recent Posts
Recent Comments
Link
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | ||
6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 | 17 | 18 | 19 |
20 | 21 | 22 | 23 | 24 | 25 | 26 |
27 | 28 | 29 | 30 | 31 |
Tags
- localization
- gettext
- docker
- coursera
- I18N
- internationalization
- 국제화
- andrew ng
- deeplearning.ai
- AI
- gettext_windows
- Python
- 현지화
Archives
- Today
- Total
JMANI
Lecture 2: Playing OpenAI GYM Games by Sung Kim 본문
link: https://www.youtube.com/watch?v=xgoO54qN4lY&list=PLlMkM4tgfjnKsCWav-Z2F-MMFRx-2gMGG&index=2
Frozen Lake World
S: start(시작점)
F: frozen(얼음이 있는 곳)
H: hole(구멍)
G: goal(도착점)
- (1) Agent가 특정 Action을 취함
- (2) Env가 Action에 대한 State와 Reward를 제공
OpenAi Gym
link: https://gym.openai.com/
Gym Documentation
Next API
www.gymlibrary.ml
env = gym.make("FrozenLake-v0") # 환경 생성(이름)
observation = env.reset() # 환경 초기화
for _ in range(1000):
env.render() # 환경을 화면으로 출력
action = env.action_space.sample() # your agent here (this takes random actions)
observation, reward, done, info = env.step(action) # action의 결과 반환 done: true/false
import gym
from gym.envs.registration import register
import sys, tty, termios
class _Getch:
def __call__(self):
fd = sys.stdin.fileno()
old_settings = termios.tcgetattr(fd)
try:
tty.setraw(sys.stdin.fileno())
ch = sys.stdin.read(3)
finally:
termios.tcsetattr(fd, termios.TCSADRAIN, old_settings)
return ch
inkey = _Getch()
# MACROS
LEFT = 0
DOWN = 1
RIGHT = 2
UP = 3
# Key mapping
arrow_keys = {
'\x1b[A': UP,
'\x1b[B': DOWN,
'\x1b[C': RIGHT,
'\x1b[D': LEFT
}
# Register Frozen with is_slippery False
register(
id='FrozenLake-v3',
entry_point='gym.envs.toy_text:FrozenLakeEnv',
kwargs={'map_name': '4x4', 'is_slippery': False}
)
env = gym.make('FrozenLake-v3')
state = env.reset()
env.render() # Show the inital board
while True:
# Choose an action from keyboard
key = inkey()
if key not in arrow_keys.keys():
print("Game aborted!")
break
action = arrow_keys[key]
state, reward, done, info = env.step(action)
env.render() # Show the board after action
print("State: ", state, "Action: ", action, "Reward: ", reward, "Info: ", info)
if done:
print("Finished with reward", reward)
break
'AI > Reinforcement Learning' 카테고리의 다른 글
Lecture 6: Q-Network by Sung Kim (0) | 2022.05.23 |
---|---|
Lecture 5: Q-learning on Nondeterministic Worlds! by Sung Kim (0) | 2022.05.20 |
Lecture 4: Q-learning (table) exploit&exploration and discounted reward by Sung Kim (0) | 2022.05.20 |
Lecture 3: Dummy Q-learning (table) by Sung Kim (0) | 2022.05.19 |
Lecture 1: RL 수업소개 (Introduction) by Sung Kim (0) | 2022.05.19 |
Comments