top of page

Minwu Kim

[강화학습] MADDPG

Minwu Kim
2024년 4월 30일
1분 분량

최종 수정일: 2024년 6월 5일

핵심 내용:

기존 DQN, PG, DDPG 같은 알고리즘은 partially observable Markov Game을 제대로 modelling 하지 못함.
MDP의 방식은 다른 에이전트의 액션을 고려하지 않기 때문.
고로 Centralized critic, decentralized actor 도입, in order to take other agents' actions into account.

bottom of page