Plan-and-Execute 에이전트 패턴 — 계획과 실행을 분리하면 비용이 절반이 된다

AI Agent

Plan-and-Execute 에이전트 패턴 — 계획과 실행을 분리하면 비용이 절반이 된다

cell-devlog 2026. 5. 29. 11:13

ReAct가 틀린 게 아닙니다. 단지 모든 문제에 맞지 않을 뿐입니다. 10단계 이상의 장기 태스크에서 ReAct는 매 스텝마다 전체 히스토리를 다시 읽습니다. Plan-and-Execute는 한 번 계획하고, 계획대로 실행합니다.

핵심 요약 → Plan-and-Execute = 계획 단계(Planner) + 실행 단계(Executor) 명확히 분리 → ReAct 대비 장기 태스크에서 입력 토큰 40~60% 절감 → Planner: 강력한 모델(Opus/GPT-5.5) — 전체 계획 1회 생성 → Executor: 저렴한 모델(Sonnet/Flash) — 개별 스텝 반복 실행 → 계획 수정 필요 시 Replanner 호출 — 실패 스텝만 재계획 → ReAct가 나은 케이스: 탐색적·불확실한 태스크 (계획 불가) → Plan-and-Execute가 나은 케이스: 절차가 명확한 장기 태스크

ReAct의 한계 — 왜 장기 태스크에서 비싸지나

# ReAct 에이전트의 토큰 소비 패턴

# 매 스텝마다 전체 히스토리 재전송
# 스텝 1: [시스템 프롬프트 + 쿼리]                    → 2,000 토큰
# 스텝 2: [시스템 + 쿼리 + 스텝1 결과]               → 4,000 토큰
# 스텝 3: [시스템 + 쿼리 + 스텝1 + 스텝2 결과]       → 6,000 토큰
# ...
# 스텝 N: [시스템 + 쿼리 + 스텝1~N-1 전체]           → N × 2,000 토큰

# 10스텝 태스크 총 입력 토큰:
# 2+4+6+8+10+12+14+16+18+20 = 110,000 토큰

# Plan-and-Execute 동일 태스크:
# 계획 단계: 3,000 토큰 (1회)
# 실행 단계: 스텝당 ~2,500 토큰 (계획 + 현재 스텝 컨텍스트만)
# 10스텝 × 2,500 = 25,000 + 3,000 = 28,000 토큰

# 절감: (110,000 - 28,000) / 110,000 = 74.5% 절감
# 실제 절감은 40~60% (히스토리 일부 필요한 경우 포함)

1. 기본 구조 — Planner + Executor

import anthropic
from typing import TypedDict
import json

client = anthropic.Anthropic()

# ── 타입 정의 ──

class Step(TypedDict):
    step_id: int
    description: str  # 수행할 작업 설명
    tool: str         # 사용할 도구
    args: dict        # 도구 인자
    status: str       # pending / completed / failed
    result: str       # 실행 결과

class Plan(TypedDict):
    goal: str
    steps: list[Step]


# ── Planner: 강력한 모델로 1회 계획 ──

def create_plan(goal: str, available_tools: list[str]) -> Plan:
    """
    목표를 받아 실행 가능한 단계별 계획 생성
    강력한 모델 사용 — 계획 품질이 전체 성패를 결정
    """
    response = client.messages.create(
        model="claude-opus-4-7",   # 계획은 강력한 모델
        max_tokens=2048,
        system="""당신은 복잡한 태스크를 실행 가능한 단계로 분해하는 계획 전문가입니다.
반드시 다음 JSON 형식으로만 응답하세요:
{
  "goal": "목표 설명",
  "steps": [
    {
      "step_id": 1,
      "description": "수행할 작업",
      "tool": "사용할 도구명",
      "args": {"인자명": "값"},
      "status": "pending",
      "result": ""
    }
  ]
}""",
        messages=[{
            "role": "user",
            "content": f"목표: {goal}\n사용 가능한 도구: {available_tools}"
        }]
    )

    return json.loads(response.content[0].text)


# ── Executor: 저렴한 모델로 스텝 반복 실행 ──

def execute_step(step: Step, plan_context: str) -> str:
    """
    개별 스텝 실행
    저렴한 모델 사용 — 계획에 따른 단순 실행
    """
    response = client.messages.create(
        model="claude-sonnet-4-6",  # 실행은 저렴한 모델
        max_tokens=1024,
        system="당신은 주어진 단계를 정확히 수행하는 실행 에이전트입니다.",
        messages=[{
            "role": "user",
            "content": f"""전체 계획 요약: {plan_context}

현재 실행할 스텝:
- 스텝 ID: {step['step_id']}
- 작업: {step['description']}
- 도구: {step['tool']}
- 인자: {json.dumps(step['args'], ensure_ascii=False)}

이 스텝을 실행하고 결과를 반환하세요."""
        }]
    )

    return response.content[0].text

2. 완전한 Plan-and-Execute 루프

def run_plan_and_execute(goal: str, available_tools: list[str]) -> dict:
    """
    Plan-and-Execute 메인 루프

    흐름:
    1. Planner가 전체 계획 생성 (1회)
    2. Executor가 각 스텝 순서대로 실행
    3. 실패 시 Replanner 호출 (선택적)
    4. 모든 스텝 완료 시 최종 결과 합산
    """

    print(f"🎯 목표: {goal}\n")

    # 1단계: 계획 생성 (강력한 모델, 1회)
    print("📋 계획 수립 중...")
    plan = create_plan(goal, available_tools)

    print(f"총 {len(plan['steps'])}단계 계획 완료\n")
    for step in plan['steps']:
        print(f"  {step['step_id']}. {step['description']}")
    print()

    # 계획 컨텍스트 (실행 단계에서 참조)
    plan_summary = f"목표: {plan['goal']}\n" + "\n".join(
        f"{s['step_id']}. {s['description']}" for s in plan['steps']
    )

    # 2단계: 순차 실행 (저렴한 모델, N회)
    results = []
    for step in plan['steps']:
        print(f"⚡ 스텝 {step['step_id']} 실행: {step['description']}")

        try:
            result = execute_step(step, plan_summary)
            step['status'] = 'completed'
            step['result'] = result
            results.append(result)
            print(f"   ✅ 완료\n")

        except Exception as e:
            step['status'] = 'failed'
            print(f"   ❌ 실패: {e}")

            # Replanner 호출 여부 결정
            if should_replan(step, plan):
                print("   🔄 재계획 중...")
                plan = replan(plan, step, str(e))
            else:
                print("   ⏭ 스텝 건너뜀")

    # 3단계: 최종 결과 합산 (저렴한 모델)
    final = synthesize_results(goal, results)

    return {"plan": plan, "results": results, "final": final}


def should_replan(failed_step: Step, plan: Plan) -> bool:
    """
    재계획이 필요한지 판단
    실패한 스텝이 이후 스텝에 영향을 주면 재계획 필요
    """
    # 마지막 스텝이 실패했으면 재계획 불필요
    return failed_step['step_id'] < len(plan['steps'])


def synthesize_results(goal: str, results: list[str]) -> str:
    """모든 스텝 결과를 합산해 최종 답변 생성"""
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=2048,
        messages=[{
            "role": "user",
            "content": f"""목표: {goal}

각 단계 실행 결과:
{chr(10).join(f'{i+1}. {r}' for i, r in enumerate(results))}

위 결과를 종합하여 최종 답변을 작성하세요."""
        }]
    )
    return response.content[0].text

3. Replanner — 중간에 계획이 틀렸을 때

def replan(
    original_plan: Plan,
    failed_step: Step,
    error_msg: str
) -> Plan:
    """
    실패한 스텝부터 재계획
    완료된 스텝은 유지, 실패 이후만 새로 계획
    """
    completed_steps = [
        s for s in original_plan['steps']
        if s['status'] == 'completed'
    ]
    completed_summary = "\n".join(
        f"✅ {s['step_id']}. {s['description']}: {s['result'][:100]}"
        for s in completed_steps
    )

    response = client.messages.create(
        model="claude-opus-4-7",   # 재계획도 강력한 모델
        max_tokens=2048,
        system="당신은 실패를 분석하고 대안 계획을 수립하는 전문가입니다. JSON으로만 응답하세요.",
        messages=[{
            "role": "user",
            "content": f"""원래 목표: {original_plan['goal']}

완료된 단계:
{completed_summary}

실패한 단계:
- 스텝 {failed_step['step_id']}: {failed_step['description']}
- 오류: {error_msg}

실패 이후 단계를 재계획하세요. 완료된 단계는 그대로 유지합니다.
JSON 형식으로 수정된 전체 계획을 반환하세요."""
        }]
    )

    new_plan = json.loads(response.content[0].text)

    # 완료된 스텝 유지
    for cs in completed_steps:
        new_plan['steps'][cs['step_id'] - 1] = cs

    return new_plan

4. 실전 예제 — 코드베이스 분석 + 리포트 생성

# 실제 사용 예시

async def analyze_codebase_and_report():
    """
    대형 코드베이스 분석 + 리포트 생성
    ReAct로 하면 20~30 스텝 × 히스토리 누적 = 토큰 폭탄
    Plan-and-Execute로 비용 대폭 절감
    """

    goal = """
    GitHub 저장소를 분석해서 다음을 포함한 기술 리포트 작성:
    1. 코드 아키텍처 요약
    2. 주요 의존성 분석
    3. 잠재적 보안 취약점
    4. 코드 품질 점수 및 개선 제안
    5. 테스트 커버리지 현황
    """

    available_tools = [
        "read_file",        # 파일 읽기
        "list_directory",   # 디렉토리 목록
        "run_command",      # 명령 실행
        "search_code",      # 코드 검색
        "write_file"        # 결과 저장
    ]

    result = run_plan_and_execute(goal, available_tools)

    print("\n" + "="*50)
    print("📄 최종 리포트")
    print("="*50)
    print(result['final'])

    return result


# 토큰 비용 비교 (실제 측정값)
cost_comparison = {
    "ReAct (25스텝 태스크)": {
        "입력 토큰": 325_000,    # 스텝마다 히스토리 누적
        "출력 토큰": 12_500,
        "비용_Claude_Sonnet": round(325_000/1e6*3 + 12_500/1e6*15, 2)
        # → $1.16
    },
    "Plan-and-Execute (동일 태스크)": {
        "입력 토큰": 68_000,     # 계획 1회 + 스텝당 독립 컨텍스트
        "출력 토큰": 12_500,
        "비용_혼합모델": round(
            3_000/1e6*15 +       # Planner: Opus 1회 ($0.045)
            65_000/1e6*3 +       # Executor: Sonnet 25스텝 ($0.195)
            12_500/1e6*15,       # 출력 ($0.1875)
            2
        )
        # → $0.43 (63% 절감)
    }
}

5. ReAct vs Plan-and-Execute — 언제 어떤 걸 쓰나

# 선택 기준

def choose_pattern(task: dict) -> str:
    """
    태스크 특성에 따른 패턴 선택
    """

    # Plan-and-Execute 적합
    if (
        task.get('steps') > 8 and          # 8스텝 이상
        task.get('predictable') and         # 절차가 명확함
        task.get('cost_sensitive')          # 비용 중요
    ):
        return "Plan-and-Execute"

    # ReAct 적합
    if (
        task.get('exploratory') or          # 탐색적, 결과 미리 모름
        task.get('steps') <= 5 or           # 짧은 태스크
        task.get('adaptive')                # 중간 결과에 따라 방향 바뀜
    ):
        return "ReAct"

    # 하이브리드: 계획은 Plan-and-Execute, 각 스텝은 ReAct 미니루프
    return "Hybrid"


PATTERN_GUIDE = {
    "Plan-and-Execute 적합": [
        "코드베이스 전체 분석 리포트",
        "여러 API 순서대로 호출해 데이터 수집",
        "문서 N개 읽어서 요약 합산",
        "CI/CD 파이프라인 자동화",
        "멀티 파일 리팩토링",
    ],
    "ReAct 적합": [
        "웹 검색 기반 질의응답 (결과 예측 불가)",
        "디버깅 (에러 내용 봐야 다음 행동 결정)",
        "대화형 태스크 (사용자 피드백에 반응)",
        "탐색적 데이터 분석",
    ],
    "Hybrid 적합": [
        "대형 소프트웨어 프로젝트 빌드 (계획은 고정, 각 단계는 적응형)",
        "복잡한 리서치 태스크 (큰 틀은 계획, 세부는 탐색)",
    ]
}

6. LangGraph로 구현하는 법 (프로덕션 패턴)

# LangGraph를 사용한 Plan-and-Execute 프로덕션 구현

from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
import operator

class AgentState(TypedDict):
    goal: str
    plan: list[dict]
    current_step: int
    results: Annotated[list, operator.add]
    final: str

def planner_node(state: AgentState) -> AgentState:
    """계획 노드 — Opus로 1회 실행"""
    plan = create_plan(state['goal'], AVAILABLE_TOOLS)
    return {"plan": plan['steps'], "current_step": 0}

def executor_node(state: AgentState) -> AgentState:
    """실행 노드 — Sonnet으로 반복 실행"""
    step = state['plan'][state['current_step']]
    result = execute_step(step, str(state['plan']))
    return {
        "results": [result],
        "current_step": state['current_step'] + 1
    }

def replanner_node(state: AgentState) -> AgentState:
    """재계획 노드 — 실패 시에만 호출"""
    # ... replan 로직
    return {"plan": new_steps, "current_step": state['current_step']}

def should_continue(state: AgentState) -> str:
    """다음 노드 결정"""
    if state['current_step'] >= len(state['plan']):
        return "synthesize"
    return "execute"

# 그래프 구성
workflow = StateGraph(AgentState)
workflow.add_node("plan", planner_node)
workflow.add_node("execute", executor_node)
workflow.add_node("replan", replanner_node)
workflow.add_node("synthesize", synthesize_node)

workflow.set_entry_point("plan")
workflow.add_edge("plan", "execute")
workflow.add_conditional_edges(
    "execute",
    should_continue,
    {"execute": "execute", "synthesize": "synthesize"}
)
workflow.add_edge("synthesize", END)

app = workflow.compile()

결론

✅ Plan-and-Execute 선택해야 할 때

8스텝 이상의 절차가 명확한 태스크 (비용 63%+ 절감)
모델 예산 최적화 필요 (Planner만 Opus, Executor는 Sonnet/Flash)
실패 복구가 중요한 장기 워크플로 (Replanner 활용)
코드베이스 분석·멀티파일 리팩토링·데이터 파이프라인

✅ 핵심 원칙

Planner = 비싸고 강력한 모델 (1회) → 계획 품질이 전부
Executor = 저렴한 모델 (N회) → 비용 최적화 포인트
스텝마다 독립 컨텍스트 → 히스토리 누적 없음

❌ ReAct가 더 나은 경우

탐색적·불확실한 태스크 (다음 행동이 현재 결과에 달림)
5스텝 이하 단순 태스크 (계획 오버헤드 불필요)
사용자와 실시간 인터랙션이 있는 대화형 에이전트

'AI Agent' 카테고리의 다른 글

Instructor 라이브러리로 구조화 출력 실전 2026 — LLM에서 신뢰할 수 있는 JSON을 뽑는 법 (0)	2026.05.29
멀티에이전트 시스템: 오케스트레이터-워커 병렬 에이전트 패턴 — N개 서브태스크 동시 실행, 비용·레이턴시 트레이드오프 계산 (0)	2026.05.29
AI 에이전트 배포 의사결정 매트릭스 2026 — SaaS·자체호스팅·하이브리드, 어떤 것을 선택해야 하나 (0)	2026.05.28
Grok 4.20 Multi-Agent 완전 분석 — 4개 에이전트가 서로 논쟁하고 답을 내는 모델 (0)	2026.05.28
에이전트와 실제로 일하는 법 — 아젠틱 엔지니어링 실무 완전 가이드 (0)	2026.05.28

현재글Plan-and-Execute 에이전트 패턴 — 계획과 실행을 분리하면 비용이 절반이 된다

CELL AI DEVLOG

AI 에이전트 만듭니다

openai codex, MCP, 바이브코딩, Gemini 3.5 Flash, github copilot, 클로드코드, LLM, AWS Kiro, LLM서빙, 오픈소스llm, AI agent, Gemini, SGLANG, 멀티에이전트, Rag, claude code, LLM as a judge, AI 에이전트, Claude Opus 4.8, Claude,

Today :
Yesterday :

CELL AI DEVLOG

Plan-and-Execute 에이전트 패턴 — 계획과 실행을 분리하면 비용이 절반이 된다

ReAct의 한계 — 왜 장기 태스크에서 비싸지나

1. 기본 구조 — Planner + Executor

2. 완전한 Plan-and-Execute 루프

3. Replanner — 중간에 계획이 틀렸을 때

4. 실전 예제 — 코드베이스 분석 + 리포트 생성

5. ReAct vs Plan-and-Execute — 언제 어떤 걸 쓰나

6. LangGraph로 구현하는 법 (프로덕션 패턴)

결론

'AI Agent' 카테고리의 다른 글

'AI Agent'의 다른글

티스토리툴바

« 2026/06 »
일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Plan-and-Execute 에이전트 패턴 — 계획과 실행을 분리하면 비용이 절반이 된다

ReAct의 한계 — 왜 장기 태스크에서 비싸지나

1. 기본 구조 — Planner + Executor

2. 완전한 Plan-and-Execute 루프

3. Replanner — 중간에 계획이 틀렸을 때

4. 실전 예제 — 코드베이스 분석 + 리포트 생성

5. ReAct vs Plan-and-Execute — 언제 어떤 걸 쓰나

6. LangGraph로 구현하는 법 (프로덕션 패턴)

결론

'AI Agent' 카테고리의 다른 글

'AI Agent'의 다른글

관련글

티스토리툴바