Deep Research API 실전 가이드 — Collaborative Planning, MCP, File Search 완전 연동

Gemini

Deep Research API 실전 가이드 — Collaborative Planning, MCP, File Search 완전 연동

cell-devlog 2026. 6. 12. 10:21

"알아서 조사해줘"를 API로 구현하는 방법입니다. 단순 요약이 아니라 수십 개 소스를 직접 탐색하고 차트까지 그려주는 에이전트를 코드 몇 줄로 붙일 수 있게 됐습니다.

한줄요약: Gemini Deep Research API는 멀티스텝 리서치를 비동기로 실행하는 관리형 에이전트로, 4월 업데이트에서 Collaborative Planning·MCP 서버 연동·File Search·차트 생성이 추가됐습니다.

Deep Research API가 뭔가요?

일반 generate_content와 근본적으로 다릅니다. 일반 API는 한 번의 추론으로 응답을 반환하지만, Deep Research는 수십 번의 검색·읽기·합성 루프를 자율로 돌리고 인용이 붙은 리포트를 만들어 냅니다. 작업 시간이 수 분 걸리기 때문에 반드시 백그라운드 비동기 실행을 써야 합니다.

4월 업데이트로 두 가지 버전이 생겼습니다.

deep-research-preview-04-2026      → 속도·효율 최적화, UI 스트리밍에 적합
deep-research-max-preview-04-2026  → 최대 정확도, 자동화 컨텍스트 수집에 적합

최대 리서치 시간은 60분이고, 대부분의 태스크는 20분 안에 완료됩니다.

주의: generate_content로는 접근할 수 없습니다. 반드시 Interactions API를 써야 합니다.

설치 및 기본 실행

pip install google-genai
export GEMINI_API_KEY="your-api-key"

가장 단순한 형태입니다. background=True로 시작하고 폴링으로 결과를 기다립니다.

import time
from google import genai

client = genai.Client()

# 리서치 시작
interaction = client.interactions.create(
    input="2026년 AI 반도체 시장 현황과 주요 플레이어 분석",
    agent="deep-research-preview-04-2026",
    background=True,
)

print(f"리서치 시작됨: {interaction.id}")

# 폴링으로 완료 대기
while True:
    interaction = client.interactions.get(interaction.id)
    if interaction.status == "completed":
        print(interaction.output_text)
        break
    elif interaction.status == "failed":
        print(f"실패: {interaction.error}")
        break
    time.sleep(10)

Collaborative Planning — 방향을 잡고 나서 실행하기

에이전트가 바로 달려드는 게 아니라 먼저 리서치 계획을 제안하고, 개발자가 수정한 뒤 승인하면 실행하는 방식입니다. 방향이 잘못된 채로 20분을 기다리는 상황을 막아줍니다.

Step 1. 계획 요청

plan = client.interactions.create(
    agent="deep-research-preview-04-2026",
    input="GPT-5.5 vs Claude Opus 4.8 코딩 성능 비교 리서치",
    agent_config={
        "type": "deep-research",
        "collaborative_planning": True  # 계획만 반환, 실행 안 함
    },
    background=True,
)

while (result := client.interactions.get(id=plan.id)).status != "completed":
    time.sleep(5)

print(result.outputs[-1].text)
# 에이전트가 제안한 리서치 계획이 출력됨

Step 2. 계획 수정

previous_interaction_id로 대화를 이어가면서 계획을 다듬습니다. collaborative_planning=True를 유지하는 동안은 계속 계획 수정 모드입니다.

refined = client.interactions.create(
    agent="deep-research-preview-04-2026",
    input="벤치마크 섹션에 실제 개발자 사용 사례도 추가해줘. 한국어 자료도 포함해줘.",
    agent_config={
        "type": "deep-research",
        "collaborative_planning": True
    },
    previous_interaction_id=plan.id,  # 이전 대화 연결
    background=True,
)

while (result := client.interactions.get(id=refined.id)).status != "completed":
    time.sleep(5)

print(result.outputs[-1].text)

Step 3. 승인 후 실행

⚠️ 여기서 가장 자주 실수하는 부분입니다. "좋아, 시작해"라고만 보내면 실행이 안 됩니다. 반드시 collaborative_planning=False로 명시적으로 플래그를 꺼야 합니다.

report = client.interactions.create(
    agent="deep-research-preview-04-2026",
    input="계획 좋아. 시작해줘.",
    agent_config={
        "type": "deep-research",
        "collaborative_planning": False  # 이걸 빠뜨리면 실행 안 됨 ⚠️
    },
    previous_interaction_id=refined.id,
    background=True,
)

while (result := client.interactions.get(id=report.id)).status != "completed":
    time.sleep(5)

print(result.outputs[-1].text)

MCP 서버 연동 — 내부 데이터 소스 연결하기

외부 공개 웹 검색이 아니라, 내부 DB나 사내 API를 리서치 소스로 연결할 때 씁니다. 인증 방식은 3가지를 지원합니다.

# Bearer Token 인증
interaction = client.interactions.create(
    agent="deep-research-preview-04-2026",
    input="당사 Q1 2026 매출 데이터와 업계 트렌드를 비교 분석해줘",
    tools=[
        {
            "type": "mcp_server",
            "name": "Internal BI Server",
            "url": "https://bi.company.com/mcp",
            "headers": {"Authorization": "Bearer my-internal-token"},
        }
    ],
    background=True,
)

# 인증 없음 (퍼블릭 MCP)
interaction = client.interactions.create(
    agent="deep-research-preview-04-2026",
    input="최신 환율 데이터 기반으로 환위험 분석해줘",
    tools=[
        {
            "type": "mcp_server",
            "name": "FX Data Provider",
            "url": "https://fx-public.example.com/mcp",
            # headers 없음 = 인증 없음
        }
    ],
    background=True,
)

특정 툴만 허용하고 싶으면 allowed_tools로 제한할 수 있습니다.

tools=[
    {
        "type": "mcp_server",
        "name": "Finance MCP",
        "url": "https://finance.example.com/mcp",
        "headers": {"Authorization": "Bearer token"},
        "allowed_tools": ["get_stock_price", "get_market_data"]  # 이 두 개만 허용
    }
]

툴 조합 — 웹+내부 데이터 동시에 쓰기

Deep Research는 여러 툴을 동시에 활성화해서 쓸 수 있습니다.

툴 종류          타입              기본값    설명
─────────────────────────────────────────────────────────
Google Search    google_search     ✅        공개 웹 검색
URL Context      url_context       ✅        웹 페이지 읽기
Code Execution   code_execution    ✅        계산·데이터 분석
MCP Server       mcp_server        —         원격 MCP 연결
File Search      file_search       —         업로드 문서 검색

# 웹 + 내부 MCP + 파일 동시 사용
interaction = client.interactions.create(
    agent="deep-research-max-preview-04-2026",
    input="업계 동향과 당사 내부 보고서를 종합해서 경쟁사 분석 리포트 작성",
    tools=[
        {"type": "google_search"},
        {"type": "url_context"},
        {"type": "code_execution"},
        {
            "type": "mcp_server",
            "name": "Internal Reports",
            "url": "https://reports.company.com/mcp",
            "headers": {"Authorization": "Bearer token"},
        },
        {"type": "file_search"},
    ],
    background=True,
)

# 내부 데이터만 (웹 검색 완전 차단)
interaction = client.interactions.create(
    agent="deep-research-preview-04-2026",
    input="내부 고객 피드백 데이터만 분석해줘. 외부 검색은 하지 마.",
    tools=[
        {"type": "file_search"},   # 업로드 파일만
        {
            "type": "mcp_server",
            "name": "CRM",
            "url": "https://crm.company.com/mcp",
            "headers": {"Authorization": "Bearer token"},
        }
        # google_search 없음 = 웹 검색 비활성화
    ],
    background=True,
)

차트 생성 + 실시간 스트리밍

visualization="auto"를 켜면 에이전트가 텍스트 리포트와 함께 차트·인포그래픽을 base64 이미지로 반환합니다. thinking_summaries="auto"를 추가하면 에이전트의 중간 추론 과정도 실시간으로 볼 수 있습니다.

import base64
from google import genai

client = genai.Client()

interaction_id = None
last_event_id = None
is_complete = False

def process_stream(stream):
    global interaction_id, last_event_id, is_complete
    for chunk in stream:
        if chunk.event_type == "interaction.start":
            interaction_id = chunk.interaction.id
        if chunk.event_id:
            last_event_id = chunk.event_id

        if chunk.event_type == "content.delta":
            if chunk.delta.type == "text":
                print(chunk.delta.text, end="", flush=True)
            elif chunk.delta.type == "thought_summary":
                print(f"\n💭 {chunk.delta.content.text}", flush=True)
            elif chunk.delta.type == "image" and chunk.delta.data:
                image_bytes = base64.b64decode(chunk.delta.data)
                with open("chart_output.png", "wb") as f:
                    f.write(image_bytes)
                print("\n📊 차트 저장됨: chart_output.png")
        elif chunk.event_type in ("interaction.complete", "error"):
            is_complete = True

stream = client.interactions.create(
    input="AI 코딩 도구 시장 점유율 트렌드 분석. 벤더별 비교 차트 포함.",
    agent="deep-research-preview-04-2026",
    background=True,
    stream=True,
    agent_config={
        "type": "deep-research",
        "thinking_summaries": "auto",  # 중간 추론 과정 표시
        "visualization": "auto",       # 차트 자동 생성
    },
)
process_stream(stream)

# 연결이 끊기면 재연결
while not is_complete and interaction_id:
    status = client.interactions.get(interaction_id)
    if status.status != "in_progress":
        break
    stream = client.interactions.get(
        id=interaction_id,
        stream=True,
        last_event_id=last_event_id,
    )
    process_stream(stream)

멀티모달 인풋 — PDF·이미지를 리서치 컨텍스트로

논문 PDF나 스크린샷을 직접 넘기면 에이전트가 해당 문서를 근거로 리서치를 진행합니다.

interaction = client.interactions.create(
    agent="deep-research-preview-04-2026",
    input=[
        {
            "type": "text",
            "text": "이 논문의 주요 주장을 검증하고, 관련 후속 연구를 조사해줘"
        },
        {
            "type": "document",
            "uri": "https://arxiv.org/pdf/1706.03762",  # Attention is All You Need
            "mime_type": "application/pdf"
        },
    ],
    background=True,
)

알아두면 좋은 제약사항

→ Function Calling 커스텀 툴은 아직 미지원 (MCP 서버로 대체)
→ Structured Output 미지원 (텍스트+이미지만 반환)
→ background=True 필수 (동기 실행 불가)
→ background=True 사용 시 store=True 함께 필요
→ 최대 리서치 시간 60분

✅ Collaborative Planning은 단순 편의기능이 아닙니다. 특히 Max 버전은 리서치 깊이가 깊어서 방향이 잘못되면 20분이 낭비됩니다. 프로덕션에서는 Planning 단계를 항상 거치는 게 낫습니다.

❌ collaborative_planning=False 플래그 없이 "시작해줘"만 보내면 계속 계획 수정 모드로 남습니다. 이 함정은 공식 문서에도 굵은 글씨로 경고하고 있습니다.

'Gemini' 카테고리의 다른 글

로봇, Physical AI — Gemini Robotics-ER 1.6 공간추론·기기판독 개발 가이드 (0)	2026.06.12
Gemini 3.1 Flash TTS (Text-to-Speech) 음성 감정 표현 API 실전 가이드 (0)	2026.06.12
RAG 파이프라인에 이미지·영상·오디오를 넣는 시대 — gemini-embedding-2 GA 완전 가이드 (0)	2026.06.12
6월 18일이면 gemini 명령어가 멈춥니다 — Antigravity CLI 마이그레이션 실전 가이드 (0)	2026.06.12
DiffusionGemma 완전 분석: Gemma 4랑 뭐가 다른 거예요? (0)	2026.06.11

현재글Deep Research API 실전 가이드 — Collaborative Planning, MCP, File Search 완전 연동

CELL AI DEVLOG

AI 에이전트 만듭니다

github copilot, openai codex, 클로드코드, Gemini, 바이브코딩, Claude Opus 4.8, claude code, LLM, MCP, AWS Kiro, Rag, LLM as a judge, Gemini 3.5 Flash, AI 에이전트, 멀티에이전트, LLM서빙, Claude, AI agent, 오픈소스llm, SGLANG,

Today :
Yesterday :

일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

CELL AI DEVLOG