# File Search

The `file_search` tool is a **built-in tool** for searching through your uploaded documents. Unlike `internet_search` (live web) and `url_content` (public URLs), this tool searches **your private uploaded files** stored in a **File Search Store**. It uses **Google Gemini 2.5 Flash** with native `fileSearch` grounding to retrieve and answer from document content.

### How It Works

1. You (or your system) upload documents to a **File Search Store** (via a separate upload API)
2. You pass the store name(s) in the `file_search_store_names` field of your request
3. The LLM receives the store names via an injected system message and calls the `file_search` tool automatically
4. Gemini 2.5 Flash searches the store using semantic/vector search and returns grounded answers with citations
5. The answer is returned as a standard chat completion

#### Key Difference from Other Tools

| Tool              | Trigger                                     | Source                              |
| ----------------- | ------------------------------------------- | ----------------------------------- |
| `internet_search` | Auto (from prompt)                          | Live public web via Perplexity      |
| `url_content`     | Auto (URL in message)                       | Specific public URL via Gemini      |
| `file_search`     | Auto + `file_search_store_names` in request | **Your private uploaded documents** |

#### Use Cases & Examples

#### 1. Summarize an Uploaded Document

```json
{
  "model": "google/gemini-2.5-flash",
  "file_search_store_names": ["fileSearchStores/defaultstore-31ihhh9rmyef"],
  "messages": [
    { "role": "user", "content": "Summarize the uploaded document" }
  ]
}
```

#### 2. Ask a Specific Question from Documents

```json
{
  "model": "google/gemini-2.5-flash",
  "file_search_store_names": ["fileSearchStores/defaultstore-31ihhh9rmyef"],
  "messages": [
    { "role": "user", "content": "What is the definition of molarity and how is it calculated?" }
  ]
}

```

#### 3. Search Multiple Stores Simultaneously

```json
{
  "model": "google/gemini-2.5-flash",
  "file_search_store_names": [
    "fileSearchStores/q1-reports",
    "fileSearchStores/q2-reports"
  ],
  "messages": [
    { "role": "user", "content": "Compare Q1 and Q2 revenue figures from the reports" }
  ]
}
```

#### 4. Contract / Legal Document Analysis

```json
{
  "model": "google/gemini-2.5-flash",
  "file_search_store_names": ["fileSearchStores/contracts-store"],
  "messages": [
    { "role": "user", "content": "What are the termination clauses in the uploaded contract?" }
  ]
}
```

#### 5. With a System Prompt

```json
{
  "model": "google/gemini-2.5-flash",
  "file_search_store_names": ["fileSearchStores/support-docs"],
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful support agent. Only answer based on the uploaded documentation."
    },
    { "role": "user", "content": "How do I upgrade my plan?" }
  ]
}

```

#### Tips & Best Practices

| Tip                            | Details                                                                                 |
| ------------------------------ | --------------------------------------------------------------------------------------- |
| **Always pass store names**    | `file_search_store_names` must be in every request — the tool won't activate without it |
| **Pass on every turn**         | In multi-turn conversations, include the field on each request                          |
| **Multiple stores**            | Pass an array to search across multiple document collections at once                    |
| **Any model works**            | Store names are injected as a system message — model-agnostic                           |
| **Be specific**                | "What is the refund policy?" retrieves more precisely than "Tell me about the document" |
| **Combine with system prompt** | Use a system prompt to constrain the model to only answer from the documents            |

***

### Limitations

* **Requires pre-uploaded documents** — you must upload files to a store before querying (separate upload flow)
* **Store names are static** — if a store is deleted or renamed, queries will fail gracefully (returns an error message without crashing)
* **No raw chunk retrieval** — you get a synthesized answer, not a list of raw document excerpts
* **No filtering by file** — all files in the store are searched together; you can't target a specific file within a store
* **`file_search_store_names` is not a standard OpenAI field** — OpenAI SDK clients must use `extra_body` (Python) or `// @ts-ignore` (TypeScript) to pass it

\ <br>
