diff --git a/docs/pages/product/apis-integrations.mdx b/docs/pages/product/apis-integrations.mdx
index e0fdfb05f8e14..15bc385d12cc2 100644
--- a/docs/pages/product/apis-integrations.mdx
+++ b/docs/pages/product/apis-integrations.mdx
@@ -12,7 +12,7 @@ applications.
 Despite varying protocols and query formats, all data APIs share common
 [querying concepts][ref-queries].
 
-<Diagram src="https://ucarecdn.com/4abc9729-66a3-489f-9d48-0bf27e728c87/" />
+<Diagram src="https://ucarecdn.com/aa9b3201-d254-4a38-a4a9-b3786af6167b/" />
 
 Also, there are [management APIs](#management-apis) to control Cube deployments
 externally.
@@ -38,8 +38,6 @@ Otherwise, connect via the [SQL API][ref-sql-api] directly.
 [REST API][ref-rest-api] or [GraphQL API][ref-graphql-api]. When using the REST API,
 the [JavaScript SDK][ref-js-sdk] can simplify integration with your front-end code.
 
-* For AI use cases, use the [AI API][ref-ai-api].
-
 <ReferenceBox>
 
 See this [GitHub issue](https://github.com/cube-js/cube/issues/1744#issuecomment-2291680777)
@@ -67,7 +65,7 @@ tools][ref-viz-tools]:
 | [User name and password][ref-auth-user-pass] | [DAX API][ref-dax-api]<br/>[MDX API][ref-mdx-api]<br/>[Semantic Layer Sync][ref-sls]<br/>[SQL API][ref-sql-api] |
 | [Kerberos][ref-auth-kerberos] and [NTLM][ref-auth-ntlm] | [DAX API][ref-dax-api]<br/>[MDX API][ref-mdx-api] |
 | [Identity provider][ref-auth-idp] | [Cube Cloud for Excel][ref-cube-cloud-for-excel]<br/>[Cube Cloud for Sheets][ref-cube-cloud-for-sheets] |
-| [JSON Web Token][ref-auth-jwt] | [REST API][ref-rest-api]<br/>[GraphQL API][ref-graphql-api]<br/>[AI API][ref-ai-api] |
+| [JSON Web Token][ref-auth-jwt] | [REST API][ref-rest-api]<br/>[GraphQL API][ref-graphql-api] |
 
 ## Management APIs
 
@@ -84,7 +82,6 @@ API][ref-orchestration-api].
 [ref-rest-api]: /product/apis-integrations/rest-api
 [ref-graphql-api]: /product/apis-integrations/graphql-api
 [ref-mdx-api]: /product/apis-integrations/mdx-api
-[ref-ai-api]: /product/apis-integrations/ai-api
 [ref-orchestration-api]: /product/apis-integrations/orchestration-api
 [ref-sls]: /product/apis-integrations/semantic-layer-sync
 [ref-js-sdk]: /product/apis-integrations/javascript-sdk
diff --git a/docs/pages/product/apis-integrations/_meta.js b/docs/pages/product/apis-integrations/_meta.js
index 1061f3fdcd807..0da2dbae8d9d8 100644
--- a/docs/pages/product/apis-integrations/_meta.js
+++ b/docs/pages/product/apis-integrations/_meta.js
@@ -8,7 +8,6 @@ module.exports = {
   "sql-api": "SQL API",
   "rest-api": "REST API",
   "graphql-api": "GraphQL API",
-  "ai-api": "AI API",
   "javascript-sdk": "JavaScript SDK",
   "orchestration-api": "Orchestration API",
 };
diff --git a/docs/pages/product/apis-integrations/ai-api.mdx b/docs/pages/product/apis-integrations/ai-api.mdx
deleted file mode 100644
index 06e0a8f76e553..0000000000000
--- a/docs/pages/product/apis-integrations/ai-api.mdx
+++ /dev/null
@@ -1,391 +0,0 @@
-# AI API
-
-The AI API provides a standard interface for interacting with large language models (LLMs) as a turnkey solution for text-to-semantic layer queries.
-
-Specifically, you can send the AI API a message (or conversation of messages) and it will return a Cube REST API query. Optionally, it will also run the query and return the results.
-
-<WarningBox>
-  The AI API is available on [Cube
-  Cloud](/getting-started#getting-started-with-cube-cloud) only. It is currently
-  in preview and should not be used for production workloads.
-</WarningBox>
-
-See [AI API reference][ref-ref-ai-api] for the list of supported API endpoints.
-
-<YouTubeVideo url="https://www.youtube.com/embed/Qpg4RxqndnE"/>
-
-## Configuration
-
-While the AI API is in preview, your Cube account team will enable and configure it for you.
-
-If you wish to enable or disable the AI API on a specific Cube deployment, you may do so by going to "Settings" in the Cube Cloud sidebar, then "Configuration", and then toggling the "AI API" configuration flag switch.
-
-To find your AI API endpoint in Cube Cloud, go to the <Btn>Overview</Btn> page,
-click <Btn>API credentials</Btn>, and choose the <Btn>AI API</Btn> tab.
-
-## Getting Started
-
-### Data modeling
-
-The AI API currently requires [views](/reference/data-model/view) in order to generate queries. This is because:
-
-1. Views let you create carefully-curated datasets, resulting in better outputs from LLMs. That is, you can choose exactly what is "ready" for the AI to see and what is not.
-2. Views define deterministic joins between Cubes, so the LLM does not have to "guess" at join ordering
-
-To use the AI API, set up one or more views before getting started.
-
-<InfoBox>
-  By default, the AI API syncs data model changes hourly. To manually trigger a
-  sync, go to "Settings" in the Cube Cloud sidebar, then "Data Catalog
-  Services", then hit "Sync" on the Cube connection.
-</InfoBox>
-
-### Authentication
-
-Authentication works the same as for the [REST API](/product/apis-integrations/rest-api#authentication).
-
-The API Token is passed via the Authorization Header. The token itself is a
-[JSON Web Token](https://jwt.io), the [Security section](/product/auth) describes
-how to generate it.
-
-### Example request
-
-Given the data model from the ["data modeling" section](#data-modeling) above, you could send a request with the following body:
-
-```json
-{
-  "messages": [
-    {
-      "role": "user",
-      "content": "Where do we have the highest aov this year?"
-    }
-  ]
-}
-```
-
-Based on the view(s) provided, the AI API generates a Cube REST API request that could be used to answer the user's question. For example, you might receive the following response:
-
-```json
-{
-  "message": "To find where we have the highest Average Order Value (AOV) this year, we can analyze the data by comparing the AOV across different dimensions such as cities or states.",
-  "cube_query": {
-    "measures": ["orders_view.average_order_value"],
-    "dimensions": ["orders_view.users_city"],
-    "timeDimensions": [
-      {
-        "dimension": "orders_view.created_at",
-        "dateRange": "this year"
-      }
-    ],
-    "order": {
-      "orders_view.average_order_value": "desc"
-    },
-    "limit": 10
-  }
-}
-```
-
-See [running queries](#running-queries) for details on how to run the Cube query generated.
-
-### Running queries
-
-You have two possible ways to run the query:
-
-#### 1. `runQuery` parameter
-
-Use the `runQuery` request parameter to have the AI API run the query and report results back. When doing this, the request above would become:
-
-```json
-{
-  "messages": [
-    {
-      "role": "user",
-      "content": "Where do we have the highest aov this year?"
-    }
-  ],
-  "runQuery": true
-}
-```
-
-The response will be the same as above, possibly followed by a second JSON object representing the response (see the [REST API reference](/product/apis-integrations/rest-api/reference#v1load) for its format).
-
-<WarningBox>
-  In some cases, the AI API will not generate a query, i.e. there will be no `cube_query` key in the first JSON object. 
-  When that happens, there will be no second object generated, as there are no results to show. This is expected and may
-  occur when the model needs more information or doesn't have the necessary fields to run the requested query.
-</WarningBox>
-
-<InfoBox>
-  Note that if the AI API generated a query, the response now contains two JSON objects separated by a newline
-  (`\n`). You are responsible for parsing these appropriately.
-</InfoBox>
-
-#### 2. `/load`
-
-Alternatively, you may take the generated `cube_query` from the response and then call the [REST API `/load` endpoint](/product/apis-integrations/rest-api/reference#v1load) with it in the `/load` request body. This is recommended for advanced use-cases where you need more control over formatting, pagination, etc. or if you are adding the AI API to an existing Cube REST API implementation.
-
-### Error Handling
-
-Occasionally you may encounter errors. There are a few common categories of errors:
-
-#### 1. Cannot answer question
-
-If the AI API is unable to generate a query because the view(s) in your data model do not have the appropriate fields to answer the question, you will receive a message like the following, and no `cube_query` in the response:
-
-```
-{
-    "message": "I'm sorry, but the current data modeling doesn't cover stock prices or specific company data like NVDA. I will notify the data engineering team about this request."
-}
-```
-
-#### 2. Invalid query
-
-Occasionally, the AI API may generate a query that is invalid or cannot be run. When this happens, you will receive an error upon running the query.
-
-One way of handling this is to pass the error message back into the AI API; it may then self-correct and provide a new, valid query.
-
-#### 3. Continue wait
-
-When using `"runQuery": true`, you might sometimes receive a query result containing `{ "error": "Continue wait" }`. If this happens, you should use `/load` ([described above](#2-load)) instead of `runQuery` to run the query, and handle retries as described in the [REST API documentation](/product/apis-integrations/rest-api#continue-wait).
-
-## Advanced Usage
-
-<InfoBox>
-    The advanced features discussed here are available on Cube version 1.1.7 and above.
-</InfoBox>
-
-### Custom prompts
-
-You can prompt the AI API with custom instructions. For example, you may want it to always
-respond in a particular language, or to refer to itself by a name matching your brand.
-Custom prompts also allow you to give the model more context on your company and data model,
-for example if it should usually prefer a particular view.
-
-To use a custom prompt, set the `CUBE_CLOUD_AI_API_PROMPT` environment variable in your deployment.
-
-<InfoBox>
-  Custom prompts add to, rather than overwrite, the AI API's existing prompting, so you
-  do not need to re-write instructions around how to generate the query itself.
-</InfoBox>
-
-### Meta tags
-
-The AI API can read [meta tags](/reference/data-model/view#meta) on your dimensions, measures, 
-segments, and views.
-
-Use the `ai` meta tag to give context that is specific to AI and goes beyond what is 
-included in the description. This can have any keys that you want. For example, you can use it
-to give the AI context on possible values in a categorical dimension:
-```yaml
-      - name: status
-        sql: status
-        type: string
-        meta:
-          ai:
-            values:
-              - shipped
-              - processing
-              - completed
-```
-
-### Value search
-
-By default, the AI API has no ability to see the contents of your data (for privacy reasons).
-However, this makes it difficult for the AI API to generate correct filters for some queries.
-
-Imagine you have a categorical `order_status` dimension with the possible values "shipped",
-"processing", and "completed". Without value search, asking "how many complete orders did
-we have today" might get you a query filtering on `order_status = 'Complete'` instead of
-the correct `order_status = 'completed'`.
-
-To solve this, the AI API can perform "value searches" where it introspects the values in
-selected categorical dimensions before running a query. Value search is opt-in and dimensions
-must be enabled for it individually. Currently, the AI API performs value search by running
-Cube queries using the `contains` filter operator against one or more chosen dimensions.
-The LLM will select dimensions from among those you have based on the question asked and
-generate possible values dynamically.
-
-<InfoBox>
-  When running value search queries, the AI API passes through the security context used
-  for the AI API request, so security is maintained and only dimensions the end user has
-  access to are able to be searched.
-</InfoBox>
-
-To enable value search on a dimension, set the `searchable` field to true under the `ai`
-meta tag, as shown below:
-```yaml
-    - name: order_status
-      sql: order_status
-      type: string
-      meta:
-        ai:
-          searchable: true
-```
-
-Note that enabling Value Search may lead to slightly longer AI API response times when it
-is used but should result in significantly more accurate queries in many situations. Value
-Search can only be used on string dimensions.
-
-### Other LLM providers
-
-<InfoBox>
-  These environment variables also apply to the [AI Assistant](/product/workspace/ai-assistant),
-  if it is enabled on your deployment.
-</InfoBox>
-
-If desired, you may "bring your own" LLM model by providing a model and API credentials
-for a supported model provider. Do this by setting environment variables in your Cube
-deployment.
-
-- `CUBE_CLOUD_AI_COMPLETION_MODEL` - The AI model name to use (varies based on provider). For example `gpt-4o`.
-- `CUBE_CLOUD_AI_COMPLETION_PROVIDER` - The provider. Must be one of the following:
-  - `amazon-bedrock`
-  - `anthropic`
-  - `azure`
-  - `cohere`
-  - `databricks`
-  - `deepseek`
-  - `fireworks`
-  - `google-generative-ai`
-  - `google-vertex-ai`
-  - `google-vertex-ai-anthropic`
-  - `groq`
-  - `mistral`
-  - `openai`
-  - `openai-compatible` (any provider with an OpenAI-compatible API; support may vary)
-  - `snowflake`
-  - `together-ai`
-  - `x-ai`
-
-See below for required variables by provider (required unless noted):
-
-#### AWS Bedrock
-
-<WarningBox>
-  The AI API currently supports only Anthropic Claude models on AWS Bedrock.
-  Other models may work but are not fully supported.
-</WarningBox>
-
-- `CUBE_CLOUD_AI_AWS_ACCESS_KEY_ID` - An access key for an IAM user with `InvokeModelWithResponseStream` permissions on the desired region/model.
-- `CUBE_CLOUD_AI_AWS_SECRET_ACCESS_KEY` - The corresponding access secret
-- `CUBE_CLOUD_AI_AWS_REGION` - A supported AWS Bedrock region, for example `us-west-2`
-- `CUBE_CLOUD_AI_AWS_SESSION_TOKEN` - The session token (optional)
-
-#### Anthropic
-
-- `CUBE_CLOUD_AI_ANTHROPIC_API_KEY`
-- `CUBE_CLOUD_AI_ANTHROPIC_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)
-
-#### Microsoft Azure OpenAI
-
-- `CUBE_CLOUD_AI_AZURE_RESOURCE_NAME`
-- `CUBE_CLOUD_AI_AZURE_API_KEY`
-- `CUBE_CLOUD_AI_AZURE_API_VERSION` (optional)
-- `CUBE_CLOUD_AI_AZURE_BASE_URL` (optional)
-
-#### Cohere
-
-- `CUBE_CLOUD_AI_COHERE_API_KEY`
-- `CUBE_CLOUD_AI_COHERE_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)
-
-#### Databricks
-
-<InfoBox>
-  The AI API uses [Databricks Foundation Model APIs](https://docs.databricks.com/aws/en/large-language-models/llm-serving-intro). 
-  Currently only `databricks-claude-3-7-sonnet` is supported, although other models may also work.
-</InfoBox>
-
-- `CUBE_CLOUD_AI_DATABRICKS_HOST` - for example, `your-instance-id.cloud.databricks.com` (do not include `https://`)
-- `CUBE_CLOUD_AI_DATABRICKS_TOKEN` - your personal access token
-
-#### DeepSeek
-
-- `CUBE_CLOUD_AI_DEEPSEEK_API_KEY`
-- `CUBE_CLOUD_AI_DEEPSEEK_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)
-
-#### Fireworks
-
-- `CUBE_CLOUD_AI_FIREWORKS_API_KEY`
-- `CUBE_CLOUD_AI_FIREWORKS_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)
-
-#### Google Generative AI
-
-- `CUBE_CLOUD_AI_GOOGLE_GENERATIVE_AI_API_KEY`
-- `CUBE_CLOUD_AI_GOOGLE_GENERATIVE_AI_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)
-
-#### GCP Vertex AI
-
-<WarningBox>
-  See <Btn>Google Vertex AI (Anthropic)</Btn> below if using Anthropic models
-</WarningBox>
-
-- `CUBE_CLOUD_AI_GOOGLE_VERTEX_PROJECT`
-- `CUBE_CLOUD_AI_GOOGLE_VERTEX_LOCATION`
-- `CUBE_CLOUD_AI_GOOGLE_VERTEX_CREDENTIALS`
-- `CUBE_CLOUD_AI_GOOGLE_VERTEX_PUBLISHER` - defaults to `google`; change if using another publisher (optional)
-
-#### GCP Vertex AI (Anthropic)
-
-- `CUBE_CLOUD_AI_GOOGLE_VERTEX_ANTHROPIC_PROJECT`
-- `CUBE_CLOUD_AI_GOOGLE_VERTEX_ANTHROPIC_LOCATION`
-- `CUBE_CLOUD_AI_GOOGLE_VERTEX_ANTHROPIC_CREDENTIALS`
-- `CUBE_CLOUD_AI_GOOGLE_VERTEX_ANTHROPIC_PUBLISHER` - defaults to `anthropic`; change if using another publisher (optional)
-
-#### Groq
-
-- `CUBE_CLOUD_AI_GROQ_API_KEY`
-- `CUBE_CLOUD_AI_GROQ_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)
-
-#### Mistral
-
-- `CUBE_CLOUD_AI_MISTRAL_API_KEY`
-- `CUBE_CLOUD_AI_MISTRAL_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)
-
-#### OpenAI
-
-- `CUBE_CLOUD_AI_OPENAI_API_KEY`
-- `CUBE_CLOUD_AI_OPENAI_ORGANIZATION` - (optional)
-- `CUBE_CLOUD_AI_OPENAI_PROJECT` - (optional)
-- `CUBE_CLOUD_AI_OPENAI_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)
-
-#### OpenAI Compatible Providers
-
-<InfoBox>
-  Use this provider if your provider is not listed on this page but provides an
-  OpenAI compatible endpoint. Not all providers/models are supported.
-</InfoBox>
-
-- `CUBE_CLOUD_AI_OPENAI_COMPATIBLE_API_KEY`
-- `CUBE_CLOUD_AI_OPENAI_COMPATIBLE_BASE_URL`
-
-#### Snowflake Cortex
-
-<WarningBox>
-  We recommend using `claude-3-5-sonnet` (or any newer Claude models available)
-  on Snowflake Cortex with the Cube AI API. Other models may work but are not fully tested or supported.
-</WarningBox>
-
-<InfoBox>
-The Snowflake Cortex LLM REST API uses key pair authentication. 
-Please follow the steps in [Snowflake's documentation](https://docs.snowflake.com/en/user-guide/key-pair-auth#configuring-key-pair-authentication) to generate
-a key and assign it to a Snowflake user.
-
-We recommend creating a separate Snowflake user with limited permissions for
-use with the Cube AI API.
-</InfoBox>
-
-- `CUBE_CLOUD_AI_SNOWFLAKE_ACCOUNT`
-- `CUBE_CLOUD_AI_SNOWFLAKE_USERNAME`
-- `CUBE_CLOUD_AI_SNOWFLAKE_PRIVATE_KEY`
-
-#### Together AI
-
-- `CUBE_CLOUD_AI_TOGETHER_API_KEY`
-- `CUBE_CLOUD_AI_TOGETHER_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)
-
-#### xAI (Grok)
-
-- `CUBE_CLOUD_AI_X_AI_API_KEY`
-- `CUBE_CLOUD_AI_X_AI_BASE_URL` - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)
-
-[ref-ref-ai-api]: /product/apis-integrations/ai-api/reference
diff --git a/docs/pages/product/apis-integrations/ai-api/_meta.js b/docs/pages/product/apis-integrations/ai-api/_meta.js
deleted file mode 100644
index 2ece3a20a620f..0000000000000
--- a/docs/pages/product/apis-integrations/ai-api/_meta.js
+++ /dev/null
@@ -1,4 +0,0 @@
-module.exports = {
-  "privacy-security": "Privacy and Security",
-  "reference": "Reference"
-};
diff --git a/docs/pages/product/apis-integrations/ai-api/privacy-security.mdx b/docs/pages/product/apis-integrations/ai-api/privacy-security.mdx
deleted file mode 100644
index e755b4992b9a5..0000000000000
--- a/docs/pages/product/apis-integrations/ai-api/privacy-security.mdx
+++ /dev/null
@@ -1,28 +0,0 @@
-# Privacy and Security
-
-With Cube’s AI API, your credentials are never shared with AI, and neither is the connection to your data store. All access to the AI API is governed by the same security context as anything else in Cube Cloud.
-
-## Data Retention Policy
-
-By default, the Cube AI API uses Anthropic models via GCP VertexAI. Your data isn’t used by Google or Anthropic to train models or improve products.
-
-- Google does not retain customer data or use it for training or model improvement purposes.
-- Usage is governed by the [Anthropic on Vertex Commercial Terms of Service][ref-anthropic-tos], which specify that Anthropic does not receive access to prompts or outputs and may not train models on customer data.
-
-## Dynamic grounding with secure data retrieval
-
-- Relevant information from your Cube semantic layer is merged with the prompt to provide context.
-- The metadata available for grounding the prompt is limited to the permissions of the user executing the prompt.
-- Secure data retrieval preserves in place all standard Cube role-based controls for user permissions and column/row level access when merging grounding data from your Cube semantic layer.
-
-## Prompt Defense
-
-- Context provided by the semantic layer limits hallucinations by the LLM.
-- LLMs interface with existing Cube APIs further constraining their ability and limiting hallucinations, whilst providing enhanced transparency
-
-## Data Masking
-
-- Data masking policies enforced by Cube are also enforced in AI API usage.
-- You can configure what must and must not be masked in the Cube Semantic Layer.
-
-[ref-anthropic-tos]: https://www-cdn.anthropic.com/471bd07290603ee509a5ea0d5ccf131ea5897232/anthropic-vertex-commercial-terms-march-2024.pdf
diff --git a/docs/pages/product/apis-integrations/ai-api/reference.mdx b/docs/pages/product/apis-integrations/ai-api/reference.mdx
deleted file mode 100644
index f1f304a7e818b..0000000000000
--- a/docs/pages/product/apis-integrations/ai-api/reference.mdx
+++ /dev/null
@@ -1,183 +0,0 @@
-# AI API reference
-
-The [AI API](/product/apis-integrations/ai-api) provides the following endpoints.
-
-## `/query/completions`
-
-Generate a Cube query that can be used to answer a user's question, and (optionally) run the query and return its results.
-
-| Parameter  | Required | Description                                                                                     |
-| ---------- | -------- | ----------------------------------------------------------------------------------------------- |
-| `messages` | ✅ Yes   | An array of messages in the format: `{ "role": "user" \| "assistant", "content": "string" }`    |
-| `views`    |          | An array of view names (used to limit the views that the AI API can use to generate its answer) |
-| `runQuery` |          | Boolean (true or false) whether to run the query and return its results                         |
-| `options`  |          | An object in the format `{ "chart": true \| false }`
-
-Response
-
-- `message` - A message from the AI assistant describing the query, how it was chosen, why it could not generate the requested query, etc.
-- `cube_query` - A Cube [Query](/product/apis-integrations/rest-api/query-format) that could be used to answer the given question
-- `chart` - If the `chart` option is set to `true`, an object containing a chart spec for the generated query in the following format:
-    ```json
-    {
-        "type": "bar" | "line" | "pie" | "table" | "area" | "scatter",
-        "x": string,
-        "y": string[],
-        "pivot": string // optional; the field to pivot by, if any
-    }
-    ```
-
-### Examples
-
-#### Without `runQuery`
-
-Example request:
-
-```bash
-curl \
- -X POST  \
- -H "Content-Type: application/json" \
- -H "Authorization: EXAMPLE-API-TOKEN" \
- --data '{ "messages": [{ "role": "user", "content": "What cities have the highest aov this year?" }], "views": ["orders_view"] }' \
- https://YOUR_CUBE_API/cubejs-api/v1/ai/query/completions
-```
-
-Example response:
-
-```json
-{
-  "message": "To find the cities with the highest Average Order Value (AOV) this year, we can use the Orders View. This query will aggregate data to calculate the average order value per city for the current year.",
-  "cube_query": {
-    "measures": ["orders_view.average_order_value"],
-    "dimensions": ["orders_view.users_city"],
-    "timeDimensions": [
-      {
-        "dimension": "orders_view.created_at",
-        "granularity": "year",
-        "dateRange": "this year"
-      }
-    ],
-    "order": {
-      "orders_view.average_order_value": "desc"
-    },
-    "limit": 10
-  }
-}
-```
-
-#### With `runQuery`
-
-```bash
-curl \
- -X POST  \
- -H "Content-Type: application/json" \
- -H "Authorization: EXAMPLE-API-TOKEN" \
- --data '{ "messages": [{ "role": "user", "content": "What cities had the highest aov last year?" }], "runQuery": true}' \
- https://YOUR_CUBE_API/cubejs-api/v1/ai/query/completions
-```
-
-Example response:
-
-```json
-{
-    "message": "To find the city with the highest average order value for last year, we'll analyze the data by city and calculate the average order value for each. The query will group the results by users' city and sort them to identify the city with the highest average order value.",
-    "cube_query": {
-        "measures": [
-            "orders_view.average_order_value"
-        ],
-        "dimensions": [
-            "orders_view.users_city"
-        ],
-        "timeDimensions": [
-            {
-                "dimension": "orders_view.created_at",
-                "dateRange": "last year",
-                "granularity": "year"
-            }
-        ],
-        "order": {
-            "orders_view.average_order_value": "desc"
-        },
-        "limit": 1
-    }
-}
-{
-    "query": {
-        "measures": [
-            "orders_view.average_order_value"
-        ],
-        "dimensions": [
-            "orders_view.users_city"
-        ],
-        "timeDimensions": [
-            {
-                "dimension": "orders_view.created_at",
-                "dateRange": [
-                    "2023-01-01T00:00:00.000",
-                    "2023-12-31T23:59:59.999"
-                ],
-                "granularity": "year"
-            }
-        ],
-        "order": [
-            {
-                "id": "orders_view.average_order_value",
-                "desc": true
-            }
-        ],
-        "limit": 1,
-        "timezone": "UTC",
-        "filters": [],
-        "rowLimit": 1
-    },
-    "data": [
-        {
-            "orders_view.users_city": "San Francisco",
-            "orders_view.created_at.year": "2023-01-01T00:00:00.000",
-            "orders_view.created_at": "2023-01-01T00:00:00.000",
-            "orders_view.average_order_value": "322.619048"
-        }
-    ],
-    "lastRefreshTime": "2024-05-08T18:24:14.623Z",
-    "annotation": {
-        "measures": {
-            "orders_view.average_order_value": {
-                "title": "Orders View Average Order Value",
-                "shortTitle": "Average Order Value",
-                "type": "number",
-                "drillMembers": [],
-                "drillMembersGrouped": {
-                    "measures": [],
-                    "dimensions": []
-                }
-            }
-        },
-        "dimensions": {
-            "orders_view.users_city": {
-                "title": "Orders View Users City",
-                "shortTitle": "Users City",
-                "type": "string"
-            }
-        },
-        "segments": {},
-        "timeDimensions": {
-            "orders_view.created_at.year": {
-                "title": "Orders View Created at",
-                "shortTitle": "Created at",
-                "type": "time"
-            },
-            "orders_view.created_at": {
-                "title": "Orders View Created at",
-                "shortTitle": "Created at",
-                "type": "time"
-            }
-        }
-    },
-    "dataSource": "default",
-    "dbType": "snowflake",
-    "extDbType": "cubestore",
-    "external": false,
-    "slowQuery": false,
-    "total": null
-}
-```
diff --git a/docs/pages/product/configuration/visualization-tools.mdx b/docs/pages/product/configuration/visualization-tools.mdx
index 59d24d6c855bf..82242e684a8f1 100644
--- a/docs/pages/product/configuration/visualization-tools.mdx
+++ b/docs/pages/product/configuration/visualization-tools.mdx
@@ -257,9 +257,4 @@ out REST and GraphQL APIs.
     imageUrl="https://static.cube.dev/icons/mdx.svg"
     title="MDX API"
   />
-  <GridItem
-    url="/product/apis-integrations/ai-api"
-    imageUrl="https://static.cube.dev/icons/ai.svg"
-    title="AI API"
-  />
 </Grid>
diff --git a/docs/pages/product/deployment/cloud/pricing.mdx b/docs/pages/product/deployment/cloud/pricing.mdx
index ce5f77de102f4..e9017aba921c0 100644
--- a/docs/pages/product/deployment/cloud/pricing.mdx
+++ b/docs/pages/product/deployment/cloud/pricing.mdx
@@ -114,14 +114,6 @@ of deployments within a Cube Cloud account. The consumption is measured in 5-min
 | [Query History][ref-query-history] | <nobr>0..20</nobr> | Depends on a [chosen tier](#query-history-tiers) |
 | [Monitoring Integrations][ref-monitoring-integrations] | <nobr>1..4</nobr> | Depends on a [chosen tier](#monitoring-integrations-tiers) |
 
-The following resource types incur CCU consumption and apply to _individual requests_
-to deployments within a Cube Cloud account:
-
-| Resource type | CCUs per request | Notes |
-| --- | :---: | --- |
-| [AI API][ref-ai-api] | <nobr>0..1</nobr> | Depends on [configuration](#ai-requests-consumption) |
-| [AI Assistant][ref-ai-assistant] | <nobr>0..1</nobr> | Depends on [configuration](#ai-requests-consumption) |
-
 The following resource types incur CCU consumption and apply to the _whole Cube Cloud
 account_:
 
@@ -200,16 +192,6 @@ You can upgrade to a chosen tier in the
 You can [upgrade][ref-monitoring-integrations-config] to a chosen tier in the
 <Btn>Settings</Btn> of your deployment.
 
-### AI requests consumption
-
-[AI API][ref-ai-api] and [AI Assistant][ref-ai-assistant] consume CCUs per request apart from
-[Enterprise and above][cube-pricing] product tiers where customers can provide their own suitable
-LLM if wanted and then will be exempt from this charge:
-
-| <nobr>CCUs per request</nobr> |
-| :------------------------: |
-| 1                          |
-
 ### Audit Log tiers
 
 [Audit Log][ref-audit-log] collects, stores, and displays security-related events
@@ -365,6 +347,4 @@ product tier level. Payments are non-refundable.
 [ref-data-at-rest-encryption]: /product/caching/running-in-production#data-at-rest-encryption
 [ref-customer-managed-keys]: /product/workspace/encryption-keys
 [ref-semantic-catalog]: /product/workspace/semantic-catalog
-[ref-ai-api]: /product/apis-integrations/ai-api
-[ref-ai-assistant]: /product/workspace/ai-assistant
 [ref-query-history-export]: /product/workspace/monitoring#query-history-export
\ No newline at end of file
diff --git a/docs/pages/product/workspace.mdx b/docs/pages/product/workspace.mdx
index 6e8c8d7993a42..2ec185dac51d5 100644
--- a/docs/pages/product/workspace.mdx
+++ b/docs/pages/product/workspace.mdx
@@ -42,7 +42,6 @@ encryption in Cube Store][ref-cube-store-encryption].
 - Use [Budgets][ref-budgets] to control the usage and spend of your Cube
   Cloud account.
 - Use [Preferences][ref-prefs] to adjust the workspace to your liking.
-- Use [AI Assistant][ref-ai-assistant] to explore and query data with natural language.
 - Use [Semantic Catalog][ref-semantic-catalog] to search a unified view of connected data assets, see lineage, and explore connected BI content.
 
 ## Workspace tools in Cube Core
@@ -74,7 +73,6 @@ With Cube Core, you can:
 [ref-budgets]: /product/workspace/budgets
 [ref-prefs]: /product/workspace/preferences
 [ref-cli]: /product/workspace/cli
-[ref-ai-assistant]: /product/workspace/ai-assistant
 [ref-semantic-catalog]: /product/workspace/semantic-catalog
 [ref-encryption-keys]: /product/workspace/encryption-keys
 [ref-cube-store-encryption]: /product/caching/running-in-production#data-at-rest-encryption
diff --git a/docs/pages/product/workspace/_meta.js b/docs/pages/product/workspace/_meta.js
index 8e87b1a25f5f0..6252465843bf0 100644
--- a/docs/pages/product/workspace/_meta.js
+++ b/docs/pages/product/workspace/_meta.js
@@ -19,6 +19,5 @@ module.exports = {
   "budgets": "Budgets",
   "preferences": "Preferences",
   "cli": "CLI",
-  "ai-assistant": "AI Assistant",
   "semantic-catalog": "Semantic Catalog",
 }
diff --git a/docs/pages/product/workspace/access-control.mdx b/docs/pages/product/workspace/access-control.mdx
index ef06077aa6e96..bcb1167e765c7 100644
--- a/docs/pages/product/workspace/access-control.mdx
+++ b/docs/pages/product/workspace/access-control.mdx
@@ -115,7 +115,7 @@ Actions for the `Deployment` policy:
 | `Data Model Edit (all branches)`<br/>`Data Model Edit (dev branches only)` | Use the [development mode][ref-dev-mode], edit the data model, perform Git operations (e.g., commit, pull, push). |
 | `Queries & Metrics Access` | Use [Query History][ref-query-history] and [Performance Insights][ref-perf-insights]. |
 | `SQL Runner Access` | Use [SQL Runner][ref-sql-runner]. |
-| `Data Assets Access` | Use [Semantic Catalog][ref-semantic-catalog] and [AI Assistant][ref-ai-assistant]. |
+| `Data Assets Access` | Use [Semantic Catalog][ref-semantic-catalog]. |
 
 Actions for the `Report` policy:
 
@@ -139,4 +139,3 @@ Actions for the `ReportFolder` policy:
 [ref-perf-insights]: /product/workspace/performance
 [ref-sql-runner]: /product/workspace/sql-runner
 [ref-semantic-catalog]: /product/workspace/semantic-catalog
-[ref-ai-assistant]: /product/workspace/ai-assistant
\ No newline at end of file
diff --git a/docs/pages/product/workspace/ai-assistant.mdx b/docs/pages/product/workspace/ai-assistant.mdx
deleted file mode 100644
index 172945c35510d..0000000000000
--- a/docs/pages/product/workspace/ai-assistant.mdx
+++ /dev/null
@@ -1,135 +0,0 @@
-# AI Assistant
-
-Business users can ask questions about your organization's Cube data model and run queries using natural language.
-AI Assistant is integrated with the [Playground][ref-playground] and [Semantic Catalog][ref-catalog] so that users can easily explore their results further.
-
-<SuccessBox>
-
-AI Assistant is available in Cube Cloud on
-[Premium and above](https://cube.dev/pricing) product tiers.
-[Contact us](https://cube.dev/contact) for details.
-
-</SuccessBox>
-
-<Screenshot src="https://ucarecdn.com/9aee5273-e219-4ccc-aa5c-2c8b66a6c932/assistant.jpg" />
-
-## Getting Started
-
-AI Assistant is currently in preview. To get started, please ask your account team to enable AI Assistant for you.
-
-Then, if you've already set up Semantic Catalog, you're ready to use AI Assistant. If you haven't, do the following to enable AI Asisstant:
-
-1. In your Cube deployment sidebar, navigate to "Settings" and then "Catalog Services"
-2. Click the button to enable the Catalog. This will connect your Cube data model and enable AI Assistant.
-3. If you'd like to connect any downstream business intelligence tools, follow the [guide on the Semantic Catalog page][ref-catalog-downstream].
-
-## Using AI Assistant
-
-Users can ask questions and have conversations with the AI Assistant to better understand the data in your Cube data model, to pull data, and run basic analyses.
-There are two specific types of questions that the AI Assistant can answer: catalog questions and data queries. These are described in more detail below.
-
-### Catalog questions
-
-A catalog question is a question about what data is available to the user or what particular data assets mean (for example, a dimension or measure).
-
-**Example**
-
-A user might want to do some analysis around geographies, but they're not sure if their organization already has dashboards about users' locations
-or how granular their organization's data on user locations goes. They could ask:
-
-> What info do we have about user locations?
-
-The AI Assistant will reply with a summary and display the dashboards, charts, and/or Cube view(s) containing location information for the user to explore.
-
-<Screenshot src="https://ucarecdn.com/0fb3768f-1b1a-42d5-89f0-38555b0c7427/Screenshot20240624at41011PM.png" />
-
-### Data Queries
-
-A data query is one where the user wants the AI Assistant to generate and run a Cube query, and return the results.
-Users can get quick answers to questions instead of having to ask an analyst, file a ticket, or navigate a complex visualization tool.
-
-**Example**
-
-A sales analyst might want to know which cities orders are trending in lately. They could ask the following question:
-
-> Where did we have the most orders last month?
-
-The query will automatically run in the sidebar and can be opened in the [Playground][ref-playground] for further exploration.
-
-<Screenshot src="https://ucarecdn.com/4249ff1e-fae1-42c8-ad3a-b9e406ea2022/Screenshot20240624at34327PM.png" />
-
-## Advanced Usage
-
-<InfoBox>
-    The advanced features discussed here are available on Cube version 1.1.7 and above.
-</InfoBox>
-
-### Custom prompts
-
-You can prompt the AI Assistant with custom instructions. For example, you may want it to always
-respond in a particular language, or to refer to itself by a name matching your brand.
-Custom prompts also allow you to give the model more context on your company and data model,
-for example if it should usually prefer a particular view.
-
-To use a custom prompt, set the `CUBE_CLOUD_AI_ASSISTANT_PROMPT` environment variable in your deployment.
-
-<InfoBox>
-  Custom prompts add to, rather than overwrite, the AI Assistant's existing prompting.
-</InfoBox>
-
-### Meta tags
-
-The AI Assistant can read [meta tags](/reference/data-model/view#meta) on your dimensions, measures, 
-segments, and views.
-
-Use the `ai` meta tag to give context that is specific to AI and goes beyond what is 
-included in the description. This can have any keys that you want. For example, you can use it
-to give the AI context on possible values in a categorical dimension:
-```yaml
-      - name: status
-        sql: status
-        type: string
-        meta:
-          ai:
-            values:
-              - shipped
-              - processing
-              - completed
-```
-
-### Value search
-
-Value Search can be enabled for AI Assistant in the same way as for the AI API. See the 
-[AI API's documentation][ref-ai-api-value-search] for details and instructions.
-
-### Other LLM providers
-
-See the [AI API's documentation][ref-ai-api-providers] for information on how to "bring your own" LLM. 
-
-## FAQ and limitations
-
-### 1. What language model(s) does the AI Assistant use?
-
-- The AI Assistant currently uses Claude 3.5 Sonnet v2 from Anthropic (via Google Cloud), but this may change in the future
-
-### 2. Are conversations saved or used for training models?
-
-- Per our terms with the LLM provider(s), they do not use the conversations for training models.
-- They may save conversations for up to 30 days for abuse and fraud monitoring purposes.
-- Note that customer data (i.e. the results of queries) is _never_ visible to the LLM in the AI Assistant.
-
-### 3. Can the LLM hallucinate or give incorrect results?
-
-- We make every effort to avoid hallucinations and incorrect results. However, the nature of AI-based systems is that they may make mistakes from time to time.
-- If the model hallucinates data assets (such as dimensions, measures, or views) that don't exist, the user will see an error in the playground. It will _never_ return "fake" data, as a valid Cube query is needed to display results.
-
-### 4. How can I give feedback or train the model?
-
-- To give feedback to the model, use the thumbs-up and thumbs-down buttons that appear under each response.
-- If a model is consistently getting something wrong, it may be a sign that the data model is confusing or incomplete. Check things like field labels and descriptions and make sure that irrelevant fields are not marked as visible in your Cube data model.
-
-[ref-catalog]: /product/workspace/semantic-catalog
-[ref-playground]: /product/workspace/playground
-[ref-catalog-downstream]: /product/workspace/semantic-catalog#connecting-downstream-tools
-[ref-ai-api-providers]: /product/apis-integrations/ai-api#other-llm-providers
-[ref-ai-api-value-search]: /product/apis-integrations/ai-api#value-search
\ No newline at end of file
diff --git a/docs/redirects.json b/docs/redirects.json
index 0eb1d477a6b3b..96b718040bf13 100644
--- a/docs/redirects.json
+++ b/docs/redirects.json
@@ -49,11 +49,6 @@
     "destination": "/product/workspace/cli/reference",
     "permanent": true
   },
-  {
-    "source": "/reference/ai-api",
-    "destination": "/product/apis-integrations/ai-api/reference",
-    "permanent": true
-  },
   {
     "source": "/reference/graphql-api",
     "destination": "/product/apis-integrations/graphql-api/reference",
@@ -1304,4 +1299,4 @@
     "destination": "/guides/recipes/access-control/column-based-access",
     "permanent": true
   }
-]
\ No newline at end of file
+]