Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions docs/llmservice/models/qwen3.6-27b.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Qwen3.6-27B

## Overview

Qwen3.6-27B FP8 is a dense 27-billion-parameter multimodal model developed by Alibaba's Qwen Team. It is optimized for agentic coding and reasoning, while remaining practical to deploy for production workloads. On B.AI, it is positioned for coding assistance, technical reasoning, multimodal understanding, and tool-assisted workflows. Specific input modalities, context limits, and tool capabilities may vary by B.AI model catalog and platform configuration.

## Key Features

* **Hybrid Gated DeltaNet Architecture**: Uses a hybrid attention design that combines efficient linear-attention-style layers with full self-attention layers, balancing inference efficiency with long-context performance.
* **Natively Multimodal**: Supports text, image, and video inputs at the model level, subject to B.AI platform configuration and availability.
* **Hybrid Thinking Mode**: Supports both thinking and non-thinking response modes where available, allowing different quality-speed tradeoffs per task.
* **Thinking Preservation**: Designed to preserve reasoning context across multi-turn conversations, improving coherence in agentic coding workflows.
* **Multi-Token Prediction (MTP)**: Uses multi-token prediction training to improve inference throughput.

## Best Use Cases

* **Agentic Coding**: Well-suited for autonomous code generation, debugging, and multi-step software engineering workflows.
* **Complex Reasoning Tasks**: Suitable for scientific, mathematical, engineering, and analytical problem-solving.
* **Multimodal Analysis**: Can be used for document, screenshot, chart, diagram, image, and video understanding when the relevant input modality is enabled.
* **Production Workloads**: A practical choice for workloads that need a balance of capability, latency, and usage cost.

## Capabilities and Limitations

| Capability | Description |
| :------------------- | :------------------------------------------------------------------------------------------------------ |
| **Reasoning** | Strong technical and analytical reasoning for structured problem-solving tasks |
| **Coding** | Suitable for code generation, debugging, refactoring, and agentic software workflows |
| **Creative Writing** | General-purpose text generation; primarily optimized for code and reasoning rather than creative output |
| **Multimodal** | Text, image, and video input at the model level; text output |
| **Context Window** | Up to 128k tokens, subject to platform configuration |
| **Max Output** | Up to 32,768 tokens, subject to platform configuration |
| **Tool Use** | Native function calling and tool use support where enabled |

### Known Limitations

* Specific capability availability may depend on the B.AI integration, provider support, plan settings, and rollout status.
* Video input, tool use, long-context limits, and other advanced capabilities require compatible platform configuration.
* Public evaluations, third-party comparisons, policy behavior, and implementation details may change over time, so they are not treated as fixed guarantees in this documentation.

## Pricing

| Model | Input (Credits/Token) | Cache Write (Credits/Token) | Cache Read (Credits/Token) | Output (Credits/Token) | Web Search (Credits/Use) | Billing Notes |
| :--- | --------------------: | --------------------------: | -------------------------: | ---------------------: | -----------------------: | :--- |
| **Qwen3.6-27B** | `0.19` | `0.19` | `0.19` | `2.99` | `-` | Cache reads and writes are billed at the same rate. |

:::info Pricing note
Prices shown in the documentation are B.AI standard reference prices for base billing purposes. B.AI may provide lower actual usage costs through top-up bonuses and account benefits. Specific prices, bonus Credits, and account benefits are subject to the platform display and final billing records.
:::
1 change: 1 addition & 0 deletions docs/llmservice/pricing-and-usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ The platform uses a unified Credits system to measure and settle usage across al
| MiniMax M2.7 | 0.30 | 0.375 | 0.06 | 1.20 | - |
| Kimi K2.6 | 0.95 | 0.95 | 0.16 | 4.00 | - |
| Kimi K2.5 | 0.59 | 0.59 | 0.177 | 3.00 | - |
| Qwen3.6-27B | 0.19 | 0.19 | 0.19 | 2.99 | - |
| GLM-5.1 | 1.40 | 1.40 | 0.26 | 4.40 | - |
| GLM-5 | 1.00 | 1.00 | 0.20 | 3.20 | - |
| DeepSeek V3.2 | 0.29 | 0.29 | 0.145 | 0.44 | - |
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Qwen3.6-27B

## 概述

Qwen3.6-27B FP8 是由 Alibaba Qwen Team 开发的 270 亿参数稠密多模态模型。该模型面向 Agentic Coding 和复杂推理任务优化,同时具备较强的生产部署实用性。在 B.AI 上,Qwen3.6-27B 适用于代码辅助、技术推理、多模态理解和工具辅助工作流。具体输入模态、上下文长度和工具能力可能会随 B.AI 模型目录和平台配置调整。

## 核心特性

* **Hybrid Gated DeltaNet Architecture**:采用混合注意力设计,将高效的线性注意力类层与完整自注意力层结合,在推理效率和长上下文表现之间取得平衡。
* **原生多模态能力**:模型层面支持文本、图像和视频输入,具体以 B.AI 平台配置和可用状态为准。
* **Hybrid Thinking Mode**:在可用场景下支持 thinking 与 non-thinking 两种响应模式,可根据任务需求调整质量、速度和成本之间的平衡。
* **Thinking Preservation**:面向多轮对话保留推理上下文,有助于提升 Agentic Coding 工作流中的连贯性。
* **Multi-Token Prediction (MTP)**:采用多 token 预测训练方式,以提升推理吞吐效率。

## 适用场景

* **Agentic Coding**:适合自主代码生成、调试和多步骤软件工程工作流。
* **复杂推理任务**:适合科学、数学、工程和分析类问题求解。
* **多模态分析**:在相关输入模态启用时,可用于文档、截图、图表、示意图、图像和视频理解。
* **生产工作负载**:适合需要兼顾能力、延迟和使用成本的实际业务场景。

## 能力与限制

| 能力维度 | 说明 |
| :--- | :--- |
| **推理能力** | 适合技术、分析和结构化问题求解的强推理能力 |
| **编程能力** | 适合代码生成、调试、重构和 Agentic 软件工作流 |
| **创意写作** | 支持通用文本生成;主要优化方向是代码和推理,而非创意写作 |
| **多模态能力** | 模型层面支持文本、图像和视频输入;输出为文本 |
| **上下文窗口** | 最高 128k tokens,具体以平台配置为准 |
| **最大输出** | 最高 32,768 tokens,具体以平台配置为准 |
| **工具调用** | 在启用时支持原生函数调用和工具使用 |

### 已知限制

* 具体能力可用性可能取决于 B.AI 集成、供应商支持、套餐配置和功能上线状态。
* 视频输入、工具调用、长上下文上限及其他高级能力需要兼容的平台配置支持。
* 公开评测、第三方对比、策略行为和实现细节可能随时间变化,因此本文档不将其作为固定承诺。

## Pricing

| 模型名称 | 输入 (Credits/Token) | Cache Write (Credits/Token) | Cache Read (Credits/Token) | 输出 (Credits/Token) | 网页搜索(Credits/次) | 计费说明 |
| :--- | --------------------: | --------------------------: | -------------------------: | -------------------: | ---------------------: | :--- |
| **Qwen3.6-27B** | `0.19` | `0.19` | `0.19` | `2.99` | `-` | 缓存读取和缓存写入按相同价格计费。 |

:::info 价格说明
文档价格为 B.AI 平台模型标准参考价,仅供基础计费说明使用。B.AI 可能会通过充值赠送及账户权益等方式,为用户提供更低的实际使用成本。具体价格、赠送积分及账户权益请以平台页面展示及最终账单为准。
:::
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
| MiniMax M2.7 | 0.30 | 0.375 | 0.06 | 1.20 | - |
| Kimi K2.6 | 0.95 | 0.95 | 0.16 | 4.00 | - |
| Kimi K2.5 | 0.59 | 0.59 | 0.177 | 3.00 | - |
| Qwen3.6-27B | 0.19 | 0.19 | 0.19 | 2.99 | - |
| GLM-5.1 | 1.40 | 1.40 | 0.26 | 4.40 | - |
| GLM-5 | 1.00 | 1.00 | 0.20 | 3.20 | - |
| DeepSeek V3.2 | 0.29 | 0.29 | 0.145 | 0.44 | - |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -242,9 +242,9 @@ const sidebars = {
'llmservice/models/glm-5',
'llmservice/models/kimi-k2.6',
'llmservice/models/kimi-k2.5',
'llmservice/models/qwen3.6-27b',
'llmservice/models/minimax-m3',
'llmservice/models/minimax-m2.7',
'llmservice/models/minimax-m2.5',
],
},
{ type: 'doc', id: 'llmservice/memory', label: '记忆服务' },
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@x402-tron/docs",
"version": "1.2.25",
"version": "1.2.26",
"description": "x402-tron documentation",
"license": "MIT",
"scripts": {
Expand Down
1 change: 1 addition & 0 deletions sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -239,6 +239,7 @@ const sidebars = {
'llmservice/models/glm-5',
'llmservice/models/kimi-k2.6',
'llmservice/models/kimi-k2.5',
'llmservice/models/qwen3.6-27b',
'llmservice/models/minimax-m3',
'llmservice/models/minimax-m2.7',
],
Expand Down
Loading