Kimi K2.7 Code favicon

Kimi K2.7 Code

Kimi-K2.7-Code: A High-Performance 1T Parameter MoE Coding Agent by Moonshot AI

Introduction:

Kimi-K2.7-Code is an advanced coding-focused agentic model built by Moonshot AI, featuring 1T parameters, 256K context length, and superior long-horizon task completion.

Added On:

2026-06-15

Monthly Visitors:

27366.8K

Kimi K2.7 Code - AI Tool Screenshot and Interface Preview

Kimi K2.7 Code Product Information

Kimi-K2.7-Code: The Next Generation of Agentic Coding Models by Moonshot AI

Kimi-K2.7-Code represents a significant leap forward in the realm of coding-focused artificial intelligence. Developed by Moonshot AI, this model is an agentic powerhouse built upon the foundations of Kimi K2.6. Designed specifically for complex software engineering workflows, Kimi-K2.7-Code excels in real-world long-horizon coding tasks, providing end-to-end task completion while maintaining high token efficiency. By reducing thinking-token usage by approximately 30% compared to its predecessor, Kimi-K2.7-Code offers a more streamlined and cost-effective solution for developers and enterprises alike.

What's Kimi-K2.7-Code?

Kimi-K2.7-Code is a specialized Mixture-of-Experts (MoE) model engineered to act as a coding agent. Unlike standard large language models, it is optimized for the intricate requirements of software development, including debugging, refactoring, and complex architectural planning. The model boasts a massive 1 trillion total parameters, with 32 billion activated parameters per inference step, ensuring a balance between depth of knowledge and computational speed.

With a 256K context length, Kimi-K2.7-Code can ingest and analyze entire codebases, making it an ideal choice for long-context software projects. It utilizes the MLA (Multi-head Latent Attention) mechanism and SwiGLU activation function to deliver state-of-the-art performance in both text and vision-related coding tasks.

Key Features of Kimi-K2.7-Code

1. Advanced Architecture

Kimi-K2.7-Code is built using a sophisticated MoE framework:

  • Total Parameters: 1T
  • Activated Parameters: 32B
  • Layers: 61 (including dense layers)
  • Experts: 384 total experts, with 8 selected per token.
  • Vision Encoder: MoonViT with 400M parameters, enabling the model to handle image and video inputs.

2. Enhanced Token Efficiency

Efficiency is at the core of Kimi-K2.7-Code. It achieves a 30% reduction in thinking-token usage compared to Kimi K2.6, allowing for faster response times and reduced operational costs without sacrificing reasoning quality.

3. Native INT4 Quantization

Following the same path as Kimi-K2-Thinking, Kimi-K2.7-Code adopts native INT4 quantization. This allows for high-performance deployment on various hardware configurations while maintaining model precision.

4. Agentic Performance Benchmarks

In rigorous evaluations, Kimi-K2.7-Code has shown remarkable improvements across various benchmarks:

  • Kimi Code Bench v2: Scored 62.0 (up from 50.9 in K2.6).
  • MCP Mark Verified: Scored 81.1.
  • Kimi Claw 24/7 Bench: Scored 46.9.

Use Cases for Kimi-K2.7-Code

Kimi-K2.7-Code is versatile enough to support a wide range of professional software development scenarios:

  • Complex Software Engineering: Managing long-horizon tasks that require multi-step reasoning and deep codebase understanding.
  • Multi-Modal Coding Support: Utilizing its vision capabilities to describe images or analyze video content related to UI/UX design or technical demonstrations.
  • Automated Debugging: Leveraging its thinking mode to trace errors across large files thanks to its 256K context window.
  • Coding Agent Frameworks: It works optimally with the Kimi Code CLI to act as a fully autonomous coding assistant.

How to Use Kimi-K2.7-Code

There are several ways to deploy and interact with Kimi-K2.7-Code, ranging from high-level libraries to direct API calls.

Using Transformers Library

You can quickly implement Kimi-K2.7-Code using the Hugging Face transformers library. Ensure your version is >=4.57.1 and <5.0.0.

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="moonshotai/Kimi-K2.7-Code", trust_remote_code=True)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

Deployment with vLLM

For high-throughput inference, vLLM is a recommended engine for Kimi-K2.7-Code.

# Install vLLM:
pip install vllm
# Start the server:
vllm serve "moonshotai/Kimi-K2.7-Code"

Official Moonshot AI API

Moonshot AI provides an OpenAI-compatible API for Kimi-K2.7-Code. Note that this model forces thinking and preserve_thinking modes to be active.

Chat Completion with Thinking Mode

import openai

def simple_chat(client: openai.OpenAI, model_name: str):
    messages = [
        {'role': 'system', 'content': 'You are Kimi, an AI assistant created by Moonshot AI.'},
        {'role': 'user', 'content': 'which one is bigger, 9.11 or 9.9? think carefully.'},
    ]
    response = client.chat.completions.create(
        model=model_name, messages=messages, stream=False, max_tokens=4096
    )
    print(f'Reasoning: {response.choices[0].message.reasoning}')
    print(f'Response: {response.choices[0].message.content}')

FAQ

Q: What is the context length of Kimi-K2.7-Code? A: The model supports a massive context length of 256K tokens.

Q: Does Kimi-K2.7-Code support multi-modal inputs? A: Yes, Kimi-K2.7-Code supports both image and video inputs. Note that video support is currently experimental and primarily available through the official Moonshot AI API.

Q: What license is Kimi-K2.7-Code released under? A: Both the code repository and the model weights are released under the Modified MIT License.

Q: What are the recommended settings for inference? A: When using third-party APIs like vLLM or SGLang, a temperature of 1.0 is recommended for Thinking mode, with a top_p of 0.95. Instant mode is not supported.

Q: Where can I get support for Kimi-K2.7-Code? A: You can reach out to the Moonshot AI team at [email protected] for any questions regarding the model or its implementation.

Kimi-K2.7-Code is a major milestone for developers looking for an agentic model that truly understands the complexities of modern software development workflows.

Loading related products...