Model interaction
Model interaction is the process of interacting with the model to get the desired output. It is the process of sending a prompt to the model and getting a response.
📄️ OpenAI-compatible API
An OpenAI-compatible API implements the same request and response formats as OpenAI's official API, allowing developers to switch between different models without changing existing code.
📄️ Anthropic-compatible API
An Anthropic-compatible API mirrors Anthropic's Messages API so Claude-based clients, SDKs, and agent tools can use another model or provider with minimal code changes.
📄️ Function calling
Learn what function calling is and its use case.
📄️ Structured outputs
Structured outputs are model responses in defined formats like JSON or XML, making AI-generated data predictable, machine-readable, and easy to integrate into applications and workflows.
📄️ Model Context Protocol
Learn what Model Context Protocol (MCP) is and its use case.
📄️ Prompt engineering
Understand prompt engineering for LLM inference. Learn system & user prompts, zero-shot & few-shot prompting, KV cache impact, token costs, and production best practices.
📄️ LLM inference parameters
LLM inference parameters are request-time settings that control randomness, output length, repetition, stopping behavior, reproducibility, and structured generation.
Stay updated with the handbook
Get the latest insights and updates on LLM inference and optimization techniques.
- Monthly insights
- Latest techniques
- Handbook updates