← 返回首页
🧠

客户端库:各语言LLM客户端库的使用与开发

📂 llm ⏱ 2 min 376 words

--- title: "客户端库:各语言LLM客户端库的使用与开发" description: "介绍主流编程语言下LLM客户端库的选择、使用技巧及自定义客户端库的开发方法" tags: ["客户端库", "LLM客户端", "Python", "JavaScript", "Go"] category: "llm" icon: "🧠"

客户端库:各语言LLM客户端库的使用与开发

主流LLM客户端库概览

不同编程语言生态下,LLM客户端库的成熟度和功能各有差异。选择合适的客户端库需要考虑语言支持、API完整性、社区活跃度等因素。

Python生态

Python是LLM应用开发的主流语言,客户端库最为丰富:

# OpenAI官方SDK
from openai import OpenAI

client = OpenAI(api_key="your-key")
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}]
)

# LangChain封装
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o", temperature=0.7)
response = llm.invoke("Hello")

JavaScript/TypeScript生态

// OpenAI官方SDK
import OpenAI from "openai";

const client = new OpenAI({ apiKey: "your-key" });
const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello" }],
});

// Vercel AI SDK
import { generateText } from "ai";
import { openai } from "@ai-sdk/openai";

const result = await generateText({
  model: openai("gpt-4o"),
  prompt: "Hello",
});

Go生态

// OpenAI Go SDK
import openai "github.com/sashabaranov/go-openai"

client := openai.NewClient("your-key", "")
resp, err := client.CreateChatCompletion(
    context.Background(),
    openai.ChatCompletionRequest{
        Model: openai.GPT4o,
        Messages: []openai.ChatCompletionMessage{
            {Role: openai.ChatMessageRoleUser, Content: "Hello"},
        },
    },
)

客户端库选择标准

1. API覆盖率

评估客户端库是否覆盖目标LLM API的核心功能:

# 检查关键功能支持
required_features = [
    "chat_completions",
    "streaming",
    "function_calling",
    "embeddings",
    "image_generation",
    "token_counting",
]

def check_feature_support(sdk_class, features):
    supported = []
    for feature in features:
        if hasattr(sdk_class, feature):
            supported.append(feature)
    return supported

2. 错误处理质量

优秀的客户端库应提供结构化的错误信息:

from openai import (
    APIConnectionError,
    RateLimitError,
    APITimeoutError,
    AuthenticationError,
)

try:
    response = client.chat.completions.create(...)
except RateLimitError as e:
    # 自动识别限流,实施重试
    wait_time = e.response.headers.get("Retry-After", 60)
    time.sleep(int(wait_time))
except AuthenticationError:
    # 密钥问题,提示用户检查
    raise LLMAuthError("API密钥无效或已过期")
except APIConnectionError:
    # 网络问题,可重试
    raise LLMConnectionError("无法连接到API服务器")

3. 流式输出支持

流式输出是LLM应用的基本需求:

# Python流式处理
stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "写一个故事"}],
    stream=True,
)

collected_content = []
for chunk in stream:
    if chunk.choices[0].delta.content:
        content = chunk.choices[0].delta.content
        collected_content.append(content)
        print(content, end="", flush=True)

full_response = "".join(collected_content)

自定义客户端库开发

当官方SDK不满足需求时,可以开发自定义客户端库:

import httpx
from dataclasses import dataclass
from typing import AsyncIterator

@dataclass
class LLMConfig:
    api_key: str
    base_url: str = "https://api.openai.com/v1"
    timeout: float = 30.0

class CustomLLMClient:
    def __init__(self, config: LLMConfig):
        self.config = config
        self._client = httpx.AsyncClient(
            base_url=config.base_url,
            headers={"Authorization": f"Bearer {config.api_key}"},
            timeout=config.timeout,
        )

    async def chat(self, messages: list[dict], **kwargs) -> dict:
        response = await self._client.post(
            "/chat/completions",
            json={"messages": messages, **kwargs},
        )
        response.raise_for_status()
        return response.json()

    async def stream_chat(self, messages: list[dict], **kwargs) -> AsyncIterator[str]:
        async with self._client.stream(
            "POST",
            "/chat/completions",
            json={"messages": messages, "stream": True, **kwargs},
        ) as response:
            async for line in response.aiter_lines():
                if line.startswith("data: ") and line != "data: [DONE]":
                    import json
                    data = json.loads(line[6:])
                    yield data["choices"][0]["delta"].get("content", "")

连接池与性能优化

对于高并发场景,连接池管理至关重要:

import httpx

# 复用连接池
class PooledLLMClient:
    def __init__(self, config: LLMConfig):
        self._client = httpx.AsyncClient(
            base_url=config.base_url,
            headers={"Authorization": f"Bearer {config.api_key}"},
            limits=httpx.Limits(
                max_connections=100,
                max_keepalive_connections=20,
                keepalive_expiry=30,
            ),
        )

    async def batch_chat(self, prompts: list[str]) -> list[str]:
        import asyncio
        tasks = [self._chat_single(p) for p in prompts]
        return await asyncio.gather(*tasks)

总结

选择LLM客户端库时,应综合考虑API覆盖率、错误处理、流式支持、社区活跃度等因素。对于特殊需求,可以基于HTTP客户端开发自定义SDK,同时注意连接池管理和性能优化。