← 返回首页

serving-llms-vllm

版本 1.0.0 • 作者:Orchestra Research

vLLM: high-throughput LLM serving, OpenAI API, quantization.

📥 安装命令

hermes skill install serving-llms-vllm
分类
uncategorized
版本
1.0.0
作者
Orchestra Research
同步时间
2026-06-06

🏷️ 标签

vLLMInference ServingPagedAttentionContinuous BatchingHigh ThroughputProductionOpenAI APIQuantizationTensor Parallelism