logo
3
2
Login
docs: 更新文档

DeepSeek-OCR vLLM Docker Image

This Docker image packages DeepSeek-OCR model with vLLM for serving OCR requests via OpenAI-compatible API.

Prerequisites

Before building the image, clone the model repository locally:

git clone https://cnb.cool/ai-models/deepseek-ai/DeepSeek-OCR model

Build the Image

docker build -t deepseek-ocr/deepseek-ocr:latest .

Run the Container

docker run -d \ --name deepseek-ocr \ -p 8080:8080 \ --gpus all \ --ipc=host \ deepseek-ocr/deepseek-ocr:latest

直接启动构建好的

docker run -d \ --name deepseek-ocr \ -p 8080:8080 \ --gpus all \ --ipc=host \ docker.cnb.cool/ai-models/deepseek-ai/deepseek-ocr-vllm:latest

限制显存

docker run -d \ --name deepseek-ocr \ -p 8080:8080 \ --gpus '"device=0,memory=10G"' \ --ipc=host \ docker.cnb.cool/ai-models/deepseek-ai/deepseek-ocr-vllm:latest

Usage Example

查看有哪些模型:

curl http://localhost:8080/v1/models

进行OCR:

curl -X POST http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer EMPTY" \ -d '{ "model": "deepseek-ocr", "messages": [ { "role": "user", "content": [ { "type": "image_url", "image_url": { "url": "https://ofasys-multimodal-wlcb-3-toshanghai.oss-accelerate.aliyuncs.com/wpf272043/keepme/image/receipt.png" } }, { "type": "text", "text": "Free OCR." } ] } ], "max_tokens": 2048, "temperature": 0.0, "skip_special_tokens": false, "extra_body": { "vllm_xargs": { "ngram_size": 30, "window_size": 90, "whitelist_token_ids": [128821, 128822] } } }'

进行图片解读:

curl -X POST http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer EMPTY" \ -d '{ "model": "deepseek-ocr", "messages": [ { "role": "user", "content": [ { "type": "image_url", "image_url": { "url": "https://ofasys-multimodal-wlcb-3-toshanghai.oss-accelerate.aliyuncs.com/wpf272043/keepme/image/receipt.png" } }, { "type": "text", "text": "\n 这是一张" } ] } ], "max_tokens": 2048, "temperature": 0.0, "skip_special_tokens": false, "extra_body": { "vllm_xargs": { "ngram_size": 30, "window_size": 90, "whitelist_token_ids": [128821, 128822] } } }'
import time from openai import OpenAI client = OpenAI( api_key="EMPTY", base_url="http://localhost:8080/v1", timeout=3600 ) messages = [ { "role": "user", "content": [ { "type": "image_url", "image_url": { "url": "https://ofasys-multimodal-wlcb-3-toshanghai.oss-accelerate.aliyuncs.com/wpf272043/keepme/image/receipt.png" } }, { "type": "text", "text": "Free OCR." } ] } ] start = time.time() response = client.chat.completions.create( model="/workspace/model", messages=messages, max_tokens=2048, temperature=0.0, extra_body={ "skip_special_tokens": False, # args used to control custom logits processor "vllm_xargs": { "ngram_size": 30, "window_size": 90, # whitelist: <td>, </td> "whitelist_token_ids": [128821, 128822], }, }, ) print(f"Response costs: {time.time() - start:.2f}s") print(f"Generated text: {response.choices[0].message.content}")

Alternative Usage with Grounding

For document-to-markdown conversion:

messages = [ { "role": "user", "content": [ { "type": "image_url", "image_url": { "url": "your_image_url" } }, { "type": "text", "text": "<|grounding|>Convert the document to markdown." } ] } ]

Custom Parameters

You can pass additional vLLM parameters when running the container:

docker run -d \ --name deepseek-ocr \ -p 8080:8080 \ --gpus all \ --ipc=host \ deepseek-ocr/deepseek-ocr:latest \ --max-model-len 8192 \ --max-num-batched-tokens 4096

Model Sizes

DeepSeek-OCR supports different processing modes:

  • Tiny: base_size = 512, image_size = 512
  • Small: base_size = 640, image_size = 640
  • Base: base_size = 1024, image_size = 1024
  • Large: base_size = 1280, image_size = 1280
  • Gundam: base_size = 1024, image_size = 640, crop_mode = True

References