
安全好用的OpenApi
本文将提供两种接入方案(直接调试和API服务),并包含VSCode特有配置技巧。
1. 项目结构配置
deepseek-vscode/
├── models/ # 模型文件目录
│ └── deepseek-7b-chat/
├── src/
│ ├── api.py # API服务文件
│ └── client.py # 客户端测试脚本
├── .env # 环境变量
└── requirements.txt # 依赖清单
创建 :src/deepseek_demo.py
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
MODEL_PATH = "./models/deepseek-7b-chat"
def load_model():
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
MODEL_PATH,
device_map="auto",
torch_dtype=torch.float16
)
return model, tokenizer
def generate_response(prompt):
model, tokenizer = load_model()
inputs = tokenizer.apply_chat_template(
[{"role": "user", "content": prompt}],
return_tensors="pt"
).to(model.device)
outputs = model.generate(inputs, max_new_tokens=200)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# 在VSCode中按F5启动调试
if __name__ == "__main__":
while True:
query = input("用户输入:")
print("DeepSeek:", generate_response(query))
创建 :src/api.py
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from src.deepseek_demo import generate_response
import uvicorn
app = FastAPI()
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_methods=["*"],
allow_headers=["*"],
)
@app.get("/chat")
async def chat(q: str):
try:
response = generate_response(q)
return {"response": response}
except Exception as e:
return {"error": str(e)}
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8000)
{
"version": "0.2.0",
"configurations": [
{
"name": "启动API服务",
"type": "python",
"request": "launch",
"program": "src/api.py",
"args": [],
"env": {"PYTHONPATH": "${workspaceFolder}"}
},
{
"name": "交互式调试",
"type": "python",
"request": "launch",
"program": "src/deepseek_demo.py",
"console": "integratedTerminal",
"env": {"PYTHONPATH": "${workspaceFolder}"}
}
]
}
.ipynb
# %%
from src.deepseek_demo import generate_response
# 实时测试模型响应
def test_model(prompt):
response = generate_response(prompt)
print(f"输入:{prompt}\n输出:{response}")
test_model("解释量子计算的基本原理")
在调试过程中使用 Python Debugger:
{
"python.analysis.extraPaths": ["./src"],
"python.languageServer": "Pylance",
"jupyter.kernels.trusted": true,
"debugpy.allowRemote": true,
"python.terminal.activateEnvironment": true
}
创建 :Dockerfile
FROM nvidia/cuda:12.2.0-base
WORKDIR /app
COPY . .
RUN apt-get update && \
apt-get install -y python3.10 python3-pip && \
pip install -r requirements.txt
CMD ["python3", "src/api.py"]
使用 Dev Containers 扩展实现一键容器化开发。
问题现象 | 解决方案 |
模块导入错误 | 在文件添加.env PYTHONPATH=/path/to/project-root |
CUDA版本不匹配 | 使用VSCode的Dev Container功能创建隔离环境 |
长文本生成卡顿 | 安装 Transformer Tokens 扩展实时监控token消耗 |
中文显示乱码 | 设置终端编码:"terminal.integrated.defaultProfile.windows": "Command Prompt" |
性能测试示例(VSCode终端):
# 启动压力测试
python -m src.stress_test --threads 4 --requests 100
推荐扩展组合:
通过以上配置,可以在VSCode中实现:
本文转载自@CSDNddv_08