
使用DeepSeek R1、LangChain和Ollama构建端到端生成式人工智能应用
近年来,国内开源AI大模型发展得特别快,像DeepSeek R1和通义千问Max这样的明星模型,正在让人工智能技术变得更普及、更好用。DeepSeek R1 主打轻便高效,尤其擅长处理图片、文本等多模态任务,特别适合开发者快速部署到实际应用中。而通义千问Max 则是个”大块头”,参数规模超大,还融入了丰富的行业知识,在复杂任务上表现非常强悍。
幂简集成精心整理了一份详尽的对比表格,全面覆盖国内主流开源AI大模型的核心性能指标、API产品特性、价格等关键信息。本文将重点从API产品表格和API接口效果两个角度进行分析,为您提供直观的比较视角,同时诚邀您查阅完整报表,获取更全面的洞察!
Deepseek R1 标志着人工智能技术的一次大胆进步,带来了创新功能,并注重精度。与此同时, 通义千问Max 代表着阿里云在人工智能领域的又一次飞跃,展示了其在多功能性和可靠性方面的显著提升。
在讨论大语言模型的性能时,我们需要关注其核心参数和表现指标。以下表格详细对比了通义千问Max与Deepseek R1两款模型在输入输出能力、输出效率等方面的关键数据,为进一步分析模型性能提供直观依据。
规格 | 通义千问Max | Deepseek R1 |
输入上下文窗口 | 32k | 64K |
最大输出令牌 | 最大输出约4k token | 最大输出约8k token |
参数数量 | 基于大量公开数据训练 | 基于深度优化的领域数据训练 |
知识截止 | 截至2024年中期 | 截至2024年后期 |
发布日期 | 2024年早期 | 2024年中期 |
每秒输出令牌数 | 约30-50 token/s | 约50-80 token/s |
该表格比较了通义千问Max和Deepseek R1两种模型:
总结:Deepseek R1在上下文窗口、最大输出令牌和每秒输出速度上优于通义千问Max,但两者知识更新和发布日期相同,训练数据来源不同。
该基准测试是两种型号的官方发布说明以及多个开放基准测试的组合。
基准 | 通义千问Max | Deepseek R1 |
MMLU | 90.8 | 90.8 |
GPQA | 67 | 71.5 |
Human Eval | 96 | 96.3 |
CTFs | 43.0 | – |
MATH-500 | 92 | 97.3 |
该表格比较了通义千问Max和Deepseek R1两个模型在不同基准测试中的性能表现,数据如下:
说明:
总体来看,Deepseek R1在大多数测试中表现略优或持平,通义千问Max在CTFs上有数据但表现较低。
我们在充满挑战的真实场景中对人工智能模型进行严格测试。我们的评估重点关注对话能力、逻辑推理、数学问题解决能力和编程性能等关键领域,以评估每个模型的优势和劣势。
让我们从一个熟悉的挑战开始:很多人都知道,AI 模型在诸如计算单词中特定字母的数量之类的任务上经常遇到困难。让我们看看 DeepSeek R1 和 通义千问Max 如何处理这项任务。(我故意拼错了这个词)
How many letters 'r' are in the word strawbery?
There are **2** instances of the letter 'r' in "strawbery."
正确答案:✅
The word “strawbery” (as spelled) contains 2 instances of the letter ‘r’. Here’s the breakdown:
S
T
R (first ‘r’)
A
W
B
E
R (second ‘r’)
Y
Note: The correct spelling of the fruit is “strawberry” (with two 'r’s and two 'b’s), but the count here is based on the provided spelling.
正确答案:✅
点击试用验证更多模型效果
模型将接受一个陈述和几个可能的答案。它的任务是选出正确的答案。
The medieval Arabs had many manuscripts of the ancient Greek.
When needed, they translate them into Arabic.
Medieval Arab philosophers were very interested in Aristotle's Theory of Poetry,
which was obviously not shared by Arab poets,
because a poet interested in it must want to read Homer's poems.
Aristotle himself often quotes Homer's poems.
However, Homer's poems were not translated into Arabic until modern times.
Which of the following options, if true, strongly supports the above argument?
Please only return the letter in front of your predict correct option, A, B, C or D.
A. Some medieval Arab translators have manuscripts of Homer poems in ancient Greek.
B. Aristotle's Theory of Poetry is often quoted and commented by modern Arab poets.
C. In Aristotle's Theory of Poetry, most of the content is related to drama,
and medieval Arabs also wrote plays and performed them.
D. A series of medieval Arab stories, such as Arab Night,
are very similar to some parts of Homer's epic.
A
正确答案:✅
通义千问Max 成功完成了这项任务
A
正确答案:❌
Deepseek R1 无法成功完成此任务
点击试用验证更多模型效果
Write a short poem about friendship
In life’s journey, paths may twist and turn,
Friendship’s light forever will burn.
A bond so pure, a heart’s true find,
Through every season, love intertwined.
Hand in hand, we face the storm,
With gentle words, a healing balm.
Laughter shared, or silent gaze,
In friendships deep, our souls find ways.
To lift, support, and understand—
An open heart, a helping hand.
So cherish near, each kindred soul,
For friendships make us whole.
正确答案: (4/5)✅
这是一首扎实、真挚的诗,但在执行上却略显可预测。
Bonds Unseen
In laughter’s echo, through silent tears,
A thread weaves softly, spanning years—
Not bound by time, nor duty’s chain,
But roots that hold through wind and rain.
Shared sunsets, secrets whispered low,
A shelter when the cold winds blow.
No score, no count, just open hands,
A map of trust where each soul stands.
When shadows stretch and paths divide,
Your voice becomes my steady guide.
Not perfect, flawed, yet fiercely true,
A mirror showing the best in you.
For friendship’s not a blaze, bright-brief,
But embers glowing, slow, beneath—
A quiet flame that time won’t dim,
The quiet hum of home within.
正确答案:(4.5/5)✅
这是一首生动且发人深省的诗歌,虽然不太容易理解,但值得深入思考。
很难评判哪首诗更好,因为每首诗都有其独特的优点。此外,对诗歌的看法非常主观,因人而异。不过,我上面的评分只是对每首诗的个人看法。
点击试用验证更多模型效果
我用几个不同的任务测试了这些模型,它们都成功完成了。然而,GPT 在这个任务上遇到了困难。
The expression 2⋅3⋅4⋅5+1=121, as multiplication is performed before addition.
However, other values can be obtained by inserting parentheses into the expression.
For example:
(2⋅(3⋅4))⋅(5+1)=144
How many distinct values can be obtained by inserting parentheses into
2⋅3⋅4⋅5+1, without rearranging the terms?
To solve this problem, we need to analyze the different ways parentheses can be inserted into the expression ( 2 \cdot 3 \cdot 4 \cdot 5 + 1 ), while ensuring that the terms are not rearranged. The goal is to determine how many distinct values can be obtained.
Step 1: Understand the structure of the expression
The original expression is: [ 2 \cdot 3 \cdot 4 \cdot 5 + 1 ] Here, multiplication has higher precedence than addition. Without parentheses, the default evaluation order is: [ (2 \cdot 3 \cdot 4 \cdot 5) + 1 = 120 + 1 = 121 ]
By inserting parentheses, we can change the grouping of terms and thus alter the result. However, the order of the terms ( 2, 3, 4, 5, +1 ) must remain fixed.
Step 2: Count the number of ways to insert parentheses
The expression consists of four multiplications and one addition. To evaluate it, we must group the terms using parentheses. This is equivalent to finding all possible binary trees with four leaves (one for each number ( 2, 3, 4, 5 )) and a single addition operation.
Catalan numbers and binary tree enumeration
The number of ways to parenthesize ( n ) terms in an expression is given by the ( (n-1) )-th Catalan number. For four terms (( n = 4 )), the number of ways to parenthesize them is: [ C_3 = \frac{1}{3+1} \binom{6}{3} = \frac{1}{4} \cdot 20 = 5 ] Thus, there are 5 ways to parenthesize the multiplication part of the expression ( 2 \cdot 3 \cdot 4 \cdot 5 ).
For each of these 5 groupings, the addition ( +1 ) can either be applied last or grouped earlier with some subset of the terms. This introduces additional possibilities.
Step 3: Evaluate all possible parenthesizations
We now systematically evaluate all distinct results that can arise from the different parenthesizations.
Case 1: Default order (no parentheses)
Without any parentheses, the expression evaluates as: [ (2 \cdot 3 \cdot 4 \cdot 5) + 1 = 120 + 1 = 121 ]
Case 2: Grouping two terms at a time
We consider all possible ways to group two terms together first, then proceed with the rest of the computation.
Group ( (2 \cdot 3) ): [ ((2 \cdot 3) \cdot 4 \cdot 5) + 1 = (6 \cdot 4 \cdot 5) + 1 = (24 \cdot 5) + 1 = 120 + 1 = 121 ] Alternatively: [ (2 \cdot 3) \cdot (4 \cdot 5) + 1 = 6 \cdot 20 + 1 = 120 + 1 = 121 ]
Group ( (3 \cdot 4) ): [ (2 \cdot (3 \cdot 4) \cdot 5) + 1 = (2 \cdot 12 \cdot 5) + 1 = (24 \cdot 5) + 1 = 120 + 1 = 121 ] Alternatively: [ (2 \cdot 3) \cdot (4 \cdot 5) + 1 = 6 \cdot 20 + 1 = 120 + 1 = 121 ]
Group ( (4 \cdot 5) ): [ (2 \cdot 3 \cdot (4 \cdot 5)) + 1 = (2 \cdot 3 \cdot 20) + 1 = (6 \cdot 20) + 1 = 120 + 1 = 121 ]
Case 3: Grouping three terms together
We now consider cases where three terms are grouped together first.
Group ( (2 \cdot 3 \cdot 4) ): [ ((2 \cdot 3 \cdot 4) \cdot 5) + 1 = (24 \cdot 5) + 1 = 120 + 1 = 121 ] Alternatively: [ (2 \cdot 3 \cdot 4) + 1 \cdot 5 = 24 + 5 = 29 ]
Group ( (3 \cdot 4 \cdot 5) ): [ (2 \cdot (3 \cdot 4 \cdot 5)) + 1 = (2 \cdot 60) + 1 = 120 + 1 = 121 ] Alternatively: [ 2 \cdot (3 \cdot 4 \cdot 5 + 1) = 2 \cdot 61 = 122 ]
Case 4: Other combinations
Finally, we check other possible groupings:
Group ( (2 \cdot (3 \cdot 4)) ): [ ((2 \cdot (3 \cdot 4)) \cdot 5) + 1 = (2 \cdot 12 \cdot 5) + 1 = (24 \cdot 5) + 1 = 120 + 1 = 121 ] Alternatively: [ (2 \cdot (3 \cdot 4)) + 1 \cdot 5 = 24 + 5 = 29 ]
Group ( ((2 \cdot 3) \cdot 4) ): [ (((2 \cdot 3) \cdot 4) \cdot 5) + 1 = (6 \cdot 4 \cdot 5) + 1 = (24 \cdot 5) + 1 = 120 + 1 = 121 ] Alternatively: [ ((2 \cdot 3) \cdot 4) + 1 \cdot 5 = 24 + 5 = 29 ]
Step 4: Identify distinct values
From the above evaluations, the distinct values obtained are: [ 121, 29, 122 ]
Thus, the total number of distinct values is: [ \boxed{3} ]
正确答案:❌
GPT 因推理错误而失败
The distinct values obtained by inserting parentheses into the expression (2 \cdot 3 \cdot 4 \cdot 5 + 1) without rearranging the terms are four. Each value corresponds to incorporating the addition of 1 at different stages of the multiplication process:
Adding 1 at the very end: ((2 \cdot 3 \cdot 4 \cdot 5) + 1 = 121).
Adding 1 to the product of 3, 4, and 5: (2 \cdot (3 \cdot 4 \cdot 5 + 1) = 122).
Adding 1 to the product of 4 and 5: (2 \cdot 3 \cdot (4 \cdot 5 + 1) = 126).
Adding 1 to 5 first: (2 \cdot 3 \cdot 4 \cdot (5 + 1) = 144).
Each grouping leads to a distinct result, and due to the associative property of multiplication, different parenthesizations within the multiplications do not yield new values. Thus, the number of distinct values is 4.
(\boxed{4})
正确答案:✅
Deepseek 成功完成任务
点击试用验证更多模型效果
Given two sorted arrays nums1 and nums2 of size m and n respectively,
return the median of the two sorted arrays.
The overall run time complexity should be O(log (m+n))
Example:
Input: nums1 = [1,3], nums2 = [2]
Output: 2.00000
Explanation: merged array = [1,2,3] and median is 2.
def findMedianSortedArrays(nums1, nums2):
# Ensure nums1 is the smaller array
if len(nums1) > len(nums2):
nums1, nums2 = nums2, nums1
m, n = len(nums1), len(nums2)
imin, imax, half_len = 0, m, (m + n + 1) // 2
while imin <= imax:
i = (imin + imax) // 2
j = half_len - i
if i < m and nums2[j-1] > nums1[i]:
# i is too small, increase it
imin = i + 1
elif i > 0 and nums1[i-1] > nums2[j]:
# i is too big, decrease it
imax = i - 1
else:
# i is perfect
max_of_left = 0
if i == 0:
max_of_left = nums2[j-1]
elif j == 0:
max_of_left = nums1[i-1]
else:
max_of_left = max(nums1[i-1], nums2[j-1])
if (m + n) % 2 == 1:
return max_of_left
min_of_right = 0
if i == m:
min_of_right = nums2[j]
elif j == n:
min_of_right = nums1[i]
else:
min_of_right = min(nums1[i], nums2[j])
return (max_of_left + min_of_right) / 2
# Example usage:
nums1 = [1, 3]
nums2 = [2]
print(findMedianSortedArrays(nums1, nums2)) # Output: 2.0
正确答案: (5/5)✅
GPT 提供了出色的解决方案,跻身最佳之列
点击试用验证更多模型效果
def findMedianSortedArrays(nums1, nums2):
if len(nums1) > len(nums2):
nums1, nums2 = nums2, nums1
m, n = len(nums1), len(nums2)
total_left = (m + n + 1) // 2
low, high = 0, m
while low <= high:
partition_nums1 = (low + high) // 2
partition_nums2 = total_left - partition_nums1
max_left_nums1 = float('-inf') if partition_nums1 == 0 else nums1[partition_nums1 - 1]
min_right_nums1 = float('inf') if partition_nums1 == m else nums1[partition_nums1]
max_left_nums2 = float('-inf') if partition_nums2 == 0 else nums2[partition_nums2 - 1]
min_right_nums2 = float('inf') if partition_nums2 == n else nums2[partition_nums2]
if max_left_nums1 <= min_right_nums2 and max_left_nums2 <= min_right_nums1:
if (m + n) % 2 == 0:
return (max(max_left_nums1, max_left_nums2) + min(min_right_nums1, min_right_nums2)) / 2
else:
return max(max_left_nums1, max_left_nums2)
elif max_left_nums1 > min_right_nums2:
high = partition_nums1 - 1
else:
low = partition_nums1 + 1
return 0.0
正确答案:(4.5/5)✅
Deepseek 提供了近乎完美的解决方案
点击试用验证更多模型效果
This function, form_valid, processes a valid form
submission for a purchase in a Django view. Optimize and refactor it.
def form_valid(self, form):
email = form.cleaned_data['email']
tel = form.cleaned_data['tel']
country = form.cleaned_data['country']
city = form.cleaned_data['city']
street = form.cleaned_data['street']
user = self.request.user
products = Basket.objects.values_list(
'product',
flat=True
).filter(user=self.request.user)
total_price = Basket.objects.filter(
user=self.request.user
).aggregate(
total_price=Sum(F('product__price') * F('quantity'))
)
purchase = Purchase.objects.create(
user=user,
total_price=total_price.get('total_price'),
email=email,
tel=tel,
country=country,
city=city,
street=street
)
purchase.products.add(*products)
Basket.objects.filter(user=self.request.user).delete()
return redirect('history_purchases')
from django.db.models import F, Sum
def form_valid(self, form):
# Extract cleaned data from the form
user = self.request.user
basket_products = Basket.objects.filter(user=user)
# Aggregate cleaned data for Purchase creation
purchase_data = {
'user': user,
'email': form.cleaned_data['email'],
'tel': form.cleaned_data['tel'],
'country': form.cleaned_data['country'],
'city': form.cleaned_data['city'],
'street': form.cleaned_data['street'],
}
# Calculate total price using a single query
total_price = basket_products.aggregate(
total_price=Sum(F('product__price') * F('quantity'))
)['total_price'] or 0 # Handle case where total_price might be None
# Create the Purchase instance
purchase = Purchase.objects.create(**purchase_data, total_price=total_price)
# Add products from the basket to the purchase
product_ids = basket_products.values_list('product', flat=True)
purchase.products.add(*product_ids)
# Clear the user's basket after the purchase is completed
basket_products.delete()
return super().form_valid(form)
正确答案: (4/5)✅
优点:
– 有效地获取相关产品对象可以在访问产品属性(例如, product__price )时最大限度地减少数据库查询。
– total_price 的计算保留在交易块之外,从而降低了交易内部的复杂性。
缺点:
– [item.product for item in shopping cart_items] 在内存中创建一个列表,这是不必要的,因为可以直接获取 ID
– 重复逻辑(提取表单数据和购物篮产品可以组织得更紧凑)
def form_valid(self, form):
# Get user once and reuse
user = self.request.user
# Get form data in one dictionary operation
form_data = {
'email': form.cleaned_data['email'],
'tel': form.cleaned_data['tel'],
'country': form.cleaned_data['country'],
'city': form.cleaned_data['city'],
'street': form.cleaned_data['street']
}
# Get basket data in one query using select_related for better performance
basket_queryset = Basket.objects.filter(user=user).select_related('product')
# Calculate total price and get product IDs in a single iteration
total_price = 0
product_ids = []
for item in basket_queryset:
total_price += item.product.price * item.quantity
product_ids.append(item.product.id)
# Create purchase with all data
purchase = Purchase.objects.create(
user=user,
total_price=total_price,
**form_data
)
# Add products using the collected product IDs
purchase.products.add(*product_ids)
# Clear the basket (assuming this was intended from the truncated original)
basket_queryset.delete()
return super().form_valid(form)
正确答案:(4/5)✅
点击试用验证更多模型效果
优点:
– 使用 values_list(‘product’, flat=True) 直接从购物篮中获取产品 ID,而无需将完整对象加载到内存中
– 事务块封装了所有数据库操作,使得识别代码的关键部分变得更加容易。
– Extracting and passing form data is concise and integrated with the – 提取和传递表单数据简洁并与 Purchase.objects.create 语句集成
缺点:
– 代码不使用 select_related ,如果访问产品(例如,查询其价格),这可能会导致额外的查询。
– 紧凑的特性牺牲了一点清晰度,特别是对于初学者来说,因为所有逻辑都在交易块内。
虽然我们强调了一些优势,但最适合您特定任务的模型取决于您的需求。以上测试和比较了两种语言模型: 通义千问Max 输出和 Deepseek R1。尝试不同的提示,看看哪个模型能为您带来最相关、最令人印象深刻的结果!
测试 | GPT o1 预览版 | 深探 R1 |
1.聊天 | ✅ | ✅ |
2.逻辑推理 | ❌ | ✅ |
3.创造力 | (4/5)✅ | (4.5/5)✅ |
4.数学 | ❌ | ✅ |
5.算法 | (5/5)✅ | (4.5/5)✅ |
6.重构与优化 | (4/5)✅ | (4/5)✅ |
通义千问Max在多语言处理、复杂对话和数据分析上表现突出,尤其擅长教育咨询与跨文化任务;而Deepseek R1则在高等数学、符号推理和内存高效编码领域占据优势,更适合学术研究与资源敏感型开发。两者定位鲜明,选择需基于项目核心需求:若需全球化服务或合规安全内容,优先通义千问Max;若涉及理论推导或硬件受限场景,Deepseek R1更为胜任。
若要从更多的基础技术参数、服务稳定性、互联网口碑等其它维度进行选型的话,可以自己选择期望的服务商制作比较报告。