spring ai支持tools mcp服务调用
本文介绍了基于Spring AI框架实现MCP(模型控制协议)服务的实践过程。作者以DDR4内存价格分析为案例,展示了如何构建一个支持工具调用的智能代理系统。文章详细说明了服务端和客户端的配置实现,包括模型选择(Gemma3和Qwen3.5)、MCP工具注解的使用以及YAML配置。在遇到Gemma3不支持工具调用的问题后,作者改用Qwen3.5模型,并优化了系统架构,将实时数据获取与LLM推理分离
前言
最近做项目,需要与AI的支持,实际上现在很多不确定的内容都可以通过AI生成,比如文本,图片,甚至语音视频等。vibe coding在大规模的应用,在crud方面大放光彩,但是AI本身的知识内容是预训练的,除非一直训推,否则llm的知识就固定到某个程度,所以有了rag function calling mcp等,这些都是为了agent服务的,一般我们与大模型的交互就是提示词,agent就是运用各种能力封装的智能体,当前时间大火的openclaw其实就是agent的智能判断和调用。
准备demo
springai已经封装好了mcp的各种能力,原生支持ollama,笔者是macbook,选择模型为gguf,方便,毕竟不是N卡,模型选择gemma3:4b 量化Q4_K_M,模型选择这个为mcp埋坑了。
mcp server
pom,笔者这里mcp也用了大语言模型(实际上现在很多模型都是多模态,不仅仅支持文本),如果mcp服务不使用llm,可以去掉ollama的依赖
<properties>
<maven.compiler.source>21</maven.compiler.source>
<maven.compiler.target>21</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-starter-mcp-server-webmvc</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
<version>3.5.8</version>
</dependency>
<dependency>
<groupId>org.aspectj</groupId>
<artifactId>aspectjweaver</artifactId>
<version>1.9.7</version>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-starter-model-ollama</artifactId>
</dependency>
</dependencies>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-bom</artifactId>
<version>1.1.2</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
随意写一写服务,以当前大涨的内存条为例,笔者还在说苹果黄金内存,结果都是黄金内存了,苹果有性价比了😅,回归正题,写个服务层
package com.feng.mcp.server;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import java.util.HashMap;
import java.util.Map;
@Service
public class InfoService {
public Map<String, String> getDramInfo(String dramType) {
System.out.println("dramType:" + dramType);
Map<String, String> map = new HashMap<>();
map.put("dramType", dramType);
map.put("type", "ddr4内存有3种类型:笔记本、台式机、低功耗(lpddr4)");
map.put("environment", "AI对HBM显存有需求,内存颗粒厂商控制产量,奸商趁机囤积");
map.put("priceStatus", "ddr4 笔记本内存条已经涨价 300%");
return map;
}
@Autowired
private ChatClient chatClient;
public String getBuySuggestion(String dramType, String dramInfo) {
// call llm analysis
System.out.println("dramType:" + dramType);
System.out.println("dramInfo:" + dramInfo);
return chatClient.prompt("请根据MCP的实时内容分析,当前ddr4笔记本内存条是否值得购买").user(dramType+":\n"+dramInfo).call().content();
}
}
mcp定义
package com.feng.mcp.server;
import org.springaicommunity.mcp.annotation.McpTool;
import org.springaicommunity.mcp.annotation.McpToolParam;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import java.util.Map;
@Service
public class MemeryMcpToolsService {
@Autowired
private InfoService infoService;
@McpTool(name = "getDDR4MemeryInfo", description = "Get dram information of ddr4 laptop")
public Map<String, String> getDDR4MemeryInfo(@McpToolParam(description = "dram type") String dramType) {
return infoService.getDramInfo(dramType);
}
@McpTool(name = "buyDDR4Suggestion", description = "get suggestion of ddr4 laptop when go to buy")
public String buyDDR4Suggestion(@McpToolParam(description = "dram type") String dramType,
@McpToolParam(description = "dram info", required = false) String dramInfo) {
return infoService.getBuySuggestion(dramType, dramInfo);
}
}
配置yaml
spring:
ai:
ollama:
base-url: http://localhost:11434
chat:
# deepseek-r1:1.5b
model: qwen3.5:4b
mcp:
server:
enabled: true # Enable MCP Server
protocol: STATELESS # Use Stateless streamable HTTP method
annotation-scanner:
enabled: true # Enable annotation scanning for @McpTool annotation
streamable-http:
mcp-endpoint: /api/mcp-Service # Define endpoint for MCP Client to discover available tools
capabilities:
tool: true # Indicate this MCP Server provides tool capabilities
application:
name: dram-buy
aop:
auto: true
proxy-target-class: true
然后客户端使用上一篇文章的chat改一改
@Autowired
private SyncMcpToolCallbackProvider provider;
@PostMapping("/dram_buy")
public String buyDram(@RequestParam("question") String question) {
List<Message> messageList = new ArrayList<>();
messageList.add(new UserMessage(question));
return chatClient.prompt().messages(messageList).toolCallbacks(provider).call().content();
}
配置好mcp服务
spring:
ai:
ollama:
base-url: http://localhost:11434
chat:
# deepseek-r1:1.5b
model: qwen3.5:4b
# model: gemma3:4b
mcp:
client:
streamable-http:
connections:
buy-dram: # 自定义名称
url: http://localhost:8080
endpoint: /api/mcp-Service
server:
port: 8380
至此一个简单的mcp服务和配置完成,当然如果在gpt开发平台就是网页配置了,当然mcp服务要自己写,但是基本上是封装的,中间经过mcp网关代理。
问题分析
笔者尝试运行:
POST localhost:8380/chat/dram_buy?question=我想购买ddr4笔记本内存条,请帮忙分析现在适合购买吗,价格实惠吗
结果报错了:
org.springframework.ai.retry.NonTransientAiException: HTTP 400 - {"error":"registry.ollama.ai/library/gemma3:4b does not support tools"}
at org.springframework.ai.retry.autoconfigure.SpringAiRetryAutoConfiguration$2.handleError(SpringAiRetryAutoConfiguration.java:126) ~[spring-ai-autoconfigure-retry-1.1.2.jar:1.1.2]
at org.springframework.web.client.ResponseErrorHandler.handleError(ResponseErrorHandler.java:58) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.client.StatusHandler.lambda$fromErrorHandler$1(StatusHandler.java:71) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.client.StatusHandler.handle(StatusHandler.java:146) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.applyStatusHandlers(DefaultRestClient.java:838) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.lambda$readBody$4(DefaultRestClient.java:827) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.client.DefaultRestClient.readWithMessageConverters(DefaultRestClient.java:216) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.readBody(DefaultRestClient.java:826) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.lambda$body$0(DefaultRestClient.java:757) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.exchangeInternal(DefaultRestClient.java:586) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.exchange(DefaultRestClient.java:540) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.client.RestClient$RequestHeadersSpec.exchange(RestClient.java:680) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.executeAndExtract(DefaultRestClient.java:821) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.body(DefaultRestClient.java:757) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.ai.ollama.api.OllamaApi.chat(OllamaApi.java:115) ~[spring-ai-ollama-1.1.2.jar:1.1.2]
at org.springframework.ai.ollama.OllamaChatModel.lambda$internalCall$1(OllamaChatModel.java:248) ~[spring-ai-ollama-1.1.2.jar:1.1.2]
at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:357) ~[spring-retry-2.0.12.jar:na]
at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:230) ~[spring-retry-2.0.12.jar:na]
at org.springframework.ai.ollama.OllamaChatModel.lambda$internalCall$3(OllamaChatModel.java:248) ~[spring-ai-ollama-1.1.2.jar:1.1.2]
at io.micrometer.observation.Observation.observe(Observation.java:564) ~[micrometer-observation-1.14.13.jar:1.14.13]
at org.springframework.ai.ollama.OllamaChatModel.internalCall(OllamaChatModel.java:246) ~[spring-ai-ollama-1.1.2.jar:1.1.2]
at org.springframework.ai.ollama.OllamaChatModel.call(OllamaChatModel.java:231) ~[spring-ai-ollama-1.1.2.jar:1.1.2]
at org.springframework.ai.chat.client.advisor.ChatModelCallAdvisor.adviseCall(ChatModelCallAdvisor.java:56) ~[spring-ai-client-chat-1.1.2.jar:1.1.2]
at org.springframework.ai.chat.client.advisor.DefaultAroundAdvisorChain.lambda$nextCall$1(DefaultAroundAdvisorChain.java:114) ~[spring-ai-client-chat-1.1.2.jar:1.1.2]
at io.micrometer.observation.Observation.observe(Observation.java:564) ~[micrometer-observation-1.14.13.jar:1.14.13]
at org.springframework.ai.chat.client.advisor.DefaultAroundAdvisorChain.nextCall(DefaultAroundAdvisorChain.java:113) ~[spring-ai-client-chat-1.1.2.jar:1.1.2]
at org.springframework.ai.chat.client.DefaultChatClient$DefaultCallResponseSpec.lambda$doGetObservableChatClientResponse$1(DefaultChatClient.java:539) ~[spring-ai-client-chat-1.1.2.jar:1.1.2]
at io.micrometer.observation.Observation.observe(Observation.java:564) ~[micrometer-observation-1.14.13.jar:1.14.13]
at org.springframework.ai.chat.client.DefaultChatClient$DefaultCallResponseSpec.doGetObservableChatClientResponse(DefaultChatClient.java:537) ~[spring-ai-client-chat-1.1.2.jar:1.1.2]
at org.springframework.ai.chat.client.DefaultChatClient$DefaultCallResponseSpec.content(DefaultChatClient.java:517) ~[spring-ai-client-chat-1.1.2.jar:1.1.2]
at com.feng.chat.demo.controller.ChatController.buyDram(ChatController.java:49) ~[classes/:na]
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) ~[na:na]
at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[na:na]
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:258) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:191) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:118) ~[spring-webmvc-6.2.14.jar:6.2.14]
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:991) ~[spring-webmvc-6.2.14.jar:6.2.14]
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:896) ~[spring-webmvc-6.2.14.jar:6.2.14]
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) ~[spring-webmvc-6.2.14.jar:6.2.14]
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1089) ~[spring-webmvc-6.2.14.jar:6.2.14]
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:979) ~[spring-webmvc-6.2.14.jar:6.2.14]
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1014) ~[spring-webmvc-6.2.14.jar:6.2.14]
at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:914) ~[spring-webmvc-6.2.14.jar:6.2.14]
at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:590) ~[tomcat-embed-core-10.1.49.jar:6.0]
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:885) ~[spring-webmvc-6.2.14.jar:6.2.14]
at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:658) ~[tomcat-embed-core-10.1.49.jar:6.0]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:138) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:51) ~[tomcat-embed-websocket-10.1.49.jar:10.1.49]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:162) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:138) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) ~[spring-web-6.2.14.jar:6.2.14]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:162) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:138) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) ~[spring-web-6.2.14.jar:6.2.14]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:162) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:138) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201) ~[spring-web-6.2.14.jar:6.2.14]
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) ~[spring-web-6.2.14.jar:6.2.14]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:162) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:138) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:165) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:88) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:482) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:113) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:83) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:72) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:399) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:903) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1774) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:973) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:491) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:63) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
at java.base/java.lang.Thread.run(Thread.java:1583) ~[na:na]
说是gemma3模型不支持tools,看了ollama官网,果然不支持,😅

启动有顺序要求,要先启动mcp服务,再启动mcp客户端,否则连不上启动失败,所以mcp网关就很有存在的价值。
效果
为了支持mcp,换了支持tools的模型qwen3.5:4b

测试一下:用了qwen3.5think模式,太久了,超时
Caused by: java.util.concurrent.TimeoutException: Did not observe any item or terminal signal within 20000ms in 'source(MonoDeferContextual)' (and no fallback has been configured)
笔者关闭think模式吧,毕竟丐版m4 mac运算能力不足
@Bean
public ChatClient initChatClient() {
return builder.defaultSystem("你是一名AI助手,你的名字叫阿尔法。你可以帮助用户解答关于用户提出的相关的知识")
.defaultOptions(OllamaChatOptions.builder()
.model("qwen3.5:4b")
.disableThinking()
.build()).build();
}
其实应该使用参数,对每个对话请求进行进行细化关闭,只有少数场景会使用think模式,毕竟比较慢,消耗资源大。


笔者这个示例不太好,改一改,毕竟分析的能力llm自带,没必要调用mcp,改为什么时候买合适的预测信息,让AI给我们总结分析。这样可以去掉mcp里面调用llm,提升速度。
代码如下:
@McpTool(name = "buyDDR4BuyTimeSuggestion", description = "get suggestion of ddr4 laptop when go to buy")
public Map<String, String> buyDDR4Suggestion(@McpToolParam(description = "dram type") String dramType,
@McpToolParam(description = "buy time") Date buyTime) {
return infoService.getBuyTimeSuggestion(dramType, buyTime);
}
public Map<String, String> getBuyTimeSuggestion(String dramType, Date buyTime) {
// call llm analysis
System.out.println("dramType:" + dramType);
System.out.println("dramInfo:" + buyTime);
Map<String, String> map = new HashMap<>();
map.put("dramType", dramType);
map.put("type", "ddr4内存有3种类型:笔记本、台式机、低功耗(lpddr4)");
map.put("environment", "AI对HBM显存有需求,内存颗粒厂商控制产量,奸商趁机囤积,根据各大内存颗粒厂商预测2028年左右可以价格回跌到正常水平");
map.put("priceStatus", "预计2028年dram厂商大规模生成后,AI数据中心建设完成后,价格可以回归正常水平");
return map;
}
其实就是llm解析自然语言,提取关键信息,但是llm没有联网,没有最新数据,根据配置去mcp服务访问,读取最新信息,然后继续推理,给出我们答案。

确认mcp被调用

工具使用
其实工具的使用类似mcp服务,都是模型自己不能做,或者需要最新的数据时才能执行,根据官方文档:tools.html#_information_retrieval
笔者试着让qwen3.5执行数据运算和数值比较,这个以前是重灾区,结果现在llm能做了,😅

应该是把数学推理逻辑纳入了模型训练,据说新模型gpt新版本会把agent调用纳入,原生支持openclaw的模型agent的调用推理能力
那就把时间调用纳入agent,写个tool
public class DateToolService {
@Tool(description = "Get the current date and time in the user's timezone")
String getCurrentDateTime() {
return LocalDateTime.now().atZone(LocaleContextHolder.getTimeZone().toZoneId()).toString();
}
}
然后再问一下时间看看
@PostMapping("/getTimeNow")
public String getTimeNow(){
return chatClient.prompt("明天的日期是?")
.tools(new DateToolService())
.call()
.content();
}
效果:

总结
其实使用springai来处理mcp服务很简单,但是实际应用过程基本上是封装agent,agent来调用mcp服务取数,比如业务人员要报表,那么agent就会调用mcp🔗数据库来取数,并根据业务人员的要求,做出各种报表。当然日常的天气啥的都是经典示例了,主要解决llm对于实时数据的需求。毕竟llm需要训推才能识别模型外的知识,这些是为了扩展llm的能力。
更多推荐
所有评论(0)