前言

最近做项目,需要与AI的支持,实际上现在很多不确定的内容都可以通过AI生成,比如文本,图片,甚至语音视频等。vibe coding在大规模的应用,在crud方面大放光彩,但是AI本身的知识内容是预训练的,除非一直训推,否则llm的知识就固定到某个程度,所以有了rag function calling mcp等,这些都是为了agent服务的,一般我们与大模型的交互就是提示词,agent就是运用各种能力封装的智能体,当前时间大火的openclaw其实就是agent的智能判断和调用。

准备demo

springai已经封装好了mcp的各种能力,原生支持ollama,笔者是macbook,选择模型为gguf,方便,毕竟不是N卡,模型选择gemma3:4b 量化Q4_K_M,模型选择这个为mcp埋坑了。

mcp server

pom,笔者这里mcp也用了大语言模型(实际上现在很多模型都是多模态,不仅仅支持文本),如果mcp服务不使用llm,可以去掉ollama的依赖

    <properties>
        <maven.compiler.source>21</maven.compiler.source>
        <maven.compiler.target>21</maven.compiler.target>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-starter-mcp-server-webmvc</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
            <version>3.5.8</version>
        </dependency>
        <dependency>
            <groupId>org.aspectj</groupId>
            <artifactId>aspectjweaver</artifactId>
            <version>1.9.7</version>
        </dependency>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-starter-model-ollama</artifactId>
        </dependency>
    </dependencies>

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.springframework.ai</groupId>
                <artifactId>spring-ai-bom</artifactId>
                <version>1.1.2</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

随意写一写服务,以当前大涨的内存条为例,笔者还在说苹果黄金内存,结果都是黄金内存了,苹果有性价比了😅,回归正题,写个服务层

package com.feng.mcp.server;

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;

import java.util.HashMap;
import java.util.Map;

@Service
public class InfoService {


    public Map<String, String> getDramInfo(String dramType) {
        System.out.println("dramType:" + dramType);
        Map<String, String> map = new HashMap<>();
        map.put("dramType", dramType);
        map.put("type", "ddr4内存有3种类型:笔记本、台式机、低功耗(lpddr4)");
        map.put("environment", "AI对HBM显存有需求,内存颗粒厂商控制产量,奸商趁机囤积");
        map.put("priceStatus", "ddr4 笔记本内存条已经涨价 300%");
        return map;
    }

    @Autowired
    private ChatClient chatClient;

    public String getBuySuggestion(String dramType, String dramInfo) {
        // call llm analysis
        System.out.println("dramType:" + dramType);
        System.out.println("dramInfo:" + dramInfo);
        return chatClient.prompt("请根据MCP的实时内容分析,当前ddr4笔记本内存条是否值得购买").user(dramType+":\n"+dramInfo).call().content();
    }
}

mcp定义

package com.feng.mcp.server;

import org.springaicommunity.mcp.annotation.McpTool;
import org.springaicommunity.mcp.annotation.McpToolParam;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;

import java.util.Map;

@Service
public class MemeryMcpToolsService {

    @Autowired
    private InfoService infoService;

    @McpTool(name = "getDDR4MemeryInfo", description = "Get dram information of ddr4 laptop")
    public Map<String, String> getDDR4MemeryInfo(@McpToolParam(description = "dram type") String dramType) {
        return infoService.getDramInfo(dramType);
    }

    @McpTool(name = "buyDDR4Suggestion", description = "get suggestion of ddr4 laptop when go to buy")
    public String buyDDR4Suggestion(@McpToolParam(description = "dram type") String dramType,
                          @McpToolParam(description = "dram info", required = false) String dramInfo) {
        return infoService.getBuySuggestion(dramType, dramInfo);
    }
}

配置yaml

spring:
  ai:
    ollama:
      base-url: http://localhost:11434
      chat:
        # deepseek-r1:1.5b
        model: qwen3.5:4b
    mcp:
      server:
        enabled: true  # Enable MCP Server
        protocol: STATELESS  # Use Stateless streamable HTTP method
        annotation-scanner:
          enabled: true  # Enable annotation scanning for @McpTool annotation
        streamable-http:
          mcp-endpoint: /api/mcp-Service  # Define endpoint for MCP Client to discover available tools
        capabilities:
          tool: true  # Indicate this MCP Server provides tool capabilities
  application:
    name: dram-buy
  aop:
    auto: true
    proxy-target-class: true

然后客户端使用上一篇文章的chat改一改

    @Autowired
    private SyncMcpToolCallbackProvider provider;

    @PostMapping("/dram_buy")
    public String buyDram(@RequestParam("question") String question) {
        List<Message> messageList = new ArrayList<>();
        messageList.add(new UserMessage(question));
        return chatClient.prompt().messages(messageList).toolCallbacks(provider).call().content();
    }

配置好mcp服务

spring:
  ai:
    ollama:
      base-url: http://localhost:11434
      chat:
        # deepseek-r1:1.5b
        model: qwen3.5:4b
#        model: gemma3:4b
    mcp:
      client:
        streamable-http:
          connections:
            buy-dram: # 自定义名称
              url: http://localhost:8080
              endpoint: /api/mcp-Service

server:
  port: 8380

至此一个简单的mcp服务和配置完成,当然如果在gpt开发平台就是网页配置了,当然mcp服务要自己写,但是基本上是封装的,中间经过mcp网关代理。

问题分析

笔者尝试运行:

POST localhost:8380/chat/dram_buy?question=我想购买ddr4笔记本内存条,请帮忙分析现在适合购买吗,价格实惠吗

结果报错了:

org.springframework.ai.retry.NonTransientAiException: HTTP 400 - {"error":"registry.ollama.ai/library/gemma3:4b does not support tools"}
	at org.springframework.ai.retry.autoconfigure.SpringAiRetryAutoConfiguration$2.handleError(SpringAiRetryAutoConfiguration.java:126) ~[spring-ai-autoconfigure-retry-1.1.2.jar:1.1.2]
	at org.springframework.web.client.ResponseErrorHandler.handleError(ResponseErrorHandler.java:58) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.client.StatusHandler.lambda$fromErrorHandler$1(StatusHandler.java:71) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.client.StatusHandler.handle(StatusHandler.java:146) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.applyStatusHandlers(DefaultRestClient.java:838) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.lambda$readBody$4(DefaultRestClient.java:827) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.client.DefaultRestClient.readWithMessageConverters(DefaultRestClient.java:216) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.readBody(DefaultRestClient.java:826) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.lambda$body$0(DefaultRestClient.java:757) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.exchangeInternal(DefaultRestClient.java:586) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.client.DefaultRestClient$DefaultRequestBodyUriSpec.exchange(DefaultRestClient.java:540) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.client.RestClient$RequestHeadersSpec.exchange(RestClient.java:680) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.executeAndExtract(DefaultRestClient.java:821) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.body(DefaultRestClient.java:757) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.ai.ollama.api.OllamaApi.chat(OllamaApi.java:115) ~[spring-ai-ollama-1.1.2.jar:1.1.2]
	at org.springframework.ai.ollama.OllamaChatModel.lambda$internalCall$1(OllamaChatModel.java:248) ~[spring-ai-ollama-1.1.2.jar:1.1.2]
	at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:357) ~[spring-retry-2.0.12.jar:na]
	at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:230) ~[spring-retry-2.0.12.jar:na]
	at org.springframework.ai.ollama.OllamaChatModel.lambda$internalCall$3(OllamaChatModel.java:248) ~[spring-ai-ollama-1.1.2.jar:1.1.2]
	at io.micrometer.observation.Observation.observe(Observation.java:564) ~[micrometer-observation-1.14.13.jar:1.14.13]
	at org.springframework.ai.ollama.OllamaChatModel.internalCall(OllamaChatModel.java:246) ~[spring-ai-ollama-1.1.2.jar:1.1.2]
	at org.springframework.ai.ollama.OllamaChatModel.call(OllamaChatModel.java:231) ~[spring-ai-ollama-1.1.2.jar:1.1.2]
	at org.springframework.ai.chat.client.advisor.ChatModelCallAdvisor.adviseCall(ChatModelCallAdvisor.java:56) ~[spring-ai-client-chat-1.1.2.jar:1.1.2]
	at org.springframework.ai.chat.client.advisor.DefaultAroundAdvisorChain.lambda$nextCall$1(DefaultAroundAdvisorChain.java:114) ~[spring-ai-client-chat-1.1.2.jar:1.1.2]
	at io.micrometer.observation.Observation.observe(Observation.java:564) ~[micrometer-observation-1.14.13.jar:1.14.13]
	at org.springframework.ai.chat.client.advisor.DefaultAroundAdvisorChain.nextCall(DefaultAroundAdvisorChain.java:113) ~[spring-ai-client-chat-1.1.2.jar:1.1.2]
	at org.springframework.ai.chat.client.DefaultChatClient$DefaultCallResponseSpec.lambda$doGetObservableChatClientResponse$1(DefaultChatClient.java:539) ~[spring-ai-client-chat-1.1.2.jar:1.1.2]
	at io.micrometer.observation.Observation.observe(Observation.java:564) ~[micrometer-observation-1.14.13.jar:1.14.13]
	at org.springframework.ai.chat.client.DefaultChatClient$DefaultCallResponseSpec.doGetObservableChatClientResponse(DefaultChatClient.java:537) ~[spring-ai-client-chat-1.1.2.jar:1.1.2]
	at org.springframework.ai.chat.client.DefaultChatClient$DefaultCallResponseSpec.content(DefaultChatClient.java:517) ~[spring-ai-client-chat-1.1.2.jar:1.1.2]
	at com.feng.chat.demo.controller.ChatController.buyDram(ChatController.java:49) ~[classes/:na]
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) ~[na:na]
	at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[na:na]
	at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:258) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:191) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:118) ~[spring-webmvc-6.2.14.jar:6.2.14]
	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:991) ~[spring-webmvc-6.2.14.jar:6.2.14]
	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:896) ~[spring-webmvc-6.2.14.jar:6.2.14]
	at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) ~[spring-webmvc-6.2.14.jar:6.2.14]
	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1089) ~[spring-webmvc-6.2.14.jar:6.2.14]
	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:979) ~[spring-webmvc-6.2.14.jar:6.2.14]
	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1014) ~[spring-webmvc-6.2.14.jar:6.2.14]
	at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:914) ~[spring-webmvc-6.2.14.jar:6.2.14]
	at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:590) ~[tomcat-embed-core-10.1.49.jar:6.0]
	at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:885) ~[spring-webmvc-6.2.14.jar:6.2.14]
	at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:658) ~[tomcat-embed-core-10.1.49.jar:6.0]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:138) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:51) ~[tomcat-embed-websocket-10.1.49.jar:10.1.49]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:162) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:138) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) ~[spring-web-6.2.14.jar:6.2.14]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:162) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:138) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) ~[spring-web-6.2.14.jar:6.2.14]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:162) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:138) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201) ~[spring-web-6.2.14.jar:6.2.14]
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) ~[spring-web-6.2.14.jar:6.2.14]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:162) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:138) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:165) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:88) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:482) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:113) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:83) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:72) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:399) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:903) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1774) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:973) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:491) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:63) ~[tomcat-embed-core-10.1.49.jar:10.1.49]
	at java.base/java.lang.Thread.run(Thread.java:1583) ~[na:na]

说是gemma3模型不支持tools,看了ollama官网,果然不支持,😅

启动有顺序要求,要先启动mcp服务,再启动mcp客户端,否则连不上启动失败,所以mcp网关就很有存在的价值。

效果

为了支持mcp,换了支持tools的模型qwen3.5:4b

测试一下:用了qwen3.5think模式,太久了,超时

Caused by: java.util.concurrent.TimeoutException: Did not observe any item or terminal signal within 20000ms in 'source(MonoDeferContextual)' (and no fallback has been configured)
笔者关闭think模式吧,毕竟丐版m4 mac运算能力不足

    @Bean
    public ChatClient initChatClient() {
        return builder.defaultSystem("你是一名AI助手,你的名字叫阿尔法。你可以帮助用户解答关于用户提出的相关的知识")
                .defaultOptions(OllamaChatOptions.builder()
                        .model("qwen3.5:4b")
                        .disableThinking()
                        .build()).build();
    }

其实应该使用参数,对每个对话请求进行进行细化关闭,只有少数场景会使用think模式,毕竟比较慢,消耗资源大。

官方文档:thinking_mode_reasoning

笔者这个示例不太好,改一改,毕竟分析的能力llm自带,没必要调用mcp,改为什么时候买合适的预测信息,让AI给我们总结分析。这样可以去掉mcp里面调用llm,提升速度。

代码如下:

    @McpTool(name = "buyDDR4BuyTimeSuggestion", description = "get suggestion of ddr4 laptop when go to buy")
    public Map<String, String> buyDDR4Suggestion(@McpToolParam(description = "dram type") String dramType,
                          @McpToolParam(description = "buy time") Date buyTime) {
        return infoService.getBuyTimeSuggestion(dramType, buyTime);
    }

    public Map<String, String> getBuyTimeSuggestion(String dramType, Date buyTime) {
        // call llm analysis
        System.out.println("dramType:" + dramType);
        System.out.println("dramInfo:" + buyTime);
        Map<String, String> map = new HashMap<>();
        map.put("dramType", dramType);
        map.put("type", "ddr4内存有3种类型:笔记本、台式机、低功耗(lpddr4)");
        map.put("environment", "AI对HBM显存有需求,内存颗粒厂商控制产量,奸商趁机囤积,根据各大内存颗粒厂商预测2028年左右可以价格回跌到正常水平");
        map.put("priceStatus", "预计2028年dram厂商大规模生成后,AI数据中心建设完成后,价格可以回归正常水平");
        return map;
    }

其实就是llm解析自然语言,提取关键信息,但是llm没有联网,没有最新数据,根据配置去mcp服务访问,读取最新信息,然后继续推理,给出我们答案。

确认mcp被调用

工具使用

其实工具的使用类似mcp服务,都是模型自己不能做,或者需要最新的数据时才能执行,根据官方文档:tools.html#_information_retrieval

笔者试着让qwen3.5执行数据运算和数值比较,这个以前是重灾区,结果现在llm能做了,😅

应该是把数学推理逻辑纳入了模型训练,据说新模型gpt新版本会把agent调用纳入,原生支持openclaw的模型agent的调用推理能力

那就把时间调用纳入agent,写个tool

public class DateToolService {
    @Tool(description = "Get the current date and time in the user's timezone")
    String getCurrentDateTime() {
        return LocalDateTime.now().atZone(LocaleContextHolder.getTimeZone().toZoneId()).toString();
    }
}

然后再问一下时间看看

    @PostMapping("/getTimeNow")
    public String getTimeNow(){
        return chatClient.prompt("明天的日期是?")
                .tools(new DateToolService())
                .call()
                .content();
    }

效果:

总结

其实使用springai来处理mcp服务很简单,但是实际应用过程基本上是封装agent,agent来调用mcp服务取数,比如业务人员要报表,那么agent就会调用mcp🔗数据库来取数,并根据业务人员的要求,做出各种报表。当然日常的天气啥的都是经典示例了,主要解决llm对于实时数据的需求。毕竟llm需要训推才能识别模型外的知识,这些是为了扩展llm的能力。

Logo

腾讯云面向开发者汇聚海量精品云计算使用和开发经验,营造开放的云计算技术生态圈。

更多推荐