11种常用的深度学习图像分割模型（介绍 + 代码）

图像分割是计算机视觉领域的一项关键任务，其中图像中的每个像素被分配给特定的类别或对象。近年来，随着深度学习的发展，图像分割技术取得了长足的进步。本文将介绍11种常用的深度学习模型，帮助你理解和掌握这些强大的工具。【戳下面蓝字，即可跳转到学习页面】U-Net 是一个经典的图像分割模型，最初设计用于生物医学图像分割。其架构采用编码器-解码器结构，通过跳跃连接将低级特征传递到更高级别，从而保留更详细的信

yyyyyybw

3300人浏览 · 2025-05-21 14:12:40

yyyyyybw · 2025-05-21 14:12:40 发布

图像分割是计算机视觉领域的一项关键任务，其中图像中的每个像素被分配给特定的类别或对象。

近年来，随着深度学习的发展，图像分割技术取得了长足的进步。

本文将介绍11种常用的深度学习模型，帮助你理解和掌握这些强大的工具。

【戳下面蓝字，即可跳转到学习页面】

【2025】这才是科研人该学的pytorch教程！通俗易懂，完全可以自学，7天带你构建神经网络，解决pytorch框架问题！深度学习|机器学习|人工智能

深度学习八大算法真不难！一口气学完CNN、RNN、GAN、GNN、DQN、Transformer、LSTM八大神经网络！机器学习|卷积神经网络|pytorch

1. U-Net

U-Net 是一个经典的图像分割模型，最初设计用于生物医学图像分割。其架构采用编码器-解码器结构，通过跳跃连接将低级特征传递到更高级别，从而保留更详细的信息。

代码示例：

import torch
import torch.nn as nn

class UNet(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(UNet, self).__init__()
        # Encoder part
        self.encoder = nn.Sequential(
            nn.Conv2d(in_channels, 64, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.Conv2d(64, 64, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2)
        )
        # Decoder part
        self.decoder = nn.Sequential(
            nn.Conv2d(64, 64, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.Conv2d(64, 64, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.ConvTranspose2d(64, out_channels, kernel_size=2, stride=2)
        )

    def forward(self, x):
        # Encoder
        x1 = self.encoder(x)
        # Decoder
        x2 = self.decoder(x1)
        return x2

# Create model instance
model = UNet(in_channels=3, out_channels=1)
print(model)

输出：

UNet(  
  (encoder): Sequential(  
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))  
    (1): ReLU()  
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))  
    (3): ReLU()  
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)  
  )  
  (decoder): Sequential(  
    (0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))  
    (1): ReLU()  
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))  
    (3): ReLU()  
    (4): ConvTranspose2d(64, 1, kernel_size=(2, 2), stride=(2, 2), padding=(0, 0))  
  )  
)

2. DeepLab v3+

DeepLab v3+ 是一个基于卷积神经网络的分割模型，利用空洞卷积来捕捉多尺度信息，并通过 ASPP（空洞空间金字塔池化）模块增强其表达能力。
代码示例：

import torchvision.models.segmentation as segmentation_models  
# Load pretrained DeepLab v3+ model  
model = segmentation_models.deeplabv3_resnet101(pretrained=True)  
# Modify output channels  
model.classifier[4] = nn.Conv2d(256, 1, kernel_size=1)  
print(model)

输出：

DeepLabV3(  
  (backbone): IntermediateLayerGetter(  
    (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)  
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)  
    (relu): ReLU(inplace=True)  
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)  
    (layer1): Bottleneck(...)  
    (layer2): Bottleneck(...)  
    (layer3): Bottleneck(...)  
    (layer4): Bottleneck(...)  
  )  
  (classifier): DeepLabHead(  
    (aspp): ASPP(  
      (convs): ModuleList(  
        (0): ASPPConv(...)  
        (1): ASPPConv(...)  
        (2): ASPPConv(...)  
        (3): ASPPPooling(...)  
      )  
      (project): Sequential(  
        (0): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1))  
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)  
        (2): ReLU(inplace=True)  
        (3): Dropout(p=0.5, inplace=False)  
      )  
    )  
    (classifier): Conv2d(256, 1, kernel_size=(1, 1), stride=(1, 1))  
  )  
  (aux_classifier): None  
)

3. Mask R-CNN

Mask R-CNN 是基于 Faster R-CNN 的实例分割模型，不仅可以检测图像中的物体，还可以为每个物体生成像素级的 mask。
代码示例：

import torchvision.models.detection.mask_rcnn as mask_rcnn  
# Load the pretrained Mask R-CNN model  
model = mask_rcnn.maskrcnn_resnet50_fpn(pretrained=True)  
# Modify the number of output classes  
model.roi_heads.box_predictor.cls_score.out_features = 2  
model.roi_heads.box_predictor.bbox_pred.out_features = 8  
model.roi_heads.mask_predictor.mask_fcn_logits.out_channels = 1  
print(model)

4. PSPNet

PSPNet（金字塔场景解析网络）是一个基于金字塔池化的图像分割模型，通过不同尺度的池化操作来捕捉全局上下文信息。
代码示例：

import torchvision.models.segmentation as segmentation_models  
# Load the pretrained PSPNet model  
model = segmentation_models.fcn_resnet101(pretrained=True)  
# Modify the number of output channels  
model.classifier[4] = nn.Conv2d(512, 1, kernel_size=1)  
print(model)

5. SegNet

SegNet 是一种基于编码器 - 解码器架构的图像分割模型。它使用最大池化索引进行上采样，这有助于保留更详细的信息。
代码示例：

import torch  
import torch.nn as nn  
class SegNet(nn.Module):  
    def __init__(self, in_channels, out_channels):  
        super(SegNet, self).__init__()  
        # Encoder part  
        self.encoder = nn.Sequential(  
            nn.Conv2d(in_channels, 64, kernel_size=3, padding=1),  
            nn.BatchNorm2d(64),  
            nn.ReLU(),  
            nn.MaxPool2d(kernel_size=2, stride=2, return_indices=True)  
        )  
        # Decoder part  
        self.decoder = nn.Sequential(  
            nn.MaxUnpool2d(kernel_size=2, stride=2),  
            nn.Conv2d(64, 64, kernel_size=3, padding=1),  
            nn.BatchNorm2d(64),  
            nn.ReLU(),  
            nn.Conv2d(64, out_channels, kernel_size=3, padding=1)  
        )  
    def forward(self, x):  
        # Encoder  
        x, indices = self.encoder(x)  
        # Decoder  
        x = self.decoder((x, indices))  
        return x  
# Create model instance  
model = SegNet(in_channels=3, out_channels=1)  
print(model)

6. LinkNet

LinkNet 是一个轻量级的图像分割模型，通过跳过连接来连接编码器和解码器，减少参数数量并提高计算效率。
代码示例：

import torch  
import torch.nn as nn  
class LinkNet(nn.Module):  
    def __init__(self, in_channels, out_channels):  
        super(LinkNet, self).__init__()  
        # Encoder part  
        self.encoder = nn.Sequential(  
            nn.Conv2d(in_channels, 64, kernel_size=3, padding=1),  
            nn.BatchNorm2d(64),  
            nn.ReLU(),  
            nn.MaxPool2d(kernel_size=2, stride=2)  
        )  
        # Decoder part  
        self.decoder = nn.Sequential(  
            nn.ConvTranspose2d(64, 64, kernel_size=2, stride=2),  
            nn.BatchNorm2d(64),  
            nn.ReLU(),  
            nn.Conv2d(64, out_channels, kernel_size=3, padding=1)  
        )  
    def forward(self, x):  
        # Encoder  
        x1 = self.encoder(x)  
        # Decoder  
        x2 = self.decoder(x1)  
        return x2  
# Create model instance  
model = LinkNet(in_channels=3, out_channels=1)  
print(model)

7. HRNet

HRNet（High-Resolution Network）是一种高分辨率网络，通过多尺度融合和并行处理来保持高分辨率的特征图，从而提高分割精度。
代码示例：

import torch  
import torch.nn as nn  
class HRNet(nn.Module):  
    def __init__(self, in_channels, out_channels):  
        super(HRNet, self).__init__()  
        # High-resolution branch  
        self.high_res_branch = nn.Sequential(  
            nn.Conv2d(in_channels, 64, kernel_size=3, padding=1),  
            nn.BatchNorm2d(64),  
            nn.ReLU()  
        )  
        # Low-resolution branch  
        self.low_res_branch = nn.Sequential(  
            nn.Conv2d(in_channels, 32, kernel_size=3, padding=1),  
            nn.BatchNorm2d(32),  
            nn.ReLU(),  
            nn.MaxPool2d(kernel_size=2, stride=2)  
        )  
        # Fusion module  
        self.fusion = nn.Sequential(  
            nn.Conv2d(96, 64, kernel_size=3, padding=1),  
            nn.BatchNorm2d(64),  
            nn.ReLU(),  
            nn.Conv2d(64, out_channels, kernel_size=1)  
        )  
    def forward(self, x):  
        # High-resolution branch  
        high_res = self.high_res_branch(x)  
        # Low-resolution branch  
        low_res = self.low_res_branch(x)  
        # Upsample low-resolution feature map  
        low_res_upsampled = nn.functional.interpolate(low_res, scale_factor=2, mode='bilinear', align_corners=True)  
        # Fuse high-resolution and low-resolution feature maps  
        fused = torch.cat([high_res, low_res_upsampled], dim=1)  
        # Output  
        output = self.fusion(fused)  
        return output  
# Create model instance  
model = HRNet(in_channels=3, out_channels=1)  
print(model)

8. RefineNet

RefineNet 是一个多路径细化网络，它通过多路径细化模块逐步细化特征图，从而提高分割精度。
代码示例：

import torch  
import torch.nn as nn  
class RefineNet(nn.Module):  
    def __init__(self, in_channels, out_channels):  
        super(RefineNet, self).__init__()  
        # Encoder part  
        self.encoder = nn.Sequential(  
            nn.Conv2d(in_channels, 64, kernel_size=3, padding=1),  
            nn.BatchNorm2d(64),  
            nn.ReLU(),  
            nn.MaxPool2d(kernel_size=2, stride=2)  
        )  
        # Multi-path refinement module  
        self.refine_module = nn.Sequential(  
            nn.Conv2d(64, 64, kernel_size=3, padding=1),  
            nn.BatchNorm2d(64),  
            nn.ReLU(),  
            nn.Conv2d(64, 64, kernel_size=3, padding=1),  
            nn.BatchNorm2d(64),  
            nn.ReLU()  
        )  
        # Decoder part  
        self.decoder = nn.Sequential(  
            nn.ConvTranspose2d(64, 64, kernel_size=2, stride=2),  
            nn.BatchNorm2d(64),  
            nn.ReLU(),  
            nn.Conv2d(64, out_channels, kernel_size=3, padding=1)  
        )  
    def forward(self, x):  
        # Encoder  
        x1 = self.encoder(x)  
        # Multi-path refinement  
        x2 = self.refine_module(x1)  
        # Decoder  
        x3 = self.decoder(x2)  
        return x3  
# Create model instance  
model = RefineNet(in_channels=3, out_channels=1)  
print(model)

9. BiSeNet

BiSeNet（双边分割网络）是一种高效的实时图像分割模型，它通过空间和上下文路径分别捕获空间细节和全局上下文信息。
代码示例：

import torch  
import torch.nn as nn  
class BiSeNet(nn.Module):  
    def __init__(self, in_channels, out_channels):  
        super(BiSeNet, self).__init__()  
        # Spatial Path  
        self.spatial_path = nn.Sequential(  
            nn.Conv2d(in_channels, 64, kernel_size=3, padding=1),  
            nn.BatchNorm2d(64),  
            nn.ReLU(),  
            nn.Conv2d(64, 64, kernel_size=3, padding=1),  
            nn.BatchNorm2d(64),  
            nn.ReLU()  
        )  
        # Context Path  
        self.context_path = nn.Sequential(  
            nn.Conv2d(in_channels, 64, kernel_size=3, padding=1),  
            nn.BatchNorm2d(64),  
            nn.ReLU(),  
            nn.MaxPool2d(kernel_size=2, stride=2)  
        )  
        # Fusion Module  
        self.fusion = nn.Sequential(  
            nn.Conv2d(128, 64, kernel_size=3, padding=1),  
            nn.BatchNorm2d(64),  
            nn.ReLU(),  
            nn.Conv2d(64, out_channels, kernel_size=1)  
        )  
    def forward(self, x):  
        # Spatial Path  
        spatial = self.spatial_path(x)  
        # Context Path  
        context = self.context_path(x)  
        # Upsample context feature map  
        context_upsampled = nn.functional.interpolate(context, scale_factor=2, mode='bilinear', align_corners=True)  
        # Fuse spatial and context feature maps  
        fused = torch.cat([spatial, context_upsampled], dim=1)  
        # Output  
        output = self.fusion(fused)  
        return output  
# Create model instance  
model = BiSeNet(in_channels=3, out_channels=1)  
print(model)

10. OCRNet

OCRNet（用于语义分割的对象上下文表示）是一种利用对象上下文表示来改进语义分割的模型。它通过注意力机制增强特征图的表现力。
代码示例：

import torch  
import torch.nn as nn  
class OCRNet(nn.Module):  
    def __init__(self, in_channels, out_channels):  
        super(OCRNet, self).__init__()  
        # Encoder part  
        self.encoder = nn.Sequential(  
            nn.Conv2d(in_channels, 64, kernel_size=3, padding=1),  
            nn.BatchNorm2d(64),  
            nn.ReLU(),  
            nn.MaxPool2d(kernel_size=2, stride=2)  
        )  
        # Object-context module  
        self.object_context = nn.Sequential(  
            nn.Conv2d(64, 64, kernel_size=3, padding=1),  
            nn.BatchNorm2d(64),  
            nn.ReLU(),  
            nn.Conv2d(64, 64, kernel_size=3, padding=1),  
            nn.BatchNorm2d(64),  
            nn.ReLU()  
        )  
        # Decoder part  
        self.decoder = nn.Sequential(  
            nn.Conv

腾讯云开发者社区

腾讯云面向开发者汇聚海量精品云计算使用和开发经验，营造开放的云计算技术生态圈。

更多推荐

终极指南：Flink SQL连接器版本管理从混乱到有序的升级之路

Apache Flink作为流处理领域的佼佼者，其SQL连接器的版本管理一直是开发者面临的核心挑战。本文将系统讲解Flink SQL连接器版本管理的最佳实践，帮助你轻松应对版本兼容性问题，实现从混乱到有序的升级之旅。## 连接器版本管理的常见痛点 😫在Flink应用开发中，连接器版本管理常常让开发者头疼不已。不同版本的连接器可能导致各种兼容性问题，例如API变更、功能差异甚至运行时错误。

腾讯云开发者社区

Elasticsearch复杂数据类型终极指南：从入门到精通

Elasticsearch作为功能强大的搜索引擎，支持多种复杂数据类型，让开发者能够灵活处理各种结构化和非结构化数据。本文将带你全面了解Elasticsearch中的复杂数据类型，从基础概念到实际应用，助你轻松掌握数据建模的核心技巧。## 内部对象：构建层级化数据结构在Elasticsearch中，对象类型（Object）是最基础的复杂数据类型之一，用于表示具有嵌套关系的数据。例如，我们可

腾讯云开发者社区

如何快速搭建Neon无服务器PostgreSQL：面向初学者的完整指南

Neon是一款革命性的无服务器PostgreSQL解决方案，它通过分离存储和计算层，实现了自动扩缩容、类代码式数据库分支以及零级扩展能力。本指南将帮助你从零开始搭建Neon开发环境，体验这款创新数据库的强大功能。## 准备工作：环境要求与依赖项在开始搭建Neon环境前，请确保你的系统满足以下要求：- Linux操作系统（推荐Ubuntu 20.04+或Debian 11+）- Git