C++开发必备之stl进阶（八 unordered_map）

假设我们有一个自定义类型作为键，需要自定义哈希函数。int x;int y;// 自定义哈希函数return 0;如果需要不同的键比较逻辑，可以通过提供自定义比较函数来实现。然而，对于，键比较函数需要用于处理哈希冲突时键的相等性判断，通常不需要修改，除非有特殊需求。当使用自定义类型作为键时，需要自定义哈希函数和键比较函数。// 自定义键类型int age;// 自定义哈希函数return 0;有时

挨踢小明

2239人浏览 · 2024-09-18 14:51:06

挨踢小明 · 2024-09-18 14:51:06 发布

`std::unordered_map`

std::unordered_map 是 C++ 标准模板库（STL）中的一个关联容器，提供了键值对（key-value pair）的存储和快速访问。与 std::map 不同，std::unordered_map 基于哈希表实现，因此它不保持元素的有序性，但在平均情况下提供了更快的查找、插入和删除操作。

1. 基本概念

std::unordered_map 是一个键值对容器，允许通过键（key）快速访问对应的值（value）。它内部使用哈希表来实现元素的存储，因此：

无序存储：元素的存储顺序与插入顺序无关。
快速访问：平均时间复杂度为 O(1) 的查找、插入和删除操作。
唯一键：每个键在 unordered_map 中是唯一的，不能有重复的键。

2. 声明与初始化

std::unordered_map 的声明方式如下：

#include <unordered_map>
#include <string>

// 默认的 unordered_map，键为 std::string，值为 int
std::unordered_map<std::string, int> umap;

// 指定初始桶数量和哈希函数
std::unordered_map<std::string, int> umap_custom(100, std::hash<std::string>());

示例：初始化 std::unordered_map

#include <iostream>
#include <unordered_map>
#include <string>

int main() {
    // 使用初始化列表
    std::unordered_map<std::string, int> umap = {
        {"apple", 3},
        {"banana", 5},
        {"orange", 2}
    };

    // 输出内容
    for (const auto& pair : umap) {
        std::cout << pair.first << ": " << pair.second << std::endl;
    }

    return 0;
}

输出可能为：

banana: 5
orange: 2
apple: 3

注意：输出顺序不固定，因为 unordered_map 不保持元素的有序性。

3. 常用操作

3.1 插入元素

使用 operator[]：如果键存在，则返回对应的值；如果键不存在，则插入键并默认初始化值。
```
umap["grape"] = 4;  // 插入或更新键 "grape"
```
使用 insert()：尝试插入一个键值对，不会覆盖已存在的键。
```
umap.insert({"melon", 1});
umap.insert(std::make_pair("kiwi", 2));
```
使用 emplace()：在容器内部直接构造元素，避免不必要的拷贝。
```
umap.emplace("peach", 6);
```
3.2 访问元素

使用 operator[]：访问或插入元素。

int apple_count = umap["apple"];

使用 at()：访问元素，如果键不存在则抛出 std::out_of_range 异常。

try {
    int banana_count = umap.at("banana");
} catch (const std::out_of_range& e) {
    std::cerr << "Key not found!" << std::endl;
}

使用 find()：查找元素，返回迭代器。

auto it = umap.find("orange");
if (it != umap.end()) {
    std::cout << "Found orange: " << it->second << std::endl;
}

3.3 删除元素

使用 erase()：删除指定键或迭代器指向的元素。

umap.erase("apple");           // 删除键 "apple"
umap.erase(umap.find("banana"));  // 删除键 "banana"

使用 clear()：清空整个容器。

3.4 其他操作

6.1 负载因子与桶数量

检查键是否存在：

if (umap.find("kiwi") != umap.end()) {
    std::cout << "Kiwi exists in the map." << std::endl;
}

获取容器大小：

size_t size = umap.size();

检查容器是否为空：

if (umap.empty()) {
    std::cout << "The unordered_map is empty." << std::endl;
}

4. 迭代器与遍历

std::unordered_map 提供了多种迭代器类型，用于遍历容器中的元素。

4.1 正向遍历

#include <iostream>
#include <unordered_map>
#include <string>

int main() {
    std::unordered_map<std::string, int> umap = {
        {"apple", 3},
        {"banana", 5},
        {"orange", 2}
    };

    for (auto it = umap.begin(); it != umap.end(); ++it) {
        std::cout << it->first << ": " << it->second << std::endl;
    }

    return 0;
}

4.2 范围基于的 for 循环

for (const auto& pair : umap) {
    std::cout << pair.first << ": " << pair.second << std::endl;
}

4.3 使用 const_iterator

for (std::unordered_map<std::string, int>::const_iterator it = umap.cbegin(); it != umap.cend(); ++it) {
    std::cout << it->first << ": " << it->second << std::endl;
}

4.4 反向遍历

std::unordered_map 不支持反向迭代器，因为哈希表不保持元素的顺序。然而，可以通过复制元素到其他有序容器（如 std::vector）并进行反向遍历。

#include <iostream>
#include <unordered_map>
#include <vector>
#include <algorithm>

int main() {
    std::unordered_map<std::string, int> umap = {
        {"apple", 3},
        {"banana", 5},
        {"orange", 2}
    };

    // 将元素复制到 vector
    std::vector<std::pair<std::string, int>> vec(umap.begin(), umap.end());

    // 反向遍历
    for (auto it = vec.rbegin(); it != vec.rend(); ++it) {
        std::cout << it->first << ": " << it->second << std::endl;
    }

    return 0;
}

5. 自定义哈希函数和键比较

默认情况下，std::unordered_map 使用 std::hash<Key> 作为哈希函数，并使用 std::equal_to<Key> 作为键比较函数。但是，你可以自定义这两个函数以适应特定的需求。

5.1 自定义哈希函数

假设我们有一个自定义类型作为键，需要自定义哈希函数。

#include <iostream>
#include <unordered_map>
#include <string>

struct Point {
    int x;
    int y;

    bool operator==(const Point& other) const {
        return x == other.x && y == other.y;
    }
};

// 自定义哈希函数
struct PointHash {
    std::size_t operator()(const Point& p) const {
        return std::hash<int>()(p.x) ^ (std::hash<int>()(p.y) << 1);
    }
};

int main() {
    std::unordered_map<Point, std::string, PointHash> pointMap;

    Point p1 {1, 2};
    Point p2 {3, 4};

    pointMap[p1] = "Point 1";
    pointMap[p2] = "Point 2";

    for (const auto& pair : pointMap) {
        std::cout << "(" << pair.first.x << ", " << pair.first.y << "): " << pair.second << std::endl;
    }

    return 0;
}

输出可能为：

(1, 2): Point 1
(3, 4): Point 2

5.2 自定义键比较函数

如果需要不同的键比较逻辑，可以通过提供自定义比较函数来实现。然而，对于 unordered_map，键比较函数需要用于处理哈希冲突时键的相等性判断，通常不需要修改，除非有特殊需求。

6. 性能与复杂度

查找、插入和删除：
- 平均时间复杂度：O(1)
- 最坏时间复杂度：O(n)（当所有元素都落入同一个桶时）
空间复杂度：O(n)，其中 n 是元素的数量。
负载因子（Load Factor）：load_factor() 表示当前元素数量与桶数量的比值。较高的负载因子可能导致更多的哈希冲突，降低性能。

调整负载因子：

rehash()：设置桶的数量，使得负载因子不超过指定值。

reserve()：预留空间以减少后续的 rehash 操作。

std::unordered_map<std::string, int> umap;

// 设置最少桶数量
umap.rehash(100);

// 预留空间
umap.reserve(1000);

7. 常见用法示例

7.1 统计字符出现次数

#include <iostream>
#include <unordered_map>
#include <string>

int main() {
    std::string text = "hello world";
    std::unordered_map<char, int> charCount;

    for (char c : text) {
        if (c != ' ') {  // 忽略空格
            charCount[c]++;
        }
    }

    for (const auto& pair : charCount) {
        std::cout << pair.first << ": " << pair.second << std::endl;
    }

    return 0;
}

输出可能为：

h: 1
e: 1
l: 3
o: 2
w: 1
r: 1
d: 1

7.2 使用 std::unordered_map 实现字典

#include <iostream>
#include <unordered_map>
#include <string>

int main() {
    std::unordered_map<std::string, std::string> dictionary = {
        {"apple", "A fruit that grows on trees."},
        {"banana", "A long yellow fruit."},
        {"orange", "A citrus fruit."}
    };

    std::string word = "banana";
    auto it = dictionary.find(word);
    if (it != dictionary.end()) {
        std::cout << word << ": " << it->second << std::endl;
    } else {
        std::cout << word << " not found in the dictionary." << std::endl;
    }

    return 0;
}

输出：

banana: A long yellow fruit.

7.3 合并两个 std::unordered_map

#include <iostream>
#include <unordered_map>
#include <string>

int main() {
    std::unordered_map<std::string, int> map1 = {
        {"apple", 1},
        {"banana", 2}
    };

    std::unordered_map<std::string, int> map2 = {
        {"orange", 3},
        {"banana", 4},  // 重复键
        {"grape", 5}
    };

    for (const auto& pair : map2) {
        // 如果键不存在，则插入；如果存在，则更新值
        map1[pair.first] = pair.second;
    }

    for (const auto& pair : map1) {
        std::cout << pair.first << ": " << pair.second << std::endl;
    }

    return 0;
}

输出可能为：

banana: 4
apple: 1
orange: 3
grape: 5

说明：键 "banana" 的值被更新为 4。

8. 高级用法

8.1 自定义哈希函数与键比较

当使用自定义类型作为键时，需要自定义哈希函数和键比较函数。

#include <iostream>
#include <unordered_map>
#include <string>

// 自定义键类型
struct Person {
    std::string name;
    int age;

    bool operator==(const Person& other) const {
        return name == other.name && age == other.age;
    }
};

// 自定义哈希函数
struct PersonHash {
    std::size_t operator()(const Person& p) const {
        return std::hash<std::string>()(p.name) ^ (std::hash<int>()(p.age) << 1);
    }
};

int main() {
    std::unordered_map<Person, std::string, PersonHash> personMap;

    Person p1 {"Alice", 30};
    Person p2 {"Bob", 25};
    Person p3 {"Charlie", 35};

    personMap[p1] = "Engineer";
    personMap[p2] = "Doctor";
    personMap[p3] = "Teacher";

    for (const auto& pair : personMap) {
        std::cout << pair.first.name << " (" << pair.first.age << "): " << pair.second << std::endl;
    }

    return 0;
}

输出可能为：

Alice (30): Engineer
Bob (25): Doctor
Charlie (35): Teacher

8.2 使用 `std::unordered_map` 与智能指针

std::unordered_map 可以与智能指针（如 std::shared_ptr 和 std::unique_ptr）结合使用，管理对象的生命周期。

#include <iostream>
#include <unordered_map>
#include <memory>
#include <string>

struct Employee {
    std::string name;
    double salary;

    Employee(const std::string& n, double s) : name(n), salary(s) {}
};

int main() {
    std::unordered_map<int, std::shared_ptr<Employee>> employeeMap;

    employeeMap.emplace(1, std::make_shared<Employee>("Alice", 70000));
    employeeMap.emplace(2, std::make_shared<Employee>("Bob", 80000));
    employeeMap.emplace(3, std::make_shared<Employee>("Charlie", 90000));

    // 访问并修改
    if (employeeMap.find(2) != employeeMap.end()) {
        employeeMap[2]->salary += 5000;  // 增加薪水
    }

    for (const auto& pair : employeeMap) {
        std::cout << "ID: " << pair.first << ", Name: " << pair.second->name 
                  << ", Salary: " << pair.second->salary << std::endl;
    }

    return 0;
}

输出：

ID: 1, Name: Alice, Salary: 70000
ID: 2, Name: Bob, Salary: 85000
ID: 3, Name: Charlie, Salary: 90000

8.3 使用 `std::unordered_map` 实现多值映射

如果一个键对应多个值，可以使用 std::unordered_map 的值部分为 std::vector 或其他容器。

#include <iostream>
#include <unordered_map>
#include <vector>
#include <string>

int main() {
    std::unordered_map<std::string, std::vector<int>> multiMap;

    // 插入多个值
    multiMap["even"] = {2, 4, 6, 8};
    multiMap["odd"] = {1, 3, 5, 7};

    // 添加更多值
    multiMap["even"].push_back(10);
    multiMap["odd"].push_back(9);

    // 遍历并输出
    for (const auto& pair : multiMap) {
        std::cout << pair.first << ": ";
        for (int num : pair.second) {
            std::cout << num << " ";
        }
        std::cout << std::endl;
    }

    return 0;
}

输出：

odd: 1 3 5 7 9 
even: 2 4 6 8 10

8.4 使用 `std::unordered_map` 与自定义分区

有时需要对数据进行分区或分组，可以结合使用 std::unordered_map 和算法。

#include <iostream>
#include <unordered_map>
#include <vector>
#include <string>

struct Product {
    std::string name;
    std::string category;
    double price;
};

int main() {
    std::vector<Product> products = {
        {"Laptop", "Electronics", 1200.0},
        {"Smartphone", "Electronics", 800.0},
        {"T-Shirt", "Clothing", 20.0},
        {"Jeans", "Clothing", 40.0},
        {"Blender", "Home Appliance", 60.0},
        {"Microwave", "Home Appliance", 150.0}
    };

    // 分组产品按类别
    std::unordered_map<std::string, std::vector<Product>> categoryMap;

    for (const auto& product : products) {
        categoryMap[product.category].push_back(product);
    }

    // 输出分组结果
    for (const auto& pair : categoryMap) {
        std::cout << "Category: " << pair.first << std::endl;
        for (const auto& product : pair.second) {
            std::cout << "  - " << product.name << ": $" << product.price << std::endl;
        }
    }

    return 0;
}

输出：

Clothing
  - T-Shirt: $20
  - Jeans: $40
Home Appliance
  - Blender: $60
  - Microwave: $150
Electronics
  - Laptop: $1200
  - Smartphone: $800

9. 与 `std::map` 的比较

特性	`std::unordered_map`	`std::map`
底层实现	哈希表	红黑树（平衡二叉搜索树）
元素顺序	无序	按键的升序排列
查找复杂度	平均 O(1)	O(log n)
插入复杂度	平均 O(1)	O(log n)
迭代器	无特定顺序	有序迭代器
内存使用	通常比 `std::map` 更高	相对较低
使用场景	需要快速访问，元素顺序不重要	需要按键有序访问

9.1 示例对比

#include <iostream>
#include <unordered_map>
#include <map>
#include <string>

int main() {
    // 使用 unordered_map
    std::unordered_map<std::string, int> umap = {
        {"banana", 5},
        {"apple", 3},
        {"orange", 2}
    };

    std::cout << "unordered_map contents:" << std::endl;
    for (const auto& pair : umap) {
        std::cout << pair.first << ": " << pair.second << std::endl;
    }

    // 使用 map
    std::map<std::string, int> omap = {
        {"banana", 5},
        {"apple", 3},
        {"orange", 2}
    };

    std::cout << "\nmap contents:" << std::endl;
    for (const auto& pair : omap) {
        std::cout << pair.first << ": " << pair.second << std::endl;
    }

    return 0;
}

输出：

unordered_map contents:
banana: 5
orange: 2
apple: 3

map contents:
apple: 3
banana: 5
orange: 2

说明：std::map 按键有序，而 std::unordered_map 的输出顺序可能不固定。

10. 注意事项与最佳实践

10.1 选择合适的底层容器

默认情况下，使用 std::deque 作为底层容器通常足够。如果有特殊需求，如自定义哈希函数，确保底层容器满足性能和功能要求。
当使用自定义类型作为键时，确保 hash 和 operator== 正确实现，以避免哈希冲突和错误的键比较。

10.2 自定义哈希函数

struct Person {
    std::string name;
    int age;

    bool operator==(const Person& other) const {
        return name == other.name && age == other.age;
    }
};

// 自定义哈希函数
struct PersonHash {
    std::size_t operator()(const Person& p) const {
        return std::hash<std::string>()(p.name) ^ (std::hash<int>()(p.age) << 1);
    }
};

10.3 处理哈希冲突

哈希冲突会影响 unordered_map 的性能，选择一个好的哈希函数和调整桶数量可以减少冲突的发生。
使用 emplace() 和 try_emplace() 来构造元素，避免不必要的拷贝和移动操作。
10.4 避免不必要的拷贝

// 使用 emplace 直接构造元素
umap.emplace("grape", 4);

// 使用 try_emplace 仅在键不存在时插入
umap.try_emplace("kiwi", 2);

10.5 预分配桶空间

通过 reserve() 或 rehash() 预先分配足够的桶空间，以减少后续的重新哈希操作，提高性能。
```
umap.reserve(1000);  // 预留空间
```
10.6 使用合适的键类型
选择合适的键类型，尽量使用简单且支持高效哈希和比较的类型。例如，使用 std::string、int 等基本类型作为键通常表现良好。
频繁的删除操作可能导致哈希表频繁重新调整，影响性能。考虑使用懒删除或其他优化策略。
10.7 避免频繁的 erase 操作

以下是一个更为综合的示例，展示了 std::unordered_map 的多种功能：

#include <iostream>
#include <unordered_map>
#include <string>
#include <vector>
#include <algorithm>

// 自定义类型
struct Person {
    std::string name;
    int age;

    bool operator==(const Person& other) const {
        return name == other.name && age == other.age;
    }
};

// 自定义哈希函数
struct PersonHash {
    std::size_t operator()(const Person& p) const {
        return std::hash<std::string>()(p.name) ^ (std::hash<int>()(p.age) << 1);
    }
};

int main() {
    // 使用 std::unordered_map 存储 Person 到职位的映射
    std::unordered_map<Person, std::string, PersonHash> personJobMap;

    // 插入元素
    personJobMap.emplace(Person{"Alice", 30}, "Engineer");
    personJobMap.emplace(Person{"Bob", 25}, "Doctor");
    personJobMap.emplace(Person{"Charlie", 35}, "Teacher");

    // 访问元素
    Person searchPerson = {"Bob", 25};
    auto it = personJobMap.find(searchPerson);
    if (it != personJobMap.end()) {
        std::cout << it->first.name << " is a " << it->second << "." << std::endl;
    } else {
        std::cout << searchPerson.name << " not found." << std::endl;
    }

    // 遍历所有元素
    std::cout << "\nAll entries:" << std::endl;
    for (const auto& pair : personJobMap) {
        std::cout << pair.first.name << " (" << pair.first.age << "): " << pair.second << std::endl;
    }

    // 更新元素
    personJobMap[Person{"Alice", 30}] = "Senior Engineer";

    // 删除元素
    personJobMap.erase(Person{"Charlie", 35});

    // 输出更新后的内容
    std::cout << "\nAfter updates:" << std::endl;
    for (const auto& pair : personJobMap) {
        std::cout << pair.first.name << " (" << pair.first.age << "): " << pair.second << std::endl;
    }

    return 0;
}

在上述示例中，我们展示了如何使用自定义类型作为键，如何插入、访问、更新和删除元素，以及如何遍历 std::unordered_map。

通过掌握 std::unordered_map 的各种用法和优化技巧，你可以在实际开发中高效地管理和操作键值对数据。