python数据分析关于抑郁症研究代码

抑郁症患者数据分析摘要该数据集包含8,400条抑郁症患者咨询记录，主要字段包括患者基本信息(姓名、性别、年龄)、咨询内容标签、日期、标题、交流次数、医生信息(姓名、医院、科室)等。数据分析显示：1)部分日期数据存在缺失；2)患者姓名列需要拆分出性别和年龄信息；3)数据预处理阶段已完成性别和年龄的拆分提取。该数据集可用于研究抑郁症患者的性别年龄分布、咨询频率、医生接诊情况等医疗数据分析。

slh7773

667人浏览 · 2025-06-06 15:04:29

slh7773 · 2025-06-06 15:04:29 发布

{
“cells”: [
{
“cell_type”: “markdown”,
“metadata”: {},
“source”: [
“# 导入数据”
]
},
{
“cell_type”: “code”,
“execution_count”: 27,
“metadata”: {},
“outputs”: [
{
“data”: {
“text/html”: [
“

”
],

  "text/plain": [
   "  Patient_name Label   Date  \\\n",
   "0    患者：女  43岁    压抑  05.28   \n",
   "\n",
   "                                               Title  Communications Doctor  \\\n",
   "0  压抑 个人情况：去年1月份开始夫妻两地分居，孩子13岁男孩住校，平... 这种情况是否需要去...             115    杨胜文   \n",
   "\n",
   "  Hospital Faculty  \n",
   "0  襄阳市安定医院     心理科  "
  ]
 },
 "execution_count": 27,
 "metadata": {},
 "output_type": "execute_result"
}

],
“source”: [
“import pandas as pd\n”,
“from pyecharts.charts import \n",
“from pyecharts import options as opts\n”,
“\n”,
“df=pd.read_csv(‘YiYuZheng.csv’)\n”,
“df.head(1)”
]
},
{
“cell_type”: “code”,
“execution_count”: 28,
“metadata”: {},
“outputs”: [
{
“name”: “stdout”,
“output_type”: “stream”,
“text”: [
“<class ‘pandas.core.frame.DataFrame’>\n”,
“RangeIndex: 8400 entries, 0 to 8399\n”,
“Data columns (total 8 columns):\n”,
" # Column Non-Null Count Dtype \n",
“— ------ -------------- ----- \n”,
" 0 Patient_name 8400 non-null object\n",
" 1 Label 8400 non-null object\n",
" 2 Date 8288 non-null object\n",
" 3 Title 8400 non-null object\n",
" 4 Communications 8400 non-null int64 \n",
" 5 Doctor 8400 non-null object\n",
" 6 Hospital 8400 non-null object\n",
" 7 Faculty 8400 non-null object\n",
“dtypes: int64(1), object(7)\n”,
“memory usage: 525.1+ KB\n”
]
}
],
“source”: [
"# 查看数据\n",*
“df.info()”
]
},
{
“cell_type”: “markdown”,
“metadata”: {},
“source”: [
“从数据反馈结果来看：Date列存在空缺值，并且不是日期类型。\n”,
“\n”,
“Patient_name列存在信息混合一起情况，需要拆分年龄和性别。”
]
},
{
“cell_type”: “markdown”,
“metadata”: {},
“source”: [
“# 数据预处理”
]
},
{
“cell_type”: “markdown”,
“metadata”: {},
“source”: [
“## 拆分年龄和性别”
]
},
{
“cell_type”: “code”,
“execution_count”: 29,
“metadata”: {},
“outputs”: [
{
“data”: {
“text/html”: [
“

\n”,
“\n”,
“<table border=“1” class=“dataframe”>\n”,
" \n",
" <tr style=“text-align: right;”>\n",
" \n",
" Patient_name\n",
" Label\n",
" Date\n",
" Title\n",
" Communications\n",
" Doctor\n",
" Hospital\n",
" Faculty\n",
" Sex\n",
" Age\n",
" \n",
" \n",
" \n",
" \n",
" 0\n",
" 患者：女 43岁\n",
" 压抑\n",
" 05.28\n",
" 压抑个人情况：去年1月份开始夫妻两地分居，孩子13岁男孩住校，平… 这种情况是否需要去…\n",
" 115\n",
" 杨胜文\n",
" 襄阳市安定医院\n",
" 心理科\n",
" 女\n",
" 43\n",
" \n",
" \n",
" 1\n",
" 患者：女 32岁\n",
" 生气。心梗。抑郁\n",
" 05.28\n",
" 生气。心梗。抑郁郁郁寡欢。被他人语言刺激。卧床不起。没动力。心疼。受伤是什么病。怎么办\n",
" 12\n",
" 郭汉法\n",
" 泰安八十八医院\n",
" 临床心理科\n",
" 女\n",
" 32\n",
" \n",
" \n",
" 2\n",
" 患者：女 15岁\n",
" 情绪低落，烦躁抑郁\n",
" 05.28\n",
" 情绪低落，烦躁抑郁情绪低落，压抑烦躁，思考能力降低。长时间学习，睡眠时间少。睡… 还有…\n",
" 2\n",
" 郭苏皖\n",
" 南京脑科医院\n",
" 医学心理科\n",
" 女\n",
" 15\n",
" \n",
" \n",
" 3\n",
" 患者：女 16岁\n",
" 抑郁\n",
" 05.28\n",
" 抑郁前面已简述，2024年夏季中考，本来学习非常好，非常自律，自… 已经服用9个月的艾…\n",
" 2\n",
" 刘丽\n",
" 联勤保障部队第九〇四医院（常州院区）\n",
" 精神3科（物质依赖科）\n",
" 女\n",
" 16\n",
" \n",
" \n",
" 4\n",
" 患者：女 67岁\n",
" 焦虑症严重躯干反应、抑郁症\n",
" 05.28\n",
" 焦虑症严重躯干反应抑郁症草酸加量以后，还是有比较严重的躯干反应，主要表现为背痛脖…\n",
" 2\n",
" 刘晓华\n",
" 上海市精神卫生中心\n",
" 精神科\n",
" 女\n",
" 67\n",
" \n",
" \n",
“\n”,
“

”
],

  "text/plain": [
   "  Patient_name           Label   Date  \\\n",
   "0    患者：女  43岁              压抑  05.28   \n",
   "1    患者：女  32岁        生气。心梗。抑郁  05.28   \n",
   "2    患者：女  15岁       情绪低落，烦躁抑郁  05.28   \n",
   "3    患者：女  16岁              抑郁  05.28   \n",
   "4    患者：女  67岁  焦虑症 严重躯干反应、抑郁症  05.28   \n",
   "\n",
   "                                               Title  Communications Doctor  \\\n",
   "0  压抑 个人情况：去年1月份开始夫妻两地分居，孩子13岁男孩住校，平... 这种情况是否需要去...             115    杨胜文   \n",
   "1      生气。心梗。抑郁 郁郁寡欢。被他人语言刺激。卧床不起。没动力。心疼。受伤 是什么病。怎么办              12    郭汉法   \n",
   "2  情绪低落，烦躁抑郁 情绪低落，压抑烦躁，思考能力降低。长时间学习，睡眠时间少。睡... 还有...               2    郭苏皖   \n",
   "3  抑郁 前面已简述，2024年夏季中考，本来学习非常好，非常自律，自... 已经服用9个月的艾...               2     刘丽   \n",
   "4  焦虑症 严重躯干反应   抑郁症 草酸加量以后，还是有比较严重的躯干反应，主要表现为背痛 脖...               2    刘晓华   \n",
   "\n",
   "             Hospital      Faculty Sex Age  \n",
   "0             襄阳市安定医院          心理科   女  43  \n",
   "1             泰安八十八医院        临床心理科   女  32  \n",
   "2              南京脑科医院        医学心理科   女  15  \n",
   "3  联勤保障部队第九〇四医院（常州院区）  精神3科（物质依赖科）   女  16  \n",
   "4           上海市精神卫生中心          精神科   女  67  "
  ]
 },
 "execution_count": 29,
 "metadata": {},
 "output_type": "execute_result"
}

],
“source”: [
“#获取性别，作为新列\n”,
“#患者：女 43岁,首先按照空格拆分，结果为[患者：女]\[ ]\[43岁],选取第一个，第二次按照中文冒号拆分，[患者][: ][女]\n”,
“df[‘Sex’]=df[‘Patient_name’].map(lambda x:x.split(” “)[0]).map(lambda x:x.split(”：“)[-1])\n”,
“\n”,
“#获取年龄，作为新列\n”,
“#患者：女 43岁,首先按照空格拆分，结果为[患者：女]\[ ]\[43岁],选取第三个，并且去掉“岁”\n”,
“df[‘Age’]=df[‘Patient_name’].map(lambda x:x.split(” “)[2][:-1])\n”,
“\n”,
“df.head()”
]
},
{
“cell_type”: “markdown”,
“metadata”: {},
“source”: [
“## 处理空缺值”
] 在这里插入图片描述

},
{
“cell_type”: “code”,
“execution_count”: 30,
“metadata”: {},
“outputs”: [
{
“data”: {
“text/plain”: [
“Patient_name 0\n”,
“Label 0\n”,
“Date 112\n”,
“Title 0\n”,
“Communications 0\n”,
“Doctor 0\n”,
“Hospital 0\n”,
“Faculty 0\n”,
“Sex 0\n”,
“Age 0\n”,
“dtype: int64”
]
},
“execution_count”: 30,
“metadata”: {},
“output_type”: “execute_result”
}
],
“source”: [
“df.isnull().sum()”
]
},
{
“cell_type”: “code”,
“execution_count”: 31,
“metadata”: {},
“outputs”: [],
“source”: [
“#因为空缺数据较少，并且不适合使用填充法，故而删除\n”,
“df.dropna(inplace=True)#在原来的数据上删除”
] 在这里插入图片描述

},
{
“cell_type”: “code”,
“execution_count”: 32,
“metadata”: {},
“outputs”: [
{
“name”: “stdout”,
“output_type”: “stream”,
“text”: [
“<class ‘pandas.core.frame.DataFrame’>\n”,
“Int64Index: 8288 entries, 0 to 8399\n”,
“Data columns (total 10 columns):\n”,
" # Column Non-Null Count Dtype \n",
“— ------ -------------- ----- \n”,
" 0 Patient_name 8288 non-null object\n",
" 1 Label 8288 non-null object\n",
" 2 Date 8288 non-null object\n",
" 3 Title 8288 non-null object\n",
" 4 Communications 8288 non-null int64 \n",
" 5 Doctor 8288 non-null object\n",
" 6 Hospital 8288 non-null object\n",
" 7 Faculty 8288 non-null object\n",
" 8 Sex 8288 non-null object\n",
" 9 Age 8288 non-null object\n",
“dtypes: int64(1), object(9)\n”,
“memory usage: 712.2+ KB\n”
]
}
],
“source”: [
“df.info()”
]
},
{
“cell_type”: “markdown”,
“metadata”: {},
“source”: [
“## 修改Date列”
]
},
{
“cell_type”: “code”,
“execution_count”: 33,
“metadata”: {},
“outputs”: [],
“source”: [
“#df[‘Date’]\n”,
“#转换成字符串类型\n”,
“df[‘Date’]=df[‘Date’].astype(str)\n”,
“\n”,
“#定义函数，实现date列格式统一：年-月-日\n”,
“def trans_date(tag):\n”,
" if tag.startswith(“20”):#查看是否以20开头，即查看是否存在年\n",
" tag=tag.replace(“.”,“-”)\n",
" else:\n",
" tag=“2025-”+tag.replace(“.”,“-”)#否则加上年份\n",
" return tag\n",
“\n”,
“df[‘Date’]= df[‘Date’].map(lambda x:trans_date(x))#调用函数转换格式\n”,
“\n”,
“#转换成日期类型\n”,
“df[‘Date’]=pd.to_datetime(df[‘Date’])\n”,
“\n”,
“#df.info()”
]
},
{
“cell_type”: “markdown”,
“metadata”: {},
“source”: [
“# 数据可视化分析”
]
},
{
“cell_type”: “code”,
“execution_count”: 19,
“metadata”: {},
“outputs”: [],
“source”: [
“from pyecharts.globals import ThemeType #导入主题库”
]
},
{
“cell_type”: “markdown”,
“metadata”: {},
“source”: [
“## 查看患者性别分布情况”
]
},
{
“cell_type”: “code”,
“execution_count”: 41,
“metadata”: {},
“outputs”: [
{
“data”: {
“text/html”: [
“\n”,
“\n"
],
“text/plain”: [
“<pyecharts.render.display.HTML at 0x18dbfdb6100>”
]
},
“execution_count”: 41,
“metadata”: {},
“output_type”: “execute_result”
}
],
“source”: [
“#准备数据：按照性别统计个数\n”,
“data=df[‘Sex’].value_counts()\n”,
“#data\n”,
“x=data.index.tolist()\n”,
“y=data.tolist()\n”,
“\n”,
“#绘制饼图\n”, 在这里插入图片描述

"pie=(\n",
"    Pie(init_opts=opts.InitOpts(theme=ThemeType.LIGHT))#设置主题\n",
"    .add(\"\",\n",
"         [list(z) for z in zip(x,y)],#数据需要打包成[(key,value),(key,value),...]\n",
"         label_opts=opts.LabelOpts(formatter=\"{b}:{d}%\")#以百分比形式显示标签\n",
"        )\n",
"    .set_global_opts(title_opts=opts.TitleOpts(title=\"患者性别分布情况\"))\n",
")\n",
"pie.render_notebook()"

]
},
{
“cell_type”: “markdown”,
“metadata”: {},
“source”: [
“## 患者年龄分布情况”
]
},
{
“cell_type”: “code”,
“execution_count”: 34,
“metadata”: {},
“outputs”: [],
“source”: [
“#数据准备\n”,
“#1.转换年龄为数值类型\n”,
“#df[‘Age’]=df[‘Age’].astype(int)\n”,
“#因为年龄数据不规范，存在：X岁Y月形式的数据，再次进行数据处理\n”,
“df[‘Age’]=df[‘Age’].map(lambda x:“1” if (“天” in x or “个” in x or “月” in x) else x).astype(int)\n”,
“#df.info()”
]
},
{
“cell_type”: “code”,
“execution_count”: 40,
“metadata”: {},
“outputs”: [
{
“data”: {
“text/html”: [
“\n”,
“\n"
],
“text/plain”: [
“<pyecharts.render.display.HTML at 0x1f3c6256940>”
]
},
“execution_count”: 6,
“metadata”: {},
“output_type”: “execute_result”
}
],
“source”: [
“data=df[‘Faculty’].value_counts()[:10] #选取前十科室\n”,
“\n”,
“pie=(\n”,
" Pie()\n",
" .add(‘’,[list(z) for z in zip(data.index.tolist(),data.tolist())],#饼图数据格式[[key1,value1],[key2,value2],…]\n",
" label_opts=opts.LabelOpts(formatter=“{b}:{d}%”)#标签格式\n",
" )\n",
“)\n”,
“pie.render_notebook()”
]
},
{
“cell_type”: “code”,
“execution_count”: null,
“metadata”: {},
“outputs”: [],
“source”: []
}
],
“metadata”: {
“kernelspec”: {
“display_name”: “Python 3”,
“language”: “python”,
“name”: “python3”
},
“language_info”: {
“codemirror_mode”: {
“name”: “ipython”,
“version”: 3
},
“file_extension”: “.py”,
“mimetype”: “text/x-python”,
“name”: “python”,
“nbconvert_exporter”: “python”,
“pygments_lexer”: “ipython3”,
“version”: “3.8.5”
}
},
“nbformat”: 4,
“nbformat_minor”: 4
}
![

## 在这里插入图片描述

*](https://i-blog.csdnimg.cn/direct/f3381692840e4ae2872e741d34349439.png#pic_center

腾讯云开发者社区

腾讯云面向开发者汇聚海量精品云计算使用和开发经验，营造开放的云计算技术生态圈。

更多推荐

终极指南：Flink SQL连接器版本管理从混乱到有序的升级之路

Apache Flink作为流处理领域的佼佼者，其SQL连接器的版本管理一直是开发者面临的核心挑战。本文将系统讲解Flink SQL连接器版本管理的最佳实践，帮助你轻松应对版本兼容性问题，实现从混乱到有序的升级之旅。## 连接器版本管理的常见痛点 😫在Flink应用开发中，连接器版本管理常常让开发者头疼不已。不同版本的连接器可能导致各种兼容性问题，例如API变更、功能差异甚至运行时错误。

腾讯云开发者社区

Elasticsearch复杂数据类型终极指南：从入门到精通

Elasticsearch作为功能强大的搜索引擎，支持多种复杂数据类型，让开发者能够灵活处理各种结构化和非结构化数据。本文将带你全面了解Elasticsearch中的复杂数据类型，从基础概念到实际应用，助你轻松掌握数据建模的核心技巧。## 内部对象：构建层级化数据结构在Elasticsearch中，对象类型（Object）是最基础的复杂数据类型之一，用于表示具有嵌套关系的数据。例如，我们可

腾讯云开发者社区

如何快速搭建Neon无服务器PostgreSQL：面向初学者的完整指南

Neon是一款革命性的无服务器PostgreSQL解决方案，它通过分离存储和计算层，实现了自动扩缩容、类代码式数据库分支以及零级扩展能力。本指南将帮助你从零开始搭建Neon开发环境，体验这款创新数据库的强大功能。## 准备工作：环境要求与依赖项在开始搭建Neon环境前，请确保你的系统满足以下要求：- Linux操作系统（推荐Ubuntu 20.04+或Debian 11+）- Git