关于python flask处理前端传过来的二进制文件的常见操作（待更新）

王小希ww

3262人浏览 · 2022-10-16 02:26:25

王小希ww · 2022-10-16 02:26:25 发布

关于python flask处理前端传过来的二进制文件的常见操作

文章目录

关于python flask处理前端传过来的二进制文件的常见操作

一、音频、视频、文本文件保存到本地

参考

核心代码：

with open(file_path, "wb") as out_file:  # open for [w]riting as [b]inary
    out_file.write(buffer_video)

其中wb+的含义是：以二进制格式打开一个文件用于读写，如果该文件已存在则将其覆盖，如果该文件不存在，创建新文件。

Note：字节流无需考虑字符编码，即open()无需设置encoding。

1）保存二进制视频

如果前端传过来视频，则使用with open处理，注意wb+；

fileStorage = request.files['videofile']  #视频文件
buffer_video = fileStorage.read()
filename = request.files['textfile'].filename  #上传的文件名
# 将二进制视频流保存成文件之后再用opencv读取 参考https://stackoverflow.com/questions/57865656/save-video-in-python-from-bytes
if (not os.path.isdir(file_path)):
    os.mkdir(file_path)  # 创建文件夹
file_path = os.path.join(file_path,"temp." + filename.split(".")[-1])
with open(file_path, "wb+") as out_file:  # open for [w]riting as [b]inary
    out_file.write(buffer_video)

2）保存二进制音频

如果前端传过来音频，同样使用with open读取二进制进行处理；

'''语音转文字'''
@speechB.route('/predict_text_from_audio',methods=['POST'])
def speech2word():
    if (request.method == 'POST'):  # 先返回音频文件 / 如果不行再返回一个音频地址供前端访问
        if (not os.path.isdir(voice2text_save_path)):  # 创建文件夹
            os.mkdir(voice2text_save_path)
        fileStorage = request.files['audiofile']  #视频文件
        buffer_data = fileStorage.read()
        filename = request.files['audiofile'].filename  #上传的文件名
        temp_path = os.path.join(voice2text_save_path, 'demo.' + filename.split(".")[-1]))
        with open(temp_path, 'wb+') as f:
            f.write(buffer_data)  #二进制转为音频文件
        text = speech2word_Handler.predict_word_with_voice(temp_path)
        return text
    else:
        return jsonify({'code': 400, 'msg': '操作失败：请使用post方法'})

3）保存二进制文本文件

如果前端传过来的是二进制文件（pdf，docx，txt等等），同样使用with open读取二进制进行处理；

'''语音合成'''
@speechB.route('/predict_audio_from_text',methods=['POST'])
def speechSynthetic():
    if (request.method == 'POST'):  # 先返回音频文件 / 如果不行再返回一个音频地址供前端访问
        type = int(request.form.get("type"))
        if(type == None): type = int(request.json['type'])

        if (not os.path.isdir(text2voice_save_path)):  # 创建多级文件夹
            # os.mkdir(text2voice_save_path)
            os.makedirs(text2voice_save_path, mode=0o777)

        ret = True
        if(type == 0): #type=0为文本字符串
            text = request.form.get('text')   #将text封装再formdata里
        elif(type == 1): #type=1为二进制文本文件
            fileStorage = request.files['textfile']  # 二进制文件
            buffer_data = fileStorage.read()
            filename = request.files['textfile'].filename
            suffix = filename.split(".")[-1]
            filePath = os.path.join(text2voice_save_path, 'demo.' + suffix)
            save_file_from_byte(buffer_data,filePath)  #保存二进制文件
            text,ret = read_file(filePath)  #读取二进制文件文本内容(ret=False表示文本解析异常)

        if(ret == True):
            savePath = os.path.join(text2voice_save_path, 'demo.wav')
            wav_path = text2voice_Handler.handle_speech_2_voice(input_text=text, savePath=savePath)
            timeStamp = str(time.mktime(time.localtime(time.time())))
            data = "http://" + ip + ":" + port + "/get_audio?file_path=" + wav_path + "&timeStamp=" + timeStamp  # 返回文件访问路径
            return jsonify({'data': data, 'error_flag' : False})
        else:
            return jsonify({'data': text, 'error_flag' : True})
    else:
        return jsonify({'code': 400, 'msg': '操作失败：请使用post方法'})

其中save_file_from_byte()为文件写入代码：

#将二进制流保存为文件
def save_file_from_byte(file_byte,filePath):
    with open(filePath, 'wb+') as f:
        f.write(file_byte)  # 二进制转为文本文件保存再本地

二、读取刚保存的文本文件

1）读取txt

text = open(filePath, encoding='utf-8').read()

2）读取docx

参考python_docx读取word的内容

先pip install python_docx，再使用如下代码

from docx import Document
doc = Document(filePath)
for i in doc.paragraphs:
  text = text + str(i.text)
print(text)

Note：能读取docx，但读取不了doc

3）读取pdf

参考一文教会你用Python读取PDF文件_python_脚本之家

先pip install pdfplumber，再使用如下代码

import pdfplumber
with pdfplumber.open(filePath) as pdf:
    for page in pdf.pages:
        text = text + page.extract_text()
print(text)

4）整体代码

import os
from docx import Document
import pdfplumber

#读取文件
def read_file(filePath):
    '''
    @param filePath: 文件路径
    @return:
    '''
    # 文件类型
    file_types = ['txt','md']
    file_type = filePath.split(".")[-1]
    text = ""
    if (file_type == 'docx'):  #参考https://blog.csdn.net/qq_38870145/article/details/124076591
        doc = Document(filePath)
        for i in doc.paragraphs:
            text = text + str(i.text)
    elif(file_type == 'pdf'):  #参考https://www.jb51.net/article/258597.htm
        with pdfplumber.open(filePath) as pdf:
            for page in pdf.pages:
                text = text + page.extract_text()
    elif(file_type in file_types):
        text = open(filePath, encoding='utf-8').read()
    else:
        return "Exception: Only `*.pdf`, `*.docx`, `*.txt`, `*.md` files can be read",False
    return text,True

三、mp3转wav，并设置采样率

参考

import soundfile as sf
import librosa
file_path = 'demo.mp3'   
y, s = librosa.load(file_path, sr=16000)  #将音频的采样率设置为16000HZ
file_path = file_path.split(".")[0] + ".wav"
sf.write(file_path, y, 16000)  # 写入文件（mp3转为wav格式）

腾讯云开发者社区

腾讯云面向开发者汇聚海量精品云计算使用和开发经验，营造开放的云计算技术生态圈。

更多推荐

计算机网络微课堂笔记

腾讯云开发者社区

Rabbitmq在java中的使用

腾讯云开发者社区

java try catch 之后定位不到具体报错行_JAVA入门（三）上

点击蓝字｜关注我们一、异常与异常处理异常简介代码中：阻止当前方法或作用域继续实现的，称之为异常java中的所有异常类都继承Throwable类，Exception 的父类是 Throwable编码环境用户操作输入出现问题由java虚拟机自动抛出和自动捕获需要手动添加抛出和捕获语句文件找不到ThrowableErrorException虚拟机错误 VirtualMachineError...