pyspark 如果是 2.4.x 版本以及 python 环境是 3.8 时,会报 TypeError: an integer is required (got type bytes) 错误,那如何 fix 该 error 呢。

错误信息

错误信息可能如下:

Traceback (most recent call last):

File "/xxx/xxx/xxx.py", line 2, in

from pyspark.sql import SparkSession

File "/xxx/xxx/lib/python3.8/site-packages/pyspark/__init__.py", line 51, in

from pyspark.context import SparkContext

File "/xxx/xxx/lib/python3.8/site-packages/pyspark/context.py", line 31, in

from pyspark import accumulators

File "/xxx/xxx/lib/python3.8/site-packages/pyspark/accumulators.py", line 97, in

from pyspark.serializers import read_int, PickleSerializer

File "/xxx/xxx/lib/python3.8/site-packages/pyspark/serializers.py", line 72, in

from pyspark import cloudpickle

File "/xxx/xxx/lib/python3.8/site-packages/pyspark/cloudpickle.py", line 145, in

_cell_set_template_code = _make_cell_set_template_code()

File "/xxx/xxx/lib/python3.8/site-packages/pyspark/cloudpickle.py", line 126, in _make_cell_set_template_code

return types.CodeType(

TypeError: an integer is required (got type bytes)

原因及解决方法

打印如上错误异常是因为 spark 2.4.x 还不支持 python 3.8 版本,需要将执行代码的 python 环境降级到 3.7 版本或以下即可解决。

Logo

腾讯云面向开发者汇聚海量精品云计算使用和开发经验,营造开放的云计算技术生态圈。

更多推荐