java.lang.String cannot be cast to org.apache.spark.unsafe.types.UTF8String
2021/7/13 14:07:32
本文主要是介绍java.lang.String cannot be cast to org.apache.spark.unsafe.types.UTF8String,对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!
问题
最近遇到一个问题,在用spark读取tensorflow的文件并写入时
//读数据 val df = spark.read.format("tfrecords").load(readPath).toDF() //写数据 df.write.mode(SaveMode.Overwrite).format("tfrecords").option("recordType","Example").save(writePath)
出现如下报错:
org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:121) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$11.apply(Executor.scala:407) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:413) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.spark.unsafe.types.UTF8String at org.apache.spark.sql.catalyst.util.GenericArrayData.getUTF8String(GenericArrayData.scala:75) at org.apache.spark.sql.catalyst.InternalRow$$anonfun$getAccessor$8.apply(InternalRow.scala:136) at org.apache.spark.sql.catalyst.InternalRow$$anonfun$getAccessor$8.apply(InternalRow.scala:136) at org.apache.spark.sql.catalyst.util.ArrayData.toArray(ArrayData.scala:178) at org.tensorflow.spark.datasources.tfrecords.serde.DefaultTfRecordRowEncoder$.org$tensorflow$spark$datasources$tfrecords$serde$DefaultTfRecordRowEncoder$$encodeFeature(DefaultTfRecordRowEncoder.scala:132) at org.tensorflow.spark.datasources.tfrecords.serde.DefaultTfRecordRowEncoder$$anonfun$encodeExample$1.apply(DefaultTfRecordRowEncoder.scala:64) at org.tensorflow.spark.datasources.tfrecords.serde.DefaultTfRecordRowEncoder$$anonfun$encodeExample$1.apply(DefaultTfRecordRowEncoder.scala:61) at scala.collection.immutable.List.foreach(List.scala:392) at org.tensorflow.spark.datasources.tfrecords.serde.DefaultTfRecordRowEncoder$.encodeExample(DefaultTfRecordRowEncoder.scala:61) at org.tensorflow.spark.datasources.tfrecords.DefaultSource$$anonfun$2.apply(DefaultSource.scala:59) at org.tensorflow.spark.datasources.tfrecords.DefaultSource$$anonfun$2.apply(DefaultSource.scala:56) at scala.collection.Iterator$$anon$11.next(Iterator.scala:410) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:129) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394) at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139) ... 10 more
我用的是jar包是:spark-tensorflow-connector_2.11-1.6.0.jar
解决方案:
主要是版本的问题,更换spark connector的jar包
spark-tensorflow-connector_2.11-1.14.0.jar
这篇关于java.lang.String cannot be cast to org.apache.spark.unsafe.types.UTF8String的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!
- 2025-01-07如何利用看板工具优化品牌内容创作与审批,确保按时发布?
- 2025-01-07百万架构师第十一课:源码分析:Spring 源码分析:Spring源码分析前篇|JavaGuide
- 2025-01-07质量检测标准严苛,这 6 款办公软件达标了吗?
- 2025-01-07提升品牌活动管理的效率:看板工具助力品牌活动日历的可视化管理
- 2025-01-07宠物商场的精准营销秘籍:揭秘看板软件的力量
- 2025-01-07“30了,资深骑手” | 程序员能有什么好出路?
- 2025-01-07宠物公园的营销秘籍:看板软件如何帮你精准触达目标客户?
- 2025-01-07从任务分解到资源优化:甘特图工具全解析
- 2025-01-07企业升级必备指南:从传统办公软件到SaaS工具的转型攻略
- 2025-01-07一文告诉你IT项目管理如何做到高效