磚筆記本的失敗與“造成的:java.i…-磚- 17284

rpshgupta · ‎06-19-2022

org.apache.spark。SparkException:工作階段失敗而終止:任務0階段458.0失敗了4次,最近的失敗:在舞台上失去了任務0.3 458.0 (TID 2247)(172.18.102.75執行人1):com.databricks.sql.io。FileReadException:當abfss閱讀文件時發生錯誤:(電子郵件保護)/ file.csv。它是可能的底層文件已經更新。您可以顯式地火花的緩存失效運行“REFRESH TABLE表”命令的SQL或重新創建數據集/ DataFrame參與。如果δ緩存過期或底層文件已被移除,你可以手動三角洲緩存失效重新啟動集群。

在org.apache.spark.sql.execution.datasources.FileScanRDD立刻1美元立刻2.美元美元logfilenameandthrow (FileScanRDD.scala: 417)

在org.apache.spark.sql.execution.datasources.FileScanRDD立刻1美元立刻2.美元美元getnext (FileScanRDD.scala: 369)

org.apache.spark.util.NextIterator.hasNext (NextIterator.scala: 73)

在另一次1.美元美元org.apache.spark.sql.execution.datasources.FileScanRDD nextiterator (FileScanRDD.scala: 509)

在org.apache.spark.sql.execution.datasources.FileScanRDD立刻1美元。美元anonfun hasNext 1美元(FileScanRDD.scala: 322)

在scala.runtime.java8.JFunction0 mcZ sp.apply美元(JFunction0 mcZ sp.java美元:23)

在美元com.databricks.spark.util.ExecutorFrameProfiler知根知底(ExecutorFrameProfiler.scala: 110)

在另一次1.美元美元org.apache.spark.sql.execution.datasources.FileScanRDD hasnext (FileScanRDD.scala: 317)

在另一次10.美元美元scala.collection.Iterator hasnext (Iterator.scala: 460)

在另一次12.美元美元scala.collection.Iterator hasnext (Iterator.scala: 513)

在另一次11.美元美元scala.collection.Iterator hasnext (Iterator.scala: 491)

在另一次10.美元美元scala.collection.Iterator hasnext (Iterator.scala: 460)

org.apache.spark.sql.execution.collect.UnsafeRowBatchUtils .encodeUnsafeRows美元(UnsafeRowBatchUtils.scala: 80)

在org.apache.spark.sql.execution.collect.Collector。anonfun processFunc美元1美元(Collector.scala: 155)

在org.apache.spark.scheduler.ResultTask。anonfun runTask美元3美元(ResultTask.scala: 75)

在美元com.databricks.spark.util.ExecutorFrameProfiler知根知底(ExecutorFrameProfiler.scala: 110)

在org.apache.spark.scheduler.ResultTask。anonfun runTask美元1美元(ResultTask.scala: 75)

在美元com.databricks.spark.util.ExecutorFrameProfiler知根知底(ExecutorFrameProfiler.scala: 110)

org.apache.spark.scheduler.ResultTask.runTask (ResultTask.scala: 55)

org.apache.spark.scheduler.Task.doRunTask (Task.scala: 156)

在org.apache.spark.scheduler.Task。anonfun運行$ 1美元(Task.scala: 125)

在美元com.databricks.spark.util.ExecutorFrameProfiler知根知底(ExecutorFrameProfiler.scala: 110)

org.apache.spark.scheduler.Task.run (Task.scala: 95)

在org.apache.spark.executor.Executor TaskRunner。美元anonfun運行13美元(Executor.scala: 825)

org.apache.spark.util.Utils .tryWithSafeFinally美元(Utils.scala: 1658)

在org.apache.spark.executor.Executor TaskRunner。美元anonfun運行4美元(Executor.scala: 828)

在scala.runtime.java8.JFunction0專門sp.apply美元(美元JFunction0 mcV $ sp.java: 23)

在美元com.databricks.spark.util.ExecutorFrameProfiler知根知底(ExecutorFrameProfiler.scala: 110)

org.apache.spark.executor.Executor TaskRunner.run美元(Executor.scala: 683)

java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java: 1149)

java.util.concurrent.ThreadPoolExecutor Worker.run美元(ThreadPoolExecutor.java: 624)

java.lang.Thread.run (Thread.java: 748)

引起的:java。FileNotFoundException:操作失敗:指定的路徑不存在。”404頭,https://adls.dfs.core.windows.net/raw/file.csv?upn=false&action=getStatus&timeout=90

shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.checkException (AzureBlobFileSystem.java: 1344)

shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.open (AzureBlobFileSystem.java: 266)

com.databricks.spark.metrics.FileSystemWithMetrics.open (FileSystemWithMetrics.scala: 336)

在org.apache.hadoop.fs.FileSystem.lambda openFileWithOptions 0美元(FileSystem.java: 4633)

org.apache.hadoop.util.LambdaUtils.eval (LambdaUtils.java: 52)

org.apache.hadoop.fs.FileSystem.openFileWithOptions (FileSystem.java: 4631)

org.apache.hadoop.fs.FileSystem FSDataInputStreamBuilder.build美元(FileSystem.java: 4768)

org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize (LineRecordReader.java: 92)

在org.apache.spark.sql.execution.datasources.HadoopFileLinesReader。< init > (HadoopFileLinesReader.scala: 65)

org.apache.spark.sql.execution.datasources.csv.TextInputCSVDataSource.readFile (CSVDataSource.scala: 108)

在org.apache.spark.sql.execution.datasources.csv.CSVFileFormat。anonfun buildReader美元2美元(CSVFileFormat.scala: 169)

org.apache.spark.sql.execution.datasources.FileFormat立刻1.美元美元申請(FileFormat.scala: 156)

org.apache.spark.sql.execution.datasources.FileFormat立刻1.美元美元申請(FileFormat.scala: 143)

在org.apache.spark.sql.execution.datasources.FileScanRDD立刻1美元立刻2.美元美元getnext (FileScanRDD.scala: 353)

…31日更

Hubert_Dudek1 · ‎06-20-2022

看來,它指向一個文件,已不複存在。說錯誤,請嚐試刷新表的表的所以它將在蜂巢metastore更新鏈接文件。如果沒有幫助,請分享你的代碼。

rpshgupta · ‎06-20-2022

@Hubert杜德克沒有表。我隻是寫/讀鑲花的文件。

Hubert_Dudek1 · ‎06-28-2022

請分享您的代碼。然後我們將能夠幫助。

Kaniz · ‎06-27-2022

嗨@Rupesh gupta,下麵的例子使用了讀方法使用拚花方法產生的DataFrameReader在指定位置讀取鋪文件到一個DataFrame然後顯示DataFrame的內容。通過這種方法你可以閱讀你的拚花文件。

parquetDF = spark.read.format(“鋪”).load(" /路徑”)parquetDF.show(截斷= False)

磚

磚的筆記本“java:造成失敗。FileNotFoundException:操作失敗:指定的路徑不存在。”404頭,https://adls.dfs.core.windows.net/raw/file.csv?upn=false&action=getStatus&timeout=90".