讀值百分比火花(沒有鑄造)-磚- 34122

sarvesh · ‎12-01-2021

我有一個xlsx文件一欄;

百分比

30%

40%

50%

-10%

0.00%

0%

0.10%

110%

99.99%

99.98%

-99.99%

-99.98%

當我讀到這個使用Apache-Spark把我得到的是,

| |百分比

+ - - - - - - - - - - - +

| 0.3 |

| 0.4 |

| 0.5 |

| -0.1 |

| 0.0 |

| 0.001 |

| 1.1 |

| 0.9999 |

| 0.9998 |

+ - - - - - - - - - - - +

預期的輸出,

+ - - - - - - - - - - - +

| |百分比

+ - - - - - - - - - - - +

| | 30%

| | 40%

| | 50%

| | -10%

| | 0.00%

| | 0%

| | 0.10%

| | 110%

| | 99.99%

| | 99.98%

+ - - - - - - - - - - - +

我的代碼,

val火花= SparkSession

.builder

.appName (“trimTest”)

部分(“地方[*]”)

.getOrCreate ()

val df = spark.read

.format (“com.crealytics.spark.excel”)。

選項(“頭”,“真正的”)。

選項(“maxRowsInMemory”, 1000)。

選項(“inferSchema”,“真正的”)。

負載(“數據/ percentage.xlsx”)

df.printSchema ()

df.show (10)

我不想使用鑄造或inferschema變成假的,我想要一個百分比值百分比不讀或字符串的兩倍。

Hubert_Dudek1 · ‎12-01-2021

輸出比例是相當正確的,因為這是在excel中(在excel是格式化的細胞)。在引發同樣的100% = 1。

如果你想在儀表板中顯示百分比比如你隻需要連接%的跡象。

.withColumn(“率”(坳(率)* 100).cast (int)) .withColumn(“率”,concat((坳(率)* 100).cast (“int”),點燃(' % ')))

sarvesh · ‎12-01-2021

鑄造並不是我想要的想我得到一個大excel文件數百萬行,鑄造會超級慢。

werners1 · ‎12-01-2021

沒有necessarely。數百萬行並不是那麼多。Excel是,但不是火花。

werners1 · ‎12-01-2021

肯定的。這是excel商店百分比。你看到的隻是細胞格式化。

磚的筆記本(?)沒有可能格式輸出。

但很容易使用BI工具最重要的磚,在那裏你可以改變格式。

這是在我看來應該是如何實現的。