解決:問題組頁2 -磚- 32202

Braxx · ‎01-05-2022

我想組由一個數據幀通過“產品”,“市場”和總col_list中指定的其餘部分。有多列列表中但對於簡化讓下麵的例子。

annihilate我的錯誤:

“TypeError: unhashable類型:“列”

在與expr

col_list =(“價值”,“單位”)exprs = {(x)和.alias x (x)的col_list} df2 = df1。groupBy(“產品”、“市場”).agg (exprs)

蒂雅

Pholo · ‎01-10-2022

嗨@Shivers羅伯特

嚐試使用這樣的

進口pyspark.sql。函數作為F def year_sum (column_year, column_sum):返回F。當(F.col (column_year) = =, F.col (column_sum)) .otherwise (F.lit(沒有))顯示(df.select (* (F。總和(year_sum(我,“年”,“your_column_variable”)) .alias (str (i))我在[2018、2019]]))# # # #也可以使用主方法顯示(df.groupby (F.lit('假')).pivot(年).agg (F.sum (your_column_variable)) .drop(假的))

讓meknow如果它的工作原理。

磚

問題組的