pyspark.pandas.DataFrame.transpose¶

DataFrame。 轉置 ( )→pyspark.pandas.frame.DataFrame¶

轉置索引和列。

反映了對其主對角線DataFrame通過編寫行和列,反之亦然。房地產T是一個訪問器方法置()。

請注意

這種方法是基於一項昂貴的操作由於大數據的性質。內部需要為每個值生成每一行,然後組織兩次——這是一個巨大的操作。為了防止誤用,這種方法的計算。max_rows默認的輸入長度限製,引發了ValueError。

            > > >從pyspark.pandas.config進口option_context> > >與option_context(“compute.max_rows”,1000年):…ps。DataFrame({“一個”:範圍(1001年)})。轉置()回溯(最近的電話):…ValueError:當前DataFrame超過給定的限製1000行。請設定計算。max_rows”通過“pyspark.pandas.config.set_option”檢索檢索超過1000行。注意,在改變之前“compute.max_rows”,this operation is considerably expensive.
           

返回

DataFrame: 轉置DataFrame。

筆記

置換與混合DataFrame dtypes將導致一個齊次DataFrame強迫dtype。例如,如果int和浮動必須放置在同一列,它變成浮動。如果強製類型轉換是不可能的,它失敗了。

同時,注意索引的值應該是唯一的,因為他們成為獨特的列名。

此外,如果使用火花2.3,類型應該是完全相同的。

例子

廣場與齊次dtype DataFrame

           > > >d1={“col1”:(1,2),“col2”:(3,4]}> > >df1=ps。DataFrame(數據=d1,列=(“col1”,“col2”])> > >df1col1 col20 1 31 2 4
          

           > > >df1_transposed=df1。T。sort_index()> > >df1_transposed0 1col1 1 2col2 3 4
          

當dtype原始DataFrame均勻,我們得到一個相同的轉置DataFrame dtype:

           > > >df1。dtypescol1 int64col2 int64dtype:對象> > >df1_transposed。dtypes0 int641 int64dtype:對象
          

方陣DataFrame dtypes喜憂參半

           > > >d2={“分數”:(9.5,8),…“孩子”:(0,0),…“年齡”:(12,22]}> > >df2=ps。DataFrame(數據=d2,列=(“分數”,“孩子”,“年齡”])> > >df2分數的孩子年齡9.5 0 0 121 8.0 0 22
          

           > > >df2_transposed=df2。T。sort_index()> > >df2_transposed0 1年齡12.0 - 22.0孩子0.0 - 0.0得分9.5 - 8.0
          

當DataFrame dtypes混合,得到的轉置DataFrame強迫dtype:

           > > >df2。dtypes分數float64孩子int64年齡int64dtype:對象
          

           > > >df2_transposed。dtypes0 float641 float64dtype:對象
          

以前的

pyspark.pandas.DataFrame.T

下一個

pyspark.pandas.DataFrame.reindex