pyspark.pandas.read_json¶

pyspark.pandas。 read_json ( 路徑:str,行:bool=真正的,index_col:聯盟(str,列表(str),沒有一個)=沒有一個,* *選項:任何 )→pyspark.pandas.frame.DataFrame¶

DataFrame轉換為一個JSON字符串。

參數

路徑字符串: 文件路徑
行 bool,默認的真: 每行讀取文件作為一個json對象。現在應該總是正確的。
index_col str和str列表,可選的,默認值:沒有: 表的索引列火花。
選項 dict: 所有其他選項直接傳遞到火花的數據源。

例子

           > > >df=ps。DataFrame([[“一個”,“b”),(“c”,' d ']],…列=(“上校1”,《col 2》])
          

           > > >df。to_json(路徑=r”% s/ read_json foo.json”%路徑,num_files=1)> > >ps。read_json(…路徑=r”% s/ read_json foo.json”%路徑…)。sort_values(通過=“上校1”)坳1 col 20 b1 c d
          

           > > >df。to_json(路徑=r”% s/ read_json foo.json”%路徑,num_files=1,lineSep=“___”)> > >ps。read_json(…路徑=r”% s/ read_json foo.json”%路徑,lineSep=“___”…)。sort_values(通過=“上校1”)坳1 col 20 b1 c d
          

你可以保留指數往返如下。

           > > >df。to_json(路徑=r”% s/ read_json bar.json”%路徑,num_files=1,index_col=“指數”)> > >ps。read_json(…路徑=r”% s/ read_json bar.json”%路徑,index_col=“指數”…)。sort_values(通過=“上校1”)坳1 col 2指數0 b1 c d
          

以前的

pyspark.pandas.DataFrame.to_excel

下一個

pyspark.pandas.DataFrame.to_json