我得到以下錯誤當我試圖加載使用mlflow水模型預測
錯誤:
錯誤的工作與關鍵03017 f00000132d4ffffffff _990da74b0db027b33cc49d1d90934149美元失敗的一個例外:. lang。IllegalArgumentException:測試/驗證數據集沒有列與訓練集
源代碼:
# # pip安裝請求! pip安裝彙總# ! pip安裝“彩色光> = 0.3.8 " # ! pip安裝未來# ! pip安裝- fhttp://h2o-release.s3.amazonaws.com/h2o/latest_stable_Py.html水# ! pip安裝mlflow # ! wgethttps://github.com/mlflow/mlflow-example/blob/master/wine-quality.csv隨機進口mlflow進口mlflow進口水進口。水從h2o.estimators.random_forest進口H2ORandomForestEstimator h2o.init()酒= h2o.import_file (path =“winequality.csv”) r =葡萄酒('質量'].runif()火車=葡萄酒(r & lt;0.7]測試=酒(0.3 & lt; = r) mlflow.set_tracking_uri mlflow.set_experiment (“https://mlflow.xxxxxxx.cloud/”) (“H2ORandomForestEstimator”) def train_random_forest (ntrees):與mlflow.start_run():射頻= H2ORandomForestEstimator (ntrees = ntrees) train_cols = [n n的葡萄酒。col_names如果n ! =“質量”)射頻。火車(train_cols,“質量”,training_frame =火車,validation_frame =測試)mlflow。log_param mlflow (“ntrees”, ntrees)。log_metric (rmse rf.rmse ()) mlflow。log_metric (r2, rf.r2 ()) mlflow。log_metric(“美”,rf.mae ())mlflow.h2o。“模型”log_model (rf) h2o.save_model (rf)預測= rf.predict(測試)打印(predict.head())的ntrees (10、20、50、100): train_random_forest (ntrees) < / pre > < pre >進口mlflow logged_model =“s3: / / mlflow-sagemaker / 1/66f7c015fe8d4fb080940f3d31003f49 /工件/模型”# PyFuncModel負載模型。loaded_model = mlflow.pyfunc.load_model (logged_model) #熊貓DataFrame預測。熊貓作為pd loaded_model.predict導入(pd.DataFrame(測試))< / pre >
我跑在磚和它工作,沒有問題。我建議你要確保你的wget路徑是正確的,因為你發布下載HTML,而不是原始的csv。那可能導致了問題的產生。
% sh wgethttps://raw.githubusercontent.com/mlflow/mlflow-example/master/wine-quality.csv
隨機進口mlflow進口mlflow進口水進口。水從h2o.estimators.random_forest H2ORandomForestEstimator進口
h2o.init()酒= h2o.import_file(路徑=“。/ wine-quality.csv”) r =葡萄酒('質量'].runif()火車=葡萄酒(r < 0.7)測試=酒(0.3 < = r)
def train_random_forest (ntrees):與mlflow.start_run():射頻= H2ORandomForestEstimator (ntrees = ntrees) train_cols = [n n的葡萄酒。col_names如果n ! =“質量”)射頻。火車(train_cols,“質量”,training_frame =火車,validation_frame =測試)
mlflow。ntrees log_param (“ntrees”)mlflow。log_metric (rmse rf.rmse ()) mlflow。log_metric (r2, rf.r2 ()) mlflow。log_metric(“美”,rf.mae ())mlflow.h2o。“模型”log_model (rf)h2o.save_model (rf)預測= rf.predict(測試)print (predict.head())的ntrees (10、20、50、100): train_random_forest (ntrees進口mlflow logged_model = ' s3: / / mlflow-s3 sagemaker / 1/58e5371188ed4t649d2d75686a9f155d /工件/模型”# PyFuncModel負載模型。loaded_model = mlflow.pyfunc.load_model (logged_model) #熊貓DataFrame預測。熊貓作為pd loaded_model.predict導入(pd.DataFrame(測試)
錯誤
OSError:工作與關鍵03017 f00000132d4ffffffff _9993cede52525f90fe9729b1ddb24cf7美元失敗有一個例外:. lang。IllegalArgumentException:測試/驗證數據集沒有列與訓練集stacktrace: java.lang.IllegalArgumentException: Test/Validation dataset has no columns in common with the training set at hex.Model.adaptTestForTrain(Model.java:1568)