解決:如何開始一個mlflow服務器postgres回來……-磚- 21778

naveen_marthala · ‎05-01-2022

我在碼頭工人嚐試mlflow容器。

我已經運行在碼頭工人的postgres。當我使用了一個空的數據庫而mlflow服務器開始,一切按預期工作;

2022/05/01 mlflow.store.db 13:57:45信息。初始MLflow跑龍套:創建數據庫表……2022/05/01 mlflow.store.db 13:57:45信息。跑龍套:更新數據庫表信息impl PostgresqlImpl [alembic.runtime.migration]上下文。信息(alembic.runtime.migration)將承擔事務性DDL。信息(alembic.runtime.migration)運行升級- > 451 aebb31d03,添加度量步驟信息[alembic.runtime.migration]運行升級451 aebb31d03 - > 90 e64c465722遷移用戶列標簽信息(alembic.runtime.migration)運行升級90 e64c465722 - > 181 f10493468,允許null度量值信息[alembic.runtime.migration]運行升級181 f10493468 - > df50e92ffc5e,添加實驗標簽表信息(alembic.runtime.migration)運行升級df50e92ffc5e - > 7 ac759974ad8更新運行標簽更大限製信息(alembic.runtime.migration)運行升級7 ac759974ad8 - > 89 d4b8295536創建最新指標表信息[89 d4b8295536_create_latest_metrics_table_py]遷移完成了!信息(alembic.runtime.migration)運行升級89 d4b8295536 - > 2 b4d017a5e9b,添加模型與數據庫信息注冊表[2 b4d017a5e9b_add_model_registry_tables_to_db_py] registered_models和model_versions表添加到數據庫中。信息[2 b4d017a5e9b_add_model_registry_tables_to_db_py)遷移完成!信息[alembic.runtime.migration]運行升級2 b4d017a5e9b - > cfd24bdc0731,更新運行狀態約束與死亡信息(alembic.runtime.migration)運行升級cfd24bdc0731 - > 0 a8213491aaa drop_duplicate_killed_constraint信息(alembic.runtime.migration)運行升級0 a8213491aaa - > 728 d730b5ebd,添加注冊模型標簽表信息[alembic.runtime.migration]運行升級728 d730b5ebd - > 27 a6a02d2cf1,添加模型版本標簽表信息(alembic.runtime.migration)運行升級27 a6a02d2cf1 - > 84291 f40a231,加上run_link model_version信息[alembic.runtime.migration]運行升級84291 f40a231 - > a8c4a736bde6,允許null run_id信息(alembic.runtime.migration)運行升級a8c4a736bde6 - > 39 d1c3be5f05 add_is_nan_constraint_for_metrics_tables_if_necessary信息(alembic.runtime.migration)運行升級39 d1c3be5f05 - > c48cb773bb87 reset_default_value_for_is_nan_in_metrics_table_for_mysql impl PostgresqlImpl信息(alembic.runtime.migration)上下文。信息(alembic.runtime.migration)將承擔事務性DDL。

但當我開始一個新的容器運行mlflow服務器並使用相同的數據庫,我得到遷移錯誤。完整回溯:

2022/05/01 mlflow 16:43:28錯誤。cli:初始化錯誤後端存儲2022/05/01 mlflow 16:43:28錯誤。cli:檢測到過時的數據庫模式(bd07f7e963c5發現版本,但預期c48cb773bb87)。備份你的數據庫,然後運行mlflow db < database_uri >升級的數據庫遷移到最新的模式。注意:模式遷移可能會導致數據庫停機時間,請谘詢您的數據庫的文檔以了解更多的細節。回溯(最近調用最後):文件“/ home / naveend /。local / lib / python3.9 /網站/ mlflow / cli。py”, 411行,在服務器initialize_backend_stores (backend_store_uri default_artifact_root)文件“/ home / naveend。local / lib / python3.9 /網站/ mlflow /服務器/處理程序。py”, 258行,在initialize_backend_stores _get_tracking_store (backend_store_uri default_artifact_root)文件“/ home / naveend。local / lib / python3.9 /網站/ mlflow /服務器/處理程序。py”, 243行,在_get_tracking_store _tracking_store = _tracking_store_registry。get_store (store_uri artifact_root)文件“/ home / naveend /。local / lib / python3.9 /網站/ mlflow /跟蹤/ _tracking_service /注冊表。py”, 39, get_store回歸自我。_get_store_with_resolved_uri (resolved_store_uri artifact_uri)文件“/ home / naveend /。local / lib / python3.9 /網站/ mlflow /跟蹤/ _tracking_service /注冊表。py", line 49, in _get_store_with_resolved_uri return builder(store_uri=resolved_store_uri, artifact_uri=artifact_uri) File "/home/naveend/.local/lib/python3.9/site-packages/mlflow/server/handlers.py", line 111, in _get_sqlalchemy_store return SqlAlchemyStore(store_uri, artifact_uri) File "/home/naveend/.local/lib/python3.9/site-packages/mlflow/store/tracking/sqlalchemy_store.py", line 141, in __init__ mlflow.store.db.utils._verify_schema(self.engine) File "/home/naveend/.local/lib/python3.9/site-packages/mlflow/store/db/utils.py", line 53, in _verify_schema raise MlflowException( mlflow.exceptions.MlflowException: Detected out-of-date database schema (found version bd07f7e963c5, but expected c48cb773bb87). Take a backup of your database, then run 'mlflow db upgrade ' to migrate your database to the latest schema. NOTE: schema migration may result in database downtime - please consult your database's documentation for more detail.

在未來,我計劃整個設置轉移到AWS ECS (FARGATE)在AWS RDS postgres。所以,當容器mlflow服務器重新啟動時,它不會讓我空數據庫或遷移模式通過運行“mlflow db升級< database_uri >”,因為我將在serverless容器。我怎麼繞過這個限製和啟動mlflow服務器容器保持重啟和繼續使用相同的postgres嗎?

這個服務器上運行”——serve-artifacts”模式。

naveen_marthala · ‎06-15-2022

這是修複我發現和在我身邊一直沒有任何缺陷。

postgres服務器必須首先開始後,隻有mlflow服務器應該開始。我開始mlflow服務器而我postgres服務器發射。

在原帖子查看解決方案

Prabakar · ‎05-01-2022

嗨@Naveen Marthala我尋找和發現的代碼github鏈接。

naveen_marthala · ‎05-01-2022

你好@Prabakar Ammeappin,我想學習如何修複和發射mlflow服務器與預先存在的模式。為什麼每次服務器需要生成新的模式。我猜的源代碼不會給我太多的幫助。還是我丟失的東西。

Kaniz · ‎05-13-2022

嗨@Naveen Marthala,請通過鏈接,了解更多關於推出mlflow服務器。

Kaniz · ‎05-18-2022

嗨@Naveen Marthala,隻是一個友好的後續。你還需要幫助或上述反應幫助你找到解決方案了嗎?請讓我們知道。

磚

如何開始一個mlflow服務器postgres端已經滿是元數據的許多實驗?