
Integrating Local Open LLMs (LLM-Jp) with MLflow Prompt Engineering UI by ss-13
👉 Japanese Article of this repository
This repository provides a practical guide and working example for integrating MLflow Prompt Engineering UI with non-officially supported Japanese LLMs, specifically from the LLM-jp project.
MLflow Prompt Engineering UI is an experimental feature allowing users to test and optimize prompts in a no-code UI. It supports OpenAI and other hosted models, but local models like those from LLM-jp require manual setup.
This repository shows how to:
- Use MLflow Model Serving to serve local LLM-jp models via pyfunc.PythonModel
- Expose the local models through MLflow AI Gateway
- Run prompt optimization experiments via Prompt Engineering UI
- serve local models via pyfunc.PythonModel
python wrap_pyfunc.py
- MLflow Tracing and Streamlit UI
mlflow ui --backend-store-uri sqlite:///mlruns/mlflow.db --port 5000 streamlit run app.py --server.fileWatcherType none
- AI gateway and prompt enginnering
export MLFLOW_DEPLOYMENTS_TARGET="http://127.0.0.1:7000" export OPENAI_API_KEY="" # if you use OpenAI Model for LLM-as-a-judge mlflow models serve -m ./saved_model/ --no-conda --port 5001 mlflow gateway start --config-path config.yaml --port 7000 mlflow server --port 5000
The MLflow client can interface with a SQLAlchemy-compatible database (e.g., SQLite, PostgreSQL, MySQL) for the backend. Saving metadata to a database allows you cleaner management of your experiment data while skipping the effort of setting up a server.
|Convert to| B[MLflow PyFunc Model];n B –>|Save Locally| C[Local MLflow Model Storage];n C –>|Serve with MLflow| D[MLflow Model Serving];n D –>|Expose via AI Gateway| E[MLflow AI Gateway Endpoint];n E –>|Use in| F[MLflow Prompt Engineering UI];n”}” data-plain=”graph TD;
A[Hugging Face Hub Model Card] –>|Convert to| B[MLflow PyFunc Model];
B –>|Save Locally| C[Local MLflow Model Storage];
C –>|Serve with MLflow| D[MLflow Model Serving];
D –>|Expose via AI Gateway| E[MLflow AI Gateway Endpoint];
E –>|Use in| F[MLflow Prompt Engineering UI];
” dir=”auto”>
A[Hugging Face Hub Model Card] –>|Convert to| B[MLflow PyFunc Model];
B –>|Save Locally| C[Local MLflow Model Storage];
C –>|Serve with MLflow| D[MLflow Model Serving];
D –>|Expose via AI Gateway| E[MLflow AI Gateway Endpoint];
E –>|Use in| F[MLflow Prompt Engineering UI];
” dir=”auto”>
graph TD;
A[Hugging Face Hub Model Card] -->|Convert to| B[MLflow PyFunc Model];
B -->|Save Locally| C[Local MLflow Model Storage];
C -->|Serve with MLflow| D[MLflow Model Serving];
D -->|Expose via AI Gateway| E[MLflow AI Gateway Endpoint];
E -->|Use in| F[MLflow Prompt Engineering UI];
Loading
Japanese prompt input may cause encoding issues in Prompt Engineering UI.
Workaround: use ASCII-only prompts or sanitize input manually.
- LLM-jp:
A Cross-organizational Project for the Research and
Development of Fully Open Japanese LLMs - LLM-jp: 日本語に強い大規模言語モデルの研究開発を行う
組織横断プロジェクト - MLflow Tracing for LLM Observability
- LLM 勉強会
- Hugging Face LLM-jp
- Prompt Engineering UI (Experimental)
- MLflow AI Gateway (Experimental)
- LLM-jp-3 Fine-tuned Models
- llm-jp/llm-jp-3-150m-instruct2
- llm-jp/llm-jp-3-150m-instruct3
- llm-jp/llm-jp-3-440m-instruct2
- llm-jp/llm-jp-3-440m-instruct3
- llm-jp/llm-jp-3-980m-instruct2
- llm-jp/llm-jp-3-980m-instruct3
- llm-jp/llm-jp-3-1.8b
- llm-jp/llm-jp-3-1.8b-instruct
- llm-jp/llm-jp-3-1.8b-instruct2
- llm-jp/llm-jp-3-1.8b-instruct3
- llm-jp/llm-jp-3-3.7b
- llm-jp/llm-jp-3-3.7b-instruct
- llm-jp/llm-jp-3-3.7b-instruct2
- llm-jp/llm-jp-3-3.7b-instruct3
- llm-jp/llm-jp-3-7.2b-instruct
- llm-jp/llm-jp-3-7.2b-instruct2
- llm-jp/llm-jp-3-7.2b-instruct
1 Comment
ss-13
I’ve been experimenting with MLflow’s Prompt Engineering UI, which lets you do no-code prompt tuning across multiple LLMs. While it officially supports models like OpenAI out of the box, I wanted to try it with Japanese open-source models from the LLM-jp project.
This repo shows how to serve these models locally using MLflow’s pyfunc model interface, expose them via the MLflow AI Gateway, and compare prompt performance through the UI.
It includes a working setup with:
– Hugging Face LLM-jp models (e.g. llm-jp-3-3.7b-instruct)
– MLflow Model Serving
– MLflow Gateway
– Prompt Engineering UI
– Streamlit UI for experiment tracking
GitHub: https://github.com/suzuki-2001/mlflow-llm-jp-integration
Japanese article explaining the project: https://zenn.dev/shosuke_13/articles/21d304b5f80e00