👋 Join our WeChat or Discord communities
You can use Claude Code with GLM Coding Plan and enter the following prompt to quickly deploy this project:
Access the documentation and install AutoGLM for me https://raw.githubusercontent.com/zai-org/Open-AutoGLM/refs/heads/main/README_en.md
Phone Agent is a mobile intelligent assistant framework built on AutoGLM. It understands phone screen content in a multimodal manner and helps users complete tasks through automated operations. The system controls devices via ADB (Android Debug Bridge), perceives screens using vision-language models, and generates and executes operation workflows through intelligent planning. Users simply describe their needs in natural language, such as "Open eBay and search for wireless earphones." and Phone Agent will automatically parse the intent, understand the current interface, plan the next action, and complete the entire workflow. The system also includes a sensitive operation confirmation mechanism and supports manual takeover during login or verification code scenarios. Additionally, it provides remote ADB debugging capabilities, allowing device connection via WiFi or network for flexible remote control and development.
⚠️ This project is for research and learning purposes only. It is strictly prohibited to use for illegal information acquisition, system interference, or any illegal activities. Please carefully review the Terms of Use.
| Model | Download Links |
|---|---|
| AutoGLM-Phone-9B | 🤗 Hugging Face 🤖 ModelScope |
| AutoGLM-Phone-9B-Multilingual | 🤗 Hugging Face 🤖 ModelScope |
AutoGLM-Phone-9B is optimized for Chinese mobile applications, while AutoGLM-Phone-9B-Multilingual supports English scenarios and is suitable for applications containing English or other language content.
Python 3.10 or higher is recommended.
Choose the appropriate tool based on your device type:
MacOS configuration: In Terminal or any command line tool
# Assuming the extracted directory is ~/Downloads/platform-tools. Adjust the command if different.
export PATH=${PATH}:~/Downloads/platform-tools
Windows configuration: Refer to third-party tutorials for configuration.
MacOS/Linux configuration:
# Assuming the extracted directory is ~/Downloads/harmonyos-sdk/toolchains. Adjust according to actual path.
export PATH=${PATH}:~/Downloads/harmonyos-sdk/toolchains
Windows configuration: Add the HDC tool directory to the system PATH environment variable
Settings > About Phone > Build Number and tap it rapidly about 10 times until a popup shows "Developer mode has been enabled." This may vary slightly between phones; search online for tutorials if you can't find it.Settings > Developer Options > USB Debugging and enable itadb devices to see if device information appears. If not, the connection has failed.Please carefully check the relevant permissions

Note: HarmonyOS devices use native input methods and do not require ADB Keyboard.
If you are using an Android device:
Download the installation package and install it on the corresponding Android device.
Note: After installation, you need to enable ADB Keyboard in Settings > Input Method or Settings > Keyboard List for it to work.(or use command adb shell ime enable com.android.adbkeyboard/.AdbIMEHow-to-use)
pip install -r requirements.txt pip install -e .
Make sure your USB cable supports data transfer, not just charging.
Ensure ADB is installed and connect the device via USB cable:
# Check connected devices
adb devices
# Output should show your device, e.g.:
# List of devices attached
# emulator-5554 device
Make sure your USB cable supports data transfer, not just charging.
Ensure HDC is installed and connect the device via USB cable:
# Check connected devices
hdc list targets
# Output should show your device, e.g.:
# 7001005458323933328a01bce01c2500
You can choose to deploy the model service yourself or use a third-party model service provider.
If you don't want to deploy the model yourself, you can use the following third-party services that have already deployed our model:
1. z.ai
--base-url: https://api.z.ai/api/paas/v4--model: autoglm-phone-multilingual--apikey: Apply for your own API key on the z.ai platform2. Novita AI
--base-url: https://api.novita.ai/openai--model: zai-org/autoglm-phone-9b-multilingual--apikey: Apply for your own API key on the Novita AI platform3. Parasail
--base-url: https://api.parasail.io/v1--model: parasail-auto-glm-9b-multilingual--apikey: Apply for your own API key on the Parasail platformExample usage with third-party services:
# Using z.ai
python main.py --base-url https://api.z.ai/api/paas/v4 --model "autoglm-phone-multilingual" --apikey "your-z-ai-api-key" "Open Chrome browser"
# Using Novita AI
python main.py --base-url https://api.novita.ai/openai --model "zai-org/autoglm-phone-9b-multilingual" --apikey "your-novita-api-key" "Open Chrome browser"
# Using Parasail
python main.py --base-url https://api.parasail.io/v1 --model "parasail-auto-glm-9b-multilingual" --apikey "your-parasail-api-key" "Open Chrome browser"
If you prefer to deploy the model locally or on your own server:
For Model Deployment section in requirements.txt.python3 -m vllm.entrypoints.openai.api_server \ --served-model-name autoglm-phone-9b-multilingual \ --allowed-local-media-path / \ --mm-encoder-tp-mode data \ --mm_processor_cache_type shm \ --mm_processor_kwargs "{\"max_pixels\":5000000}" \ --max-model-len 25480 \ --chat-template-content-format string \ --limit-mm-per-prompt "{\"image\":10}" \ --model zai-org/AutoGLM-Phone-9B-Multilingual \ --port 8000
This model has the same architecture as GLM-4.1V-9B-Thinking. For detailed information about model deployment, you can also check GLM-V for model deployment and usage guides.
After successful startup, the model service will be accessible at http://localhost:8000/v1. If you deploy the model on a remote server, access it using that server's IP address.
After starting the model service, you can use the following command to verify the deployment:
python scripts/check_deployment_en.py --base-url http://localhost:8000/v1 --model autoglm-phone-9b-multilingual
If using a third-party model service:
# Novita AI
python scripts/check_deployment_en.py --base-url https://api.novita.ai/openai --model zai-org/autoglm-phone-9b-multilingual --apikey your-novita-api-key
# Parasail
python scripts/check_deployment_en.py --base-url https://api.parasail.io/v1 --model parasail-auto-glm-9b-multilingual --apikey your-parasail-api-key
Upon successful execution, the script will display the model's inference result and token statistics, helping you confirm whether the model deployment is working correctly.
Set the --base-url and --model parameters according to your deployed model. For example:
# Android device - Interactive mode
python main.py --base-url http://localhost:8000/v1 --model "autoglm-phone-9b-multilingual"
# Android device - Specify task
python main.py --base-url http://localhost:8000/v1 "Open Maps and search for nearby coffee shops"
# HarmonyOS device - Interactive mode
python main.py --device-type hdc --base-url http://localhost:8000/v1 --model "autoglm-phone-9b-multilingual"
# HarmonyOS device - Specify task
python main.py --device-type hdc --base-url http://localhost:8000/v1 "Open Maps and search for nearby coffee shops"
# Use API key for authentication
python main.py --apikey sk-xxxxx
# Use English system prompt
python main.py --lang en --base-url http://localhost:8000/v1 "Open Chrome browser"
# List supported apps (Android)
python main.py --list-apps
# List supported apps (HarmonyOS)
python main.py --device-type hdc --list-apps
from phone_agent import PhoneAgent
from phone_agent.model import ModelConfig
# Configure model
model_config = ModelConfig(
base_url="http://localhost:8000/v1",
model_name="autoglm-phone-9b-multilingual",
)
# Create Agent
agent = PhoneAgent(model_config=model_config)
# Execute task
result = agent.run("Open eBay and search for wireless earphones")
print(result)
Phone Agent supports remote ADB/HDC debugging via WiFi/network, allowing device control without a USB connection.
Ensure the phone and computer are on the same WiFi network, as shown below:

Ensure the phone and computer are on the same WiFi network:
Settings > System & Updates > Developer OptionsUSB Debugging and Wireless Debugging# Android device - Connect via WiFi, replace with the IP address and port shown on your phone
adb connect 192.168.1.100:5555
# Verify connection
adb devices
# Should show: 192.168.1.100:5555 device
# HarmonyOS device - Connect via WiFi
hdc tconn 192.168.1.100:5555
# Verify connection
hdc list targets
# Should show: 192.168.1.100:5555
# List all connected devices
adb devices
# Connect to remote device
adb connect 192.168.1.100:5555
# Disconnect specific device
adb disconnect 192.168.1.100:5555
# Execute task on specific device
python main.py --device-id 192.168.1.100:5555 --base-url http://localhost:8000/v1 --model "autoglm-phone-9b-multilingual" "Open TikTok and browse videos"
# List all connected devices
hdc list targets
# Connect to remote device
hdc tconn 192.168.1.100:5555
# Disconnect specific device
hdc tdisconn 192.168.1.100:5555
# Execute task on specific device
python main.py --device-type hdc --device-id 192.168.1.100:5555 --base-url http://localhost:8000/v1 --model "autoglm-phone-9b-multilingual" "Open TikTok and browse videos"
from phone_agent.adb import ADBConnection, list_devices
# Create connection manager
conn = ADBConnection()
# Connect to remote device
success, message = conn.connect("192.168.1.100:5555")
print(f"Connection status: {message}")
# List connected devices
devices = list_devices()
for device in devices:
print(f"{device.device_id} - {device.connection_type.value}")
# Enable TCP/IP on USB device
success, message = conn.enable_tcpip(5555)
ip = conn.get_device_ip()
print(f"Device IP: {ip}")
# Disconnect
conn.disconnect("192.168.1.100:5555")
from phone_agent.hdc import HDCConnection, list_devices
# Create connection manager
conn = HDCConnection()
# Connect to remote device
success, message = conn.connect("192.168.1.100:5555")
print(f"Connection status: {message}")
# List connected devices
devices = list_devices()
for device in devices:
print(f"{device.device_id} - {device.connection_type.value}")
# Disconnect
conn.disconnect("192.168.1.100:5555")
Connection Refused:
adb tcpip 5555Connection Dropped:
--connect to reconnectMultiple Devices:
--device-id to specify which device to use--list-devices to view all connected devicesThe system provides both Chinese and English prompts, switchable via the --lang parameter:
--lang cn - Chinese prompt (default), config file: phone_agent/config/prompts_zh.py--lang en - English prompt, config file: phone_agent/config/prompts_en.pyYou can directly modify the corresponding config files to enhance model capabilities in specific domains or disable certain apps by injecting app names.
| Variable | Description | Default Value |
|---|---|---|
PHONE_AGENT_BASE_URL | Model API URL | http://localhost:8000/v1 |
PHONE_AGENT_MODEL | Model name | autoglm-phone-9b |
PHONE_AGENT_API_KEY | API key for authentication | EMPTY |
PHONE_AGENT_MAX_STEPS | Maximum steps per task | 100 |
PHONE_AGENT_DEVICE_ID | ADB/HDC device ID | (auto-detect) |
PHONE_AGENT_DEVICE_TYPE | Device type (adb or hdc) | adb |
PHONE_AGENT_LANG | Language (cn or en) | en |
from phone_agent.model import ModelConfig
config = ModelConfig(
base_url="http://localhost:8000/v1",
api_key="EMPTY", # API key (if required)
model_name="autoglm-phone-9b-multilingual", # Model name
max_tokens=3000, # Maximum output tokens
temperature=0.1, # Sampling temperature
frequency_penalty=0.2, # Frequency penalty
)
from phone_agent.agent import AgentConfig
config = AgentConfig(
max_steps=100, # Maximum steps per task
device_id=None, # ADB device ID (None for auto-detect)
lang="en", # Language: cn (Chinese) or en (English)
verbose=True, # Print debug info (including thinking process and actions)
)
When verbose=True, the Agent outputs detailed information at each step:
================================================== 💭 Thinking Process: -------------------------------------------------- Currently on the system desktop, need to launch eBay app first -------------------------------------------------- 🎯 Executing Action: { "_metadata": "do", "action": "Launch", "app": "eBay" } ================================================== ... (continues to next step after executing action) ================================================== 💭 Thinking Process: -------------------------------------------------- eBay is now open, need to tap the search box -------------------------------------------------- 🎯 Executing Action: { "_metadata": "do", "action": "Tap", "element": [499, 182] } ================================================== 🎉 ================================================ ✅ Task Completed: Successfully opened eBay and searched for 'wireless earphones' ==================================================
This allows you to clearly see the AI's reasoning process and specific operations at each step.
Phone Agent supports 50+ mainstream Chinese applications:
| Category | Apps |
|---|---|
| Social & Messaging | X, Tiktok, WhatsApp, Telegram, FacebookMessenger, GoogleChat, Quora, Reddit, Instagram |
| Productivity & Office | Gmail, GoogleCalendar, GoogleDrive, GoogleDocs, GoogleTasks, Joplin |
| Life, Shopping & Finance | Amazon shopping, Temu, Bluecoins, Duolingo, GoogleFit, ebay |
| Utilities & Media | GoogleClock, Chrome, GooglePlayStore, GooglePlayBooks, FilesbyGoogle |
| Travel & Navigation | GoogleMaps, Booking.com, Trip.com, Expedia, OpenTracks |
Run python main.py --list-apps to see the complete list.
Phone Agent supports 60+ HarmonyOS native apps and system apps:
| Category | Apps |
|---|---|
| Social & Messaging | WeChat, QQ, Weibo, Feishu, Enterprise WeChat |
| E-commerce & Shopping | Taobao, JD.com, Pinduoduo, Vipshop, Dewu, Xianyu |
| Food & Delivery | Meituan, Meituan Waimai, Dianping, Haidilao |
| Travel & Navigation | 12306, Didi, Tongcheng, Amap, Baidu Maps |
| Video & Entertainment | Bilibili, Douyin, Kuaishou, Tencent Video, iQIYI, Mango TV |
| Music & Audio | QQ Music, Qishui Music, Ximalaya |
| Lifestyle & Social | Xiaohongshu, Zhihu, Toutiao, 58.com, China Mobile |
| AI & Tools | Doubao, WPS, UC Browser, CamScanner, Meitu |
| System Apps | Browser, Calendar, Camera, Clock, Cloud, File Manager, Gallery, Contacts, SMS, Settings |
| Huawei Services | AppGallery, Music, Video, Books, Themes, Weather |
Run python main.py --device-type hdc --list-apps to see the complete list.
The Agent can perform the following actions:
| Action | Description |
|---|---|
Launch | Launch an app |
Tap | Tap at specified coordinates |
Type | Input text |
Swipe | Swipe the screen |
Back | Go back to previous page |
Home | Return to home screen |
Long Press | Long press |
Double Tap | Double tap |
Wait | Wait for page to load |
Take_over | Request manual takeover (login/captcha) |
Handle sensitive operation confirmation and manual takeover:
def my_confirmation(message: str) -> bool:
"""Sensitive operation confirmation callback"""
return input(f"Confirm execution of {message}? (y/n): ").lower() == "y"
def my_takeover(message: str) -> None:
"""Manual takeover callback"""
print(f"Please complete manually: {message}")
input("Press Enter after completion...")
agent = PhoneAgent(
confirmation_callback=my_confirmation,
takeover_callback=my_takeover,
)
Check the examples/ directory for more usage examples:
basic_usage.py - Basic task executionDevelopment requires dev dependencies:
pip install -e ".[dev]"
pytest tests/
phone_agent/ ├── __init__.py # Package exports ├── agent.py # PhoneAgent main class ├── adb/ # ADB utilities │ ├── connection.py # Remote/local connection management │ ├── screenshot.py # Screen capture │ ├── input.py # Text input (ADB Keyboard) │ └── device.py # Device control (tap, swipe, etc.) ├── actions/ # Action handling │ └── handler.py # Action executor ├── config/ # Configuration │ ├── apps.py # Supported app mappings │ ├── prompts_zh.py # Chinese system prompts │ └── prompts_en.py # English system prompts └── model/ # AI model client └── client.py # OpenAI-compatible client
Here are some common issues and their solutions:
Try resolving by restarting the ADB service:
adb kill-server adb start-server adb devices
If the device is still not recognized, please check:
Some devices require both debugging options to be enabled:
Please check in Settings → Developer Options that both options are enabled.
This usually means the app is displaying a sensitive page (payment, password, banking apps). The Agent will automatically detect this and request manual takeover.
Error message like UnicodeEncodeError gbk code
Solution: Add the environment variable before running the code: PYTHONIOENCODING=utf-8
Error like: EOF when reading a line
Solution: Use non-interactive mode to specify tasks directly, or switch to a TTY-mode terminal application.
If you find our work helpful, please cite the following papers:
@article{liu2024autoglm, title={Autoglm: Autonomous foundation agents for guis}, author={Liu, Xiao and Qin, Bo and Liang, Dongzhu and Dong, Guang and Lai, Hanyu and Zhang, Hanchen and Zhao, Hanlin and Iong, Iat Long and Sun, Jiadai and Wang, Jiaqi and others}, journal={arXiv preprint arXiv:2411.00820}, year={2024} } @article{xu2025mobilerl, title={MobileRL: Online Agentic Reinforcement Learning for Mobile GUI Agents}, author={Xu, Yifan and Liu, Xiao and Liu, Xinghan and Fu, Jiaqi and Zhang, Hanchen and Jing, Bohao and Zhang, Shudan and Wang, Yuting and Zhao, Wenyi and Dong, Yuxiao}, journal={arXiv preprint arXiv:2509.18119}, year={2025} }
This section is specifically designed for AI assistants (such as Claude Code) to automate the deployment of Open-AutoGLM.
If you are a human reader, you can skip this section and follow the documentation above.
Open-AutoGLM is a phone agent framework:
The architecture consists of two parts:
Before starting deployment, confirm the following items with the user:
Ask the user explicitly: Do you already have access to an AutoGLM model service?
Option A: Use an already-deployed model service (Recommended)
http://xxx.xxx.xxx.xxx:8000/v1)--base-url parameterOption B: Deploy model locally (High system requirements)
# 1. Install ADB tools
# MacOS:
brew install android-platform-tools
# Or download manually: https://developer.android.com/tools/releases/platform-tools
# Windows: Download, extract, and add to PATH environment variable
# 2. Verify ADB installation
adb version
# Should output version information
# 3. Connect phone and verify
# Connect phone via USB cable, tap "Allow USB debugging" on phone
adb devices
# Should output device list, e.g.:
# List of devices attached
# XXXXXXXX device
If adb devices shows empty list or unauthorized:
adb kill-server && adb start-server and retry# 1. Clone repository (if not already cloned)
git clone https://github.com/zai-org/Open-AutoGLM.git
cd Open-AutoGLM
# 2. Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
pip install -e .
Note: No need to clone model repository; models are called via API.
If user chooses Option A (using already-deployed model):
You can use the following third-party model services:
z.ai
--base-url: https://api.z.ai/api/paas/v4--model: autoglm-phone-multilingual--apikey: Apply for your own API key on the z.ai platformNovita AI
--base-url: https://api.novita.ai/openai--model: zai-org/autoglm-phone-9b-multilingual--apikey: Apply for your own API key on the Novita AI platformParasail
--base-url: https://api.parasail.io/v1--model: parasail-auto-glm-9b-multilingual--apikey: Apply for your own API key on the Parasail platformExample usage:
# Using z.ai
python main.py --base-url https://api.z.ai/api/paas/v4 --model "autoglm-phone-multilingual" --apikey "your-z-ai-api-key" "Open Chrome browser"
# Using Novita AI
python main.py --base-url https://api.novita.ai/openai --model "zai-org/autoglm-phone-9b-multilingual" --apikey "your-novita-api-key" "Open Chrome browser"
# Using Parasail
python main.py --base-url https://api.parasail.io/v1 --model "parasail-auto-glm-9b-multilingual" --apikey "your-parasail-api-key" "Open Chrome browser"
Or use the URL provided by the user directly and skip local model deployment steps.
If user chooses Option B (deploy model locally):
# 1. Install vLLM
pip install vllm
# 2. Start model service (will auto-download model, ~20GB)
python3 -m vllm.entrypoints.openai.api_server \
--served-model-name autoglm-phone-9b-multilingual \
--allowed-local-media-path / \
--mm-encoder-tp-mode data \
--mm_processor_cache_type shm \
--mm_processor_kwargs "{\"max_pixels\":5000000}" \
--max-model-len 25480 \
--chat-template-content-format string \
--limit-mm-per-prompt "{\"image\":10}" \
--model zai-org/AutoGLM-Phone-9B-Multilingual \
--port 8000
# Model service URL: http://localhost:8000/v1
# Execute in the Open-AutoGLM directory
# Replace {MODEL_URL} with the actual model service address
python main.py --base-url {MODEL_URL} --model "autoglm-phone-9b-multilingual" "Open Gmail and send an email to File Transfer Assistant: Deployment successful"
Expected Result:
| Error Symptom | Possible Cause | Solution |
|---|---|---|
adb devices shows nothing | USB debugging not enabled or cable issue | Check developer options, replace cable |
adb devices shows unauthorized | Phone not authorized | Tap "Allow USB debugging" on phone |
| Can open apps but cannot tap | Missing security debugging permission | Enable "USB Debugging (Security Settings)" |
| Chinese/text input corrupted or missing | ADB Keyboard not enabled | Enable ADB Keyboard in system settings |
| Screenshot returns black screen | Sensitive page (payment/banking) | Normal behavior, system will handle automatically |
| Cannot connect to model service | Wrong URL or service not running | Check URL, confirm service is running |
ModuleNotFoundError | Dependencies not installed | Run pip install -r requirements.txt |
adb devices can see the device# Check ADB connection
adb devices
# Restart ADB service
adb kill-server && adb start-server
# Install dependencies
pip install -r requirements.txt && pip install -e .
# Run Agent (interactive mode)
python main.py --base-url {MODEL_URL} --model "autoglm-phone-9b-multilingual"
# Run Agent (single task)
python main.py --base-url {MODEL_URL} --model "autoglm-phone-9b-multilingual" "your task description"
# View supported apps list
python main.py --list-apps
Deployment success indicator: The phone can automatically execute user's natural language instructions.