🤗 Hugging Face | 🤖 ModelScope |
🖥️ Official Website | 🕹️ Demo
The Hunyuan Translation Model comprises a translation model, Hunyuan-MT-7B, and an ensemble model, Hunyuan-MT-Chimera. The translation model is used to translate source text into the target language, while the ensemble model integrates multiple translation outputs to produce a higher-quality result. It primarily supports mutual translation among 33 languages, including five ethnic minority languages in China.
| Model Name | Description | Download |
|---|---|---|
| Hunyuan-MT-7B | Hunyuan 7B translation model | 🤗 Model |
| Hunyuan-MT-7B-fp8 | Hunyuan 7B translation model,fp8 quant | 🤗 Model |
| Hunyuan-MT-Chimera | Hunyuan 7B translation ensemble model | 🤗 Model |
| Hunyuan-MT-Chimera-fp8 | Hunyuan 7B translation ensemble model,fp8 quant | 🤗 Model |
把下面的文本翻译成<target_language>,不要额外解释。 <source_text>
Translate the following segment into <target_language>, without additional explanation. <source_text>
Analyze the following multiple <target_language> translations of the <source_language> segment surrounded in triple backticks and generate a single refined <target_language> translation. Only output the refined translation, do not explain. The <source_language> segment: ```<source_text>``` The multiple <target_language> translations: 1. ```<translated_text1>``` 2. ```<translated_text2>``` 3. ```<translated_text3>``` 4. ```<translated_text4>``` 5. ```<translated_text5>``` 6. ```<translated_text6>```
First, please install transformers, recommends v4.56.0
pip install transformers==v4.56.0
The following code snippet shows how to use the transformers library to load and apply the model.
we use tencent/Hunyuan-MT-7B for example
from transformers import AutoModelForCausalLM, AutoTokenizer
import os
model_name_or_path = "tencent/Hunyuan-MT-7B"
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, device_map="auto") # You may want to use bfloat16 and/or move to GPU here
messages = [
{"role": "user", "content": "Translate the following segment into Chinese, without additional explanation.\n\nIt’s on the house."},
]
tokenized_chat = tokenizer.apply_chat_template(
messages,
tokenize=True
add_generation_prompt=False,
return_tensors="pt"
)
outputs = model.generate(tokenized_chat.to(model.device), max_new_tokens=2048)
output_text = tokenizer.decode(outputs[0])
We recommend using the following set of parameters for inference. Note that our model does not have the default system_prompt.
{
"top_k": 20,
"top_p": 0.6,
"repetition_penalty": 1.05,
"temperature": 0.7
}