logo
1
0
Login


🤗 Hugging Face  |   🤖 ModelScope  |  

🖥️ Official Website  |   🕹️ Demo    

GITHUB

Model Introduction

The Hunyuan Translation Model comprises a translation model, Hunyuan-MT-7B, and an ensemble model, Hunyuan-MT-Chimera. The translation model is used to translate source text into the target language, while the ensemble model integrates multiple translation outputs to produce a higher-quality result. It primarily supports mutual translation among 33 languages, including five ethnic minority languages in China.

Key Features and Advantages

  • In the WMT25 competition, the model achieved first place in 30 out of the 31 language categories it participated in.
  • Hunyuan-MT-7B achieves industry-leading performance among models of comparable scale
  • Hunyuan-MT-Chimera-7B is the industry’s first open-source translation ensemble model, elevating translation quality to a new level
  • A comprehensive training framework for translation models has been proposed, spanning from pretrain → cross-lingual pretraining (CPT) → supervised fine-tuning (SFT) → translation enhancement → ensemble refinement, achieving state-of-the-art (SOTA) results for models of similar size

Related News

  • 2025.9.1 We have open-sourced Hunyuan-MT-7B , Hunyuan-MT-Chimera-7B on Hugging Face.

 

模型链接

Model NameDescriptionDownload
Hunyuan-MT-7BHunyuan 7B translation model🤗 Model
Hunyuan-MT-7B-fp8Hunyuan 7B translation model,fp8 quant🤗 Model
Hunyuan-MT-ChimeraHunyuan 7B translation ensemble model🤗 Model
Hunyuan-MT-Chimera-fp8Hunyuan 7B translation ensemble model,fp8 quant🤗 Model

Prompts

Prompt Template for ZH<=>XX Translation.

把下面的文本翻译成<target_language>,不要额外解释。 <source_text>

Prompt Template for XX<=>XX Translation, excluding ZH<=>XX.

Translate the following segment into <target_language>, without additional explanation. <source_text>

Prompt Template for Hunyuan-MT-Chmeria-7B

Analyze the following multiple <target_language> translations of the <source_language> segment surrounded in triple backticks and generate a single refined <target_language> translation. Only output the refined translation, do not explain. The <source_language> segment: ```<source_text>``` The multiple <target_language> translations: 1. ```<translated_text1>``` 2. ```<translated_text2>``` 3. ```<translated_text3>``` 4. ```<translated_text4>``` 5. ```<translated_text5>``` 6. ```<translated_text6>```

 

Use with transformers

First, please install transformers, recommends v4.56.0

pip install transformers==v4.56.0

The following code snippet shows how to use the transformers library to load and apply the model.

we use tencent/Hunyuan-MT-7B for example

from transformers import AutoModelForCausalLM, AutoTokenizer import os model_name_or_path = "tencent/Hunyuan-MT-7B" tokenizer = AutoTokenizer.from_pretrained(model_name_or_path) model = AutoModelForCausalLM.from_pretrained(model_name_or_path, device_map="auto") # You may want to use bfloat16 and/or move to GPU here messages = [ {"role": "user", "content": "Translate the following segment into Chinese, without additional explanation.\n\nIt’s on the house."}, ] tokenized_chat = tokenizer.apply_chat_template( messages, tokenize=True add_generation_prompt=False, return_tensors="pt" ) outputs = model.generate(tokenized_chat.to(model.device), max_new_tokens=2048) output_text = tokenizer.decode(outputs[0])

We recommend using the following set of parameters for inference. Note that our model does not have the default system_prompt.

{ "top_k": 20, "top_p": 0.6, "repetition_penalty": 1.05, "temperature": 0.7 }

About

No description, topics, or website provided.
14.97 GiB
1 forks0 stars1 branches0 TagREADMEOther license
Language
Markdown63.6%
Others36.4%