2.开源翻译工作流(吴恩达博士)
项目地址:https://github.com/andrewyng/translation-agent (opens new window)
# 2.1 项目基本思路
让模型在完成首轮翻译之后,通过自我反思后修正的工作流优化翻译结果,以提升最终文本翻译的质量。
# 2.2 关键步骤
第一步:初始翻译
- 输入信息:原始文本语言(source_lang)、翻译目标语言(target_lang) 和 原始文本(source_text)
- 角色设定:以翻译文本为任务目标的语言学家
- 输出结果:基于所有输入信息,对原始文本(source_text)进行第一轮翻译的结果(translation_1)
第二步:反思改进
- 输入信息:原始文本语言(source_lang)、翻译目标语言(target_lang)、原始文本(source_text) 和 第一轮翻译结果(translation_1)
- 角色设定:以阅读原始文本和翻译文本,并给出翻译改进意见为任务目标的语言学家
- 输出结果:基于所有输入信息,对第一轮翻译结果(translation_1)提出的改进意见反思(reflection)
第三步:优化翻译
- 输入信息:原始文本语言(source_lang)、翻译目标语言(target_lang)、原始文本(source_text)、第一轮翻译结果(translation_1) 和 改进意见反思(reflection)
- 角色设定:以翻译文本为任务目标的语言学家(和第一步相同)
- 输出结果:基于所有输入信息,给出的第二轮优化后翻译结果(translation_2)
# 2.3 关键代码
代码文件:utils.py (opens new window)
# 2.3.1 初始翻译函数
def one_chunk_initial_translation(
source_lang: str, target_lang: str, source_text: str
) -> str:
"""
Translate the entire text as one chunk using an LLM.
Args:
source_lang (str): The source language of the text.
target_lang (str): The target language for translation.
source_text (str): The text to be translated.
Returns:
str: The translated text.
"""
system_message = f"You are an expert linguist, specializing in translation from {source_lang} to {target_lang}."
translation_prompt = f"""This is an {source_lang} to {target_lang} translation, please provide the {target_lang} translation for this text. \
Do not provide any explanations or text apart from the translation.
{source_lang}: {source_text}
{target_lang}:"""
prompt = translation_prompt.format(source_text=source_text)
translation = get_completion(prompt, system_message=system_message)
return translation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# 2.3.2 反思函数
def one_chunk_reflect_on_translation(
source_lang: str,
target_lang: str,
source_text: str,
translation_1: str,
country: str = "",
) -> str:
"""
Use an LLM to reflect on the translation, treating the entire text as one chunk.
Args:
source_lang (str): The source language of the text.
target_lang (str): The target language of the translation.
source_text (str): The original text in the source language.
translation_1 (str): The initial translation of the source text.
country (str): Country specified for target language.
Returns:
str: The LLM's reflection on the translation, providing constructive criticism and suggestions for improvement.
"""
system_message = f"You are an expert linguist specializing in translation from {source_lang} to {target_lang}. \
You will be provided with a source text and its translation and your goal is to improve the translation."
if country != "":
reflection_prompt = f"""Your task is to carefully read a source text and a translation from {source_lang} to {target_lang}, and then give constructive criticism and helpful suggestions to improve the translation. \
The final style and tone of the translation should match the style of {target_lang} colloquially spoken in {country}.
The source text and initial translation, delimited by XML tags <SOURCE_TEXT></SOURCE_TEXT> and <TRANSLATION></TRANSLATION>, are as follows:
<SOURCE_TEXT>
{source_text}
</SOURCE_TEXT>
<TRANSLATION>
{translation_1}
</TRANSLATION>
When writing suggestions, pay attention to whether there are ways to improve the translation's \n\
(i) accuracy (by correcting errors of addition, mistranslation, omission, or untranslated text),\n\
(ii) fluency (by applying {target_lang} grammar, spelling and punctuation rules, and ensuring there are no unnecessary repetitions),\n\
(iii) style (by ensuring the translations reflect the style of the source text and takes into account any cultural context),\n\
(iv) terminology (by ensuring terminology use is consistent and reflects the source text domain; and by only ensuring you use equivalent idioms {target_lang}).\n\
Write a list of specific, helpful and constructive suggestions for improving the translation.
Each suggestion should address one specific part of the translation.
Output only the suggestions and nothing else."""
else:
reflection_prompt = f"""Your task is to carefully read a source text and a translation from {source_lang} to {target_lang}, and then give constructive criticism and helpful suggestions to improve the translation. \
The source text and initial translation, delimited by XML tags <SOURCE_TEXT></SOURCE_TEXT> and <TRANSLATION></TRANSLATION>, are as follows:
<SOURCE_TEXT>
{source_text}
</SOURCE_TEXT>
<TRANSLATION>
{translation_1}
</TRANSLATION>
When writing suggestions, pay attention to whether there are ways to improve the translation's \n\
(i) accuracy (by correcting errors of addition, mistranslation, omission, or untranslated text),\n\
(ii) fluency (by applying {target_lang} grammar, spelling and punctuation rules, and ensuring there are no unnecessary repetitions),\n\
(iii) style (by ensuring the translations reflect the style of the source text and takes into account any cultural context),\n\
(iv) terminology (by ensuring terminology use is consistent and reflects the source text domain; and by only ensuring you use equivalent idioms {target_lang}).\n\
Write a list of specific, helpful and constructive suggestions for improving the translation.
Each suggestion should address one specific part of the translation.
Output only the suggestions and nothing else."""
prompt = reflection_prompt.format(
source_lang=source_lang,
target_lang=target_lang,
source_text=source_text,
translation_1=translation_1,
)
reflection = get_completion(prompt, system_message=system_message)
return reflection
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
# 2.3.3 改进翻译函数
def one_chunk_improve_translation(
source_lang: str,
target_lang: str,
source_text: str,
translation_1: str,
reflection: str,
) -> str:
"""
Use the reflection to improve the translation, treating the entire text as one chunk.
Args:
source_lang (str): The source language of the text.
target_lang (str): The target language for the translation.
source_text (str): The original text in the source language.
translation_1 (str): The initial translation of the source text.
reflection (str): Expert suggestions and constructive criticism for improving the translation.
Returns:
str: The improved translation based on the expert suggestions.
"""
system_message = f"You are an expert linguist, specializing in translation editing from {source_lang} to {target_lang}."
prompt = f"""Your task is to carefully read, then edit, a translation from {source_lang} to {target_lang}, taking into
account a list of expert suggestions and constructive criticisms.
The source text, the initial translation, and the expert linguist suggestions are delimited by XML tags <SOURCE_TEXT></SOURCE_TEXT>, <TRANSLATION></TRANSLATION> and <EXPERT_SUGGESTIONS></EXPERT_SUGGESTIONS> \
as follows:
<SOURCE_TEXT>
{source_text}
</SOURCE_TEXT>
<TRANSLATION>
{translation_1}
</TRANSLATION>
<EXPERT_SUGGESTIONS>
{reflection}
</EXPERT_SUGGESTIONS>
Please take into account the expert suggestions when editing the translation. Edit the translation by ensuring:
(i) accuracy (by correcting errors of addition, mistranslation, omission, or untranslated text),
(ii) fluency (by applying {target_lang} grammar, spelling and punctuation rules and ensuring there are no unnecessary repetitions), \
(iii) style (by ensuring the translations reflect the style of the source text)
(iv) terminology (inappropriate for context, inconsistent use), or
(v) other errors.
Output only the new translation and nothing else."""
translation_2 = get_completion(prompt, system_message)
return translation_2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
编辑 (opens new window)
上次更新: 2025/12/19, 15:17:48