| [ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
As a translator, you may save yourself some work by starting from a reasonably good translation produced by a machine, and modify that translation, to make it perfect.
Thus, before you start working on a translation, you might have the PO file pretranslated.
This process, also called machine translation, is nowadays best performed through a Large Language Model (LLM). (See https://en.wikipedia.org/wiki/Machine_translation#Neural_MT, https://en.wikipedia.org/wiki/Neural_machine_translation#Generative_LLMs.)
We don't recommend to use machine translation through a web service in the cloud, controlled by someone else than yourself. Such a machine translation service would be have major drawbacks (it could go away any time, it could be used to spy on you or manipulate you, or the costs could go up beyond your control); see https://www.gnu.org/philosophy/who-does-that-server-really-serve.en.html. Additionally, such a service typically has some cost (between $10 and $25 per megabyte, as of 2025).
Instead, we recommend a Large Language Model execution engine that runs on hardware under your control. This can be a desktop computer, or for instance a single-board computer in your local network.
At this point (in 2025), a Large Language Model execution engine that is Free Software is ‘ollama’, that can be downloaded from https://ollama.com/.
Next, you will need to pick a Large Language Model. There are two properties to watch out for:
Together with an LLM of reasonable quality,
such as the model ministral-3:14b,
the system requirements are as follows:
ollama, 9 GB for the model).
ollama to provide an optional speedup.
Additional configuration:
ollama on your computer directly,
no further configuration is needed.
ollama on a separate machine,
and want to make it accessible from all machines in the LAN:
Edit the file ‘/etc/systemd/system/ollama.services’,
adding a line: Environment="OLLAMA_HOST=0.0.0.0".
See https://github.com/ollama/ollama/issues/703.
ollama in a virtual machine,
make the port 11434 accessible through port forwarding.
msgpre Program msgpre [option...] |
The msgpre program pretranslates a translation catalog.
Warning: The pretranslations might not be what you expect. They might be of the wrong form, be of poor quality, or reflect some biases.
Input PO file.
Add directory to the list of directories. Source files are searched relative to this list of directories. The resulting ‘.po’ file will be written relative to the current directory, though.
If no inputfile is given or if it is ‘-’, standard input is read.
Write output to specified file.
The results are written to standard output if no output file is specified or if it is ‘-’.
Keep fuzzy messages unmodified. Pretranslate only untranslated messages.
Specifies the type of Large Language Model execution engine.
The default and only valid value is ollama.
Specifies the URL of the server that runs Large Language Model execution engine.
For ollama, the default is http://localhost:11434.
Specifies the model to use. This option is mandatory; no default exists. The specified model must already be installed in the Large Language Model execution engine.
Specifies the prompt to use before each msgid from the PO file.
It allows you to specify extra instructions for the LLM.
The prompt should include an instruction like
"Translate into target language.".
Some hints for good prompts are described in the article
“How to write AI prompts for translation”
https://poeditor.com/blog/ai-prompts-for-translation/.
Specifies a command to post-process the output from the LLM. This should be a Bourne shell command that reads from standard input and writes to standard output.
For instance, the ministral-3:14b model
often emphasizes part of the output with ‘**’ characters.
To eliminate these markers,
you could use the command ‘sed -e 's/[*][*]//g'’.
Assume the input file is a Java ResourceBundle in Java .properties
syntax, not in PO file syntax.
Assume the input file is a NeXTstep/GNUstep localized resource file in
.strings syntax, not in PO file syntax.
Specify whether or when to use colors and other text attributes.
See The --color option for details.
Specify the CSS style rule file to use for --color.
See The --style option for details.
Always write an output file even if it contains no message.
Write the .po file using indented style.
Do not write ‘#: filename:line’ lines.
Generate ‘#: filename:line’ lines (default).
The optional type can be either ‘full’, ‘file’, or
‘never’. If it is not given or ‘full’, it generates the
lines with both file name and line number. If it is ‘file’, the
line number part is omitted. If it is ‘never’, it completely
suppresses the lines (same as --no-location).
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn't support the GNU extensions.
Write out a Java ResourceBundle in Java .properties syntax. Note
that this file format doesn't support plural forms and silently drops
obsolete messages.
Write out a NeXTstep/GNUstep localized resource file in .strings syntax.
Note that this file format doesn't support plural forms.
Set the output page width. Long strings in the output files will be split across multiple lines in order to ensure that each line's width (= number of screen columns) is less or equal to the given number.
Do not break long message lines. Message lines whose width exceeds the output page width will not be split into several lines. Only file reference lines which are wider than the output page width will be split.
Generate sorted output. Note that using this option makes it much harder for the translator to understand each message's context.
Sort output by file location.
Display this help and exit.
Output version information and exit.
Suppress progress indicators.
To pretranslate the file foo.po:
msgpre --model=ministral-3:14b < foo.po > foo-pretranslated.po |
Note that this command can take a long time, depending on the model and the available hardware.
spit Program spit [option...] |
The spit program
passes its input to a Large Language Model (LLM) instance
and prints the response.
With the --to option,
it translates its input to the specified language
through a Large Language Model (LLM) and prints the translation.
Warning: The output might not be what you expect. It might be of the wrong form, be of poor quality, or reflect some biases.
Specifies the type of Large Language Model execution engine.
The default and only valid value is ollama.
Specifies the URL of the server that runs Large Language Model execution engine.
For ollama, the default is http://localhost:11434.
Specifies the model to use. This option is mandatory; no default exists. The specified model must already be installed in the Large Language Model execution engine.
Specifies the target language.
language may be specified
as an ISO 639 language code (such as fr for French),
as a combination of an ISO 639 language code and an ISO 3166 country code
(such as fr_CA for French in Canada,
or zh_TW for traditional Chinese),
or as the English name of a language (such as French).
The effect of this option is to add a prompt similar to "Translate to language:".
Specifies the prompt to use before the input that comes from standard input. It allows you to specify extra instructions for the LLM.
This option overrides the --to option.
Specifies a command to post-process the output. This should be a Bourne shell command that reads from standard input and writes to standard output.
For instance, the ministral-3:14b model
often emphasizes part of the output with ‘**’ characters.
To eliminate these markers,
you could use the command ‘sed -e 's/[*][*]//g'’.
Display this help and exit.
Output version information and exit.
Machine translation of a single sentence:
$ echo 'Translate into German: "Welcome to the GNU project!"' \
| spit --model=ministral-3:14b \
--postprocess="sed -e 's/[*][*]//g'"
"Willkommen zum GNU-Projekt!"
|
The perfect translation would be "Willkommen beim GNU-Projekt!".
You can see: some manual adjustment after the machine translation is needed.
| [ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated by Bruno Haible on January, 13 2026 using texi2html 1.78a.