Which rules to use to analyze short machine-translated snippets?

izlt · April 2, 2020, 6:56am

Hello everyone!

First post here. I have a collection of English text snippets generated by a machine translation engine. These are sentences of segments of sentences. I don’t have access to any reference translations or even the foreign language originals. Generally, they don’t have spelling errors but may be disfluent or awkward. I am trying to see if I can use LT to analyze the level of disfluency/awkwardness in automatic or semi-automatic fashion. For example, by counting the number of errors of certain types reported by LT for each snippet.

Question: what set of standard rules available with the latest release should I use?

Again, I am not trying to assess adequacy of translations or compare them with a reference. I can imagine this being similar to a task of a native English speaker correcting sentences generated by someone who is learning English. (Not sure if there is in fact a better analogy)

TIA!

Mike_Unwalla · April 7, 2020, 8:14am

Hello,

Probably, the rules for you are the style rules. In the standalone version of LT, you can see them on the Style Rules tab: