Predicting Crowd-based Translation Quality with Language-independent Feature Vectors


Research over the past years has shown that machine trans-lation results can be greatly enhanced with the help of mono- or bilingual human contributors, e.g. by asking hu¬mans to proofread or correct outputs of machine translation systems. However, it remains difficult to determine the quality of individual revisions. This paper proposes a method to determine the quality of individual contributions by analyzing task-independent data. Examples of such data are completion time, number of keystrokes, etc. An initial evaluation showed promising F-measure values larger than 0.8 for support vector machine and decision tree based classifications of a combined test set of Vietnamese and German translations.

Proceedings of the 4th Human Computation Workshop