Datafiles for ucto, the rule-based tokenization package that is used to parse texts in different languages. WWW: https://languagemachines.github.io/ucto/