The Tweeties is a series of foundation models incorporating native tokenizers for each language, for a better understanding and generation of text in these languages. These models are adapted from existing models using trans-tokenization, and further pre-trained on existing corpora.