The future of AI
speaks Ethiopian
Dataset.ET is an open dataset initiative building speech, text, and translation datasets for 80+ Ethiopian languages to enable AI research and language technology.
0+
Languages
0+
Sentences
Our Mission
Preserving languages.
Powering intelligence.
Every language carries a universe of knowledge. No language should be left behind as the world moves toward AI.
Safeguard 80+ Languages
Every Ethiopian language carries centuries of knowledge, poetry, and identity. We create high-quality datasets that ensure no language faces digital extinction.
Train Smarter AI
From Amharic Ge'ez script to Oromo grammar, our corpus enables AI to understand, speak, and reason in Ethiopian languages with true fluency.
Community-Owned Data
Open infrastructure built by researchers, developers, and language enthusiasts across Ethiopia. Your data, your rules, your future.
Languages
Teaching AI to understand Ethiopia
Building high-quality datasets to train the next generation of Ethiopian AI models, expanding to more every quarter.
Amharic
500k+አማርኛ
Afaan Oromo
450k+Afaan Oromoo
Somali
200k+Af Soomaali
Tigrinya
200k+ትግርኛ
Sidamo
Coming soonSidaamu Afoo
Wolaytta
Coming soonWolaytta
+ 74 more languages on our roadmap
How It Works
From contribution to AI
Every contribution, no matter how small, helps bridge the gap between Ethiopian languages and artificial intelligence.
Contribute
Submit sentences, paragraphs, or documents in your native Ethiopian language through our portal.
Validate
Community reviewers verify every contribution for quality, accuracy, and linguistic structure.
Structure
Validated data is cleaned, annotated, and stored in our open-access corpus with full attribution.
Power AI
Researchers and developers use the corpus to train models that understand Ethiopian languages.
Community
Built by
Ethiopians,
for everyone.
Volunteers, researchers, linguists, and developers who believe in the power of language technology for Ethiopia.
Contributors
Donate your voice, translate text, and help expand our datasets across multiple Ethiopian languages.
Validators
Ensure quality by reviewing and verifying submitted audio and text contributions.
Developers
Build innovative AI tools and models using our open-source datasets and APIs.
Linguists
Provide expert guidance on grammar, phonetics, and cultural nuances of Ethiopian languages.
Get Involved
Your language matters.
Help us teach AI.
Every sentence you contribute builds the open infrastructure that will power Ethiopian AI for generations. No technical skills required.
By contributing, you agree to our Privacy Policy