Open Data

Dataset Release

We are currently in the data collection phase.

Dataset.ET is building speech, text, and translation datasets for 80+ Ethiopian languages to enable AI research and language technology. As soon as the dataset reaches our quality and volume milestones, we will release it here for open-source AI research.

Status

Actively Collecting Contributions

Availability

Coming Soon — Open Source

Help Us Collect Data Back to Home