Free Data Sets
Get our free data set with approximately 1,000 articles in 5 different languages.
Signup and get free data sets
A number of leading organisations and educational institutions have used our archived data sets for their predictive analysis, machine learning, and natural language processing. The archive consists of more than 6 billion news articles from the past two decades.
We have documentation readily available, but the easiest way to get familiar with our data structure is to see the data for yourself. Fill in the form below to receive five free data sets.
The five free data sets
Each data set consists of approximately 1,000 articles in a specific language. They were collected from random news sources, and they are all available in JSON and XML format.
- English articles
- Spanish articles
- French articles
- German articles
- Russian articles
If you have any questions or if you are looking for different samples from our archive, please contact us at firstname.lastname@example.org.