TEDxKinda Project: Big Data Sets
Big data has become big business around the world. Companies that have their own big data sets, like Google or Facebook, use those data sets for business purposes and often go through great lengths to keep them private. Big data sets are not always readily available because of this, and because of privacy issues involved with sharing data. Most of the data sets that are openly available are from the government or nonprofit organizations. Some of these are BIG and some are smaller.
This is a list of some of the most accessible and usable big data sets and collections. You may use any of these for your TEDxKinda research, or compile your own subsets of data by using a searchable database. Be sure to clarify the appropriateness of the data sets you find with your teacher:
- Google Public Data Explorer—130 data sets from Bureau of Labor Statistics, U.S. Census Bureau, etc.
- data.gov—an online repository of data sets from the U.S. government
- Google’s Ngram Data—data on Google’s catalog of millions of books, including raw data sets
- Google Trends—detailed search history information, including CSV downloads
- NOAA National Climatic Data Center
- Knoema—“free to use public and open data platform for users with interests in statistics and data analysis, visual storytelling and making infographics"
- Geocommons—“all about open data analysis and maps"
- Stat Silk—“interactive maps of open data"
- Better World Flux—“a beautiful interactive visualization of information on what really matters in life"
- Gapminder—“unveiling the beauty of statistics for a better world view"
- DataLab—comprehensive database from the National Center of Education Statistics
- Measure of America—“easy-to-use yet methodically sound tools for understanding the well-being and opportunity in America"