TEDxKinda Project

“The goal is to turn data into information, and information into insight.” – Carly Fiorina, former chief executive of HP

https://www.youtube.com/embed/I6APyXdHBcg

Big data is all around us. What can it do for you? us? the world? Tara’s use of big data to win a political campaign is no longer a novel strategy:

The Obama team understood the importance of execution and the difficulties of data complexities. One of that group’s first priorities before the election was to undertake a massive, 18-month-long database merge so that all data could be housed in a single repository. The database focus also allowed the Obama camp to think expansively in its approach to metrics. “We are going to measure every single thing in this campaign,” campaign manager Jim Messina told TIME. Messina was also given good resources to work with: according to the article, he hired an analytics department five times bigger than the 2008 operation. Quite the contrast to the rushed approach taken by the Romney campaign.

So what was the Obama camp able to do with this big data trove? One senior official from the campaign told TIME that the group “ran the election 66,000 times every night” based on the day’s data and allocated resources based on likely outcomes. In addition, the demographic information they collected and scored against other factors allowed them to find more targeted ways to buy television advertising to reach their “microtargeted” voters.

Source: 2012—The First Big Data Election, Harvard Business Review

Every day, big data analysis leads to innovative ideas, applications, and knowledge. Many of these innovations and discoveries are shared via dynamic TED talk presentations. TED is a nonprofit organization devoted to “Ideas Worth Spreading.” Many of TED’s best presenters utilize big data analysis techniques, too. For example, here is a presentation called “Your Phone Company is Watching,” by Malte Spitz.

https://www.youtube.com/embed/Gv7Y0W0xmYQ

Now, it’s your turn.

Assignment

Create a TEDxKinda presentation based on big data analysis.

Groups will complete extensive research to choose a topic/theme (i.e., identify a problem), conduct in-depth data analysis, and discuss the results via a TEDxKinda presentation, which will be performed for a live audience and recorded.

Each group in your class will choose its own meaningful topic, with the following caveat:

Data sets will not be available on every topic, so you may first want to identify potential data sets before you narrow down your ideas to a particular topic or theme. There is a wide variety of public data sets and tools that you may choose from in order to complete your presentation, but you will have to be creative, flexible, focused, and diligent in order to synthesize all the components into a composed, powerful presentation. Feel free to use and edit the Big Data Sets and Tools for Big Data Analysis resources.

Each TEDxKinda presentation should:

  • include at least two of the following data analysis strategies:
    • cluster analysis,
    • classification,
    • linear regression,
    • association rule mining, and/or
    • outlier, anomaly, and change detection.
  • utilize automated summarization,
  • provide insightful analysis and evaluation of the topic and related big data sets, and
  • be professional and engaging to the audience.

For extra credit, you can try to apply appropriate crowdsourcing strategies and achieve useful results!

For classes that want to go the extra mile, create your own TED-Ed club at your school, or you can invite the outside world to your TEDxKinda talk and organize an official TEDxYouth@{insert your city’s name}, like TEDxYouth@Austin. This requires a little extra work, but changing the world is never easy!

Submission

Your submission will be in the form of a presentation, including speaker notes. The presentation you submit must:

  • utilize the data analysis strategies as described above and include information from these analyses explicitly,
  • provide insightful analysis and evaluation of the topic and related big data sets,
  • contain at least three informative and aesthetically pleasing data visualizations,
  • use key terminology from the glossary appropriately and as necessary, and
  • include effective speaker notes for asynchronous presentations (and evaluation).

When you are finished, you must submit a copy of your presentation (e.g., Powerpoint, Prezi, etc.) including speaker notes. If this includes multiple files, zip the files together into one file. Be sure to appropriately name the file you upload as your submission.

Learning Goals

Over the course of this module and this project, you will learn to:

  • Relate the impact of computing to ubiquitous and large-scale data processing.
  • Outline the purpose of each of the following data processing tasks: collection, knowledge extraction, data storage, and analysis.
  • Define “data” and explain the characteristics of usable and useful data.
  • Extract structured information from unstructured data.
  • Apply the technique of crowdsourcing to a novel data collection problem.
  • Apply the basic features and functionality of modern relational databases.
  • Evaluate the trade-offs associated with the storage of structured and unstructured data.
  • Debate the implications of large-scale data storage and data persistence on privacy and utility, including the costs associated with each.
  • Appraise the trade-off of utility and confidence in descriptive, predictive, and prescriptive data analysis.
  • Perform traditional statistical hypothesis testing and exploratory data analysis.
  • Visually perform cluster analysis, modeling the dynamics of groups.
  • Visually perform anomaly/outlier/change detection and discuss the impact on potential inferences drawn from the data.
  • Synthesize a prediction through regression over known data.

Rubric

Criteria Scoring Notes Score
Data Analysis—Method 1
Performs one of the following data analysis strategies (association rule mining, classification, regression analysis; cluster analysis; anomaly, outlier, and change detection) on a large data set accurately and appropriately.
Weighted: 20%
  • Students must use one of the big data analysis techniques learned throughout the course.
  • The technique must be applied accurately and appropriately.
  • The technique must apply to the topic or theme.
2 pts 0 pts
Data Analysis—Method 2
Performs a second strategy from the following list (association rule mining, classification, regression analysis; cluster analysis; anomaly, outlier, and change detection) on a large data set accurately and appropriately.
Weighted: 20%
  • Students must use a different one of the big data analysis techniques learned throughout the course.
  • The technique must be applied accurately and appropriately.
  • The technique must apply to the topic or theme.
2 pts 0 pts
Data Analysis—Automated Summarization
Performs automated summarization strategies on a large data set accurately and appropriately.
Weighted: 10%
  • The technique must be applied accurately and appropriately.
  • The automated summarization strategy used must apply to the topic or theme.
1 pt 0 pts
Insight
Provides insightful analysis of the data set and makes clear connections to the TEDxKinda presentation topic.
Weighted: 20%
  • Students clearly understand the data collected.
  • Students draw conclusions relating to the topic and share them concisely throughout the presentation.
  • Students are able to answer simple questions about the topic or theme.
2 pts 0 pts
Quality
The presentation should be easily heard, easy to comprehend, and aesthetically pleasing.
Weighted: 10%
  • Speakers are easily heard.
  • The presentation remains on topic.
  • The videos and/or slides are aesthetically pleasing and easy to read.
1 pt 0 pts
Engagement
The presentation is engaging and interesting throughout.
Weighted: 10%
  • Speakers uses appropriate visualizations.
  • Speakers keep the audience engaged during the presentation.
1 pt 0 pts
Presenter Notes
The presenter notes are complete and thoroughly describe the presentation.
Weighted: 10%
  • Presenter notes are submitted prior to the presentation.
  • Notes are complete and thorough, describing the full presentation.
1 pt 0 pts
TOTAL 10 pts