With the college generation, everyone seems to find an excuse not to carry conversation with other face-to-face much anymore. Hiding behind computer screens and Instagram likes many different areas of our lives are being affected by social media, even dating. With dating apps targeted for young adults such as Tinder or Bumble, instead of meeting random people and sparking conversation people are resulting to swiping to find their perfect match.
- One of the data sets that I found compared opinions of online dating from 2005-2013 and it would be interesting to see how many of those participated are now married or seriously involved with someone they met online. Or possibly compare if their opinions have changed on online dating once using it for a certain period of time.
- The second data set that I found was a demographic of tinder users and it would interest me to see what each gender uses the app for (i.e. just to meet people, looking for a serious relationship,etc). Also comparing the ages of users and if the older the age the more serious of a relationship?
- With the Tinder demographic data set comparing ages of people from different socioeconomic areas to see if certain areas or cities are using it more frequently and why they might be using the app. Also looking into if their is a smaller area using data apps figuring out why they are looking for alternative routes of meeting people.
So, I found this really cool dataset about names. it includes all sorts of cool data divided into all sorts of subcategories of data. For example, you can see the to 1000 surnames in the United States, how many people have those surnames, what their nationality is, etc. (There are 127,073 Carpenters in the U.S.) You can also see things like the top 1000 female first names. (2.629% of women are named Mary.)
This started bring a lot of questions to mind:
- Why do these names occur so often? For first names, what era were they popular in? Did a celebrity inspire these names?
- Is it possible to predict the next “popular” child’s name? In my generation, there are a lot of Marys, Sarahs, Ashleys, Matts, Wills, etc. What name will be popular in the next generation’s classroom?
- What Nationality are these names typically associated with? Is there a surprising name associated with a nationality that it is not typlically thought of as related to?
Here is the link: http://names.mongabay.com/data/
One dataset that looked particularly interesting to me was a dataset about the lottery. As we all know, the Powerball was a record breaking $1.5 billion this past January and everyone and their brother bought a ticket in hopes of winning. Below are three story ideas I would use to articulate a well-developed story based on the following data.
- Which state has the highest amount of tickets sold? Based on the numbers listed in the dataset it would be easy to identify which state sold the most tickets. However, there is a disclaimer at the bottom saying “I offer no warranty as to the accuracy of any of my presented information.” Any good journalist checks and double-checks his or her information.
- Which state did the winning ticket come from and how does that compare to the total number of tickets sold? It would be interesting to find the odds of winning the lottery and how it compares to everyday events- for instance, the New York Times reported that “The odds of being struck by lightning this year are one in 1.19 million, making it about 246 times as likely as winning the Powerball jackpot.”
- What is the average income of the state with the most sold tickets? It would be interesting to look at the wealth of the state with the most sold Powerball tickets to see if it had an affect on the number of tickets sold.
I found three datasets (unfortunately I think you have to pay to see the actual data) that could complement each other nicely for several stories:
- It is possible that despite my search this topic has already been covered, but I think a simple, predominantly visual/graphic story – probably just for online and not print publications – focused on comparing the base line numbers of how many applications are available to each of the three operating system categories would be useful and interesting to some. Coming from a large family that couldn’t be more divided on Apple v. Windows v. Android, I think a visual comparison could tip the scales in favor of one or another for people either new to the smartphone world, thinking about switching, or in my case seeking evidence for a lively dinner debate.
- Building off of the first pitch, it might be more informative to break down the apps available to each of the three operating systems by category. With a little additional local market research, this story could be made timely and relevant to a number of largely populated areas. Using Nashville as an example, perhaps the headline could focus on live music and/or music production applications available to each operating system, and be an interactive graphic like those we viewed in class last week, that also presented the data for other categories like lifestyle apps or games. Or, you could choose one app category and put it in the context of numerous cities or regions. For instance, the number of restaurant and/or food production apps available to each operating system in the U.S.’s 15 largest cities. There are several ways to make the story timely or local, it would just be a matter of choosing the right application categories and applying them to relevant places.
- Gearing up for one of my first exams of the semester, I had a professor tell us that smart watches were absolutely prohibited during exam time. I’ve heard the standard no cell phone/electronics spiel hundreds of times throughout my time in the education system, but the topic of wearable technology really suck out to me. I think a story (disclaimer: I have no way of knowing if this data is in the datasets linked, it could require additional datasets) that took the apps provided by Apple, Android and Windows and found the number of each that were compatible with their corresponding wearable technology would be informative and well-received. Additionally, the story could benefit from further categorization into the apps from which you can only view posts/stats. and those that support input from the wearable technology itself. Then it would be much easier for consumers to see – through the app data – which devices were functional in the sense of interaction, and which ones were just expensive screens on which to view information, but be unable to contribute to it or update it.
The first dataset I found most interesting was one from Reddit. Especially after coming to college, caffeine intake has become a huge topic of discussion. This dataset shows the amount of caffeine (in mg) in each type of drink. A story idea that I would be interested in writing about would be to compare the amount of caffeine in coffee, soda, tea, energy drinks, and shots. As well as comparing the nutritional value of each within the categories. If I could get more information, I would also like to add in which is most effective in the mornings versus at night. The dataset I used is: http://www.caffeineinformer.com/the-caffeine-database
The second dataset I found interesting was a dataset about datasets. I thought it would be interesting to do a story about how they gather all of their data and maybe discuss how you can use such a large amount of data and how it can benefit any readers. I might even be able to link to sources that used the data to give examples of how it can be used. I looked at: https://github.com/caesar0301/awesome-public-datasets
The third dataset I found was a public health dataset. Especially with the Zika virus, or Ebola before that, health concerns are always on everyone’s mind. It would be interesting to write a story comparing times when there was no serious health concern with times when there might have been some sensationalism happening in regards to the health risk. Using a dataset that shows international health statuses could provide an interesting story covering overall health of citizens, mortality rate, life expectancy rate, and anything else you can think of. The dataset I used was: http://stats.oecd.org/index.aspx?DataSetCode=HEALTH_STAT
After looking at a workout based dataset on t-nation.com, I came to the following conclusions for my dataset based stories. The dataset I analyzed was a workout based on Bulgarian training, and it detailed the reps, sets, and weights to “simplify” the workout.
Looking at the data led me to these three questions for a story:
- Does the workout really become simplified? Exactly who does the rep/set scheme apply to, and how can it be adjusted for each individual?
- What makes Bulgarian training effective? Is it the weights utilized with the rep/set scheme? Does the data provided show how the workout is effective?
- What is the un-simplified version? Does it provide different data (reps/weight/sets), and why must it be simplified?
Attached is the link
Threadbase gathers information and statistics about clothing and different brands. One specific dataset that I found interesting was on how no two brands have the same sizing and so for example if you purchase and small from Topshop it will be different sizing and fit different than an small from Abercrombie.
1) Why are the sizes so different? Shouldn’t/isn’t there a standardization for sizing. Interview people uncovering the problems with this. And does it really bother people when they have to buy a size that they really aren’t.
2) An article just uncovering the truth and pointing out that for example a J.Crew medium t shirt is like an XL t shirt from Zara in terms of sizing. That is crazy to me. I know I wouldn’t want to buy an XL t shirt if I was really a medium. I’m not sure that people really realize the differences in this. I never really thought about it like this until I came across this data set. I think it would help people know where to shop and what stores fit them in the best way.
3) What is this like for women? This dataset was more geared towards mens clothing but I know there are the same discrepancies between sizing in women’s clothing. It would be helpful and interesting to uncover sizing issues for women as well, so they know what stores make sizing that will fit them the best.
This data set shows the median household income for the United States in terms of the Consumer Price Index (something used to measure price level changes in goods purchased by households). I’m interested in where the number is now and what has caused it to move as it has (considering that politicians are going to make a lot of economic claims between now and November).
2. and 3. https://research.stlouisfed.org/fred2/series/CIVPART
The labor force participation rate is the number of people who are employed or actively looking for work divided by the total number of Americans at a working age. This set raises two major questions that data-based stories can try to answer:
1. Is the decline in labor force participation caused by changing demography, continually poor economic performance, some mixture of the two, or something else entirely?
2. Are the solutions being proposed by candidates (think of anything they say will “create jobs”) going to address the issue in a feasible manner?
- Pensacola, Florida, my hometown (as well as two others’ in this class) was ranked in 2009 as the city with the worst water quality in America. Since then, our water provider, ECUA, has cleaned up its act (quite literally), but now that #flintwatercrisis is happening, I’m curious if there is a way to compare the atrocities. I’m hoping that I will find that Pensacola’s water quality issues to be blandly mild salsa in comparison to Flint’s issues. The links below are site’s I have found with helpful data.
Lead testing results for water sampled by residents
2. Another data set that hits close to home involves a rare heart disorder called Brugada Syndrome. My mom was diagnosed with it two years ago and since then it’s been a learning process. It is a fairly new discovery (1992) and being so there is very little conclusive data on the topic. It’s extremely rare in women, sometimes hereditary, and is the leading cause of sudden unexplained death #fun…So for this story idea, I’m looking to (somehow) gauge the likelihood of women being diagnosed vs. men, and also the ages when it was first discovered on ECGs. Deaths caused by untreated cases, etc.
3. Thirdly, I have been looking into this topic for more reasons than just this assignment, as I am trying to find a job come May (p.s. my second major and the one I’m attempting to work in is Interior Design)… I am interested in the varying salaries of interior designers across different states. There are a lot of data sets on topics like this so it won’t be difficult to find, the one I linked below seemed to be the most helpful. I think it would be interesting (just personally) to pick cities/states I am interested in living in and compare average salaries and hourly rates for designers in those areas specifically.
// c h a n d l e r m o o r e //
This may be a little morbid, but seeing as I am binge watching Criminal Minds, I thought this dataset would be appropriate.
The dataset that I found was the Texas Department of Criminal Justice Death Row Information.
It’s a comprehensive list of everyone executed in the state of Texas since 1982.
**spoiler alert, there have been 533 to date**
My story ideas:
- I thought everyone’s last words were really interesting. I read through some of them and albeit that there were some outliers, most of the executed were remorseful. What
- Texas has THE highest execution rate amongst its other 49 counterparts. Why? Are more heinous crimes committed in Texas? Are Texans just more “flip switchin” happy?
- Why are more men executed in Texas than women? There are very few females on the list and I’ve always been interested about why more men are executed.