Data Tells a Story: fighting fire with data; nature versus nurture; texting while driving

Pierce Quint

Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: fighting fire with data; nature versus nurture; and texting when you’re not supposed to.

New York City Fights Fire with Data

The NYC Fire Department (FDNY) has been mining data to better predict where fires may start. Organizing data from city agencies into about 60 risk factors, their system creates lists of buildings that are most at-risk.

Before implementing their data mining system, the FDNY kept all of their inspections records on paper and stored at each firehouse, “so there was no way to share information among other fire companies, battalions or divisions.” Now they have a data warehouse that the whole department can access.

The system has also streamlined the FDNY’s inspections process. For about 350,000 buildings, each unit makes 26 different kinds of inspections, which were originally tracked on clipboards. Now the entire inspection workflow has been automated and statistics are collected.

Nature v nurture: research shows it’s both

Researchers of University of Queensland in Australia have attempted to answer that age-old question: is it nature or nurture? Their answer: it’s both.

Teaming with scientists at the VU University of Amsterdam, the researchers reviewed nearly every twin study from around the world in the past 50 years, which involved more than 14.5 million pairs of twins and 17,804 traits. Their findings showed that “the variation for human traits and diseases is 49 per cent genetic, and 51 per cent due to environmental factors and/or measurement errors.”

It’s not just texting: drivers are Instagramming and video chatting too

While almost all 50 states have laws that explicitly ban texting while driving, a study from Braun Research has shown that drivers also email, check Facebook, and take photos while behind the wheel.

The market research firm conducted a phone survey with over 2,000 smart-phone owning drivers aged 16 to 65. Sixty-one percent admitted to texting while driving, and 33 percent to emailing. Browsing the Internet and using Facebook are neck-and-neck at 28 and 27 percent, respectively.

At the same time, most people surveyed realized that using their phones while driving was dangerous. Only 27 percent thought they could take video safely, while over 78 percent considered texting or emailing while driving as a “very serious threat to safety.”

There still isn’t a lot of data on how dangerous simultaneous driving and smartphone-using is. Very few people want to admit that their accidents were caused by being distracted by their phones, and as a result such accidents are underreported. However, a recent study by AAA showed, via footage captured by in-vehicle cameras, that 12 percent of teen drivers involved in accidents were using phones at the time.

Texting at the playground: New study shows how much time parents spend buried in smartphones

Good news, parents: you’re not using your phone in front of your kids as much as you think you are.

Researchers collected 33 hours of data at various North Seattle playgrounds, tracking how long caregivers used their phones as well as conducting interviews. Their findings showed that while 44 percent of parents and caregivers worry about excessive phone usage in front of their charges, the majority spent less than five percent of their time on their phones on the playground, while 41 percent didn’t use their phones at all.

Those who did use their phones did so for a short time: almost 30 percent of usages were less than 10 seconds long and more than half were less than one minute.

However, researchers also discovered that during phone usage, it was difficult for children to get their caregivers’ attention. In 18 cases the adult didn’t respond at all to the child, while in 70 instances of a child trying to get the attention of an adult not on the phone, the child usually got a quick reply.

Here’s how quickly interviewers decide whether or not to hire you

A study has shown that the majority of hirers make their decisions in the first 15 minutes of the interview.

Looking at 600 30-minute interviews with college and graduate students, the researchers also found that less than 5% of interviewers decided in the first minute, and about a quarter decided in the first five minutes, debunking the myth that most hiring decisions are made in five minutes or less.

Other revelations included: a longer interview is better in that it gives the applicant the chance to “break through subjective filters” and for the interviewer to get more information; interviewers who made small talk decided more quickly, perhaps relying on emotion and gut instinct; and those who interviewed in a more structured way took longer to decide.

Even order matters. Being fourth “seems to offer the best chance of having a substantive interview,” while being near the end hurts your chances.

[Photo via Flickr: “Pierce Quint,” CC BY 2.0 by Lane Pearman]

Data Tells a Story: confiding in the family pet; wasted health care; measuring vanity

Smaranda & Arpagic

Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: confiding in the family pet; wasted health care; measuring vanity.

Children ‘more likely to confide in pets than siblings’

Using data from a 10-year study of 100 families in the United Kingdom, a Cambridge University psychiatry researcher has found that children facing “emotional difficulties,” such as “bereavement, divorce, instability and illness” place higher importance on their pets for support than on their siblings, perhaps because children feel “their pets are not judging them.”

The data also showed that in the U.S. “children are more likely to live with a pet than their natural father,” specifically, “about two thirds of children live with their father, while about four in five of families with school-age children have a pet.” Moreover, having a pet “encouraged more social behaviour, such as ‘helping, sharing, and co-operating’.”


A study with more than a million Medicare patients showed that a “huge proportion” had received low-value care — in other words, care that was simply a waste.

Researchers took a look at how often patients received “one of twenty-six tests or treatments that scientific and professional organizations have consistently determined to have no benefit or to be outright harmful,” such as an EEG for an “uncomplicated headache,” or a CT or MRI scan for “low-back pain in patients without any signs of a neurological problem.” In one year, 25 to 42 percent of Medicare patients “received at least one of the twenty-six useless tests and treatments.”

The data also shows that almost every family in the U.S. “has been subject to overtesting and overtreatment in one form or another”; that millions are receiving useless drugs; and operations, and scans and tests that aren’t just ineffective but might be harmful as well. As a result, “costs appear to take thousands of dollars out of the paychecks of every household each year,” resulting in reduced spending on real needs such as “food, clothing, education, and shelter.”

Israeli big data teaches farmers a cup of joe means better crops

Big data is helping farmers more effectively and efficiently manage their farms.

One platform places sensors on plant-life, livestock, and farming equipment, and collects data on factors such as “environment, temperature and humidity, how much animals are eating, activity among animals, soil conditions for plants, the level of pests in an area, and much more.” Next, the data is analyzed and “compared to guidelines for ideal production under the circumstances,” and suggestions for improvement are sent back to the farmers.

In addition to raw data, the platform also considers cultural issues and how people work. For instance, it found that in Serbia, farmers who drank coffee first thing in the morning were more productive than those who didn’t, and so recommended that farmers ingest caffeine before setting out to work.

Cuba Turns To Analytics, Big Data To Help Tourism

Every year Cuba receives around 2.8 million visitors, half from Canada. However, once Americans are free to travel to the City of Columns, that number is expected to increase by two million in the first year alone. To help accommodate such an influx, Cuba is turning to big data to help improve their tourist industry and infrastructure in general.

The Cuban government is using a platform to monitor all of their hotel and tourist establishments; social media in the most tourist-y locations; and mentions of “Cuba” on social media. The platform also categorizes and segments the information.

However, the biggest challenge the country faces is shoddy Internet connections and availability, with an Internet penetration of just five percent. While the U.S. is planning to authorize the “sale of consumer communications devices, related software, applications, hardware, and services” to Cuba, the Cuban government still “needs to lift its own import bans.”

“Vanity capital” is the new metric for narcissism, and analysts say its value worldwide is greater than Germany’s GDP

The Bank of America Merrill Lynch is trying to quantify vanity.

Specifically, they’ve put out a report putting a price tag “on the amount we spend globally on products and services that enhance our appearance or prestige.” While defining exactly what those products and services are is tricky — jewelry, art, and a private jet, prestigious for sure, but smartphones, wedding rings, and ivy league educations may seem more like necessities — these numbers in aggregate certainly tell interesting tales.

For instance, China tops all countries in growth of both vanity and non-vanity capital from 2009 to 2014, reflecting a growth of per capita income. Changes in regards to gender and spending habits are another factor: women’s economic standing around the world is improving, “which allows them to buy more stuff,” while men in general have become more fashion conscious, purchasing “man-bags” and the like.

Other factors include social media, which “makes narcissism and envy ubiquitous,” and e-commerce, which increases shopping options.

[Photo via Flickr: “Smaranda & Arpagic,” CC BY 2.0 by Cristian Bortes]

Data Tells a Story: humblebragging; Nepal; healthy food for healthy brains


Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: the ineffectiveness of humblebragging; data over nepotism to help Nepal; and healthy food for a healthy brain.

Harvard study: Humblebragging doesn’t work

A recent paper from the Harvard Business School proves something we might already know: humblebragging is the worst.

The series of studies showed that people generally dislike bragging disguised as self-deprecation or complaints, and even prefer straightforward bragging or complaining over humblebragging.

In one study, researchers had raters evaluate a dataset of over 700 tweets that had been categorized as humblebrags. The results showed that humblebragging was negatively correlated with likability, perceived sincerity, and perceived competence.

In another study, 122 college students were asked a typical job interview question, “What is your biggest weakness?” Those who answered truthfully (“Sometimes I overreact to situations”) were seen as more hireable than those who humblebragged (“I’m too demanding when it comes to fairness”).

Humblebraggers were also seen as less sincere, and while true braggers and complainers weren’t well-liked, they were still seen as more sincere than humblebraggers.

How Big Data Could Shape the American Workforce

Places like the New York City Labor Market Information Service, or LMIS, and the Grace Institute, a not-for-profit that provides free job training for women, are “taking a data-driven approach to workforce development.”

Using publicly available data, LMIS seeks patterns in job listings. For example, of 2,000 nursing positions, LMIS found “that half of them require a bachelors’ degree, showing the steady march of ‘credential creep,’ requiring higher and higher levels of education than were previously needed for a given job.”

The Grace Institute performed a six-month deep dive into labor statistics, looking for “a ‘sweet spot’ that would combine Grace’s success in training for entry-level administrative jobs with a growing sector that offered room for people in those entry-level jobs to grow too.” They found such a field — the healthcare sector — and used LMIS data to design a more targeted training program.

Use Data, Not Nepotism, to Deliver Aid in Nepal

According to Ravi Kumar, a co-founder of Code for Nepal, Nepal has had a long history of nepotism, in which “people with connections and power have access to most of the resources, especially during times of crisis.” Kumar argues that the delivery of aid, especially now in the aftermath of a devastating earthquake, should instead “be driven by the evidence on the ground and socio-economic data.”

For example, the data shows “that villages outside the Kathmandu Valley need aid the most,” but are probably not getting enough because the damage in such remote areas are probably being underreported. For instance, according to Nepal’s 2011 census, the district of Nuwakot has about 59,000 households. About 45,000 houses have been reported as destroyed or damaged — which leaves only one in four houses left intact — and yet only only 1,300 injuries have been reported. As a result, Nuwakot has probably received far less help than they actually need.

Code for Nepal offers “an interactive map of the effects of Nepal’s earthquake,” using “district-level data to show injury tolls, death counts, and houses damaged to determine where aid is needed the most.”

New study shows that people stop listening to new music at 33

A new study using data from music streaming site Spotify and music data site Echo Nest shows that people generally stop listening to music at the ripe old age of 33.

The study found that teens listen “almost exclusively to top Billboard hits”; that those in their 20s begin exploring indie music; and that “tastes level off” once people are in their early to mid-30s, also known as “taste freeze.”

The data also surfaced gender differences. Women of all ages “are more likely to be streaming popular artists than men are,” perhaps because the most popular songs tend to be from female solo vocalists, and “the decline in popular music streaming is much steeper for men.” For instance, women “show a slow and steady decline in pop music listening from 13-49, while men drop precipitously starting from their teens until their early 30s.”

Brain food is real: Study shows how diet affects memory as we age

Over the course of five years, researchers followed almost 28,000 people “aged 55 or older from 40 different countries,” and found that those “who consumed the most nutritious food had a nearly 25 percent reduction in the risk of mental decline compared those with the least healthy diets.”

Participants self-reported the kinds of foods they ate and were tested on memory and thinking ability in the beginning of the study, after two years, and at the end. Data points included 10 different aspects of cognition; the ability to remember and recall a list of objects; arithmetic abilities; and attention span.

Over the five years of the study, almost 4,700 people “suffered a decline in thinking and memory,” and “those consuming the most nutritious diets were 24 percent less likely to have cognitive declines compared to people consuming the least healthy foods.”

The link between healthy diet and healthy brain activity may have to do with nutritious food’s positive effects on cardiovascular risk factors and cardiovascular disease, which “is an important mechanism for reducing the risk of cognitive decline,” says the study’s lead author. Another theory is that a healthy diet may reduce cellular inflammation, and that “If you’re eating well, odds are your brain is less stressed.” And as we all know, a less stressed brain is a happier one.

[Photo via Flickr: “hy-vee,” CC BY 2.0 by Dean Hochman]

Data Tells a Story: the dangers of air pollution, too much TV, and sitting for too long

She has since stopped watching TV

Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: the dangers of air pollution, too much TV, and sitting for too long.

China To Use Big Data To Rate Citizens In New ‘Social Credit System’

Think of it as a credit score on steroids.

China hopes to use data such as financial standing, criminal record, and social media behavior to give a kind of “social credit score” to every citizen, which would hold everyone “accountable for financial decisions as well as moral choices.”

The goals of this numeric rating system would be to  built “a harmonious Socialist society”; strengthen “the sincerity consciousness of the members of society”; and promote “‘socialist core values’ such as patriotism, respecting the elderly, working hard and avoiding extravagant consumption.”

Alibaba is already doing something similar with their users’ shopping data, although that focuses on loan, rather than moral, worthiness.

Olympics study links Chinese pollution to lower birth weights

A recent study sampling data from 83,672 births in Beijing found that those babies who spent their eighth month in womb during the 2008 Olympics (early August through late September) were “23 grams larger at birth, compared to the same period in 2007 and 2009.”

This is because, “as a condition of hosting the Olympics,” Beijing reduced pollution levels between 18% and 59% by closing factories and power plants, seeding clouds, and restricting traffic. “Air pollution can result in lower birth weight,” according to one of the researchers, and while that doesn’t necessarily mean a baby is less healthy, low birth weight is related to some diseases later in life and higher risk during the baby’s first month.

The significant link was seen only in the eighth month of pregnancy when infants “enjoy a growth spurt.”

Air Pollution Tied to Brain Aging

A separate study with 943 men and women over 60 showed a link between air pollution and premature aging of the brain.

Researchers examined data from the participants’ MRI exams, how close they lived to major highways, and satellite data that measured a type of pollution “that easily enters the lungs and bloodstream.”

Researchers found that those exposed to the highest amount of this pollution “had a 46 percent increased risk for covert brain infarcts, the brain damage commonly called ‘silent strokes,’” and “that each additional two micrograms per cubic meter increase in [the pollutant] was linked to a decrease in cerebral brain volume equivalent to about one year of natural aging.”

Study makes surprising link between TV time and childhood obesity

A study run by the U.S. Department of Education found that of over 10,000 kindergarteners around the country, those who watched more than an hour of television a day were 52% more likely to be overweight than those who watched less.

The data points researchers took into account before and after the year-long study were BMI, amount of TV time, and amount of computer time. They found that kids who watched an hour or more of TV per day were “39% more likely to become overweight between kindergarten and first grade,” and “86% more likely to become obese during that time.” Interestingly, they found no correlation between computer time and BMI.

The American Academy of Pediatrics currently recommends less than two hours of total screen time (that’s computers and TV) per day, which, at least as per this study, seems like too much.

Study finds we think better on our feet, literally

Standing desks aren’t just for grown-ups anymore.

While previous studies have shown that standing desks can help reduce childhood obesity (those standing burned 15 percent more calories than those sitting), researchers have also found that being on their feet helped kids have at least 12 percent greater “on-task engagement” in classrooms than those who stayed on their rears.

The study involved 300 kids in second through fourth grade. The researchers observed the students over the course of a school year, looking for data points that fell into “on-task engagement” such as answering a question, raising their hand, and participating in discussion, as well as “off-task” behavior like talking when they weren’t supposed to.

This increased percentage of on-task engagement equals “an extra seven minutes per hour of engaged instruction time.”

[Photo via Flickr: “She has since stopped watching TV,” CC BY 2.0 by Nathan Walker]

Data Tells a Story: when college is worth it; creating new foods; protecting our forests

Forest near Vřesina

Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: when college is worth it; creating new foods; and protecting our forests.

College is worth it if you have these six experiences

A study from the Gallup-Purdue Index has shown that “six elements of emotional support and experiential learning in college” are correlated with “long-term career and life success.”

The study involved 30,000 US college graduates and examined “the degree to which graduates were engaged in their work and thriving in their purpose, social, financial, community, and physical well-being,” all predictors of outcomes such as worker productivity, absenteeism, and healthcare cost burden, and many more.

The Gallup research found that 25% of US college graduates “fail to thrive in their overall careers and lives,” and that same percentage miss out on all six of the elements correlated with success.

The six elements include a professor who made them “excited about learning”; a supportive mentor; the chance to work on a long-term project; an internship; and high involvement in extra-curricular activities.

Can We Find Meaning In Our Wearable Data? Exist Thinks So

Sure, our wearables tell us how many steps we’ve taken, how many calories burned, how much sleep we’ve gotten (or how little), but what do we do with all this data?

One platform uses data from wearables, apps, and services to find patterns and trends that users can act upon. Self-reported data such as mood can be measured against other data like number of steps, amount of sleep, or what music you’re listening to, which might show, for example, correlation between mood and physical activity, quality of sleep, or type of music.

The platform also recognizes unusual activity — a change in bedtime is one example — that the user might not have noticed.

From Big Data to Big Bets on Food Science

While Uber uses big data to optimize transportation, Airbnb to streamline lodging, and big pharma to discover new drugs, other startups are using big data to create new foods.

One small team of data scientists is building “a massive database of all known plant proteins,” which could amount to as many as 18 billion, with the idea of creating “new food sources for an expanding global population—sources that are cheaper, safer, and healthier than what we have today.”

With this database, the researchers can target their efforts by predicting how proteins will interact; identifying “combinations likely to produce enjoyable foods”; and pinpointing “what will produce the right tastes, textures, and colors.”

Predicting Tropical Deforestation With Big Data

Data scientists and rainforest conservationists are teaming up to use big data to predict instances of illegal logging, which endangers wildlife and makes climate change worse.

The team is using deep learning — a type of artificial intelligence “that involves processing tremendous amounts of data to solve problems in a way that roughly mimics the human brain” — to analyze “satellite images of tens of millions of acres of forest…to discover patterns that identify indicators of deforestation risk,” such as “road-building in previously undisturbed areas.”

With such predictive data, authorities can be alerted even before the tree cutting begins

3 Cities Using Open Data in Creative Ways to Solve Problems

Three cities are taking advantage of open data by using it to make life better for their residents.

In New Orleans, both city officials and citizens “can evaluate the city’s progress in confronting urban blight” through the BlightSTAT program. The mayor’s office also uses the BlightSTAT data to make decisions about the fight against blight, which has decreased by 30 percent since the program began in 2010.

San Francisco has partnered with Yelp to make restaurants’ health inspection data more readily available (as we discussed in an earlier post), which not only lets city residents know about food hazards, it has the potential to “shame repeat-offender restaurants into complying with health standards.”

Finally, Louisville has partnered with a medical service provider to plant GPS trackers in inhalers. The trackers “measure when and where in the city people use them most,” and matches these “hotspots of inhaler-user with air quality data” so that public officials can better target their interventions.

[Photo via Flickr: “Forest near Vřesina,” CC BY 2.0 by Jiri Brozovsky]

Data Tells a Story: a healthy world; cyberattacks; where dogs come from

Labs & a Portie

Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: trying to make the world healthier; factors in cyberattacks; and where dogs come from.

Turning to Big, Big Data to See What Ails the World

The Global Burden of Disease study seeks to compile and analyze a huge amount of to try to “understand what sickens us and kills us in every country in the world.”

Getting the data hasn’t been easy. Researchers have had to comb through birth and death records, hospital files, and household surveys, pulling together data points about everything from acne to liver cancer to non-poisonous animal bites.

The study was innovative because it measured not just causes of death but causes of illness and years lost to disability. Different types of health losses were given a value. For instance, severe depression was weighted at .655, or six and a half years of life lost for each decade, and researchers were surprised to find that in 2010, major depression caused more total health loss than TB. They also found that osteoarthritis caused more years of life lost than natural disasters, and neck pain caused more than cancer.

Countries have been able to use this kind of data to change their spending in health resources and other areas. Iran, for one, found that “traffic injury was its leading preventable cause of health loss in 2003, and put money into building new roads and retraining police.”

Gestational Diabetes Is Linked to Autism Risk

Researchers at Kaiser Permanente Southern California have found that “children born to mothers who developed gestational diabetes before 26 weeks of pregnancy were at a 63 percent increased risk of being diagnosed with autism spectrum disorder,” although that risk dropped to 42 percent when controlling for such factors as maternal age, household income, and the mother’s pre-existing conditions.

The study looked at more than 320,000 babies between 1995 and 2009, and the results have two applications: first, the importance of early prenatal care — the link between gestational diabetes and autism could mean “a fetus’ early exposure to uncontrolled high blood sugar may somehow affect brain development” — and two, extra vigilance during the infant’s development milestones for those mothers diagnosed with gestational diabetes before 26 weeks.

Ted Cruz Campaign Using Advanced Data-Mining to Target Voters

Be slightly worried: Ted Cruz has a data team, and they’re using data “not just to find out what issues matter to you, but why you care about them.”

However, this isn’t really new. Political candidates have been gathering the data dust of your digital footprint since Bill Clinton’s campaign in 1992. By 1996, says one campaign-data scientist, most campaigns were online, and by 2000, politicians and their teams were using data analytics to tailor their campaigns.

Cruz’s team will be gathering data via surveys and personality profiles to predict how voters might respond to their messaging.

User mistakes aid most cyber attacks, Verizon and Symantec studies show

Two studies have shown that “the vast majority of hacking attacks are successful because employees click on links in tainted emails, companies fail to apply available patches to known software flaws, or technicians do not configure systems properly.”

One finding showed that more than two-thirds of “the 290 electronic espionage cases it learned about in 2014 involved phishing,” or trick emails. As a result of so many people clicking on tainted links or attachments, hackers only need to send phishing emails to 10 employees to get inside “corporate gates” 90 percent of the time.

Solving the Mystery of Dog Domestication

Archaeologists have long been trying to figure out where domesticated dogs come from.

A study from 1997 which analyzed DNA from more than 300 modern dogs and wolves determined that “dogs may have been domesticated as early as 135,000 years ago,” while later studies “argued for a more recent origin—less than 30,000 years ago—but the exact time and location remained unclear.”

More recently, a researcher saw a pattern in his own dog data: breeds from East Asia were more “genetically diverse,” an indicator of “more ancient origins.” After a genetic analysis of more than 1,500 dogs, he concluded that “the animals had likely arisen in a region south of China’s Yangtze River less than 16,300 years ago—a time when humans were transitioning from hunting and gathering to rice farming.”

However, another researcher argued that “looking at modern DNA is a mistake,” and conducted his own study focusing on ancient DNA. After comparing the DNA of “18 dog- and wolflike bones from Eurasia and America to that of modern dogs and wolves from around the world,” he found that “the DNA of today’s dogs most closely matched that of ancient wolves.”

Now a new study brings in new technology: a computer that can take thousands of measurements “beyond mere length and width to determine actual shapes” such as eye circlets and “the jut and jag of every tooth.” While ancient DNA can tell you where the animal came from, one of the researchers says, only such “morphometric data can show you domestication in progress.”

[Photo via Flickr: “Labs & a Portie,” CC BY 2.0 by OakleyOriginals]

Data Tells a Story: resumes and gender; fighting cancer with data; the science behind happiness


Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: gender differences in resumes; fighting cancer with data; and some science behind happiness.

The resume gap: Are different gender styles contributing to tech’s dismal diversity?

A machine learning expert used big data to try and understand the differences between men’s and women’s tech resumes, and perhaps to explain why there’s such a large gender gap in the tech industry.

After analyzing 1,100 resumes — 512 from men and 588 from women — the biggest difference she found was that while women’s resumes were longer, they had fewer concrete details.

Only 19% of women’s resumes fit on one page as opposed to 61% of men. However, men provided more specific content: 91% of men included “bulleted verb statements that describe their achievements on the job, but only 36% of women do.”

Decoding and Defeating Cancer with Data Science

A group of UC Santa Cruz researchers are fighting cancer not with medicine but with data.

The data scientists are analyzing the molecular data of tumors to try and understand why treatment isn’t working. For example, tumors have traditionally been classified and treated “based on where in the body they originated.” The researchers, however, are using molecular data to “reveal similarities between tumors in different parts of the body that were not apparent before.”

In a study of 12 tumor types, the researchers’ data suggests that “as many as one in five may need re-classification.” Moreover, the misclassified tumors “were far more likely to be unresponsive to treatment.”

In addition, the researchers “are developing a social media-inspired platform” that links in  patients, doctors and researchers, and data from biopsy samples in real time. This will allow computer scientists and physicians to share their discoveries immediately, “instead of waiting months or years for the results to appear in peer-reviewed journals.”

Out With The Caraway, In With The Ginger: 50 Years Of American Spice Consumption

Armed with data found on the Economic Research Service of the U.S. Department of Agriculture’s website, Vox reporter Anna Maria Barry-Jester wanted to understand some spicy stories.

The data details “the availability of various spices every year from 1966 to 2012,” which can serve “as a proxy for consumption.” Over the past 50 years, spice consumption overall has nearly tripled, but the popularity of individual spices has been more varied.

For instance, the 1980s and ‘90s saw a 2,000 to 3,000 percent increase in production of chile peppers. A spice shop owner saw the same popularity in her business and “couldn’t figure out the obsession until one day a customer walked in with a Rick Bayless cookbook,” which focuses on Mexican cuisine.

Another example focuses on turmeric. While between 1966 and 2004, turmeric consumption was up and down, after 2011, it jumped nearly 70%. Multiple recent studies have shown that turmeric helps prevent diabetes, is as effective as ibuprofen, and has “shown potential for slowing the damage from neurological diseases.”

Using Biometric Data to Make Simple Objects Come to Life

A project on display in Dublin shows how objects can come “alive” with a human touch: a sugar bowl rises and falls with your breath rate; a tea bag “bobs up and down to the beat of your heart”; and a record player speeds up or slows down “based on skin conductance.”

The project centers on the idea “using the electricity naturally surging through our bodies to transfer electronic signals to inanimate objects,” which could be extended from sugar bowls and tea bags to your car settings changing “depending on whose gripping the wheel, or transferring files from your computer to your phone with your finger-tip.”

The Science Of Why You Should Spend Your Money On Experiences, Not Things

You might think a physical object that lasts a long time would bring more happiness than a fleeting experience — and you would be wrong.

A study from Cornell University showed that something called adaptation makes us less happy about objects. People’s self-reported data showed that while happiness for material and experiential purchases started out the same, over time their “satisfaction with the things they bought went down, whereas their satisfaction with experiences they spent money on went up.”

This is because people became adapted to the material purchases. In other words, that new TV fades into the background and “becomes part of the new normal,” while an experience like rock climbing or exploring a foreign city becomes part of who we are.

Another study showed that even some negative experiences bring happiness later, specifically after people “have the chance to talk about it.” Something that may have once been stressful becomes “a funny story to tell at a party or [is] looked back on as an invaluable character-building experience.”

Yet another study showed that people are less likely to negatively compare experiences than material purchases — it’s easier to compare and feel competitive about carats in a ring or shoe brands than it is about personal experiences.

[Photo via Flickr: “Happiness,” CC BY 2.0 by Moyan Brenn]

Data Tells a Story: Don Draper-bot; the importance of family time; the truth about apples


Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: a Don Draper-bot; the importance of family time; and the truth about apples.

Computers can now get you to buy more stuff by tapping into your emotions

One company wants to use data to take the human writer out of copywriter.

It took eight years for the company build up a database of 500,000 phrases commonly used in advertisements. They then categorized the phrases by emotional content — positive, negative, and neutral — and subdivided those into more specific emotions such as gratitude, excitement, and gratitude.

To get a phrase, the customer uses a program to enter product information, message requirements, and a particular feeling from a “wheel of emotions” (although the company suggests letting the program pick it).

The program also tests its messages against human-written ones, and, according to the company, “the messages chosen by the computer have a 95% probability for performing better than something written by a human.”

Study Has ‘Unexpected’ Findings on How Family Time Affects Kids

A study from the University of Toronto and Bowling Green State University has shown that “spending just six hours a week with their moms reduced the likelihood of negative behavior in teens.”

The researchers analyzed data from the Panel Study of Income Dynamics Child Development Supplement, specifically the amount of time mothers spent with their children in “the 3-year-old to 11 year-old age bracket and and the 12-year-old to 18-year-old age bracket.”

They found that while amount of maternal time didn’t matter for “offspring behaviors, emotions or academics” in either childhood or adolescence, it did matter for number of delinquent behaviors for adolescents. Teens who engaged in as little as less than an hour of maternal time per day had fewer delinquent behaviors.

Why do private school students do better? It’s not their education, study finds

In a Statistics Canada study in which researchers looked at students from six Canadian provinces, they found that private school students did better academically, not because of school resources and practices, but due to “socio-economic factors and peers who tend to have university-educated parents.”

Looking at about 7,000 students across almost 1,180 schools, researchers discovered several findings:

compared with public school students, higher percentages of private school students lived in two-parent families with both biological parents; their total parental income was higher; and they tended to live in homes with more books and computers.

New study investigates the link between family income and brain development

In a study of 1,099 participants, researchers at The Saban Research Institute of Children’s Hospital and Columbia University Medical Center have found that “among children from the lowest-income families, small differences in income were associated with relatively large differences” in brain surface area for regions “associated with skills important for academic success.”

The researchers found that “improved performance in cognitive skills was also associated with higher income.” They also point out that “family income is linked to nutrition, health care, schools, play areas and even air quality – all factors that can contribute to brain development,” and suggest that “wider access to resources likely afforded by the more affluent may lead to differences in a child’s brain structure.”

Daily Apple Not Associated With Reduced Doctor Use

A University of Michigan study discovered the unthinkable: an apple a day does not keep the doctor away.

Researchers reviewed data from 8,728 people and found that 9% were daily apple eaters. This 9% was also less likely to smoke, had higher average education levels,were more likely “to be from racial and ethnic minorities,” and were less likely to use prescription medications. However, they weren’t “any less likely to have seen a doctor more than once during the past year.”

[Photo via Flickr: “Camera Test Apple,” CC BY 2.0 by Kirinohana]

Data Tells a Story: Cricket World Cup; personalizing beauty; the story behind Ikea

cricket 16

Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: making cricket more interesting; personalizing beauty; and how Ikea took over the world.

Science: Sunny days make British office workers sad

Unlike what many might think, happiness is not correlated with sunshine, at least according to a study from the University of Westminster in England.

Researchers compared daily weather patterns from 1991 to 2009 with data from the British Panel Household Survey on wellbeing which began in 1991, and found there “was no significant variation in reported happiness between sunny days and cloudy ones.”

What they did find was a “small but statistically significant correlation between unhappiness at work and sunny days.” In other words, on a nice day people would rather be outside than stuck on the office .

How the ICC is using data analytics to make the Cricket World Cup more interesting

The International Cricket Council (ICC) is mining 40 years worth of Cricket World Cup data “to produce insights that enhance the viewer’s experience,” and to “improve team performance and strategies out on the field.”

After analyzing “statistics on scores, player performance, player profiles and more,” the ICC  came away with several findings. For instance, they discovered that the team with the most well-equipped bowling line up and not necessarily “the power-packed batting [line-up]” might be the team most likely to win the World Cup.

They also found the characteristics that make a skilled player, which explains why countries not usually thought of as dominant cricket nations, such as the UAE and Ireland, perform well: they “were found to have strong performing players with similar characteristics.”

Why VCs Are Hungry for Beauty Data Startups

A beauty start-up that began as a give-away site is now using their data to power personalization tools and help brands give their consumers better experiences.

As part of their give-away program, the start-up invited consumers to periodically answer questions, of which there were 1,300. The company has collected over 10 million data points and is using this data for customer personalization and hypertargeting.

For example, they found that almost half of acne sufferers want “overnight results”; a quarter of people are loyal to certain perfume brands and aren’t eager to try new ones; and 37% are “most influenced by bloggers and editors as to what skincare products to purchase.”

Can data analysis reveal the most bigoted corners of Reddit?

When a post on Reddit asked Redditors to nominate the most “toxic communities” on the site, Ben Bell, a data scientist at a text-analytics start-up, thought there should be an objective way to measure toxicity.

To do so, Bell pulled out a sample of comments from the top 250 subreddits and from the forums mentioned in the toxicity thread. Using sentiment analysis, each comment was coded as positive, negative, or neutral, and afterward human annotators examined the negative comments to determine their toxicity.

He found that in some subreddits, “the community is proactive enough at self-policing that the average score for a bigoted comment is negative,” and “at the other end of the spectrum are those communities which seem to deliberately encourage bigotry.” He also found another kind of toxicity, that which is directed outwards — in other words, a subreddit that “focuses on highlighting bad content around the rest of Reddit.”

How Ikea took over the world

It’s not just the meatballs.

Market research is “at the heart of Ikea’s expansion.” For example, the furniture company gathered data about morning routines from over 8,000 people in eight cities. They found that people from Shanghai were “fastest out the door” (56 minutes) while those from Mumbai were the most leisurely, clocking in at two and a half hours before leaving. Those most likely to work in the bathroom? Stockholmers and New Yorkers.

What researchers also found was that regardless of city, women spend more time than men picking out their outfits, “a process many find stressful.” Ikea’s solution? A freestanding mirror called the Knapper onto which one can hang clothes and accessories the night before to decrease morning stress.

To make up for unreliably reported data (in other words, sometimes people lie, whether consciously or not), the company incorporates observed data too, and sometimes finds their items being used in unexpected ways. For instance, via cameras set up in homes, they found that residents in Shenzhen, China often sat on the floor, “using the sofas as a backrest.”

[Photo via Flickr: “cricket 16,” CC BY 2.0 by Barry Skeates]

Swagger + SmartBear!


Since Swagger‘s creation in 2011, we’ve seen phenomenal uptake of Swagger in the API community. From startups to enterprises alike, Swagger has become a common word with both REST APIs and integration architecture. We are proud to have brought frictionless development between architects, API devs, client devs, and documentation—even the company of one!

Fast forward a few years and several thousand espressos, we see Swagger actively supported in nearly every programming language, and deployed across tens-of-thousands of servers. Thanks to you, Swagger is far and away the clear leader in the API description landscape. We have accomplished this by being completely open source, transparent about our plans and goals, and most importantly by being vendor neutral. The official Swagger tools are downloaded 7,000 times a day now! And after Microsoft’s recent announcement of Swagger support across their Azure services and tooling (see the March Azure Announcement), this will increase even more rapidly.

Despite our passion and dedication for Swagger over the last several years, Swagger has outgrown Reverb. We need to guide Swagger to its next phase of growth with more resources and focus on the API space (Reverb stays plenty busy with its publisher products!) while staying true to the transparent and open-source nature that has enabled us to grow so quickly. That said, I’m both proud and excited that SmartBear has stepped up to officially lead the Swagger project!

Why SmartBear?

A change has been in the makings for the last few months. During that period, I’ve spent a lot of time talking with potential Swagger partners. With such wide industry adoption, there’s intense interest in keeping Swagger a common standard for API descriptions. Being the “glue” between services is certainly a privilege, but comes with a great responsibility to the industry. We had to ensure that the open spirit is accelerated with our partner.

We’ve been working with SmartBear for several years now. In addition to being an important yet neutral vendor in the API space, they have a track record with open source with their industry-leading SoapUI project. Critical to the success of Swagger, both past and future, has been the open attitude to both the spec and the source. SmartBear has not only a solid grasp of APIs and their importance, but with SoapUI, they are connecting to both small shops with their OSS version as well as providing support for enterprises with Pro versions.

This is where Swagger will continue its path. Both the Swagger specification—the connective tissue for the API—and the tooling will remain completely open source. SmartBear is in fact pushing the openness of Swagger forward to the next level by engaging industry leaders to establish an open governance model for the Swagger specification. The benefits of a common and shared standard in API description has proven to be invaluable, and we don’t intend to take that for granted.

Expect great things to come in this next stage of growth for Swagger. We want APIs everywhere, and to enable the developer to focus on making great products, not API plumbing.

If you have any questions for the Swagger team, please reach out to! Thank you for all your support, and look forward to the next stage of Swagger’s growth!