Data Tells a Story: when college is worth it; creating new foods; protecting our forests

Forest near Vřesina

Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: when college is worth it; creating new foods; and protecting our forests.

College is worth it if you have these six experiences

A study from the Gallup-Purdue Index has shown that “six elements of emotional support and experiential learning in college” are correlated with “long-term career and life success.”

The study involved 30,000 US college graduates and examined “the degree to which graduates were engaged in their work and thriving in their purpose, social, financial, community, and physical well-being,” all predictors of outcomes such as worker productivity, absenteeism, and healthcare cost burden, and many more.

The Gallup research found that 25% of US college graduates “fail to thrive in their overall careers and lives,” and that same percentage miss out on all six of the elements correlated with success.

The six elements include a professor who made them “excited about learning”; a supportive mentor; the chance to work on a long-term project; an internship; and high involvement in extra-curricular activities.

Can We Find Meaning In Our Wearable Data? Exist Thinks So

Sure, our wearables tell us how many steps we’ve taken, how many calories burned, how much sleep we’ve gotten (or how little), but what do we do with all this data?

One platform uses data from wearables, apps, and services to find patterns and trends that users can act upon. Self-reported data such as mood can be measured against other data like number of steps, amount of sleep, or what music you’re listening to, which might show, for example, correlation between mood and physical activity, quality of sleep, or type of music.

The platform also recognizes unusual activity — a change in bedtime is one example — that the user might not have noticed.

From Big Data to Big Bets on Food Science

While Uber uses big data to optimize transportation, Airbnb to streamline lodging, and big pharma to discover new drugs, other startups are using big data to create new foods.

One small team of data scientists is building “a massive database of all known plant proteins,” which could amount to as many as 18 billion, with the idea of creating “new food sources for an expanding global population—sources that are cheaper, safer, and healthier than what we have today.”

With this database, the researchers can target their efforts by predicting how proteins will interact; identifying “combinations likely to produce enjoyable foods”; and pinpointing “what will produce the right tastes, textures, and colors.”

Predicting Tropical Deforestation With Big Data

Data scientists and rainforest conservationists are teaming up to use big data to predict instances of illegal logging, which endangers wildlife and makes climate change worse.

The team is using deep learning — a type of artificial intelligence “that involves processing tremendous amounts of data to solve problems in a way that roughly mimics the human brain” — to analyze “satellite images of tens of millions of acres of forest…to discover patterns that identify indicators of deforestation risk,” such as “road-building in previously undisturbed areas.”

With such predictive data, authorities can be alerted even before the tree cutting begins

3 Cities Using Open Data in Creative Ways to Solve Problems

Three cities are taking advantage of open data by using it to make life better for their residents.

In New Orleans, both city officials and citizens “can evaluate the city’s progress in confronting urban blight” through the BlightSTAT program. The mayor’s office also uses the BlightSTAT data to make decisions about the fight against blight, which has decreased by 30 percent since the program began in 2010.

San Francisco has partnered with Yelp to make restaurants’ health inspection data more readily available (as we discussed in an earlier post), which not only lets city residents know about food hazards, it has the potential to “shame repeat-offender restaurants into complying with health standards.”

Finally, Louisville has partnered with a medical service provider to plant GPS trackers in inhalers. The trackers “measure when and where in the city people use them most,” and matches these “hotspots of inhaler-user with air quality data” so that public officials can better target their interventions.

[Photo via Flickr: “Forest near Vřesina,” CC BY 2.0 by Jiri Brozovsky]

Data Tells a Story: a healthy world; cyberattacks; where dogs come from

Labs & a Portie

Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: trying to make the world healthier; factors in cyberattacks; and where dogs come from.

Turning to Big, Big Data to See What Ails the World

The Global Burden of Disease study seeks to compile and analyze a huge amount of to try to “understand what sickens us and kills us in every country in the world.”

Getting the data hasn’t been easy. Researchers have had to comb through birth and death records, hospital files, and household surveys, pulling together data points about everything from acne to liver cancer to non-poisonous animal bites.

The study was innovative because it measured not just causes of death but causes of illness and years lost to disability. Different types of health losses were given a value. For instance, severe depression was weighted at .655, or six and a half years of life lost for each decade, and researchers were surprised to find that in 2010, major depression caused more total health loss than TB. They also found that osteoarthritis caused more years of life lost than natural disasters, and neck pain caused more than cancer.

Countries have been able to use this kind of data to change their spending in health resources and other areas. Iran, for one, found that “traffic injury was its leading preventable cause of health loss in 2003, and put money into building new roads and retraining police.”

Gestational Diabetes Is Linked to Autism Risk

Researchers at Kaiser Permanente Southern California have found that “children born to mothers who developed gestational diabetes before 26 weeks of pregnancy were at a 63 percent increased risk of being diagnosed with autism spectrum disorder,” although that risk dropped to 42 percent when controlling for such factors as maternal age, household income, and the mother’s pre-existing conditions.

The study looked at more than 320,000 babies between 1995 and 2009, and the results have two applications: first, the importance of early prenatal care — the link between gestational diabetes and autism could mean “a fetus’ early exposure to uncontrolled high blood sugar may somehow affect brain development” — and two, extra vigilance during the infant’s development milestones for those mothers diagnosed with gestational diabetes before 26 weeks.

Ted Cruz Campaign Using Advanced Data-Mining to Target Voters

Be slightly worried: Ted Cruz has a data team, and they’re using data “not just to find out what issues matter to you, but why you care about them.”

However, this isn’t really new. Political candidates have been gathering the data dust of your digital footprint since Bill Clinton’s campaign in 1992. By 1996, says one campaign-data scientist, most campaigns were online, and by 2000, politicians and their teams were using data analytics to tailor their campaigns.

Cruz’s team will be gathering data via surveys and personality profiles to predict how voters might respond to their messaging.

User mistakes aid most cyber attacks, Verizon and Symantec studies show

Two studies have shown that “the vast majority of hacking attacks are successful because employees click on links in tainted emails, companies fail to apply available patches to known software flaws, or technicians do not configure systems properly.”

One finding showed that more than two-thirds of “the 290 electronic espionage cases it learned about in 2014 involved phishing,” or trick emails. As a result of so many people clicking on tainted links or attachments, hackers only need to send phishing emails to 10 employees to get inside “corporate gates” 90 percent of the time.

Solving the Mystery of Dog Domestication

Archaeologists have long been trying to figure out where domesticated dogs come from.

A study from 1997 which analyzed DNA from more than 300 modern dogs and wolves determined that “dogs may have been domesticated as early as 135,000 years ago,” while later studies “argued for a more recent origin—less than 30,000 years ago—but the exact time and location remained unclear.”

More recently, a researcher saw a pattern in his own dog data: breeds from East Asia were more “genetically diverse,” an indicator of “more ancient origins.” After a genetic analysis of more than 1,500 dogs, he concluded that “the animals had likely arisen in a region south of China’s Yangtze River less than 16,300 years ago—a time when humans were transitioning from hunting and gathering to rice farming.”

However, another researcher argued that “looking at modern DNA is a mistake,” and conducted his own study focusing on ancient DNA. After comparing the DNA of “18 dog- and wolflike bones from Eurasia and America to that of modern dogs and wolves from around the world,” he found that “the DNA of today’s dogs most closely matched that of ancient wolves.”

Now a new study brings in new technology: a computer that can take thousands of measurements “beyond mere length and width to determine actual shapes” such as eye circlets and “the jut and jag of every tooth.” While ancient DNA can tell you where the animal came from, one of the researchers says, only such “morphometric data can show you domestication in progress.”

[Photo via Flickr: “Labs & a Portie,” CC BY 2.0 by OakleyOriginals]

Data Tells a Story: resumes and gender; fighting cancer with data; the science behind happiness


Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: gender differences in resumes; fighting cancer with data; and some science behind happiness.

The resume gap: Are different gender styles contributing to tech’s dismal diversity?

A machine learning expert used big data to try and understand the differences between men’s and women’s tech resumes, and perhaps to explain why there’s such a large gender gap in the tech industry.

After analyzing 1,100 resumes — 512 from men and 588 from women — the biggest difference she found was that while women’s resumes were longer, they had fewer concrete details.

Only 19% of women’s resumes fit on one page as opposed to 61% of men. However, men provided more specific content: 91% of men included “bulleted verb statements that describe their achievements on the job, but only 36% of women do.”

Decoding and Defeating Cancer with Data Science

A group of UC Santa Cruz researchers are fighting cancer not with medicine but with data.

The data scientists are analyzing the molecular data of tumors to try and understand why treatment isn’t working. For example, tumors have traditionally been classified and treated “based on where in the body they originated.” The researchers, however, are using molecular data to “reveal similarities between tumors in different parts of the body that were not apparent before.”

In a study of 12 tumor types, the researchers’ data suggests that “as many as one in five may need re-classification.” Moreover, the misclassified tumors “were far more likely to be unresponsive to treatment.”

In addition, the researchers “are developing a social media-inspired platform” that links in  patients, doctors and researchers, and data from biopsy samples in real time. This will allow computer scientists and physicians to share their discoveries immediately, “instead of waiting months or years for the results to appear in peer-reviewed journals.”

Out With The Caraway, In With The Ginger: 50 Years Of American Spice Consumption

Armed with data found on the Economic Research Service of the U.S. Department of Agriculture’s website, Vox reporter Anna Maria Barry-Jester wanted to understand some spicy stories.

The data details “the availability of various spices every year from 1966 to 2012,” which can serve “as a proxy for consumption.” Over the past 50 years, spice consumption overall has nearly tripled, but the popularity of individual spices has been more varied.

For instance, the 1980s and ‘90s saw a 2,000 to 3,000 percent increase in production of chile peppers. A spice shop owner saw the same popularity in her business and “couldn’t figure out the obsession until one day a customer walked in with a Rick Bayless cookbook,” which focuses on Mexican cuisine.

Another example focuses on turmeric. While between 1966 and 2004, turmeric consumption was up and down, after 2011, it jumped nearly 70%. Multiple recent studies have shown that turmeric helps prevent diabetes, is as effective as ibuprofen, and has “shown potential for slowing the damage from neurological diseases.”

Using Biometric Data to Make Simple Objects Come to Life

A project on display in Dublin shows how objects can come “alive” with a human touch: a sugar bowl rises and falls with your breath rate; a tea bag “bobs up and down to the beat of your heart”; and a record player speeds up or slows down “based on skin conductance.”

The project centers on the idea “using the electricity naturally surging through our bodies to transfer electronic signals to inanimate objects,” which could be extended from sugar bowls and tea bags to your car settings changing “depending on whose gripping the wheel, or transferring files from your computer to your phone with your finger-tip.”

The Science Of Why You Should Spend Your Money On Experiences, Not Things

You might think a physical object that lasts a long time would bring more happiness than a fleeting experience — and you would be wrong.

A study from Cornell University showed that something called adaptation makes us less happy about objects. People’s self-reported data showed that while happiness for material and experiential purchases started out the same, over time their “satisfaction with the things they bought went down, whereas their satisfaction with experiences they spent money on went up.”

This is because people became adapted to the material purchases. In other words, that new TV fades into the background and “becomes part of the new normal,” while an experience like rock climbing or exploring a foreign city becomes part of who we are.

Another study showed that even some negative experiences bring happiness later, specifically after people “have the chance to talk about it.” Something that may have once been stressful becomes “a funny story to tell at a party or [is] looked back on as an invaluable character-building experience.”

Yet another study showed that people are less likely to negatively compare experiences than material purchases — it’s easier to compare and feel competitive about carats in a ring or shoe brands than it is about personal experiences.

[Photo via Flickr: “Happiness,” CC BY 2.0 by Moyan Brenn]

Data Tells a Story: Don Draper-bot; the importance of family time; the truth about apples


Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: a Don Draper-bot; the importance of family time; and the truth about apples.

Computers can now get you to buy more stuff by tapping into your emotions

One company wants to use data to take the human writer out of copywriter.

It took eight years for the company build up a database of 500,000 phrases commonly used in advertisements. They then categorized the phrases by emotional content — positive, negative, and neutral — and subdivided those into more specific emotions such as gratitude, excitement, and gratitude.

To get a phrase, the customer uses a program to enter product information, message requirements, and a particular feeling from a “wheel of emotions” (although the company suggests letting the program pick it).

The program also tests its messages against human-written ones, and, according to the company, “the messages chosen by the computer have a 95% probability for performing better than something written by a human.”

Study Has ‘Unexpected’ Findings on How Family Time Affects Kids

A study from the University of Toronto and Bowling Green State University has shown that “spending just six hours a week with their moms reduced the likelihood of negative behavior in teens.”

The researchers analyzed data from the Panel Study of Income Dynamics Child Development Supplement, specifically the amount of time mothers spent with their children in “the 3-year-old to 11 year-old age bracket and and the 12-year-old to 18-year-old age bracket.”

They found that while amount of maternal time didn’t matter for “offspring behaviors, emotions or academics” in either childhood or adolescence, it did matter for number of delinquent behaviors for adolescents. Teens who engaged in as little as less than an hour of maternal time per day had fewer delinquent behaviors.

Why do private school students do better? It’s not their education, study finds

In a Statistics Canada study in which researchers looked at students from six Canadian provinces, they found that private school students did better academically, not because of school resources and practices, but due to “socio-economic factors and peers who tend to have university-educated parents.”

Looking at about 7,000 students across almost 1,180 schools, researchers discovered several findings:

compared with public school students, higher percentages of private school students lived in two-parent families with both biological parents; their total parental income was higher; and they tended to live in homes with more books and computers.

New study investigates the link between family income and brain development

In a study of 1,099 participants, researchers at The Saban Research Institute of Children’s Hospital and Columbia University Medical Center have found that “among children from the lowest-income families, small differences in income were associated with relatively large differences” in brain surface area for regions “associated with skills important for academic success.”

The researchers found that “improved performance in cognitive skills was also associated with higher income.” They also point out that “family income is linked to nutrition, health care, schools, play areas and even air quality – all factors that can contribute to brain development,” and suggest that “wider access to resources likely afforded by the more affluent may lead to differences in a child’s brain structure.”

Daily Apple Not Associated With Reduced Doctor Use

A University of Michigan study discovered the unthinkable: an apple a day does not keep the doctor away.

Researchers reviewed data from 8,728 people and found that 9% were daily apple eaters. This 9% was also less likely to smoke, had higher average education levels,were more likely “to be from racial and ethnic minorities,” and were less likely to use prescription medications. However, they weren’t “any less likely to have seen a doctor more than once during the past year.”

[Photo via Flickr: “Camera Test Apple,” CC BY 2.0 by Kirinohana]

Data Tells a Story: Cricket World Cup; personalizing beauty; the story behind Ikea

cricket 16

Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: making cricket more interesting; personalizing beauty; and how Ikea took over the world.

Science: Sunny days make British office workers sad

Unlike what many might think, happiness is not correlated with sunshine, at least according to a study from the University of Westminster in England.

Researchers compared daily weather patterns from 1991 to 2009 with data from the British Panel Household Survey on wellbeing which began in 1991, and found there “was no significant variation in reported happiness between sunny days and cloudy ones.”

What they did find was a “small but statistically significant correlation between unhappiness at work and sunny days.” In other words, on a nice day people would rather be outside than stuck on the office .

How the ICC is using data analytics to make the Cricket World Cup more interesting

The International Cricket Council (ICC) is mining 40 years worth of Cricket World Cup data “to produce insights that enhance the viewer’s experience,” and to “improve team performance and strategies out on the field.”

After analyzing “statistics on scores, player performance, player profiles and more,” the ICC  came away with several findings. For instance, they discovered that the team with the most well-equipped bowling line up and not necessarily “the power-packed batting [line-up]” might be the team most likely to win the World Cup.

They also found the characteristics that make a skilled player, which explains why countries not usually thought of as dominant cricket nations, such as the UAE and Ireland, perform well: they “were found to have strong performing players with similar characteristics.”

Why VCs Are Hungry for Beauty Data Startups

A beauty start-up that began as a give-away site is now using their data to power personalization tools and help brands give their consumers better experiences.

As part of their give-away program, the start-up invited consumers to periodically answer questions, of which there were 1,300. The company has collected over 10 million data points and is using this data for customer personalization and hypertargeting.

For example, they found that almost half of acne sufferers want “overnight results”; a quarter of people are loyal to certain perfume brands and aren’t eager to try new ones; and 37% are “most influenced by bloggers and editors as to what skincare products to purchase.”

Can data analysis reveal the most bigoted corners of Reddit?

When a post on Reddit asked Redditors to nominate the most “toxic communities” on the site, Ben Bell, a data scientist at a text-analytics start-up, thought there should be an objective way to measure toxicity.

To do so, Bell pulled out a sample of comments from the top 250 subreddits and from the forums mentioned in the toxicity thread. Using sentiment analysis, each comment was coded as positive, negative, or neutral, and afterward human annotators examined the negative comments to determine their toxicity.

He found that in some subreddits, “the community is proactive enough at self-policing that the average score for a bigoted comment is negative,” and “at the other end of the spectrum are those communities which seem to deliberately encourage bigotry.” He also found another kind of toxicity, that which is directed outwards — in other words, a subreddit that “focuses on highlighting bad content around the rest of Reddit.”

How Ikea took over the world

It’s not just the meatballs.

Market research is “at the heart of Ikea’s expansion.” For example, the furniture company gathered data about morning routines from over 8,000 people in eight cities. They found that people from Shanghai were “fastest out the door” (56 minutes) while those from Mumbai were the most leisurely, clocking in at two and a half hours before leaving. Those most likely to work in the bathroom? Stockholmers and New Yorkers.

What researchers also found was that regardless of city, women spend more time than men picking out their outfits, “a process many find stressful.” Ikea’s solution? A freestanding mirror called the Knapper onto which one can hang clothes and accessories the night before to decrease morning stress.

To make up for unreliably reported data (in other words, sometimes people lie, whether consciously or not), the company incorporates observed data too, and sometimes finds their items being used in unexpected ways. For instance, via cameras set up in homes, they found that residents in Shenzhen, China often sat on the floor, “using the sofas as a backrest.”

[Photo via Flickr: “cricket 16,” CC BY 2.0 by Barry Skeates]

Swagger + SmartBear!


Since Swagger‘s creation in 2011, we’ve seen phenomenal uptake of Swagger in the API community. From startups to enterprises alike, Swagger has become a common word with both REST APIs and integration architecture. We are proud to have brought frictionless development between architects, API devs, client devs, and documentation—even the company of one!

Fast forward a few years and several thousand espressos, we see Swagger actively supported in nearly every programming language, and deployed across tens-of-thousands of servers. Thanks to you, Swagger is far and away the clear leader in the API description landscape. We have accomplished this by being completely open source, transparent about our plans and goals, and most importantly by being vendor neutral. The official Swagger tools are downloaded 7,000 times a day now! And after Microsoft’s recent announcement of Swagger support across their Azure services and tooling (see the March Azure Announcement), this will increase even more rapidly.

Despite our passion and dedication for Swagger over the last several years, Swagger has outgrown Reverb. We need to guide Swagger to its next phase of growth with more resources and focus on the API space (Reverb stays plenty busy with its publisher products!) while staying true to the transparent and open-source nature that has enabled us to grow so quickly. That said, I’m both proud and excited that SmartBear has stepped up to officially lead the Swagger project!

Why SmartBear?

A change has been in the makings for the last few months. During that period, I’ve spent a lot of time talking with potential Swagger partners. With such wide industry adoption, there’s intense interest in keeping Swagger a common standard for API descriptions. Being the “glue” between services is certainly a privilege, but comes with a great responsibility to the industry. We had to ensure that the open spirit is accelerated with our partner.

We’ve been working with SmartBear for several years now. In addition to being an important yet neutral vendor in the API space, they have a track record with open source with their industry-leading SoapUI project. Critical to the success of Swagger, both past and future, has been the open attitude to both the spec and the source. SmartBear has not only a solid grasp of APIs and their importance, but with SoapUI, they are connecting to both small shops with their OSS version as well as providing support for enterprises with Pro versions.

This is where Swagger will continue its path. Both the Swagger specification—the connective tissue for the API—and the tooling will remain completely open source. SmartBear is in fact pushing the openness of Swagger forward to the next level by engaging industry leaders to establish an open governance model for the Swagger specification. The benefits of a common and shared standard in API description has proven to be invaluable, and we don’t intend to take that for granted.

Expect great things to come in this next stage of growth for Swagger. We want APIs everywhere, and to enable the developer to focus on making great products, not API plumbing.

If you have any questions for the Swagger team, please reach out to! Thank you for all your support, and look forward to the next stage of Swagger’s growth!

Data Tells a Story: Ferguson; taxis versus Uber; how music helps


Welcome to another installment of Data Tells a Story, in which we round up our favorite data stories of the week. The latest: what data says about Ferguson; when to take a taxi instead of Uber; how music helps.

Ferguson isn’t an anomaly: The real lesson of the Department of Justice’s explosive report

Data from the Justice Department’s recent report on the Ferguson, Missouri police department showed that from 2012 through 2014, African Americans accounted for 93 percent of all arrests in Ferguson although they make up for only 67 percent of the population.

But this pattern is evident beyond the city of Ferguson. Another analysis showed that from 2000 to 2013, the stop rate for blacks in St. Louis County increased by 522 percent, compared to the stop rate for whites which increased by 284 percent.

Stop rates have also increased at the state level, showing a 385 percent increase for African Americans and 252 percent for whites, during that same time period.

Computers can now predict violent outbreaks around the world

Political scientists at Yale University have found that while US efforts in Afghanistan “to win villagers’ hearts and minds were successful enough to render their villages Taliban targets,” they weren’t effective enough to encourage villagers to provide useful intelligence about improvised explosive devices (IEDs).

Researchers discovered this by surveying 2,754 men in 204 Afghan villages “about their level of support for the Taliban and the International Security Assistance Force (ISAF),” a NATO-led security mission in Afghanistan, and combining that data with data on “insurgent violence and the locations of military bases and aid projects.”

They fed these factors into a statistical model which showed that villages’ levels of support for ISAF could predict the degree of IED attacks. For instance, a village with “modest” support for ISAF would experience 13 more attacks on average over the following five months than one strongly against the ISAF.

The Early Warning Project is hoping to use such data to be able to predict violent outbreaks before they happen and to provide additional aid in order to stop them.

Data scientists have isolated the exact times a Yellow Taxi is a better deal than an Uber

To Uber or to taxi? That’s the question, and some researchers have figured it out, at least in New York.

Computer scientists from the University of Cambridge and Belgium’s University of Nanmur compared trip and fare data for “every yellow cab ride taken in 2013” with data from Uber’s system, “which allows anyone to query how much a fare between two points would cost.” What they found was that “Uber is more expensive than a yellow cab for a trip in New York City that costs less than $35.” In other words, if you have a short ride within Manhattan, it’s cheaper to wait for a cab.

Of course there’s app for all this, which pairs the researchers’ findings with your location and destination, and tells you if you’re better off with a cab or an Uber.

How Big Data Busted Abe Lincoln

While we think of big data and computer technology as going hand-in-hand, data analysis has been used as far back as the mid-1800s to surface surprising stories.

A U.S. 19th century law provided Congressman “compensation for travel to and from their districts” at 40 cents a mile. Chicago Tribune editor Horace Greeley was “shocked” when he saw the sums, considering them “an outrageous waste” of taxpayers’ money. He also thought that the “disbursements were a wasteful relic of an earlier time, when travel to and from the far-flung reaches of the United States would have been a costly, bruising affair.” During Greeley’s time, steamships and trains were becoming more and more available, making travel faster and cheaper.

The editor asked one of his reporters to compare the “shortest path from each congressman’s district to the Capitol” (using a U.S. Post office book of mail routes) with each congressman’s mileage reimbursements. Greeley then printed the findings in his paper.

Honest Abe was one of the worst culprits, having received about $677 in excess mileage, “more than $18,700 today.” Only worse was Jefferson Davis, who received an extra $736.80.

The House was up in arms, claiming that Greeley’s charges were “absolutely false.” However, the representatives eventually passed a bill “to change the computation of mileage to ‘the shortest continuous mail route,'” although the Senate would kill it. Still later, Congress lowered the per-mile rate from 40 cents to 20.

Data-driven therapy: How The Sync Project wants to use music as medicine

Using biometric data gathered through various wearables, the Sync Project seeks to understand the psychological effects of music and why some types of music “can enhance our moods, boost concentration, trigger emotional reactions or pump up our energy levels.”

The global collaborative wants to go beyond using their findings as a “potential music recommendation algorithm” — they want to be able to use music as therapy. For instance, the Sync Project’s CEO says “music has helped his son, who suffers from autism, communicate in ways that he hasn’t been able to before,” and  “although he has trouble expressing himself, he’s able to sing through an entire Beatles song and feel relaxed after an episode.”

Other research has shown the positive effects of music on Alzheimer’s and dementia patients. In a well-known video, a nearly vegetative patient with Alzheimer’s “awakens” when listening to music from his youth.

[Photo via Flickr: “um som e o mar,” CC BY 2.0 by Raíssa Viza]

Data Tells a Story: gender equality; thinning Arctic sea ice; Twitter was right


Welcome to another installment of Data Tells a Story, in which we round up our favorite data stories of the week. The latest: gender equality; thinning Arctic sea ice; and Twitter was right.

Gender equality report: an example of how big data can address big problems

A report published by the Bill & Melinda Gates Foundation and the Bill, Hillary & Chelsea Clinton Foundation shows that while “the status of women and girls has improved substantially since 1995,” there’s still much work to be done.

After collecting and analyzing 850,000 gender-related data points over a 20-year period from nonprofit organizations such as the United Nations and the World Bank, researchers came up with a multitude of findings. For instance, they found that “almost two-thirds of the world’s illiterate adults, 496 million people, are women.” But they also found some good news: since 1995, the maternal maternity rate as decreased by 42 percent, with South Asian showing the most improvement.

For their next project, the foundations are combining data sets from the UN and other organizations to assess what those organizations have achieved “over the past two decades in the field of gender development.”

Arctic Sea Ice ‘Thinning Dramatically,’ Study Finds

A new study has found that Arctic sea ice is “thinning at a steadier and faster rate than researchers previously thought.”

The researchers acquired data from multiple sources, “making them the first to combine all available observations on Arctic sea-ice thickness into one study.” One data set from 1975 to 2000 showed that Arctic sea ice had thinned 36%. However, a larger data set used in the new study showed that this was “a little less than half” of the actual ice thinning rate, and “that the leveling off of sea ice thinning in the 1990s was only temporary.”

Fukushima data show rise and fall in food radioactivity

Four years after the Fukushima nuclear disaster, researchers have found that “few people are likely to have eaten food that exceeded strict Japanese limits on radioactive contamination.”

The researchers used data provided by a massive food-monitoring program, which sampled “foods before they hit the market for levels of radioactive elements such as caesium-137,” and banned “producers or areas that exceeded regulatory limits.” So not only did the program support food safety, it provided researchers with almost 900,000 samples collected between 2011 and 2014.

The scientists found that “during the first year after the accident, 3.3% of food from the Fukushima region had above-limit contamination” (these foods were prevented from ever reaching the market). This percentage rose slightly in 2012 but by 2014 had fallen to 0.6%

UK draws billions in unrecorded inflows, much from Russia: study

A Deutsche Bank study has shown that Britain is attracting more than a billion pounds ”of capital inflows a month not recorded by official statistics,” and up to 40 percent of this might be from Russia.

The report said that financial institutions are misreporting data and using “tax avoidance and accounting methods.” In addition, Britain has a “perceived ‘safe-haven’ status” for stolen cash, “with tens of thousands of London properties owned by secretive companies,” according to another report.

The Deutsche Bank findings are part of a broader study of “net errors and omission” (NEOs) “across major economies.” Such NEOs “could have big implications for foreign exchange rates.”

Study of TV Viewers Backs Twitter’s Claims to Be Barometer of Public Mood

Twitter has long contended that it’s “a reliable barometer of the public’s changing moods and interests,” and that volume of tweets correlates with the popularity of television programs. Now Nielsen has the data to back those claims.

In their study, researchers measured the brain activity of about 300 people as they watched several TV shows. The researchers then compared those measurements with tweets about those same shows, and found that “number of tweets correlated closely with TV viewers’ depth of engagement with whatever was appearing on the screen at that moment,” and that as the 300 viewers were getting more engaged with a particular segment, the more intense Twitter activity became.

Such data can be used to predict which shows will be most popular, Nielsen says. For instance, the more a forthcoming show is tweeted about, the more popular its premiere may be.

[Photo via Flickr, “Iceberg,” CC BY 2.0 by NOAA’s National Ocean Service]

Data Tells a Story: online dating; killer whales; the evolution of American music


It’s time once again for our latest batch of our favorite data stories. This week: what data tells us about online dating, killer whales, and the evolution of American music.

Data can tell you how to up your online dating game

According to Vox, some data analysis is able to show what works — and what doesn’t — in online dating.

For instance, one study, after analyzing more than 150,000 first messages, discovered that those who used words that focused more on the other person, such as “you,” were more likely to get a response than those who focused more on “me” or “I.”

Overly casual language was another data point that seemed to make a difference. OkCupid researchers analyzed 500,000 first messages and “found that casual spellings like like ‘ur’ and ‘wat’ in first messages pushed the reply rate well below average.” However, first messages with “haha” or “lol” resulted in above-average reply rates.

Finally, a 2006 study of 6,500 heterosexual online daters found that 60% of women who reached out to men first received a response compared to just 35% of men who made initial contact.

How Google’s using big data and machine-learning to aid drug discovery

Google, working with Stanford University, is looking at how using data from a variety of sources “can better determine which chemical compounds will serve as ‘effective drug treatments for a variety of diseases,’” says VentureBeat.

To accelerate drug discovery, Google proposes using deep learning, “a system that involves training systems called artificial neural networks on lots of information derived from key data inputs, and then introducing new information to the mix.” Google’s models use data “from many different experiments to increase prediction accuracy across many diseases.” Their results suggest that adding even more data could improve their performance even more.

Drug discovery is normally a long, arduous, and costly process. Google suggests that automating and improving predictive techniques should “not only speed up the drug discovery process but cut the costs.”

After Menopause, Killer Whale Moms Become Pod Leaders

Killer whales are only a “handful of animals” who live many years after menopause, says Scientists at the University of Exeter, the University of York and the Center for Whale Research wanted to find out why.

The research team examined “35 years’ worth of observational data,” including decades’ worth of photographs, and, like all good data researchers, noticed a pattern: “Post-menopausal females, the oldest in the group, typically swam at the front and directed their pods’ movements in a variety of scenarios.”

To try to explain this, the researchers then focused their dataset “to years when killer whales’ primary food supply, salmon, was critically low,” and found that, because the whales have such a specialized diet, “the ability to find fish becomes invaluable to the whales’ survival and reproductive success,” especially when salmon are in short supply.

That’s where “killer whales with years of hunting experience,” such as post-menopausal mothers, come in. These females may “may boost the survival of their kin is through the transfer of ecological knowledge,” which would explain why they’re the leaders of the pack.

Unplanned Births: Another Outcome of Economic Inequality?

In 2008, says The Atlantic, data showed that unplanned pregnancies were five times more likely for women in poverty. A Boston University study attempted to answer why.

The researchers looked at the data of 3,885 single women between the ages of 15 and 44 who weren’t trying to get pregnant. Across five different economic brackets, the team found a couple of similarities. For instance, in every bracket two-thirds had had sex in the past year, with women in the highest income bracket reporting the highest rate of active. Therefore, frequency of sex wasn’t a factor.

Regarding how upset the women would be if they got pregnant, the numbers were also almost the same, with one in three saying “they would not be all that upset” and two-thirds saying that it would be “very upsetting.”

The areas of discrepancy were in contraception use and occurrence of abortions. Only 11 percent of women in the highest income bracket said they didn’t use contraception while more than twice that percentage reported the same in the poorest group. As for abortions, the wealthiest group was “more than three times as likely to have the procedure than the lowest-income group.”

While different cultural or religious views should also be taken into consideration, such findings might support the idea that poorer women have less access to contraception and safe and affordable abortions. Contraception coverage is required for many federally-backed insurance plans, but abortion, “except in the case of rape, incest, or life-threatening emergency,” is often prohibited. Even some private insurers are not allowed to cover the procedure.

In addition, women in a higher income bracket might know more about more effective contraception, such as IUDs, which are more expensive upfront but cheaper in the long run than less reliable options.

Genetic Data Tools Reveal How Pop Music Evolved In The US

A team of researchers at Queen Mary University in London have applied their number crunching techniques to study American pop music’s evolution.

The researchers analyzed more than 17,000 songs from the US Billboard 100 from 1960 to 2010. First, they rated each song “in one of 8 different harmonic categories and one of 8 different timbre categories.” They then used an algorithm to “find objective categories of musical genre that depend only on the musical qualities,” which resulted in “13 separate styles of music.” Next they used enrichment analysis, a bioinformatics technique, to search for tags “that were more commonly associated with songs in each music style.”

They came away with several interesting findings. For example, the frequency of jazz or blues style has been declining since 1960 while rock has always fluctuated. Rap is rare before 1980 but afterwards skyrockets and remains the dominant genre for the next 30 years.

The team also identified three revolutions: “a major one around 1991 and two smaller ones around 1964 and 1983.” The one in 1964 was most complex with an increase in soul and rock and a decline in doowop. The 1983 change saw increases in new wave, disco, and hard rock, and a drop in soft rock and country, while 1991 saw a rise in rap-related tags.

Finally, the researchers took on a question heavy on many music lovers’ minds: the Rolling Stones or the Beatles? The answer? Neither. They discovered that the British didn’t start the American music revolution of 1964 at all, and that it was already well underway before the British invaded.

[Photo via Flickr: “Online romance,” CC BY 2.0 by Don Hankins]

Bigger on the inside: the TARDIS, technology, your brain, and beyond


We’ve always said that Reverb is bigger on the inside. Behind all of our products — whether a recommendations plug-in that fits into any site, a news app only as big as the palm of your hand, and now an analytics dashboard that gives publishers user and content data at their fingertips — is big technology embodied by our unique Interest Graph.

How big? Inside our Interest Graph are 50 million unique words modeled, 600 million users connected, and more than 30 billion web pages processed. Billions of API calls are made a month, and into all this technology have gone more than a million people hours — all powered by what has to be tens of thousands of cups of coffee.

All of this got us thinking: what else is bigger on the inside? Let’s take a look.


The TARDIS is one of the most famous examples of dimensional transcendentalism, or the (fictional) idea that an object’s interior can be bigger than its exterior due to something called transdimensional engineering.

A time machine and spacecraft, the TARDIS looks from the outside to be the size of a British police box, but on the inside “is actually infinite in size.” Thanks to augmented reality, this amazing TARDIS replica really is bigger on the inside.

The TARDIS isn’t the only living space that’s roomier than it looks. There are the wizarding tents in the Harry Potter universe, Snoopy’s doghouse (complete with rec room, birdhouse, basement, den, etc.), and Oscar’s trashcan, which “boasts such amenities as a farm, swimming pool, ice-rink, bowling alley, and a piano.”

bag of holding

In the Dungeons & Dragons universe, the bag of holding is a bag capable of holding much more than its small size implies. Similar are Mary Poppins’s carpetbag and Hermione Granger’s beaded handbag, complete with Undetectable Extension Charm, a spell that allows the bag to hold as many items as needed.

Now there’s even a brand of Bag of Holding messenger bags which are “so big, you might think [they’re] bigger on the inside.”


Reverb’s technology was built on words and so of course we believe that books are always bigger on the inside. At an average of less than a pound each, books contain tens of thousands of words, a multitude of characters (or almost 7 million if you’re the world’s longest novel), storylines, and worlds.

Get yourself a Kindle and your universe grows exponentially. Get a TARDIS little library and you have worlds within an infinite world.


They started out as a big as a room and now you don’t even need a pocket of holding to hold one.

Slim laptops as light as air aren’t the only compact computers out there. There’s the niftily named matchbox computer, which are, says Computer World, a PC that can be “crammed into a space not much larger than” — you guessed it — “a matchbox.” One of the more popular matchbox computers is the Raspberry Pi.

Finally, one of the tiniest computers out there is the microcontroller. The microcontroller resides on a single integrated circuit, also known as a microchip, and contains “a processor core, memory, and programmable input/output peripherals.” Some microchips, such as those used for animal tracking, are as small as a grain of rice.

your brain

Your brain is much bigger than a grain of rice, but it’s still pretty small compared to all the stuff inside.

The average adult brain, which is only about three pounds, holds about 100 billion neurons, or cells that act as processors and transmitters of information through connectors called synapses. There are between 1,000 to 10,000 synapses per neuron, which means you could have up to — what’s 100 billion times 10,000? A very very large number of connections firing.

That’s a ton of activity for something the size of two fists.


Reverb works a little bit like your brain does. Through our recommendation plug-in and news app, our technology makes connections quickly and gives you recommendations based on your interests. It remembers what you like and don’t like, and gives you more of what you want and less of what you don’t.

Our newest product, Reverb Insights, takes in millions of data points from publishers and other content owners, makes lightning quick connections, and tells stories based on that data. For instance, we found that the topic, Children and Grief, was high-performing in our news app data in connection with a popular news story. This told us that it isn’t just Celebrity our readers are concerned about.

Want to learn more? Check out our website, the announcement on our blog, and this terrific write-up at TechCrunch. You can also request a demo by emailing us at

[Photo via Flickr: “brain,” CC BY 2.0 by Lovelorn Poets]