Reverb has been acquired by NDN!


I’m tremendously excited to announce that Reverb has been acquired by News Distribution Network! NDN is number two in comScore’s ranking of online News & Information video properties and touches over 30 million monthly unique users. Reverb will join NDN at their new Silicon Valley office.

It has been an amazing journey. Seven years ago, I had the privilege of being invited to join Erin McKean at Wordnik by Roger McNamee, who met Erin during her 2008 TED talk. In her presentation, Erin postulated how writing in the medium of the Internet should create an opportunity to move beyond the constraints of the traditional dictionary. Even an army of lexicographers can’t keep pace with the evolution in the English language (and there’s a shortage of lexicographers, globally). Why should we hold ourselves back when we have the Internet and nearly infinite compute?

The Wordnik team charged forward and developed the world’s largest English dictionary. We built technology to understand words and context from text from a completely different angle. We leveraged MongoDB and were the largest known production deployment for some time. We created a robust public API that has served billions of requests to over 10,000 developers. We gave back to Open Source by releasing the REST API framework Swagger, which is now the most popular of its kind the world, downloaded over 10,000 times a day.

So where did Reverb come from? Literally, the playground. While watching my kids play at a park in Menlo Park, I met Brian Leonard of TaskRabbit. “You know a lot about words,” Brian said. “We have a challenge for you.” How could a (then) tiny company understand the ideas in a single paragraph well enough to match a task with the right person? We applied our technology to TaskRabbit and powered their production categorization and real-time matching. TaskRabbit grew their business, focused on their product, and are now the leading mirco-task service in the world, powering aspects of the Home Services portion of Amazon. While TaskRabbit has since grown and operates their own categorization and matching systems, we’re quite proud to have been a part of their beginnings.

From then on, we focused on personalization and higher-level understanding of text. While Wordnik was about words, we were focusing on documents and people. After some technical breakthroughs in the personalization space, we launched Reverb and spun off Wordnik as a not-for-profit, which continues to be run by Erin McKean.

Reverb focuses on ideas in text. We built a wildly successful recommendation engine and deployed across thousands of sites on the web. As we hoped, using semantic understanding of ideas in text as a basis for recommendation showed orders-of-magnitude better engagement (and happiness) across over a billion datapoints.

We saw two trends with our recommendation product that were pivotal in the company. First, we found that personalizing content to each individual has a tremendous impact on engagement. Interests are as unique as people themselves, and software should be smart enough to respond to you as an individual. Next, news is tough. People love news but there are important subtleties that make a tremendous difference.

When we were introduced to NDN, we saw an excellent match for what we do best. Reverb’s understanding of content paired with NDN’s distribution and publisher relationships fit together in the most complementary fashion imaginable.

After almost seven years as an independent company, we at Reverb have built products and technology that have changed the way people think of content. Now being part of NDN, Reverb will be touched by more people and in ways we never could have done on our own. Reverb adds powerful personalization, recommendation, and mobile technologies to NDN’s platform of over 4,000 publishers, 400 content providers, and industry leading advertisers. It’s an exciting time for us all, and I’m certain NDN will be amazed at what we can do together!

Thanks to the Reverb Team, investors, and supporters in this amazing journey!

Tony Tam, Founder & CEO


Links & Footnotes:

[0] Wordnik is still free and still the largest English dictionary in the world (

[1] Swagger ( is now being lead by SmartBear ( and is most recently the framework of choice for Microsoft Azure, among others

[2] TaskRabbit has raised almost $40M and is providing services in over 8 US locations

Data Tells a Story: all work and no play; the best hotel deals; a smart surfboard

All work and no play...

Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: all work and no play; the best hotel deals; a smart surfboard.

The Highest Paying Jobs of the Future Will Eat Your Life

Technology is supposed to make our lives easier. However, for many people, it’s making life harder, at least in terms of work.

Number of work hours began falling during the Great Depression, according to research, but started rising again in the 1970s. Thirty years ago, more highly paid U.S. workers were less likely to work long hours than lower paid workers, but by 2006, the opposite was true: “The best-paid were twice as likely to work long hours as the poorly paid.”

Research has also shown the impact of such increased hours, namely lower productivity, poorer health, “higher chances for injuring yourself and others,” and reduced sleep. While one study showed that “lower-income workers who work two jobs sleep less than anyone,” another study found that “higher-income people slept less than the poor.”

One reason for these increased hours is technology, which means “we’re all available 24/7,” and “there are no boundaries, no breaks.”

Researchers use mobile phone data to predict employment shocks

Researchers at Northeastern University have shown that mobile phone data can be used to “detect, track, and pre­dict changes in the economy at mul­tiple levels.” For example, they found that “call detail records can be used to pre­dict unem­ploy­ment rates up to four months before the release of offi­cial reports and more accu­rately than using his­tor­ical data alone.”

Using “the power of algorithms,” researchers analyzed call record data from “two undis­closed Euro­pean coun­tries” over a 15-month period between 2006 and 2007. They identified cell phone users who had been laid off, and tracked their “mobility and social inter­ac­tions,” including “total calls, number of incoming calls, number of out­going calls, and calls made to indi­vid­uals phys­i­cally located at the plant.” Researchers found that being laid off had  “sys­tem­atic damp­ening effect,” for instance, causing total number of calls to drop by 51 percent.

A second study examined call records from thousands of subscribers in a European country “that had expe­ri­enced macro­eco­nomic dis­rup­tions,” looking at behavioral changes possibly caused by layoffs to try to predict “gen­eral unem­ploy­ment statistics.” The researchers did indeed find that “changes in mobility and social behavior pre­dicted unem­ploy­ment rates before the release of offi­cial reports and more accu­rately than tra­di­tional fore­casts.”

Using Data to Take the Guesswork out of Getting Pregnant

Doctors helping couples trying to conceive have limited access to data. For instance, fertility specialists might know how big or little of chance the couple has for having a baby naturally, with fertility injections, or in vitro.

But with more data, more of a story can be told. The couple might also be told that fertility injections could increase their chance of conceiving by over 25 percent; that IVF may increase the chances by well over 50 percent; and IVF with donor eggs, 92 percent. Unfortunately, the doctor doesn’t have the data to prove such figures, which is what one biotech firm is trying to change with their data analysis tool.

The tool lets fertility specialists compare their patients’ metrics “to a database of hundreds of thousands of other patients’ data,” and “then uses predictive analytics to calculate a patient’s most likely outcomes.” The tool can predict the likelihood of pregnancy and how that might change over time.

One of the biggest problems with fertility treatment is that women quit prematurely. One study analyzed 6,000 patient records and compared those who stopped treatment after two cycles and those who continued. The study found that if those who stopped had continued for one more month, “40 percent of them would have gotten pregnant.” Such concrete data might help convince women to not end treatment early.

Big Data Reveals When Summer Hotel Deals Are

In analyzing their hotel data, Hipmunk found where hotels are most — and least — expensive during the summer months.

You can find the most expensive room this summer in Sonoma, California, with a nightly rate of $417. California cities take the next two priciest spots as well with rooms going for $338 and $311 in Napa and Santa Barbara, respectively.

How about the cheapest summer rates? In uncomfortably hot cities or ski destinations. You can get a room for $97 a night in Scottsdale, Arizona; $136 in Miami, Florida; and $151 in Breckinridge, Colorado.

Hipmunk advises that to save some money, travel slightly off-season, which means, for instance, hitting a “visiting a ski resort in early spring, a winery in late fall, or a beach town just before the summer rush starts.”

This Surfboard Maps Waves and Gathers Ocean Data for Researchers

Soon you’ll be able to hang ten with smart surfboard.

Developed by an environment filmmaker and surfboard engineer, the Smartphin “can collect information about water temperature, acidity and salt content” via sensors that reside in a fin “that can be mounted on the body of a surfboard.”

The data has a twofold benefit: it can “help scientists understand climate change’s effects on the ocean,” and also help surfer “figure out where to catch the best waves.”

[Photo via Flickr: “All work and no play…” CC BY 2.0 by Joakim Nordlander]

Data Tells a Story: fighting fire with data; nature versus nurture; texting while driving

Pierce Quint

Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: fighting fire with data; nature versus nurture; and texting when you’re not supposed to.

New York City Fights Fire with Data

The NYC Fire Department (FDNY) has been mining data to better predict where fires may start. Organizing data from city agencies into about 60 risk factors, their system creates lists of buildings that are most at-risk.

Before implementing their data mining system, the FDNY kept all of their inspections records on paper and stored at each firehouse, “so there was no way to share information among other fire companies, battalions or divisions.” Now they have a data warehouse that the whole department can access.

The system has also streamlined the FDNY’s inspections process. For about 350,000 buildings, each unit makes 26 different kinds of inspections, which were originally tracked on clipboards. Now the entire inspection workflow has been automated and statistics are collected.

Nature v nurture: research shows it’s both

Researchers of University of Queensland in Australia have attempted to answer that age-old question: is it nature or nurture? Their answer: it’s both.

Teaming with scientists at the VU University of Amsterdam, the researchers reviewed nearly every twin study from around the world in the past 50 years, which involved more than 14.5 million pairs of twins and 17,804 traits. Their findings showed that “the variation for human traits and diseases is 49 per cent genetic, and 51 per cent due to environmental factors and/or measurement errors.”

It’s not just texting: drivers are Instagramming and video chatting too

While almost all 50 states have laws that explicitly ban texting while driving, a study from Braun Research has shown that drivers also email, check Facebook, and take photos while behind the wheel.

The market research firm conducted a phone survey with over 2,000 smart-phone owning drivers aged 16 to 65. Sixty-one percent admitted to texting while driving, and 33 percent to emailing. Browsing the Internet and using Facebook are neck-and-neck at 28 and 27 percent, respectively.

At the same time, most people surveyed realized that using their phones while driving was dangerous. Only 27 percent thought they could take video safely, while over 78 percent considered texting or emailing while driving as a “very serious threat to safety.”

There still isn’t a lot of data on how dangerous simultaneous driving and smartphone-using is. Very few people want to admit that their accidents were caused by being distracted by their phones, and as a result such accidents are underreported. However, a recent study by AAA showed, via footage captured by in-vehicle cameras, that 12 percent of teen drivers involved in accidents were using phones at the time.

Texting at the playground: New study shows how much time parents spend buried in smartphones

Good news, parents: you’re not using your phone in front of your kids as much as you think you are.

Researchers collected 33 hours of data at various North Seattle playgrounds, tracking how long caregivers used their phones as well as conducting interviews. Their findings showed that while 44 percent of parents and caregivers worry about excessive phone usage in front of their charges, the majority spent less than five percent of their time on their phones on the playground, while 41 percent didn’t use their phones at all.

Those who did use their phones did so for a short time: almost 30 percent of usages were less than 10 seconds long and more than half were less than one minute.

However, researchers also discovered that during phone usage, it was difficult for children to get their caregivers’ attention. In 18 cases the adult didn’t respond at all to the child, while in 70 instances of a child trying to get the attention of an adult not on the phone, the child usually got a quick reply.

Here’s how quickly interviewers decide whether or not to hire you

A study has shown that the majority of hirers make their decisions in the first 15 minutes of the interview.

Looking at 600 30-minute interviews with college and graduate students, the researchers also found that less than 5% of interviewers decided in the first minute, and about a quarter decided in the first five minutes, debunking the myth that most hiring decisions are made in five minutes or less.

Other revelations included: a longer interview is better in that it gives the applicant the chance to “break through subjective filters” and for the interviewer to get more information; interviewers who made small talk decided more quickly, perhaps relying on emotion and gut instinct; and those who interviewed in a more structured way took longer to decide.

Even order matters. Being fourth “seems to offer the best chance of having a substantive interview,” while being near the end hurts your chances.

[Photo via Flickr: “Pierce Quint,” CC BY 2.0 by Lane Pearman]

Data Tells a Story: confiding in the family pet; wasted health care; measuring vanity

Smaranda & Arpagic

Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: confiding in the family pet; wasted health care; measuring vanity.

Children ‘more likely to confide in pets than siblings’

Using data from a 10-year study of 100 families in the United Kingdom, a Cambridge University psychiatry researcher has found that children facing “emotional difficulties,” such as “bereavement, divorce, instability and illness” place higher importance on their pets for support than on their siblings, perhaps because children feel “their pets are not judging them.”

The data also showed that in the U.S. “children are more likely to live with a pet than their natural father,” specifically, “about two thirds of children live with their father, while about four in five of families with school-age children have a pet.” Moreover, having a pet “encouraged more social behaviour, such as ‘helping, sharing, and co-operating’.”


A study with more than a million Medicare patients showed that a “huge proportion” had received low-value care — in other words, care that was simply a waste.

Researchers took a look at how often patients received “one of twenty-six tests or treatments that scientific and professional organizations have consistently determined to have no benefit or to be outright harmful,” such as an EEG for an “uncomplicated headache,” or a CT or MRI scan for “low-back pain in patients without any signs of a neurological problem.” In one year, 25 to 42 percent of Medicare patients “received at least one of the twenty-six useless tests and treatments.”

The data also shows that almost every family in the U.S. “has been subject to overtesting and overtreatment in one form or another”; that millions are receiving useless drugs; and operations, and scans and tests that aren’t just ineffective but might be harmful as well. As a result, “costs appear to take thousands of dollars out of the paychecks of every household each year,” resulting in reduced spending on real needs such as “food, clothing, education, and shelter.”

Israeli big data teaches farmers a cup of joe means better crops

Big data is helping farmers more effectively and efficiently manage their farms.

One platform places sensors on plant-life, livestock, and farming equipment, and collects data on factors such as “environment, temperature and humidity, how much animals are eating, activity among animals, soil conditions for plants, the level of pests in an area, and much more.” Next, the data is analyzed and “compared to guidelines for ideal production under the circumstances,” and suggestions for improvement are sent back to the farmers.

In addition to raw data, the platform also considers cultural issues and how people work. For instance, it found that in Serbia, farmers who drank coffee first thing in the morning were more productive than those who didn’t, and so recommended that farmers ingest caffeine before setting out to work.

Cuba Turns To Analytics, Big Data To Help Tourism

Every year Cuba receives around 2.8 million visitors, half from Canada. However, once Americans are free to travel to the City of Columns, that number is expected to increase by two million in the first year alone. To help accommodate such an influx, Cuba is turning to big data to help improve their tourist industry and infrastructure in general.

The Cuban government is using a platform to monitor all of their hotel and tourist establishments; social media in the most tourist-y locations; and mentions of “Cuba” on social media. The platform also categorizes and segments the information.

However, the biggest challenge the country faces is shoddy Internet connections and availability, with an Internet penetration of just five percent. While the U.S. is planning to authorize the “sale of consumer communications devices, related software, applications, hardware, and services” to Cuba, the Cuban government still “needs to lift its own import bans.”

“Vanity capital” is the new metric for narcissism, and analysts say its value worldwide is greater than Germany’s GDP

The Bank of America Merrill Lynch is trying to quantify vanity.

Specifically, they’ve put out a report putting a price tag “on the amount we spend globally on products and services that enhance our appearance or prestige.” While defining exactly what those products and services are is tricky — jewelry, art, and a private jet, prestigious for sure, but smartphones, wedding rings, and ivy league educations may seem more like necessities — these numbers in aggregate certainly tell interesting tales.

For instance, China tops all countries in growth of both vanity and non-vanity capital from 2009 to 2014, reflecting a growth of per capita income. Changes in regards to gender and spending habits are another factor: women’s economic standing around the world is improving, “which allows them to buy more stuff,” while men in general have become more fashion conscious, purchasing “man-bags” and the like.

Other factors include social media, which “makes narcissism and envy ubiquitous,” and e-commerce, which increases shopping options.

[Photo via Flickr: “Smaranda & Arpagic,” CC BY 2.0 by Cristian Bortes]

Data Tells a Story: humblebragging; Nepal; healthy food for healthy brains


Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: the ineffectiveness of humblebragging; data over nepotism to help Nepal; and healthy food for a healthy brain.

Harvard study: Humblebragging doesn’t work

A recent paper from the Harvard Business School proves something we might already know: humblebragging is the worst.

The series of studies showed that people generally dislike bragging disguised as self-deprecation or complaints, and even prefer straightforward bragging or complaining over humblebragging.

In one study, researchers had raters evaluate a dataset of over 700 tweets that had been categorized as humblebrags. The results showed that humblebragging was negatively correlated with likability, perceived sincerity, and perceived competence.

In another study, 122 college students were asked a typical job interview question, “What is your biggest weakness?” Those who answered truthfully (“Sometimes I overreact to situations”) were seen as more hireable than those who humblebragged (“I’m too demanding when it comes to fairness”).

Humblebraggers were also seen as less sincere, and while true braggers and complainers weren’t well-liked, they were still seen as more sincere than humblebraggers.

How Big Data Could Shape the American Workforce

Places like the New York City Labor Market Information Service, or LMIS, and the Grace Institute, a not-for-profit that provides free job training for women, are “taking a data-driven approach to workforce development.”

Using publicly available data, LMIS seeks patterns in job listings. For example, of 2,000 nursing positions, LMIS found “that half of them require a bachelors’ degree, showing the steady march of ‘credential creep,’ requiring higher and higher levels of education than were previously needed for a given job.”

The Grace Institute performed a six-month deep dive into labor statistics, looking for “a ‘sweet spot’ that would combine Grace’s success in training for entry-level administrative jobs with a growing sector that offered room for people in those entry-level jobs to grow too.” They found such a field — the healthcare sector — and used LMIS data to design a more targeted training program.

Use Data, Not Nepotism, to Deliver Aid in Nepal

According to Ravi Kumar, a co-founder of Code for Nepal, Nepal has had a long history of nepotism, in which “people with connections and power have access to most of the resources, especially during times of crisis.” Kumar argues that the delivery of aid, especially now in the aftermath of a devastating earthquake, should instead “be driven by the evidence on the ground and socio-economic data.”

For example, the data shows “that villages outside the Kathmandu Valley need aid the most,” but are probably not getting enough because the damage in such remote areas are probably being underreported. For instance, according to Nepal’s 2011 census, the district of Nuwakot has about 59,000 households. About 45,000 houses have been reported as destroyed or damaged — which leaves only one in four houses left intact — and yet only only 1,300 injuries have been reported. As a result, Nuwakot has probably received far less help than they actually need.

Code for Nepal offers “an interactive map of the effects of Nepal’s earthquake,” using “district-level data to show injury tolls, death counts, and houses damaged to determine where aid is needed the most.”

New study shows that people stop listening to new music at 33

A new study using data from music streaming site Spotify and music data site Echo Nest shows that people generally stop listening to music at the ripe old age of 33.

The study found that teens listen “almost exclusively to top Billboard hits”; that those in their 20s begin exploring indie music; and that “tastes level off” once people are in their early to mid-30s, also known as “taste freeze.”

The data also surfaced gender differences. Women of all ages “are more likely to be streaming popular artists than men are,” perhaps because the most popular songs tend to be from female solo vocalists, and “the decline in popular music streaming is much steeper for men.” For instance, women “show a slow and steady decline in pop music listening from 13-49, while men drop precipitously starting from their teens until their early 30s.”

Brain food is real: Study shows how diet affects memory as we age

Over the course of five years, researchers followed almost 28,000 people “aged 55 or older from 40 different countries,” and found that those “who consumed the most nutritious food had a nearly 25 percent reduction in the risk of mental decline compared those with the least healthy diets.”

Participants self-reported the kinds of foods they ate and were tested on memory and thinking ability in the beginning of the study, after two years, and at the end. Data points included 10 different aspects of cognition; the ability to remember and recall a list of objects; arithmetic abilities; and attention span.

Over the five years of the study, almost 4,700 people “suffered a decline in thinking and memory,” and “those consuming the most nutritious diets were 24 percent less likely to have cognitive declines compared to people consuming the least healthy foods.”

The link between healthy diet and healthy brain activity may have to do with nutritious food’s positive effects on cardiovascular risk factors and cardiovascular disease, which “is an important mechanism for reducing the risk of cognitive decline,” says the study’s lead author. Another theory is that a healthy diet may reduce cellular inflammation, and that “If you’re eating well, odds are your brain is less stressed.” And as we all know, a less stressed brain is a happier one.

[Photo via Flickr: “hy-vee,” CC BY 2.0 by Dean Hochman]

Data Tells a Story: the dangers of air pollution, too much TV, and sitting for too long

She has since stopped watching TV

Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: the dangers of air pollution, too much TV, and sitting for too long.

China To Use Big Data To Rate Citizens In New ‘Social Credit System’

Think of it as a credit score on steroids.

China hopes to use data such as financial standing, criminal record, and social media behavior to give a kind of “social credit score” to every citizen, which would hold everyone “accountable for financial decisions as well as moral choices.”

The goals of this numeric rating system would be to  built “a harmonious Socialist society”; strengthen “the sincerity consciousness of the members of society”; and promote “‘socialist core values’ such as patriotism, respecting the elderly, working hard and avoiding extravagant consumption.”

Alibaba is already doing something similar with their users’ shopping data, although that focuses on loan, rather than moral, worthiness.

Olympics study links Chinese pollution to lower birth weights

A recent study sampling data from 83,672 births in Beijing found that those babies who spent their eighth month in womb during the 2008 Olympics (early August through late September) were “23 grams larger at birth, compared to the same period in 2007 and 2009.”

This is because, “as a condition of hosting the Olympics,” Beijing reduced pollution levels between 18% and 59% by closing factories and power plants, seeding clouds, and restricting traffic. “Air pollution can result in lower birth weight,” according to one of the researchers, and while that doesn’t necessarily mean a baby is less healthy, low birth weight is related to some diseases later in life and higher risk during the baby’s first month.

The significant link was seen only in the eighth month of pregnancy when infants “enjoy a growth spurt.”

Air Pollution Tied to Brain Aging

A separate study with 943 men and women over 60 showed a link between air pollution and premature aging of the brain.

Researchers examined data from the participants’ MRI exams, how close they lived to major highways, and satellite data that measured a type of pollution “that easily enters the lungs and bloodstream.”

Researchers found that those exposed to the highest amount of this pollution “had a 46 percent increased risk for covert brain infarcts, the brain damage commonly called ‘silent strokes,’” and “that each additional two micrograms per cubic meter increase in [the pollutant] was linked to a decrease in cerebral brain volume equivalent to about one year of natural aging.”

Study makes surprising link between TV time and childhood obesity

A study run by the U.S. Department of Education found that of over 10,000 kindergarteners around the country, those who watched more than an hour of television a day were 52% more likely to be overweight than those who watched less.

The data points researchers took into account before and after the year-long study were BMI, amount of TV time, and amount of computer time. They found that kids who watched an hour or more of TV per day were “39% more likely to become overweight between kindergarten and first grade,” and “86% more likely to become obese during that time.” Interestingly, they found no correlation between computer time and BMI.

The American Academy of Pediatrics currently recommends less than two hours of total screen time (that’s computers and TV) per day, which, at least as per this study, seems like too much.

Study finds we think better on our feet, literally

Standing desks aren’t just for grown-ups anymore.

While previous studies have shown that standing desks can help reduce childhood obesity (those standing burned 15 percent more calories than those sitting), researchers have also found that being on their feet helped kids have at least 12 percent greater “on-task engagement” in classrooms than those who stayed on their rears.

The study involved 300 kids in second through fourth grade. The researchers observed the students over the course of a school year, looking for data points that fell into “on-task engagement” such as answering a question, raising their hand, and participating in discussion, as well as “off-task” behavior like talking when they weren’t supposed to.

This increased percentage of on-task engagement equals “an extra seven minutes per hour of engaged instruction time.”

[Photo via Flickr: “She has since stopped watching TV,” CC BY 2.0 by Nathan Walker]

Data Tells a Story: when college is worth it; creating new foods; protecting our forests

Forest near Vřesina

Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: when college is worth it; creating new foods; and protecting our forests.

College is worth it if you have these six experiences

A study from the Gallup-Purdue Index has shown that “six elements of emotional support and experiential learning in college” are correlated with “long-term career and life success.”

The study involved 30,000 US college graduates and examined “the degree to which graduates were engaged in their work and thriving in their purpose, social, financial, community, and physical well-being,” all predictors of outcomes such as worker productivity, absenteeism, and healthcare cost burden, and many more.

The Gallup research found that 25% of US college graduates “fail to thrive in their overall careers and lives,” and that same percentage miss out on all six of the elements correlated with success.

The six elements include a professor who made them “excited about learning”; a supportive mentor; the chance to work on a long-term project; an internship; and high involvement in extra-curricular activities.

Can We Find Meaning In Our Wearable Data? Exist Thinks So

Sure, our wearables tell us how many steps we’ve taken, how many calories burned, how much sleep we’ve gotten (or how little), but what do we do with all this data?

One platform uses data from wearables, apps, and services to find patterns and trends that users can act upon. Self-reported data such as mood can be measured against other data like number of steps, amount of sleep, or what music you’re listening to, which might show, for example, correlation between mood and physical activity, quality of sleep, or type of music.

The platform also recognizes unusual activity — a change in bedtime is one example — that the user might not have noticed.

From Big Data to Big Bets on Food Science

While Uber uses big data to optimize transportation, Airbnb to streamline lodging, and big pharma to discover new drugs, other startups are using big data to create new foods.

One small team of data scientists is building “a massive database of all known plant proteins,” which could amount to as many as 18 billion, with the idea of creating “new food sources for an expanding global population—sources that are cheaper, safer, and healthier than what we have today.”

With this database, the researchers can target their efforts by predicting how proteins will interact; identifying “combinations likely to produce enjoyable foods”; and pinpointing “what will produce the right tastes, textures, and colors.”

Predicting Tropical Deforestation With Big Data

Data scientists and rainforest conservationists are teaming up to use big data to predict instances of illegal logging, which endangers wildlife and makes climate change worse.

The team is using deep learning — a type of artificial intelligence “that involves processing tremendous amounts of data to solve problems in a way that roughly mimics the human brain” — to analyze “satellite images of tens of millions of acres of forest…to discover patterns that identify indicators of deforestation risk,” such as “road-building in previously undisturbed areas.”

With such predictive data, authorities can be alerted even before the tree cutting begins

3 Cities Using Open Data in Creative Ways to Solve Problems

Three cities are taking advantage of open data by using it to make life better for their residents.

In New Orleans, both city officials and citizens “can evaluate the city’s progress in confronting urban blight” through the BlightSTAT program. The mayor’s office also uses the BlightSTAT data to make decisions about the fight against blight, which has decreased by 30 percent since the program began in 2010.

San Francisco has partnered with Yelp to make restaurants’ health inspection data more readily available (as we discussed in an earlier post), which not only lets city residents know about food hazards, it has the potential to “shame repeat-offender restaurants into complying with health standards.”

Finally, Louisville has partnered with a medical service provider to plant GPS trackers in inhalers. The trackers “measure when and where in the city people use them most,” and matches these “hotspots of inhaler-user with air quality data” so that public officials can better target their interventions.

[Photo via Flickr: “Forest near Vřesina,” CC BY 2.0 by Jiri Brozovsky]

Data Tells a Story: a healthy world; cyberattacks; where dogs come from

Labs & a Portie

Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: trying to make the world healthier; factors in cyberattacks; and where dogs come from.

Turning to Big, Big Data to See What Ails the World

The Global Burden of Disease study seeks to compile and analyze a huge amount of to try to “understand what sickens us and kills us in every country in the world.”

Getting the data hasn’t been easy. Researchers have had to comb through birth and death records, hospital files, and household surveys, pulling together data points about everything from acne to liver cancer to non-poisonous animal bites.

The study was innovative because it measured not just causes of death but causes of illness and years lost to disability. Different types of health losses were given a value. For instance, severe depression was weighted at .655, or six and a half years of life lost for each decade, and researchers were surprised to find that in 2010, major depression caused more total health loss than TB. They also found that osteoarthritis caused more years of life lost than natural disasters, and neck pain caused more than cancer.

Countries have been able to use this kind of data to change their spending in health resources and other areas. Iran, for one, found that “traffic injury was its leading preventable cause of health loss in 2003, and put money into building new roads and retraining police.”

Gestational Diabetes Is Linked to Autism Risk

Researchers at Kaiser Permanente Southern California have found that “children born to mothers who developed gestational diabetes before 26 weeks of pregnancy were at a 63 percent increased risk of being diagnosed with autism spectrum disorder,” although that risk dropped to 42 percent when controlling for such factors as maternal age, household income, and the mother’s pre-existing conditions.

The study looked at more than 320,000 babies between 1995 and 2009, and the results have two applications: first, the importance of early prenatal care — the link between gestational diabetes and autism could mean “a fetus’ early exposure to uncontrolled high blood sugar may somehow affect brain development” — and two, extra vigilance during the infant’s development milestones for those mothers diagnosed with gestational diabetes before 26 weeks.

Ted Cruz Campaign Using Advanced Data-Mining to Target Voters

Be slightly worried: Ted Cruz has a data team, and they’re using data “not just to find out what issues matter to you, but why you care about them.”

However, this isn’t really new. Political candidates have been gathering the data dust of your digital footprint since Bill Clinton’s campaign in 1992. By 1996, says one campaign-data scientist, most campaigns were online, and by 2000, politicians and their teams were using data analytics to tailor their campaigns.

Cruz’s team will be gathering data via surveys and personality profiles to predict how voters might respond to their messaging.

User mistakes aid most cyber attacks, Verizon and Symantec studies show

Two studies have shown that “the vast majority of hacking attacks are successful because employees click on links in tainted emails, companies fail to apply available patches to known software flaws, or technicians do not configure systems properly.”

One finding showed that more than two-thirds of “the 290 electronic espionage cases it learned about in 2014 involved phishing,” or trick emails. As a result of so many people clicking on tainted links or attachments, hackers only need to send phishing emails to 10 employees to get inside “corporate gates” 90 percent of the time.

Solving the Mystery of Dog Domestication

Archaeologists have long been trying to figure out where domesticated dogs come from.

A study from 1997 which analyzed DNA from more than 300 modern dogs and wolves determined that “dogs may have been domesticated as early as 135,000 years ago,” while later studies “argued for a more recent origin—less than 30,000 years ago—but the exact time and location remained unclear.”

More recently, a researcher saw a pattern in his own dog data: breeds from East Asia were more “genetically diverse,” an indicator of “more ancient origins.” After a genetic analysis of more than 1,500 dogs, he concluded that “the animals had likely arisen in a region south of China’s Yangtze River less than 16,300 years ago—a time when humans were transitioning from hunting and gathering to rice farming.”

However, another researcher argued that “looking at modern DNA is a mistake,” and conducted his own study focusing on ancient DNA. After comparing the DNA of “18 dog- and wolflike bones from Eurasia and America to that of modern dogs and wolves from around the world,” he found that “the DNA of today’s dogs most closely matched that of ancient wolves.”

Now a new study brings in new technology: a computer that can take thousands of measurements “beyond mere length and width to determine actual shapes” such as eye circlets and “the jut and jag of every tooth.” While ancient DNA can tell you where the animal came from, one of the researchers says, only such “morphometric data can show you domestication in progress.”

[Photo via Flickr: “Labs & a Portie,” CC BY 2.0 by OakleyOriginals]

Data Tells a Story: resumes and gender; fighting cancer with data; the science behind happiness


Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: gender differences in resumes; fighting cancer with data; and some science behind happiness.

The resume gap: Are different gender styles contributing to tech’s dismal diversity?

A machine learning expert used big data to try and understand the differences between men’s and women’s tech resumes, and perhaps to explain why there’s such a large gender gap in the tech industry.

After analyzing 1,100 resumes — 512 from men and 588 from women — the biggest difference she found was that while women’s resumes were longer, they had fewer concrete details.

Only 19% of women’s resumes fit on one page as opposed to 61% of men. However, men provided more specific content: 91% of men included “bulleted verb statements that describe their achievements on the job, but only 36% of women do.”

Decoding and Defeating Cancer with Data Science

A group of UC Santa Cruz researchers are fighting cancer not with medicine but with data.

The data scientists are analyzing the molecular data of tumors to try and understand why treatment isn’t working. For example, tumors have traditionally been classified and treated “based on where in the body they originated.” The researchers, however, are using molecular data to “reveal similarities between tumors in different parts of the body that were not apparent before.”

In a study of 12 tumor types, the researchers’ data suggests that “as many as one in five may need re-classification.” Moreover, the misclassified tumors “were far more likely to be unresponsive to treatment.”

In addition, the researchers “are developing a social media-inspired platform” that links in  patients, doctors and researchers, and data from biopsy samples in real time. This will allow computer scientists and physicians to share their discoveries immediately, “instead of waiting months or years for the results to appear in peer-reviewed journals.”

Out With The Caraway, In With The Ginger: 50 Years Of American Spice Consumption

Armed with data found on the Economic Research Service of the U.S. Department of Agriculture’s website, Vox reporter Anna Maria Barry-Jester wanted to understand some spicy stories.

The data details “the availability of various spices every year from 1966 to 2012,” which can serve “as a proxy for consumption.” Over the past 50 years, spice consumption overall has nearly tripled, but the popularity of individual spices has been more varied.

For instance, the 1980s and ‘90s saw a 2,000 to 3,000 percent increase in production of chile peppers. A spice shop owner saw the same popularity in her business and “couldn’t figure out the obsession until one day a customer walked in with a Rick Bayless cookbook,” which focuses on Mexican cuisine.

Another example focuses on turmeric. While between 1966 and 2004, turmeric consumption was up and down, after 2011, it jumped nearly 70%. Multiple recent studies have shown that turmeric helps prevent diabetes, is as effective as ibuprofen, and has “shown potential for slowing the damage from neurological diseases.”

Using Biometric Data to Make Simple Objects Come to Life

A project on display in Dublin shows how objects can come “alive” with a human touch: a sugar bowl rises and falls with your breath rate; a tea bag “bobs up and down to the beat of your heart”; and a record player speeds up or slows down “based on skin conductance.”

The project centers on the idea “using the electricity naturally surging through our bodies to transfer electronic signals to inanimate objects,” which could be extended from sugar bowls and tea bags to your car settings changing “depending on whose gripping the wheel, or transferring files from your computer to your phone with your finger-tip.”

The Science Of Why You Should Spend Your Money On Experiences, Not Things

You might think a physical object that lasts a long time would bring more happiness than a fleeting experience — and you would be wrong.

A study from Cornell University showed that something called adaptation makes us less happy about objects. People’s self-reported data showed that while happiness for material and experiential purchases started out the same, over time their “satisfaction with the things they bought went down, whereas their satisfaction with experiences they spent money on went up.”

This is because people became adapted to the material purchases. In other words, that new TV fades into the background and “becomes part of the new normal,” while an experience like rock climbing or exploring a foreign city becomes part of who we are.

Another study showed that even some negative experiences bring happiness later, specifically after people “have the chance to talk about it.” Something that may have once been stressful becomes “a funny story to tell at a party or [is] looked back on as an invaluable character-building experience.”

Yet another study showed that people are less likely to negatively compare experiences than material purchases — it’s easier to compare and feel competitive about carats in a ring or shoe brands than it is about personal experiences.

[Photo via Flickr: “Happiness,” CC BY 2.0 by Moyan Brenn]

Data Tells a Story: Don Draper-bot; the importance of family time; the truth about apples


Welcome to another installment of Data Tells a Story, in which we round up our latest favorite data stories. This week: a Don Draper-bot; the importance of family time; and the truth about apples.

Computers can now get you to buy more stuff by tapping into your emotions

One company wants to use data to take the human writer out of copywriter.

It took eight years for the company build up a database of 500,000 phrases commonly used in advertisements. They then categorized the phrases by emotional content — positive, negative, and neutral — and subdivided those into more specific emotions such as gratitude, excitement, and gratitude.

To get a phrase, the customer uses a program to enter product information, message requirements, and a particular feeling from a “wheel of emotions” (although the company suggests letting the program pick it).

The program also tests its messages against human-written ones, and, according to the company, “the messages chosen by the computer have a 95% probability for performing better than something written by a human.”

Study Has ‘Unexpected’ Findings on How Family Time Affects Kids

A study from the University of Toronto and Bowling Green State University has shown that “spending just six hours a week with their moms reduced the likelihood of negative behavior in teens.”

The researchers analyzed data from the Panel Study of Income Dynamics Child Development Supplement, specifically the amount of time mothers spent with their children in “the 3-year-old to 11 year-old age bracket and and the 12-year-old to 18-year-old age bracket.”

They found that while amount of maternal time didn’t matter for “offspring behaviors, emotions or academics” in either childhood or adolescence, it did matter for number of delinquent behaviors for adolescents. Teens who engaged in as little as less than an hour of maternal time per day had fewer delinquent behaviors.

Why do private school students do better? It’s not their education, study finds

In a Statistics Canada study in which researchers looked at students from six Canadian provinces, they found that private school students did better academically, not because of school resources and practices, but due to “socio-economic factors and peers who tend to have university-educated parents.”

Looking at about 7,000 students across almost 1,180 schools, researchers discovered several findings:

compared with public school students, higher percentages of private school students lived in two-parent families with both biological parents; their total parental income was higher; and they tended to live in homes with more books and computers.

New study investigates the link between family income and brain development

In a study of 1,099 participants, researchers at The Saban Research Institute of Children’s Hospital and Columbia University Medical Center have found that “among children from the lowest-income families, small differences in income were associated with relatively large differences” in brain surface area for regions “associated with skills important for academic success.”

The researchers found that “improved performance in cognitive skills was also associated with higher income.” They also point out that “family income is linked to nutrition, health care, schools, play areas and even air quality – all factors that can contribute to brain development,” and suggest that “wider access to resources likely afforded by the more affluent may lead to differences in a child’s brain structure.”

Daily Apple Not Associated With Reduced Doctor Use

A University of Michigan study discovered the unthinkable: an apple a day does not keep the doctor away.

Researchers reviewed data from 8,728 people and found that 9% were daily apple eaters. This 9% was also less likely to smoke, had higher average education levels,were more likely “to be from racial and ethnic minorities,” and were less likely to use prescription medications. However, they weren’t “any less likely to have seen a doctor more than once during the past year.”

[Photo via Flickr: “Camera Test Apple,” CC BY 2.0 by Kirinohana]