Top

Datawatch: The BD&A-Files – How Big Data and Analytics can Solve the World’s Most Enduring Mysteries

From UFOs to world-famous cryptids, our latest Datawatch blog explores how analytics is helping to unravel and explain some of our most enduring mysteries.

Neil Armstrong once said that “mystery creates wonder, and wonder is the basis of man’s desire to understand.” If this is true, then it should come as no surprise that we remain fascinated by the things we find hard to explain in this world. In fact, as a species we’ve been asking many of the same questions for centuries – ruminating over the enduring mysteries that have yet to be solved by scientific advances. 

Over the last few months, our Datawatch blogs have explored a variety of ways analytics is used in day-to-day life, from creating Hollywood blockbusters to driving music trends and helping to find lost hikers. In this edition, we’re going to take a step into the unknown and look at how analytics can help us gain new insights into the big-ticket mysteries that have captured our collective imaginations for generations.  

Is it a bird? Is it a Plane? 

It’s hard to talk about Unidentified Flying Objects without a certain amount of stigma being attached to the subject; one that tends to involve little green men and stolen cattle. However, this summer a Pentagon report confirmed that UFOs – now rebranded as Unidentified Aerial Phenomena (UAPs) – exist, and are almost certainly physical objects. 

What exactly they are, however, is still largely unknown. In fact, we know just as little today about these sightings as John Winthrope did back in 1639 when he recorded one the world’s first documented UFO encounters.

One of the problems is that eye-witness reports vary hugely, meaning it’s hard to gain any meaningful insights into what occupies our skies. But modern data science and analytics techniques may soon change this. 

A PhD student at Tsing Hua University recently conducted analytics research into 80,000 UFO reports, using lollipop charts, network diagrams and other visualisations to explore commonalities. The result is a clear picture of the most common physical characteristics of UFOs, where they are most likely to be spotted, and how sightings have changed over time. (As a tip, if this kind of thing gives you the heebie-jeebies, you might consider moving to Atlanta, GA, where sightings in the US are at their lowest.)

One of the most interesting findings of the analysis, is that a great number of sightings in the US take place on both the 4th of July and New Year’s Eve – coinciding with large firework displays. It’s also worthwhile noting that UFOs are most commonly spotted in coastal areas during the months of June and July – perhaps the result of atmospheric phenomenon? 

The analysis also revealed some insights into how the public perception of UFOs has changed with time. The traditional ‘flying saucer’, once a staple of early sci-fi films, is no longer en vogue. Sightings of this type of craft peaked in the 50s and have tailed off dramatically ever since.   

What lurks beneath the waves?

Similar studies have been used to help provide insight into other mysterious sightings, from Bigfoot in the Pacific Northwest to those closer to our own shores here in the UK.

Every year, hundreds of thousands of people flock to Loch Ness for the beautiful scenery and, of course, for the chance of catching a glimpse of the world-famous Loch Ness Monster. 

First reported in 565 AD, there have been over a thousand sightings of ‘Nessie’ to date. But still, no one really knows what Nessie is, if it really exists, or how it lived to be 1456 years old, for that matter.  

Today, big data analytics is playing a key role in trying to answer these age-old questions. As far back as 2013, an ecologist named Dr. Charles Paxton conducted statistical analysis of over 853 Nessie sightings looking for clusters of patterns. For instance, do sightings align with unusual wave activity? And in what month, at what time of day, and from what distance do sightings usually occur?

Interestingly, one of the reports key findings was Paxton’s conclusion that anecdotal evidence can prove to be a vital data source and, despite its inherent inaccuracies, still reveal findings of scientific significance.

More recently, a professor from the University of Otago has analysed 500 million DNA sequences taken from Loch Ness water samples in an attempt to ascertain what kind of creature Nessie might be. 

After a year-long investigation, the professor’s analytics work found that there was no shark DNA, catfish DNA, or Sturgeon DNA present in the loch. There was however a great deal of eel DNA.  

The tentative conclusion is that Nessie is most likely a giant eel. Although the jury is still out as to whether that is any less unsettling than a bona fide sea monster. 

Is anywhere safe? 

If the thought of the paranormal is enough to keep you awake at night, then you may want to move somewhere with a little less spook-factor. If that is the case, there’s good news for you, too. 

As a bit of Halloween fun last year, towardsdatascience.com used analytics techniques to create a map of the least spooky places to live in the United States, based on ‘the density of cemeteries and haunted places in each metro area, as well as the per capita UFO sightings and Bigfoot encounters.’

The experiment was surprisingly in-depth, including data on things that may affect the perception of an area’s paranormal activity, like severe weather phenomena, locals’ likelihood to believe in the supernatural, population density, and the age of the residents. 

The end result saw the Worcester MA-CT metropolitan area named as the spookiest in the US. 

The Great Maple Syrup Mystery

And finally, perhaps the greatest mystery of them all – The Great Maple Syrup Smell of 2005.

Okay, you might not have heard of this one. But, in 2005 New York news stations were rife with stories of a mysterious, pleasant, maple syrup odour appearing in certain parts of the city before quickly disappearing. 

With 9/11 still in recent memory, many residents feared some sort of chemical attack and called the city’s non-emergency 311 information line to report the smell, but for four years its origins remained a mystery. Then, analytics broke the case.

Using a combination of the location data from the 311 calls, along with data related to temperature, humidity, wind direction and wind speed at the time of those calls, data scientists were able to trace the odour to an industrial plant in New Jersey that was processing Fenugreek seeds, and ingredient often used as a flavouring in maple syrup.

Interestingly, this kind of technique is nothing new. Way back in 1854, the same method was used to trace an outbreak of Cholera that killed over 600 London residents in a week. Thanks to some handy analytics work monitoring the locations of the deceased and some rudimentary visualisation techniques, the outbreak was traced to a nearby water pump. This dispelled the myth that Cholera was transmitted through the air, and also helped stopped the outbreak in its tracks. 

Analytics at The Smart Cube

Here at The Smart Cube, we offer bespoke, end-to-end analytics capabilities, from data engineering through to reporting and visualisation, and advanced analytics. 

To read about some of the ways we’re helping our clients, or to learn how we can help you achieve your own business goals, visit here