- Like Post
Kaggle: a data scientist’s playground
How do the world’s best data scientists stay sharp? Discover Kaggle – the healthy competition bringing the world’s leading analytics experts together.
The world of data science never stands still. New techniques and technologies are brought into the fold on a regular basis, and our analytics experts face fresh challenges almost every day.
For those working out in the field, keeping up to date with the latest methodologies and solutions is a vital part of the job – but far from an easy one.
We recently interviewed some of our data scientists here at The Smart Cube to try and understand how they approach their work and what makes them tick. One of the questions we asked was how they manage to stay on top of the latest trends.
Several of our interviewees answered with a single word: Kaggle.
Sorry, what’s a Kaggle?
That was our first question, too.
Kaggle is an online community of data scientists and machine learners, owned by Google.
As well as providing a customizable Jupyter Notebook to create and share code, free GPUs, and a huge repository of data and code, Kaggle runs regular competitions to help data scientists test their mettle. Some of these competitions have prize pools upwards of $1,000,000, and attract thousands of competitors.
Sort of like a gym for the data science mind, Kaggle competitions were initially inspired by 2009’s Netflix prize, which awarded $1 million to an Australian duo for improving the website’s recommendation engine. The winning algorithm boosted the accuracy of recommendations by 10% (although, oddly, it was never used).
The big benefit of Kaggle competitions is that they present data scientists with an almost endless stream of varied and interesting tasks. Some of the most recent competitions challenged participants to classify defects in steel; predict the amount of yards NFL players will gain after a handoff; advance the capabilities of 3D object detection in cars; and even predict which passengers would have survived the Titanic shipwreck.
To win these competitions, participants must devise original solutions. But this isn’t just a few minutes’ work during a lunch break. Kaggle competitions require extensive research, bespoke algorithms, and lateral thinking. The top teams in each contest will boast decades of combined experience, and compete not just for substantial prize money but also for global rankings and bragging rights.
Essentially, Kaggle is a playground for the best in the business to test their skills, improve their knowledge, and compete against their peers.
Why our experts take part
Kaggle competitions are a great illustration of the almost infinite potential of analytics and data science. They allow our scientists to pit themselves against a wide variety of problems – and there are always plenty of learning opportunities along the way.
Each competition has its own discussion board and debriefs with the winner, so participants can gain insights into the thought process of some of the world’s most experienced and talented data science professionals.
They can also explore ‘Kaggle Kernels’ – short scripts related to specific concepts – to get new ideas, and join forums where other data scientists freely share work, offer advice, and answer questions.
Vitally, the pure variety of competitions available means our scientists can always find something that relates to an area they want to improve on, or something that’s applicable to the work they’re currently doing at The Smart Cube. At the time of writing, participants can choose from 14 competitions, and even specialise in a single programming language to help improve their knowledge of Python or R.
There’s plenty right with a little competition
Kaggle isn’t just good for data scientists looking to sharpen their toolset. Much like open source software, this kind of collaborative engagement is good for the overall advancement of the industry, pushing boundaries and fuelling discoveries that can benefit our wider society.
For instance, in 2012, The Heritage Provider Network (HPN), a physicians’ group in California, offered a $3 million pot to the Kaggle team that could develop an algorithm to predict which patients would be hospitalized in the next year. And that’s just the start.
So far, the participants of Kaggle competitions have provided insights into the winners of the World Cup, the progression of HIV in patients, the likelihood of insurance pay-outs, and the location of dark matter in the universe.
With competitions like this uniting the best minds in data science, society can’t fail to benefit.
Want to learn more about how our data scientists stay sharp, or how analytics can improve our society? Check out our Meet the Data Scientists series.
Abhishek is passionate about developing and implementing analytical solutions for Fortune 500 companies, helping them understand customers and make better business decisions. He specialises in predictive analytics and visual storytelling around consumers and operations across the Retail, CPG and BFSI domains, focusing on data science and stakeholder management.