Hear from CIOs, CTOs, and other C-level and senior execs on data and AI strategies at the Future of Work Summit this January 12, 2022. Learn more
This article was contributed by Dmytro Spilka
When we hear the term ‘artificial intelligence,’ it’s natural to think of big data and the task of sifting through volumes of information in order to achieve qualitative insights. Many AI breakthroughs in the past few years have been heavily dependent on big data. For instance, image classification grew exponentially over the last decade owing to ImageNet – a data set built upon millions of images that were manually sorted into thousands of categories. However, it’s important for businesses to appreciate the power of small data, too. This often forgotten part of data collection is set to blossom in a decade dominated by GDPR and privacy control.
We can see plenty of examples of small data working in recent years too, with transfer learning emerging as a successful interpretation of the approach. Also known as ‘fine-tuning’, transfer learning works by training a model on a large dataset before retraining it using far smaller data sets.
When Christian Nielsen and Morten Lund of the University of California conducted a case study on how Sokkelund, a Copenhagen restaurant grew its turnover from $1.1 million to $6.1 million within two years whilst depending on small data insights, we saw the traditionally non-digital business, they observed the streamlining of data flows and the elimination of inefficient processes in the wealth of insight they obtained.
In digitizing its business, Sokkelund opted to rely on the smaller, more manageable data the restaurant produced. This concerned the following areas:
- Customer data, such as booking information, meals bought, turnover per seat, and seasonal variations in customer flow – all of which can be easily accessible.
- Supply chain information was also streamlined to become more manageable
- Energy and water consumption
- The digitization of staff planning
- The emergence of social media and a digital presence
By tracking the data listed above – all of which is easily accessible, manageable, and actionable without the need for large-scale servers and costly AI algorithms, Sokkelund was able to make progressive decisions regarding its growth and acted on them in a timely manner.
But this isn’t to say that small data can’t be more intelligent, and organizations have the potential to use complex algorithms as a means of making small data go further. For instance, researchers in India used the big data from an ImageNet classifier and used it to train a model designed to locate kidneys in ultrasound images using just 45 training examples.
Small data can be more practical for small businesses to gather due to its cost-effectiveness, whilst still remaining sufficient for analysis. In the age of GDPR and heightened awareness of consumer privacy, big data can be far more difficult to access for businesses, but small data insights may yet steer companies to a qualitative decision-led future.
With GDPR forcing businesses to seek permission before collecting consumer data, we’re set to see more gaps in the information we can collect, with data models becoming considerably lighter than before. With this in mind, more businesses should consider how small data can work for them.
What is small data?
While big data focuses on the huge volumes of information that individuals and consumers produce for businesses to look at and AI programs to sift through, small data is made up of far more accessible bite-sized chunks of information that humans can interpret to gain actionable insights.
While big data can be a hindrance to small businesses due to its unstructured nature, masses of required storage space, and oftentimes the necessity of being held in SQL servers, small data holds plenty of appeal in that it can arrive ready to sort with no need for merging tables. It can also be stored on a local PC or database for ease of access.
However, as it is generally stored within a company, it’s essential that businesses utilize the appropriate levels of cybersecurity to protect the privacy of their customers and to keep their confidential data safe. Maxim Manturov, head of investment research at Freedom Finance Europe has identified Palo Alto as a leading firm for businesses looking to protect their small data centrally. “Its security ecosystem includes the Prisma cloud security platform and the Cortex artificial intelligence AI-based threat detection platform,” Manturov notes.
There are some challenges that small data poses to businesses also. Cybersecurity represents one area of concern, where centrally stored datasets may be more liable to be stolen by hackers – whilst big data is likely to be stored on external servers. While it can be a cost-effective way of gathering actionable insight, there’s also more danger of misinterpretation and biases emerging due to the smaller volumes of data available.
Because of the scale of the data you’re collecting, it’s possible to look at small data to answer specific questions or address emerging problems within your company. This data can include anything from sales data, website visits, inventory reports, weather forecasts, usage alerts, and just about anything that’s accessible and easy for a human to fetch.
The challenges of small data
According to Gartner analysts, as much as 70% of businesses will shift their focus from big data to small and wide data by 2025. Like small data, wide data relies on businesses tying together the data it produces across a range of different sources – like website traffic, store visits, social media engagements, and telephone inquiries. This is a seismic shift that points to more organizations opting to act on more cost-effective but powerful data insights in the coming years.
There are a number of challenges that come with working alongside small data, particularly when it comes to managing data imbalances, and difficulties in optimizing fewer data sets. Though we can also see that there are a number of approaches to data collection that can help small businesses to make the most of the information they can access.
While it can be difficult for businesses to understand the volume of data they need for a project, there can be plenty of non-technical solutions that can be explored. With this in mind, it’s worth decision-makers to spend more time looking at the volume of data that they can collect from customers before embracing more intricate machine learning algorithms to sift through data.
While humans are often capable of learning from a single example and possess the ability to distinguish new objects with high accuracy, the same qualities are far harder for machines to master.
Deep neural networks require large volumes of data to train and generalize their results. This can be a drawback when it comes to businesses that aren’t blessed with huge volumes of data to draw on. However, one-shot learning has been developed as a way of training neural networks with extremely small data sets.
This means that by analyzing one big data set, one-shot learning will learn from its processes and repeat them on significantly smaller – or even singular – data. This can certainly be useful for small businesses that don’t have the levels of customer flows to call on AI to generate actionable insights. Simply put, one-shot learning requires just one big data set to apply its processes to subsequent small datasets that otherwise would be too scant to understand.
We’ve seen plenty of examples of one-shot learning emerge in recent years, with the most common arriving in the form of passport control scanners, which are tasked with recognizing your face from your passport image – a picture that it’s never before come into contact with.
This technology can be trained to learn from extremely small samples of customer data, like past purchases (not in the case of biometrics, of course).
Utilizing analytical tools for small data insights
Small data means that businesses can tap into more manageable data sources like Google Analytics and Hotjar – with both platforms offering comprehensive insights into how users interact with host websites.
As the name suggests, analytical tools can generate a healthy level of insight into the performance of a company’s website. This is significant for developing small datasets and accessing information that can help to corroborate emerging data trends.
Google Analytics, for instance, has the ability to collect valuable information surrounding the interactions websites receive whilst interpreting the numbers via a digestible visualization. From basic info like unique visits and time-on-site to more advanced data sets like scrolls and goal conversions.
This example of small data in practice can help businesses to act on high bounce rates across landing pages, for instance, or drops in returning visitors.
For small businesses, the small data insights that analytics tools can deliver are capable of leveraging far greater levels of engagement and more strategic marketing campaigns.
Learning from causal AI
Small data calls for more tailor-suited AI systems, too. Causal AI represents the next frontier of artificial intelligence. This technology has been developed to reason about the world in a similar way to humans. Whilst we can learn from extremely small datasets, causal AI has been developed to do the same.
Technically speaking, causal AI models can learn from minuscule data points owing to data discovery algorithms, which are a novel class of algorithms designed to identify important information through very limited observations – just like humans. Causal AI can also enable humans to share their own insights and pre-existing knowledge with the algorithms, which can be an innovative way of generating circumstantial data when it doesn’t formally exist.
In business terms, this means that casual AI algorithms can be fed small data across a range of different sources to identify recurring themes that typical augmented reality would be unable to address. As the technology continues to emerge, we’re likely to see casual AI identify more consumer insights for marketers through the wealth of information businesses generate across a range of touchpoints. This can breathe new life into small data models and equip businesses with a more manageable approach to organizing their data in the future that may offer fewer insights into the behavior of consumers.
While big data is the word on everyone’s lips, small data may emerge as an essential part of a future dominated by GDPR and a greater emphasis on privacy.
Dmytro Spilka is a writer based in London. Founder of Solvid, a creative content creation agency based in London, UK. His work has been published in The Next Web, Nasdaq, Entrepreneur, Kiplinger, Financial Express and Zapier.
Welcome to the VentureBeat community!
DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.
If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.
You might even consider contributing an article of your own!