6 minutes to read With insights from... Kevin Denver Principal Software Engineering Consultant Kevin.Denver@zuhlke.com Walk into any high street store, look up, and you’ll probably see a video camera staring right back at you. But despite CCTV serving as the bedrock of retail security since the 1980s, video footage is a fairly blunt instrument – a reactive resource, rather than a proactive tool. Unless, that is, that age-old video capture hardware is paired with AI-enabled computer vision (CV) software. With the help of machine learning and modular, interconnected data ecosystems, in-store cameras look set to unlock a seismic shift in the way retailers think about security, optimisation, and personalisation.But that’s only if businesses can navigate a complex web of ethical concerns, outdated systems, and siloed data — all of which stand between today’s cameras and tomorrow’s smart stores. The possibilities of computer vision in retail Let’s fast-forward. Imagine a retail environment where computer vision is used to analyse camera footage across a range of verticals, and against a wide array of data sets.In this store, footfall heatmaps highlight under-used areas of the store alongside common customer journeys and use that data to recommend more efficient layouts. Popular aisles are flagged and cross-referenced with weather and public holiday data to suggest hyper-timely promotions. Self-checkouts avoid weigh-in and age verification disputes with AI backup from overhead cameras. Wouldbe thieves are flagged in the act, rather than at the door. And queues are monitored in real time to optimise both in-the-moment and predictive staffing rotas.This is the promise of AI-enabled computer vision in retail: turning the cameras that already exist in millions of stores and turning them into inputs for AI-powered insight. These use cases are all based on this principle, where a ‘digital twin’ of each retail environment can analyse different data points – of which video is just one – to surface actionable suggestions. The potential outputs cover a few primary areas:Theft detectionAI can watch for theft in real-time, using pattern analysis to spot when someone is near a high value product, and if they make movements typical to theft – like item concealment.Safety and security enforcementThis might include monitoring that fire exits are clear, gating age restricted areas, detecting spills or hazards, and even using pattern analysis to predict violent behaviour before it occurs.Store optimisationOptimisation can include everything from AI-powered self-checkout technology to smart inventory management, automatic age checks, and intelligent store layout insights based on customer behaviour.What’s key here is that computer vision really comes into its own when partnered with other data types. That means combining AI-enabled video with things like inventory databases, staff schedules, or customer purchase histories to find new insights. However, much of the necessary data remains hidden in legacy systems, fragmented across departments, or confined within outdated infrastructure. Extracting its value demands robust data analysis, thoughtful modelling, and seamless integration to truly prepare for AI at scale.But even with the right data foundations, there are still a few more hurdles to overcome. The fine print of progress: what’s holding AI-powered computer vision back While the promise of AI-enabled computer vision in retail is compelling, the path to implementation is far from frictionless. Before the benefits of smarter stores can be realised, some uncomfortable questions and fundamental upgrades must be addressed. Orwellian obstaclesToday’s retail businesses may have the hardware in place to bring our computer vision use cases to life, but how many of them also have the infrastructural and moral boxes ticked? First up: ethics. Using video footage to generate AI insights naturally raises a handful of very important concerns. For decades we’ve all acknowledged (and largely ignored) signs advising that CCTV is in operation, but what happens when those same systems are actively inputting to generative AI models? Is a sign on the wall enough of a mutual agreement in that instance?The European Data Protection Board has highlighted the sensitivity around surveillance in public spaces, noting the difficulty of obtaining meaningful consent in environments where individuals can’t easily opt out. Similarly, the UK’s Information Commissioner’s Office concedes that “in practice, it is often difficult to obtain genuine consent from individuals for processing their personal data in public spaces.” That’s while point 6 in the Surveillance Camera Code of Practice states that “no more images and information should be stored than that which is strictly required for the stated purpose of a surveillance camera system.”That presents obvious regulatory challenges in ‘test and learn’ scenarios, wherein any final use case for this technology isn’t yet defined, and therefore can’t be clearly communicated or consented to. But even beyond the raw legalities, there are deeper ethical factors at play.Staff and customers alike need to feel safe in the knowledge that any active surveillance system doesn’t exploit their privacy, and that there’s transparency around the kind of data being collected. Compliance with data collection regulations like GDPR can be tricky enough even before AI enters the mix – a space where the EU requires numerous ethical checks and balances. Tech maturity gap Wrapped up in all this are requirements for fairness, explainability, and data security – the latter of which goes hand-in-hand with our other major challenge: technological maturity. Despite the vast majority of retail environments having cameras installed onsite, the data architecture and back-end systems often lag behind what’s needed to unlock true computer vision capabilities. Orchestrating data pipelines and applying effective governance are further, intertwining necessities that might seem for many stores to be a bridge too far. So, whilst this technology has the potential to transform how retail organisations operate, it comes at the cost of significant investment in a range of key areas:Analysis of ethical, legal, and security concernsAn end-to-end service design approachUser research and experience designTrials of new technologiesDesign and build of productsCustomisation and licensing of best third-party products Wind back, scale up Time to rewind to the here and now. How can retailers actually overcome all the technological and ethical challenges outlined above today, in order to unlock tomorrow’s AI-powered efficiency gains? The answer lies in finding focus. Computer vision in retail can build a new generation of smart stores, but that overwhelming wealth of possibilities can get in the way of practical deployment. Retailers therefore need to row things back to a ‘minimum viable product’ approach – a clear use case with a specific business value. Focused outputs (with equally focused inputs) can help crystalise both the necessary tech stack upgrades and an ironclad data governance approach.In simple terms, that means:Choosing a single, business-led use case for computer vision in retailOutlining the goal and the data points needed to achieve itDrawing up data governance guidelines to matchRelaying these guidelines to customers and staffImplementing a human layer to intervene on important decisionsAdopting a modular approach that enables future innovation This systemic mindset mirrors our wider guidelines on navigating the transition from legacy retail systems, as outlined in our Retail CTO playbook:Prioritise composable architectureFocus on data unificationInvest in scalable infrastructureAdopt AI strategicallyIntegrate sustainabilityTo learn more – and to understand the role emergent retail technology will play in the next decade – download the Zühlke Retail CTO Playbook. Download now