Memories.ai Closes $8M Seed to Power Persistent Video Memory for AI

The team is building what it calls a "Large Visual Memory Model"

Large AI models can summarize a TikTok clip or generate a short film script, but video context remains a blind spot. Most models struggle beyond an hour or two of footage, a limitation that makes AI brittle in sectors like security and marketing where context is often built over longer periods. That’s the problem Memories.ai, a San Francisco-based startup, is trying to solve.

Founded by two former Meta Reality Labs researchers, Shawn Shen and Ben Zhou, the company recently raised an $8 million seed round led by Susa Ventures, with participation from Samsung Next, Crane Venture Partners, Fusion Fund, Seedcamp, and Creator Ventures. The round, originally targeted at $4 million, was oversubscribed: an indication of the perceived demand for AI systems that have high recall.

A Memory Layer for Machines

Memories.ai’s pitch is centered on infrastructure. Specifically, the team is building what it calls a Large Visual Memory Model (LVMM), a system designed to enable machines to “see, understand, and recall” visual information persistently. The platform indexes video data across long timeframes and makes it searchable through natural language queries. Its use cases are broad, but current traction is concentrated in two areas: surveillance and marketing.

Security firms use the technology to query months of footage for specific actions or objects. Marketing teams apply it to track brand visibility and sentiment across social video platforms. The model powers both a web-based chatbot interface and an API, letting developers integrate long-term video memory into applications like robotic agents and AR interfaces.

Traditional video analysis tools load an entire clip into memory. Memories.ai looks to apply a multi-layered process that compresses footage, strips away irrelevant data, indexes useful frames, and aggregates insights. This structure allows for faster queries without losing context, according to technical documentation provided by the company.

“Many top models, whether from OpenAI, Google, or Meta, start to break down when they deal with more than an hour or two of video,” Shen told TechCrunch. “We were inspired by how human memory works: it’s not just about retention, but connection and recall.”

The platform has reportedly been tested on video datasets exceeding 10 million hours and has demonstrated performance improvements across standard benchmarks for video classification, retrieval, and question answering. Shen claims the system avoids common AI failure points like hallucinations and context drift, thanks to its memory-centric architecture.

Investors See a Gap in Long-Context Video AI

Memories.ai is entering a space already worked on by the likes of Twelve Labs and Google’s DeepMind division, which are also exploring long-context video models. A growing group of “memory layer” startups, such as mem0 and Letta, are attempting similar feats, though with limited support for video and less developed infrastructure.

Investors view Memories.ai’s edge as both technical and architectural. “There’s a gap in the market for long-context visual intelligence,” said Susa Ventures partner Misha Gordon-Rowe. “Shen is pushing the boundaries of video understanding in a way we haven’t seen before.”

Samsung Next’s interest points to a potential consumer application. “On-device computing means users don’t need to store video data in the cloud, which can unlock better security applications,” said partner Sam Campbell, referencing privacy concerns that could be addressed by keeping memory local.

With a 15-person team and new capital in hand, Memories.ai plans to scale operations and enhance search capabilities, while building partnerships across industries. Future product directions include syncing content directly from customer drives and enabling more advanced assistant-like queries, such as summarizing a week’s worth of interviews or identifying recurring visual themes in media archives.

The company’s founders believe the technology could eventually serve as a foundational layer for a broader range of AI systems, including assistants capable of real-time, contextual recall and humanoid robots that learn over time through visual experience. But for now, the focus remains on high-volume video analytics and integration with customer libraries.

📣 Want to advertise in AIM Media House? Book here >

Picture of Mukundan Sivaraj
Mukundan Sivaraj
Mukundan is a writer and editor covering the AI startup ecosystem at AIM Media House. Reach out to him at mukundan.sivaraj@aimmediahouse.com.
25 July 2025 | 583 Park Avenue, New York
The Biggest Exclusive Gathering of CDOs & AI Leaders In United States

Subscribe to our Newsletter: AIM Research’s most stimulating intellectual contributions on matters molding the future of AI and Data.