Google’s Project Ellman: Merging photo and search data to create digital twin chatbot

Google is reportedly toying with the idea of using its latest Gemini AI models to analyze images from Google Photos and text from Search to put together  a life story for users.

The technology is currently being explored under “Project Ellman”, and would be powered by Google’s new multimodal large language model Gemini, announced this week. The idea is to ingest different types of data from multiple sources, like photographs stored on Google Photos or public information pulled from the internet, to create a more personalized chatbot.

Staff working across Google Photos and Gemini presented Project Ellman, and described the potential product as: “Imagine opening ChatGPT but it already knows everything about your life. What would you ask it?,” according to CNBC. The project is reportedly named after the literary critic and biographer Richard David Ellmann, who specialized in writing about Irish writers like James Joyce, Oscar Wilde, and William Butler Yeats.

Project Ellman would use AI to create a biography of users from their personal data. “We can’t answer tough questions or tell good stories without a bird’s-eye view of your life,” Google said in its presentation slides. “We trawl through your photos, looking at their tags and locations to identify a meaningful moment. When we step back and understand your life in its entirety, your overarching story becomes clear.”

Presumably amnesiac users can ask Ellman Chat if they had a pet or not, and it would look at whether they had pictures of animals in their data, and identify whether there are other photos family members were next to say, a dog or cat, to figure out the answer.

A spokesperson from Google declined to answer The Register’s questions about what kind of access a user would have to give the model for it to collect their personal data. Would it have to inspect information stored on their smartphones or laptops, for example?

“Google Photos has always used AI to help people search their photos and videos, and we’re excited about the potential of LLMs to unlock even more helpful experiences,” the representative told us.

“This is a brainstorming concept a team is at the early stages of exploring. As always, we’ll take the time needed to ensure we do it responsibly, protecting users’ privacy as our top priority.”

Gemini would be able to identify key milestones and important moments across a person’s life by looking at things like graduation or vacation photos. It could, in theory, piece together information about what university they attended or places they went to by analyzing information on Google Search. Google described the process of adding personal data to build a more detailed view of someone’s life.

“One of the reasons that an LLM is so powerful for this bird’s-eye approach, is that it’s able to take unstructured context from all different elevations across this tree, and use it to improve how it understands other regions of the tree,” according to the presentation. “This LLM can use knowledge from higher in the tree to infer that this is Jack’s birth, and that he’s James and Gemma’s first and only child,” Google said in an example. 

By analyzing users in-depth, Project Ellman could also be used to predict what products people might be interested to buy or where they want to travel by looking at screenshots of images they saved. It could also determine the top websites and apps they visited most, which is all grist to the advertising money mill. ®


This website uses cookies. By continuing to use this site, you accept our use of cookies.