Meta Releases New AI-Based Photo Segmentation Tool to Everybody

Meta has some big AI ambitions, even as it seems like it’s long been playing catch up to OpenAI, Microsoft, and even Google. To make a bit of a splash, on Wednesday the company showed off its new AI-based Segment Anything Model that’s surprisingly capable of identifying and separating specific objects in images and video. Here’s the kicker, Meta is releasing it to anybody by making its new software open source.

Generating Video Via Text? | Future Tech

There’s quite a few good apps for erasing unwanted objects from images, and all of them already employ AI models to find and replace objects in photos. In my own tests of the Segment Anything demo, Meta has gone a step further with its own offering. The demo system offers a kind of Photoshop’s ‘Magic Wand’ tool on steroids. I tried it out using a few crowded images, such as a photo of Lego’s massive Rivendell set. Not only did it collectively guess that I was trying to select specific minifigs out of the background, but when it picked up a few wayward pixels I was quickly able to tell it to delete anything that wasn’t a Lord of the Rings character with just a single click.

After computing a new image, the system does a solid job highlighting different objects in a photo. In an image of myself sitting in a extremely confining massage chair, it was able to identify both me, the chair, and even my beard individually. Of course, Meta isn’t alone creating machine learning algorithms to identify aspects of images. Apple has talked about its AI image segmentation technology since 2021.

But what might set Meta apart is both function and usability. In my own tests, I found SAM is even better at selecting small objects from crowded photos than Google’s Magic Eraser or the free online tool Inpaint, though there’s no function for removing aspects of a photo and replacing its background.

G/O Media may get a commission

Meta said SAM is capable of outputting multiple masks even when there’s “ambiguity” about the object. Even then, the company described this as just a “foundation model” useful for image segmentation, both interactive and automatic. The system is described as “promptable” meaning it can receive input such as users’ gaze in a VR headset or through clicks and even text.

Perhaps most surprising from Meta, is that it’s releasing SAM under an open license, and is further providing full details on its 1-billion mask dataset, which the company claimed was “the largest ever segmentation dataset.” This SA-1B is a semantic segmentation dataset that classifies every pixel in an image, making it easier to stylize or remove objects from photos. According to Meta, the system itself is trained on 11 million images with an average of 100 masks per image.

According to Meta’s research paper on SAM, the dataset used images “from a provider that works directly with photographers,” though it did not specify which provider it was. Some of those images the system was trained on did include faces and license plates, though the paper says Meta blurred those out when it released the dataset.

It’s great to see Meta willing to open source one of its models and data, though it’s not like we should expect much more stuff for free. Meta has recently made a hard pivot to AI, so much so that the company’s head of its metaverse division Andrew Bosworth and other execs are talking up how the company plans to use generative AI for creating ads alongside other commercial-side products. The company is still working on making a public release for its ChatGPT competitor called LLaMA, even though it had already leaked online.

Sure, SAM could be used in either an AR or VR capability to identify objects by a user’s gaze, something that’s pretty important for Meta’s ambitions for its AR headsets and glasses. There’s still plenty of room for abuse. Google’s DeepMind AI detection system has proved effective in identifying cancer cells, but similar systems have been used for facial recognition. The ACLU recently revealed the FBI had tested facial recognition software on U.S. citizens for years. As this technology gets more sophisticated, the U.S. desperately needs a federal facial and biometric ban or at the very least more regulation.

Want to know more about AI, chatbots, and the future of machine learning? Check out our full coverage of artificial intelligence, or browse our guides to The Best Free AI Art Generators, The Best ChatGPT Alternatives, and Everything We Know About OpenAI’s ChatGPT.

READ SOURCE