The seamless transition demo is fantastic. The translated voice is passable for my own native voice. It would be incredible when we can achieve this in real-time.
I speak English and Spanish, so I recorded some English sentences and listened to the Spanish output it generated. It came damn close to my own Spanish (although I have more Castilianisms in mine, which of course I wouldn't expect it to know)
These are all half-baked at best. They are spending so much
money on undergraduate-level work. But to be fair, who in their right mind would work for Meta in 2025 if you have the talent.
Neat, but I wish Meta would just say what this really is – "please give us some In the Wild data to further train our models on".
I did the same technique years ago for estimating ages. Person uploads an image, helps align 10% of our facial landmark points, and run the estimator. If we were wrong, ask for correction and refine.
Its still cool and all, but meh based on my prior experience.
Meta deeply comprehends the impact of GPT-3 vs ChatGPT. The model is a starting point, and the UX of what you do with the model showcases intelligence. This is especially pronounced in visual models. Telling me SAM2 can "see anything" is neat. Clicking the soccer ball and watching the model track it seamlessly across the video even when occluded is incredible.
Whoops, you're not connected to Mailchimp. You need to enter a valid Mailchimp API key.
Our site uses cookies. Learn more about our use of cookies: cookie policyACCEPTREJECT
Privacy & Cookies Policy
Privacy Overview
This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
13 Comments
cebert
The seamless transition demo is fantastic. The translated voice is passable for my own native voice. It would be incredible when we can achieve this in real-time.
rob-olmos
Is this subject purposely spelled Aidemos somewhere like the HN title says instead of AI Demos?
npalli
“ Our site is not available in your region at this time.”
brap
What is Meta’s angle with AI? They seem to be doing a lot of research but what is the end goal? Google and MSFT I understand, Meta not so much.
meltyness
It's a tool box of demos with the following:
Segment Anything 2:
Create video cutouts and other fun visual
effects with a few clicks.
Seamless Translation:
Hear what you sound like in another
language.
Animated Drawings:
Bring hand-drawn sketches to life with
animations.
Audiobox:
Create an audio story with A1-generated
voices and sounds.
kylecazar
Seamless translation is… Pretty incredible.
I speak English and Spanish, so I recorded some English sentences and listened to the Spanish output it generated. It came damn close to my own Spanish (although I have more Castilianisms in mine, which of course I wouldn't expect it to know)
ewuhic
Where are all the links to models?
lelag
It's not exhaustive. For exemple, it's missing the Meta Motivo demo at https://metamotivo.metademolab.com/ (humanoid control model)
lvl155
These are all half-baked at best. They are spending so much
money on undergraduate-level work. But to be fair, who in their right mind would work for Meta in 2025 if you have the talent.
nabaraz
I expected a lot more.
xyst
> Our site is not available in your region at this time.
What the shit is this?
tsumnia
Neat, but I wish Meta would just say what this really is – "please give us some In the Wild data to further train our models on".
I did the same technique years ago for estimating ages. Person uploads an image, helps align 10% of our facial landmark points, and run the estimator. If we were wrong, ask for correction and refine.
Its still cool and all, but meh based on my prior experience.
rocauc
Meta deeply comprehends the impact of GPT-3 vs ChatGPT. The model is a starting point, and the UX of what you do with the model showcases intelligence. This is especially pronounced in visual models. Telling me SAM2 can "see anything" is neat. Clicking the soccer ball and watching the model track it seamlessly across the video even when occluded is incredible.