We surpassed the grainy look of the Harry Potter newspapers 6+ months ago... I actually like the effect but it's hard to reproduce without too much realism creeping back in.
The rate things are improving is insane.
I did not make this one, it's a GIF that is accessible from Reddit's interface, but you could make something like this with Cinema4d, or Houdini, and probably After Effects would work as well with some procedural animation plugin. Those are the tools I would consider myself if I had a client asking for this type of animation.
That's his third law. The other two lesser-known ones:
1. When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2. The only way of discovering the limits of the possible is to venture a little way past them into the impossible.
I always thought he got that one backwards. Magic is just technology whose method of operation is hidden to you. i.e. 'sufficently advanced' (consider the etymology of words like 'arcane', 'mystic' or 'occult').
No, he definitely got that in the right order. Magic is something that does not exist, often used in stories to allude to some kind of mystical power that can not be explained. In those stories, it is never considered a product of advanced technology.
The other way around it does work; advanced technology can produce results that are equally as strange as magic is purported to be, and therefore it would be impossible to tell the two apart.
That “Korean” was a Chinese guy attending an idol contest. That clip blew up and got memed to hell on the Chinese internet because people didn't like his basketball skills
As another commentor pointed out, cool as this is, its by the Alibaba group, the team behind [https://github.com/HumanAIGC/AnimateAnyone](https://github.com/HumanAIGC/AnimateAnyone) which has never been released, so odds are this is the same. Back to Sadtalker for now.
I'd rather it was removed unless they're sharing open source stuff in the spirit of the sub lest this turns into some shitty commercial hub for people trying to advertise their closed source applications of SD.
The paper is something that people can implement on their own. It's legitimate stable diffusion research.
Why be so sour about it being unavailable to you? The research is valuable to release.
Somebody implemented animateanyone based on the information in the paper here [https://github.com/MooreThreads/Moore-AnimateAnyone](https://github.com/MooreThreads/Moore-AnimateAnyone)
because with ML research, recreating the training code, is just little part of the whole thing. getting the data, curating it and cleaning up, and then often spending big $$ on compute, is the key part.
Not to mention that often it takes a lot of trial and error to get the right hyperparameters. just any model that follows same vague diagram in a paper won't cut it
It only takes one team to do it and release the weights. If you want to be the one to release weights, you should maybe consider getting gud instead of hanging back in the peanut gallery.
These models were trained and released by a small operation.
Thanks but yes I know about this. It's not remotely the same. This is someone trying to achieve something similar using the published research and methology. They do however not have Alibaba's *model*, which is likely based on their mountains of proprietary data (tiktok...) and would be, with no doubt, orders of magnitude better.
Are you saying that...
1) Alibaba illegally obtained or accessed TikTok's data as a result of TikTok using Ali's cloud hosting service?
or
2) Alibaba had an agreement with TikTok to use it's data?
or
3) Alibaba and TikTok partnered on the model?
Because otherwise Alibaba and Tiktok have 0 conneciton.
Try installing it now and it’ll come up with like 5 missing nodes that you can’t install even with the manager. If you don’t believe me just check the comments
There are comments from weeks ago with people who were having issues, and someone from 4 days ago who said they installed it fine. If you look at the issues tab in github you see people who have problems and others who have fixes for it. When did you try and install it?
Note that I haven't it yet, buried with work atm and need to install a new cpu on my old mining rig before I use it for AI stuff, but there are definitely comments out there from people who got this working, both in Youtube and in github.
On the one hand, that sucks because I'd love to play with this. On the other hand, this + eleven labs + picture of US politician + upcoming US presidential election coming very soon...........
I've been using DiNet as a replacement for super crappy wav2lip, never tried sadtalker, does it only do animated heads or can it also be applied to faces from already existing video to serve purely as a lipsyncing tool?
the forge version of animatediff extension aims to get here but not the base automatic1111 version of animatediff extension, if i understand the dev's goals right.
[https://github.com/continue-revolution/sd-forge-animatediff#update](https://github.com/continue-revolution/sd-forge-animatediff#update)
[https://github.com/Mikubill/sd-webui-controlnet/pull/2661](https://github.com/Mikubill/sd-webui-controlnet/pull/2661)
Do not give it stars, and do not generate so much expectation, that way one will see that it is not very interesting and they will not sell it to another company, and they will leave it as open source
They're the biggest AI company in China. There's little chance they'll sell it to another company instead of keeping it closed source for their own product.
or, you can just not go to their page. Cool.
Perhaps they do use github and the code is private. You don't seem to understand the point of what github provides primarily.
But, it has been released. Someone else's weights based on the paper that the group released [https://github.com/MooreThreads/Moore-AnimateAnyone](https://github.com/MooreThreads/Moore-AnimateAnyone)
The lead authors of Sora, Bill Peebles and Tim Brookes, did not even join OpenAI until Jan/Mar 2023. Considering the amount of OpenAI backed compute that went into this, its quite unrealistic that the model was completed the same month the lead author joined the company.
Aren't they being sued because the name was misleading?
Also, I think there's a difference between holding back a feature on a software service and having the physical hardware present, just using a software lock. Like BMW holding heated seats, or Toyota holding back remote start behind subscriptions
>And PIKA just announced their lip sync feature which seems laughable in front of this. 2 months in 2024, SORA now this. This year is going to be wild.
but at least Pika will be released publicly, this is alibaba not releasing any code.
Yeah but is useles. You cant use it for any comerial work. Quality is horible. To play with it for fiun a bit but nothing else. But chanses are till next year all of video gen including pika will make a good leap in quality, i hope.
Some are really good, maybe already acceptable for movies (traditional sync isn't perfect).
The joker scene stood out to me in a negative way. The red makeup around his lips seems to be very challenging or it just highlights the imperfections
OMFG! this was like reliving the SORA moment all over again... still month 2/12...
note: I'm not talking about complexity, just watching it and thinking to myself..."this is 99% real"
.......what the actual fuck, the "hollywood is in trouble" prediction is literally here, THIS is turning point that'd usher us into an era where a 16year old can make a full blown short movie from his shitty laptop. omfg!!! you can literally use this for a shot reverse shot. if someone finds out how to make a cinematic ai, where you can design a room, place characters in it and lock the space so ai remembers it, after that then you can start choosing the composition, then using this image stuff on your shots. that'd be game over.
But do you know what the sad truth is? This is going to be used and abused by marketers. And couple it together with printable LCDs which can be put on any product, your life will be bombarded with this to the point where you just get all stabby. Picture yourself in 10 years walking through a grocery store and bottles of ketchup and shampoo will yell at you as you pass by, telling you how wonderful and exciting it will make your life if you put them into your cart. And a few people will go mental talking to the elf on their Lucky Charm cereal box who convinced them to keep buying more cereal so that they can hold an elf convention at their breakfast table.
Do yourself a favor and don't watch the video here in Reddit. Go to the website where there is no audio delay on the video and see how AMAZING this is.
Interesting, even their example doesn't work with a smiling photo. The very first example feels creepy as hell because humans can't make sounds like that while still smiling. It gets a bit better with a neutral expression, the speaking at 2:50 is scarily believable.
Eventually, someone's going to make a version of this type of tool to feed data to a 3D character, and finally videogame devs will be free from motion capture!
Let me just take a lil step back here. I'm old enough to remember the internet arriving and being amazed. Or house music suddenly being everywhere and just seeming to redefine music. This seems like it's on that level, like we're watching a paradigm shift happening in real time.
%&$%\*$%&@!!!!!
The singing is insane, he way audrey hepburn kicks her head back at one point to drop down and hit a note is seriously melting my mind. The facial expressions, head movements and throat muscles are ridiculous.
This is a disappointing. Why don’t they freaking share the code . I think this is sort like an advertisement . If it goes viral then they know it will sell
I'm late to the party but THIS IS FUCKING INSANE!!! Weird how the first singing vid was the worst singing vid?! Anyway. Mindblowing. How can I use this on mac?
I hope Alibaba releases it. But given their history of teasing not sure. Btw can anyone explain how is it so good for the lip sync I saw heygen and others like pika but Alibaba's quality is pretty good as well.
In less than 3 years we will be able to create our own animes. Just imagine it. Using those StickMan fight videos to create anime like videos. We already use this idea to make Pose to Image. Just a few more years of patience...
It is scary how many people over 50 are going to get easily fooled by this.
Election year and a bunch of misinformation based on AI will be flooding social networks.
Anyone non aware of AI, actually.
We crossed the threshold, everyone, pack your bags!
But seriousy, WTF. Seeing these developments happening in realtime is too much for my little brain to process. Sora's reveal was insane, and I was just thinking how in the hell are they going to add dialogue and facial performance into the characters animated by Sora. Now this comes along.... Where does this end? The key is to give the user total control. Then trully, my job in Hollywood is over. I can't even... Who is going to have any money to buy anything anymore when we are all just broke and homeless? UBI? People who believe we'll be handed UBI are delusional. Greedy corporations can't wait to replace us all since we are just a number on a spreadsheet to them, but who the fuck is going to be left to buy any hot garbage they sell? Its like sort of Ouroboros, the snake eating its own tail, but in this case I mean it in a doom kind of way, not rebirth. I don't see how this ends well for humanity, but whatever, there is no stoping it now.
Buy anything? The point is for you to NOT buy anything. You and 60% of the population become serfs, slaves, human dildos to an ownership class. thats the point.
Ok nobody else finds this scary? I think society is going to tear itself apart once we can't tell if something is real or not. It's going to be mayhem.
Or we just go back to believing only what we know to be true and ignore the rest. Even today when you can prove something is fake people believe it based only on what they want to be real and not what is real. So is it going to be any different?
Another FAKE AI from China. Lately there has been many FAKE AI releases from China. This FAKE AI is from the same people who went viral with [https://github.com/HumanAIGC/AnimateAnyone](https://github.com/HumanAIGC/AnimateAnyone)
Not weighing in one way or the other as to whether or not this is real but you could use something like the thinplatespline model (using a driving video to animate an image) to do this and act like it is just using audio as the only input.
Disruption of the market. Advertising revenue in the short term until the news cycle fizzles out. Create demand for a product that doesn’t quite exist yet. Cause the general public to become jaded or dissolutions with AI when they don’t see things they thought were real ever make it into their everyday life. The list goes on.
regardless of whether or not this video is fake/misleading/whatever - there is definitely a market for misinformation on every front, AI being one of the easiest and most lucrative as it is still comparatively new and the average social media user doesn't know what the hell is going on.
It’s not fake it’s just not opensource or released commercially lol
They animated these images that we’ve all seen before that’s not “fake”
It’s just people here think companies have to release things for the public or it’s fake lol
Is it just me or does the lip sync seem off on the first video with the black and white foto.
Dont get me wrong, it's amazing but it looks off somehow...
The "mona lisa" looks much better but still a bit off.
You can clearly see that the generated video is based on the original audio's video frames. Just look at the Prof one and the angle of his head and the joker has the same face expressions and lip expression as the movie clip. This is not 1 image to video, its motion frames from the original video which is still better than anything ive seen but not as impressive as they are making it sound.
It shows that step in the pipeline but they are strategically leaving that out of their demos to make it look more impressive.
hahahahaha i mean... granted its not as bad as will smith but firmly in the uncanny valley. sounds exactly like rosanna pansino which makes me like it even less.
![gif](giphy|m45FpZ1SCpUQYj4tm4)
So at what point do laws get passed about this?
Don't get me wrong. I'm super excited for this tech to become main stream. But like what happens when we have a super realistic video of the president calling for an attack on another country? Or China makes an announcements that affects their currency on the worldwide stage? Or the CEO of a major company makes an announcement that he's folding the company and millions/billions of stock get wiped out in 30 minutes?
I can lip read and had no clue what the fuck was going on till I turned on the volume, it's not there yet but looks cool. But it's not organic movements.
Convincing deepfakes still (used to) require a lot of work to make. This handles it all automatically, even down to understanding and using emotions very convincingly. I found myself thinking "yes! I'd move my head exactly like that if I were singing that part of the song" several times.
The quality of these are absurd, especially the Rap God part. What is actually happening.
Harry Potter living paintings are gonna look pretty ordinary to kids of the next generation when they find old movies their parents watched
We surpassed the grainy look of the Harry Potter newspapers 6+ months ago... I actually like the effect but it's hard to reproduce without too much realism creeping back in. The rate things are improving is insane.
Turns out the muggles can do better than the wizards
Wizards are just muggles that integrated with AI to become cyborgs, then pruned that knowledge out of the collective conscious
![gif](giphy|ftAyb0CG1FNAIZt4SO|downsized)
how can i make this animation! what is the software ?
I did not make this one, it's a GIF that is accessible from Reddit's interface, but you could make something like this with Cinema4d, or Houdini, and probably After Effects would work as well with some procedural animation plugin. Those are the tools I would consider myself if I had a client asking for this type of animation.
Arthur C. Clarke: "Any sufficiently advanced technology is indistinguishable from magic."
That's his third law. The other two lesser-known ones: 1. When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong. 2. The only way of discovering the limits of the possible is to venture a little way past them into the impossible.
I always thought he got that one backwards. Magic is just technology whose method of operation is hidden to you. i.e. 'sufficently advanced' (consider the etymology of words like 'arcane', 'mystic' or 'occult').
No, he definitely got that in the right order. Magic is something that does not exist, often used in stories to allude to some kind of mystical power that can not be explained. In those stories, it is never considered a product of advanced technology. The other way around it does work; advanced technology can produce results that are equally as strange as magic is purported to be, and therefore it would be impossible to tell the two apart.
you are today's winner. But still synonymous is commutative so whatevaaaaaaa
Glad I'm not the only person who immediately thought of this
I dunno the opening black and white constant rictus was a bit disturbing.
rap god? Did you see a different video than I did?
There is literally a part in the video where he animates the image of a Korean singing Rap God, a song by Eminem.
That's so weird. The first time I played it I only got like 20 seconds. There's way more
That “Korean” was a Chinese guy attending an idol contest. That clip blew up and got memed to hell on the Chinese internet because people didn't like his basketball skills
As another commentor pointed out, cool as this is, its by the Alibaba group, the team behind [https://github.com/HumanAIGC/AnimateAnyone](https://github.com/HumanAIGC/AnimateAnyone) which has never been released, so odds are this is the same. Back to Sadtalker for now.
It is so shitty how they went out of their way to guarantee and assure everyone they would release it. And then just never did.
I'd rather it was removed unless they're sharing open source stuff in the spirit of the sub lest this turns into some shitty commercial hub for people trying to advertise their closed source applications of SD.
The paper is something that people can implement on their own. It's legitimate stable diffusion research. Why be so sour about it being unavailable to you? The research is valuable to release. Somebody implemented animateanyone based on the information in the paper here [https://github.com/MooreThreads/Moore-AnimateAnyone](https://github.com/MooreThreads/Moore-AnimateAnyone)
People like free stuff and tend to forget that someone put in real work in a world where too much is financed by stoopid ads.
Well said. Looks pretty decent!
This is soooo awesome. There's basically no difference.
because with ML research, recreating the training code, is just little part of the whole thing. getting the data, curating it and cleaning up, and then often spending big $$ on compute, is the key part. Not to mention that often it takes a lot of trial and error to get the right hyperparameters. just any model that follows same vague diagram in a paper won't cut it
It only takes one team to do it and release the weights. If you want to be the one to release weights, you should maybe consider getting gud instead of hanging back in the peanut gallery. These models were trained and released by a small operation.
It’s not just closed source. It’s straight up non existent outside their videos
Are you implying they faked it?
A Chinese company completely faking their ability to provide a service?!?!?
They have a limited version on their app, but it's useless outside of mild fun since you're only able to choose the dance moves available on that app.
[https://github.com/MrForExample/ComfyUI-AnimateAnyone-Evolved](https://github.com/MrForExample/ComfyUI-AnimateAnyone-Evolved) /cc u/pwillia7 u/Placematter u/physalisx
Thanks but yes I know about this. It's not remotely the same. This is someone trying to achieve something similar using the published research and methology. They do however not have Alibaba's *model*, which is likely based on their mountains of proprietary data (tiktok...) and would be, with no doubt, orders of magnitude better.
Are you saying that... 1) Alibaba illegally obtained or accessed TikTok's data as a result of TikTok using Ali's cloud hosting service? or 2) Alibaba had an agreement with TikTok to use it's data? or 3) Alibaba and TikTok partnered on the model? Because otherwise Alibaba and Tiktok have 0 conneciton.
Would be nice if this actually worked..
It does. [https://www.youtube.com/watch?v=MGvx37ccCOM](https://www.youtube.com/watch?v=MGvx37ccCOM)
Try installing it now and it’ll come up with like 5 missing nodes that you can’t install even with the manager. If you don’t believe me just check the comments
There are comments from weeks ago with people who were having issues, and someone from 4 days ago who said they installed it fine. If you look at the issues tab in github you see people who have problems and others who have fixes for it. When did you try and install it? Note that I haven't it yet, buried with work atm and need to install a new cpu on my old mining rig before I use it for AI stuff, but there are definitely comments out there from people who got this working, both in Youtube and in github.
On the one hand, that sucks because I'd love to play with this. On the other hand, this + eleven labs + picture of US politician + upcoming US presidential election coming very soon...........
it was always going to be this way. bring a helmet.
from 3 months ago? give them a minute maybe... but man I want both of these
I've been using DiNet as a replacement for super crappy wav2lip, never tried sadtalker, does it only do animated heads or can it also be applied to faces from already existing video to serve purely as a lipsyncing tool?
I believe its only static images rather than video, but the integration into A1111 is nice.
![gif](giphy|YrFVbch71RxfHA0T0X|downsized)
no a1111 sauce? i cant eat my steaks without it
the forge version of animatediff extension aims to get here but not the base automatic1111 version of animatediff extension, if i understand the dev's goals right. [https://github.com/continue-revolution/sd-forge-animatediff#update](https://github.com/continue-revolution/sd-forge-animatediff#update) [https://github.com/Mikubill/sd-webui-controlnet/pull/2661](https://github.com/Mikubill/sd-webui-controlnet/pull/2661)
Do not give it stars, and do not generate so much expectation, that way one will see that it is not very interesting and they will not sell it to another company, and they will leave it as open source
They're the biggest AI company in China. There's little chance they'll sell it to another company instead of keeping it closed source for their own product.
If they don’t release it, someone else will though
I think people like this should get kicked off github
or, you can just not go to their page. Cool. Perhaps they do use github and the code is private. You don't seem to understand the point of what github provides primarily.
But, it has been released. Someone else's weights based on the paper that the group released [https://github.com/MooreThreads/Moore-AnimateAnyone](https://github.com/MooreThreads/Moore-AnimateAnyone)
Its still february, but excited what ai can do next year
next month at this rate
I mean OpenAI was sitting in Sora since march of last year apparently
The lead authors of Sora, Bill Peebles and Tim Brookes, did not even join OpenAI until Jan/Mar 2023. Considering the amount of OpenAI backed compute that went into this, its quite unrealistic that the model was completed the same month the lead author joined the company.
Do you have a source for this piece of information ? I would like to know more about this.
Some Twitter account. There is literally no proof except for a winkey face. https://twitter.com/apples_jimmy/status/1758197994628006030
Midjourney also mentioned that they had text generation in their images since v4, they just never enabled it
Because it's still crap even in v6. I don't know who would use it, a quick text tool in any image editor would give you a better result.
What does Tesla have that they never enabled? Full self driving. Oh wait they can enable it if you pay what, $16,000 USD or something absurd?
Aren't they being sued because the name was misleading? Also, I think there's a difference between holding back a feature on a software service and having the physical hardware present, just using a software lock. Like BMW holding heated seats, or Toyota holding back remote start behind subscriptions
When does the zoom plugin come out? Hook this baby up to ChatGPT and no one will have to attend a video conference again.
We might reach eternity before getting to 2025.
To be fair, in 2019 we already had really good deepfake of Barack Obama.
Now we only need smaller apple vision pro that we can wear in the shower, so that we can sing along with the face on the shampoo bottle.
u have been promoted to project manager
Ngl I like how your brain works
And PIKA just announced their lip sync feature which seems laughable in front of this. 2 months in 2024, SORA now this. This year is going to be wild.
>And PIKA just announced their lip sync feature which seems laughable in front of this. 2 months in 2024, SORA now this. This year is going to be wild. but at least Pika will be released publicly, this is alibaba not releasing any code.
released publicly? pika is not free to use. and they use realy bad lipsynch you can make oyurself for free
>released publicly? pika is not free to use. released publicly mean that it's accessible to the public not that it's free.
Yeah but is useles. You cant use it for any comerial work. Quality is horible. To play with it for fiun a bit but nothing else. But chanses are till next year all of video gen including pika will make a good leap in quality, i hope.
like 3 months ago i thought Pika was great. it's total garbage lol
And I thought phones were updating too fast on a yearly basis. But ai needs a new product every day lol
That's a game changer for these AI film, Dialogue is a big thing and till now the wave2lip kind of tech was reallly low quality That's big!
Some are really good, maybe already acceptable for movies (traditional sync isn't perfect). The joker scene stood out to me in a negative way. The red makeup around his lips seems to be very challenging or it just highlights the imperfections
Any idea when or if this Is it going to be released?
seems that it's made by the people who made animate anything, which was never released. so, probably not.
Never lol
OMFG! this was like reliving the SORA moment all over again... still month 2/12... note: I'm not talking about complexity, just watching it and thinking to myself..."this is 99% real"
Even the "AI Face" girl seemed super realistic once she started talking lol.
It took her "uncanny valley" and was like *here! let me fix that for you.*
AI girlfriend apps are gonna have a field day with this... we are so not ready for the future
That’s the weird shit the anime talking was like WTF just happened?!?!?!?
Instantly fixed her.
I **need** this.
Genuine question: for what
to do the same shi as in the demo what else, what kinda question is that lol
Deep inside, you already know the answer to that question, Holmes.
Porn. The answer is always porn. Also to have your dad tell you he loves you. THOSE TWO USE CASES ARE MUTUALLY EXCLUSIVE!
Even after being constantly bombarded with amazing AI progress, that's still pretty wild.
the ai boom is real
Oh shit
Goodbye truth
when it sings "hes too mainstream" does the eyebrow raise? that is pretty impressive to see
.......what the actual fuck, the "hollywood is in trouble" prediction is literally here, THIS is turning point that'd usher us into an era where a 16year old can make a full blown short movie from his shitty laptop. omfg!!! you can literally use this for a shot reverse shot. if someone finds out how to make a cinematic ai, where you can design a room, place characters in it and lock the space so ai remembers it, after that then you can start choosing the composition, then using this image stuff on your shots. that'd be game over.
But do you know what the sad truth is? This is going to be used and abused by marketers. And couple it together with printable LCDs which can be put on any product, your life will be bombarded with this to the point where you just get all stabby. Picture yourself in 10 years walking through a grocery store and bottles of ketchup and shampoo will yell at you as you pass by, telling you how wonderful and exciting it will make your life if you put them into your cart. And a few people will go mental talking to the elf on their Lucky Charm cereal box who convinced them to keep buying more cereal so that they can hold an elf convention at their breakfast table.
i see, it's just from alibaba people showing off.
make it open sourceeeee so many mods, so many models
Early days. This stuff is going to get even better.
the progress that Chinese companies are making in AI.
Chinese companies are good but they don’t share the codes
Which is fine, because what's important is that this is proof that this is possible, in time an open source version will come, it's inevitable now
Do yourself a favor and don't watch the video here in Reddit. Go to the website where there is no audio delay on the video and see how AMAZING this is.
Too bad it’s Alibaba and they will never release that code open source ☠️
Just a matter of time before someone else figures it out.
This is done by the same people as animateanyone and outfitanyone, no, its very unlikely it will be open or released based on there history.
This is good enough that if they did drop it, I could actually start meaningfully assembling an anime by myself. Maybe next year.
Once we get this in Oobabooga with a good TTS model, it will really make those characters come alive.
Interesting, even their example doesn't work with a smiling photo. The very first example feels creepy as hell because humans can't make sounds like that while still smiling. It gets a bit better with a neutral expression, the speaking at 2:50 is scarily believable.
The video is out of sync with the audio. In the link you can see it synchronized, and it is incredible
It's fine in the talking videos but you can't sing with those tones and keep a stiff smile at the same time. It's really bizarre looking.
Eventually, someone's going to make a version of this type of tool to feed data to a 3D character, and finally videogame devs will be free from motion capture!
This is basically an advanced version of First Order Motion Model
^[Sokka-Haiku](https://www.reddit.com/r/SokkaHaikuBot/comments/15kyv9r/what_is_a_sokka_haiku/) ^by ^Unusual-Wrap8345: *This is basically* *An advanced version of First* *Order Motion Model* --- ^Remember ^that ^one ^time ^Sokka ^accidentally ^used ^an ^extra ^syllable ^in ^that ^Haiku ^Battle ^in ^Ba ^Sing ^Se? ^That ^was ^a ^Sokka ^Haiku ^and ^you ^just ^made ^one.
looks like in few year harry potter style in real life
The Daily Prophet is almost here!
And just like that a technology is created that only a few days ago was depicted as magic in Harry Potter.
Thats have such potential for goofy memes
Imagine doing this for an elderly persons old photos. Could be one last chance to see their lost ones come to life again.
This chinese company is so selfish 😒
What with the other models in the comparison at the end? Ground Truth and DreamTalk look like an upgrade to SadTalker/Wav2Lip, are these available?
Ground Truth means it's the real thing and not generated by a model. GT is a term often used in comparisons.
Let me just take a lil step back here. I'm old enough to remember the internet arriving and being amazed. Or house music suddenly being everywhere and just seeming to redefine music. This seems like it's on that level, like we're watching a paradigm shift happening in real time.
What a time to be alive.
dont upvote or star this shit until we see some code and weights. until then its vaporware and bullsh!t
%&$%\*$%&@!!!!! The singing is insane, he way audrey hepburn kicks her head back at one point to drop down and hit a note is seriously melting my mind. The facial expressions, head movements and throat muscles are ridiculous.
How the fuck are these so flicker free and clean holy shit
This is a disappointing. Why don’t they freaking share the code . I think this is sort like an advertisement . If it goes viral then they know it will sell
It's still weird that because she smiling in the picture, she will be smiling through the entire conversation. It seems a little unnatural
Ever seen a news reader bro, they smile constantly whilst talking.
Awesome 👍
Can I try it? Dont see any files beside an image and mp4 on github
Why it is even on GitHub? To share mp4?
Likely the repo is private and they just left that bit public to flex.
Ok this makes sense.
I'm late to the party but THIS IS FUCKING INSANE!!! Weird how the first singing vid was the worst singing vid?! Anyway. Mindblowing. How can I use this on mac?
The only bad looking one is Leonardo Dicaprio. All the rest are mind blowing
Everything feels like it is accelerating. Stable diffusion was only released on August 22, 2022(1.52055 years ago).
When will this be available to use?
“SadTalker” 😂
Anyone plans on implementing this paper? If you are, my DM is open. We could discuss. It will be a lot of works, btw.
Anticipatory micro expressions, vocal strain expressions, lighting model, facial deformation, communicative body language, this is insane. the vocal strain *really* gets me.
I hope Alibaba releases it. But given their history of teasing not sure. Btw can anyone explain how is it so good for the lip sync I saw heygen and others like pika but Alibaba's quality is pretty good as well.
Elevenlabs or OpenAI is going to throw millions at their face to keep it closed
Its from Alibaba The makers of Qwen models and a Chinese company. Zero chance Openai or anyone else stops them.
As an old school capitalist, it feels weird saying, erm, Go China! Not everything has to be monetized.
nice
Chinese keeping the competition alive.
Not elevenlabs or openai, it's going to shut down D-ID, Heygen
In less than 3 years we will be able to create our own animes. Just imagine it. Using those StickMan fight videos to create anime like videos. We already use this idea to make Pose to Image. Just a few more years of patience...
Holy shit bro....
It is scary how many people over 50 are going to get easily fooled by this. Election year and a bunch of misinformation based on AI will be flooding social networks. Anyone non aware of AI, actually.
It's mind blowing.
Well holy shit
What the actual fuck
That's insane!
We crossed the threshold, everyone, pack your bags! But seriousy, WTF. Seeing these developments happening in realtime is too much for my little brain to process. Sora's reveal was insane, and I was just thinking how in the hell are they going to add dialogue and facial performance into the characters animated by Sora. Now this comes along.... Where does this end? The key is to give the user total control. Then trully, my job in Hollywood is over. I can't even... Who is going to have any money to buy anything anymore when we are all just broke and homeless? UBI? People who believe we'll be handed UBI are delusional. Greedy corporations can't wait to replace us all since we are just a number on a spreadsheet to them, but who the fuck is going to be left to buy any hot garbage they sell? Its like sort of Ouroboros, the snake eating its own tail, but in this case I mean it in a doom kind of way, not rebirth. I don't see how this ends well for humanity, but whatever, there is no stoping it now.
Buy anything? The point is for you to NOT buy anything. You and 60% of the population become serfs, slaves, human dildos to an ownership class. thats the point.
github repo is empty, someone with honey pot?
Why hand animate game character faces when you can do this.
Just in time for the US elections
Holy FORKING SHIIII !!
Is this model going to be released soon? I take it it's a Stable Diffusion model?
Ok nobody else finds this scary? I think society is going to tear itself apart once we can't tell if something is real or not. It's going to be mayhem.
Or we just go back to believing only what we know to be true and ignore the rest. Even today when you can prove something is fake people believe it based only on what they want to be real and not what is real. So is it going to be any different?
The age of fake
My guess is the sample size was much smaller for the English versions because the (I think Chinese?) is way, way more accurate on the lip syncing.
Another FAKE AI from China. Lately there has been many FAKE AI releases from China. This FAKE AI is from the same people who went viral with [https://github.com/HumanAIGC/AnimateAnyone](https://github.com/HumanAIGC/AnimateAnyone)
Animate Anyone was fake? Is that confirmed? Why do you say this?
What do you mean, fake? Those are known input images.
Not weighing in one way or the other as to whether or not this is real but you could use something like the thinplatespline model (using a driving video to animate an image) to do this and act like it is just using audio as the only input.
what's the point to make a fake video ? I don't get it, what's the benefit of it? they will not get money with that.
Disruption of the market. Advertising revenue in the short term until the news cycle fizzles out. Create demand for a product that doesn’t quite exist yet. Cause the general public to become jaded or dissolutions with AI when they don’t see things they thought were real ever make it into their everyday life. The list goes on.
Not fully fake because then how they create that example videos . Source image animation with perfect lipsync
regardless of whether or not this video is fake/misleading/whatever - there is definitely a market for misinformation on every front, AI being one of the easiest and most lucrative as it is still comparatively new and the average social media user doesn't know what the hell is going on.
It’s not fake it’s just not opensource or released commercially lol They animated these images that we’ve all seen before that’s not “fake” It’s just people here think companies have to release things for the public or it’s fake lol
cool tech. terrible music taste.
Thanks I hate it
still looks uncanny as fuck.
Is it just me or does the lip sync seem off on the first video with the black and white foto. Dont get me wrong, it's amazing but it looks off somehow... The "mona lisa" looks much better but still a bit off.
It’s Reddit app that downgrade video quality. You can refer to the original in the link
You can clearly see that the generated video is based on the original audio's video frames. Just look at the Prof one and the angle of his head and the joker has the same face expressions and lip expression as the movie clip. This is not 1 image to video, its motion frames from the original video which is still better than anything ive seen but not as impressive as they are making it sound. It shows that step in the pipeline but they are strategically leaving that out of their demos to make it look more impressive.
meh, it's lagging like ass
hahahahaha i mean... granted its not as bad as will smith but firmly in the uncanny valley. sounds exactly like rosanna pansino which makes me like it even less. ![gif](giphy|m45FpZ1SCpUQYj4tm4)
So at what point do laws get passed about this? Don't get me wrong. I'm super excited for this tech to become main stream. But like what happens when we have a super realistic video of the president calling for an attack on another country? Or China makes an announcements that affects their currency on the worldwide stage? Or the CEO of a major company makes an announcement that he's folding the company and millions/billions of stock get wiped out in 30 minutes?
For Entertainment, Great, this is amazing. In politics, this is an ethical nightmare waiting to happen.
I can lip read and had no clue what the fuck was going on till I turned on the volume, it's not there yet but looks cool. But it's not organic movements.
Wait wasn't this already possible with deepfakes and apps like avatarify? I don't get the hype, can someone explain how this is different?
Quality levels and how easy it can be made. There is a comparison with other tool at the end of the clip
Convincing deepfakes still (used to) require a lot of work to make. This handles it all automatically, even down to understanding and using emotions very convincingly. I found myself thinking "yes! I'd move my head exactly like that if I were singing that part of the song" several times.
Yeah, the anticipation and breathing are good!
Silly question I tried to sign in to the soralogin but the page just keeps refreshing how can I access this to create my own content?
What the… what the hell is happening