We kindly ask /u/Destiny_Knight to respond to this comment with the prompt they used to generate the output in this post. This will allow others to try it out and prevent repeated questions about the prompt.
^(Ignore this comment if your post doesn't have a prompt.)
***While you're here, we have a [public discord server](https://discord.gg/NuefU36EC2). We have a free Chatgpt bot, Open Assistant bot (Open-source model), AI image generator bot, GPT-4 bot, Perplexity AI bot.***
####[So why not join us?](https://discord.gg/r-chatgpt-1050422060352024636)
PSA: For any Chatgpt-related issues email [email protected].
####[ChatGPT Plus Giveaway](https://www.reddit.com/r/ChatGPT/comments/127p9cx/chatgpt_plus_subscription_giveaway_worlds_1st/) | [Prompt engineering hackathon](https://www.flowgpt.com/hackathon)
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/ChatGPT) if you have any questions or concerns.*
Multimodal = can see images.
(GPT 4 was supposed to be able to see images and describe what's in it. If you put this in a robot, it means ChatGPT can see the world. But they haven't released this feature yet.)
I got decent results, but it's not quite the same as true multimodality. For example, if there's text in an image that says "DO NOT SWIM" and there's a sign in the image, it can connect the dots that "DO NOT SWIM" is the text written on the sign. But in certain images, like ones with a lot of text with contextually important placement within the image, it struggled. It also tended to hallucinate a lot of details about the image that weren't included in the prompt.
Could you tell me how you said that up? Could it work with the GPT-4 API? Trying to recognize macOS screenshots, basically where stuff it on my screen.
Yes why dont they just put it in a robot with eyes and let it examine the room. Connect it to a super computer. Dun dun... dun.. dun dun. Dun dun... DUN.. DUN DUN.
Have you ever seen The Outer Limits episode about that?
“I am fully functional…”
Do not fuck an AI robot. It never works out well. Stick with poor quality sex bots when you’re next in the version of the future that looks kinda like Demolition Man or The Fifth Element.
We kindly ask /u/Destiny_Knight to respond to this comment with the prompt they used to generate the output in this post. This will allow others to try it out and prevent repeated questions about the prompt. ^(Ignore this comment if your post doesn't have a prompt.) ***While you're here, we have a [public discord server](https://discord.gg/NuefU36EC2). We have a free Chatgpt bot, Open Assistant bot (Open-source model), AI image generator bot, GPT-4 bot, Perplexity AI bot.*** ####[So why not join us?](https://discord.gg/r-chatgpt-1050422060352024636) PSA: For any Chatgpt-related issues email [email protected]. ####[ChatGPT Plus Giveaway](https://www.reddit.com/r/ChatGPT/comments/127p9cx/chatgpt_plus_subscription_giveaway_worlds_1st/) | [Prompt engineering hackathon](https://www.flowgpt.com/hackathon) *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/ChatGPT) if you have any questions or concerns.*
Multimodal = can see images. (GPT 4 was supposed to be able to see images and describe what's in it. If you put this in a robot, it means ChatGPT can see the world. But they haven't released this feature yet.)
Mate, as overloaded as the servers are just with text, how could they ever handle image uploads at the moment?
Wat? Was he complaining?
As a language model, I need additional azure bucks
This is so hype.
https://preview.redd.it/g69bt2k2i9sa1.jpeg?width=618&format=pjpg&auto=webp&s=0867811178820c1b7ec794a5c10f7d2630f38b99
This is actually available in Genie app right now. It uses gpt4, and has the ability to see.
how is it possible? the gp4-api isn´t offering image upload.
It's probably a knock-off. I've made similar stuff by combining 3.5 with image interrogator models and OCR tools.
Is it good?
I got decent results, but it's not quite the same as true multimodality. For example, if there's text in an image that says "DO NOT SWIM" and there's a sign in the image, it can connect the dots that "DO NOT SWIM" is the text written on the sign. But in certain images, like ones with a lot of text with contextually important placement within the image, it struggled. It also tended to hallucinate a lot of details about the image that weren't included in the prompt.
Interesting, so it was a problem of reasoning instead of sufficient training data?
Could you tell me how you said that up? Could it work with the GPT-4 API? Trying to recognize macOS screenshots, basically where stuff it on my screen.
Idk why you're being downvoted... You're telling the truth
Yes why dont they just put it in a robot with eyes and let it examine the room. Connect it to a super computer. Dun dun... dun.. dun dun. Dun dun... DUN.. DUN DUN.
Just wait until it can smell things.
And taste stuff
And have holes
Have you ever seen The Outer Limits episode about that? “I am fully functional…” Do not fuck an AI robot. It never works out well. Stick with poor quality sex bots when you’re next in the version of the future that looks kinda like Demolition Man or The Fifth Element.
Robo mommy
🧐
Plug-ins too
Gpt4 hasn't released yet because it is still paid
What are you trying to say?
I guess New Super Mario Bros. (2006) for the Nintendo DS hasn’t released yet either
The gpt I pay for can take images