Chat GPT, hot on the heels of Gemini 2.0, have announced that video and screen sharing are coming to their advanced mode for plus and enterprise users, the roll out coming this week. Europe, unfortunately, will not be included for now.
https://mashable.com/article/openai-brings-video-to-chatgpt-advanced-voice-mode
Video sharing is the magic we've been waiting for, the ability to point our phone at items and have a real time conversation with Chat GPT about what it is seeing in real time. Andy posted an excellent video back in May showing what it was capable of:
This is an example of Be My Eyes integration. Maybe our new benevolent overlords can comment on when this new power will be native to the Be My Eyes app?
All very exciting though. Not got it here in the UK yet, but will be looking over the next few days.
Updated to include external article.
Comments
only for advanced users
I just canceled my GPT membership for advanced mode! Any idea if this'll be rolled out to regular members as well?
only for advanced users
I just canceled my GPT membership for advanced mode! Any idea if this'll be rolled out to regular members as well? Like people who use the free version?
Hoping it's it'll be available in the UK
On open ais help page, it says EU, not Europe , so I'm hoping that we'll see it in the UK in the next week. Here is the exact wording:
āVideo, screen share, and image upload capabilities will be available to all Team users and most Plus and Pro users, except for those in the European Union, Switzerland, Iceland, Norway, and Liechtenstein. We expect this rollout to be completed over the next week. Usage of video and screen share capabilities is limited for all eligible plans on a daily basis. Usage of image uploads counts towards your plan't usage limits.ā
The ful article can be found here:
https://help.openai.com/en/articles/10271060-12-days-of-openai-release-updates
And here I go
Obsessively checking the app every five minutes š
Not here yet
Saw the live streme; neither the live video, nor santa is available for me yet; no app update either till now. Hopefully it'll get here in a few days time. And yeah, it'd be interesting to know more about any possible BeMyEyes integration.
So, will the third bus be Llama? what do you all think? Meta is usually a little late to the party but... Will we have live video capabilities in a few months time?
I have it!
And we are live!!!!!!!!! Omg and all it took was me upgrading to pro and now ChatGPT can see.
pro subscribers must be getting it first
Ah, so pro subscribers must be first to get it, which I guess makes sense?
I don't have that kind of money though lol.
Hopefully it isn't too long a wait for the rest of us?
I've got it
Well, I actually have it now!
Just decided to check before going to sleep.
Will do more testing tomorrow.
I also have it
I'm going to test it out soon.
Got it but
I got access, I turned the camera on and everything, but it says it can't access the camera and that it can't see a thing. Do I need to go into the settings and give the app permissions or something?
Realtime monitoring
Got it to work and here's an interesting observation: the native chat gpt app doesn't do realtime monitoring; it's not that it cannot, rather that it won't as it seems it's a limitation placed on it.
So basically, you can't hail a taxy with the help of gpt, unless the OpenAI-BME partnership is coming up with a blind-specific thing without this limitation inside BME.
How do you know if you have it?
I have the subscription, but not the $200 monthly pro. I think some of you have the same subscription as me, at about $20 monthly. My app had an update yesterday, but I don't seem to have this.
I tried gemini and it was pretty cool, but it would be nice if the GPT app that I pay for would do it.
If you're subscribed to plus
You should get it. just click switch to voice mode, and it should ask you something basically amounting to do you want the video thing. Even if it doesn't, just look for a video camera button once the voice mode turns on.
re: Realtime monitoring
Same observations here.
When I ask: "tell me when you see someone in the camera frame." He comfirms to do this. But when I stand in frond of the camera he doesn't notify me.
Hope this will be possible in the future. But for now: this is freaky cool already.
2 years ago I never thought this will be possible!
apple shortcut to start vision mode
Hello,
does someone know if there is a way to make a shortcut to start talking wiht de advanced voice mode whit vision thurned on?
Yea it would be so convenient to use such a shortcut
I hope OpenAI releases such a feature/shortcut in the first place. So it's not a hassle to use the live audio/video stream with ChatGPT.
To be honest
This is just the beginning of whatās going to be possible.
I now have it, and just wow!
This is pretty cool, and I bet will get even better.
Thank goodness I can tell it to stop talking like Santa though. I'm just not that cheerful. LOL
It's not super accurate when just describing a sceen yet. It said that the hallway outside my office was the sky. But I know it's just going to get better and better.
Not calibrated for blindness
the lack of accuracy in seen descriptions etc is mostly because of this lack of calibration; I bet with a bit of specific fine-tuning and training on specific data itself, it'd be super wonderful. In the imminant agentic future, what do you all think of an AI agent for the visually impaired?
@ Gokul
It is not giving me inaccurate or bad scene descriptions at all⦠I actually have found however, that some voices actually work better with the vision than others.
this shouldnāt need to be said, but Iām going to say this anyway
Ok guys, if you are having issues, this may be a good place to talk about it without having anything done about it, but the best thing to do is report it through the app. Especially if ChatGPT is not continuously looking out when you tell it to look for something. As they have just released the feature, theyāre gonna be paying extra attention to any complaints that come in about it. If only one of us is reporting it, obviously theyāre not gonna take it too seriously as itās only one out of many. But also for the love of God, please be respectful when doing itā¦
@Stephen
The quality of the descriptions/answers (to be more specific) defers with the voice one is using? That's interesting. In which case, which voice gives better (subjective of course) descriptions?
Send it thru contact formā¦I got a reply with in hours.
I got an email back from open ai within hours of sending an email thru the settings then help center and filling out the contact form:
Jay
Hello,
ā
Thank you for reaching out to OpenAI Support.
Thank you for sharing your feedback with us! Weāre thrilled to hear that youāre enjoying the new video mode for advanced voiceāyour appreciation truly means a lot. At the same time, weāre sorry to hear about the challenges youāre facing.
Regarding the issue where ChatGPT acknowledges the request but doesnāt notify you when something appears in the frame, we understand how this can challenging, especially when the model doesnāt follow through as expected.
Here are a few factors that could be contributing to the issue and steps you can try:
Model Limitations: While the advanced voice and video capabilities are powerful, they may not always perfectly detect or notify objects in real time. The model might require clearer prompts or frames for improved accuracy.
Prompt Specificity: Try providing more specific instructions or breaking down the task into smaller steps. For example, instead of asking ChatGPT to notify you when it sees your dog, you could first ask it to describe what it sees in the frame, followed by a request to notify you.
Lighting and Clarity: Ensure the video feed is clear and well-lit. Poor lighting, blurry visuals, or fast camera movements may make it difficult for the model to identify objects reliably.
If the problem continues, please provide additional details or examples of the prompts youāre using. This information will help us further investigate and work toward a solution.
We appreciate you patience and understanding in this matter.
Have a nice day!
Best,
Jay
OpenAI Support
We're here to help
Reply directly to this email or through our Messenger
intercom
@ Gokul
I have found Arbor to be the most reliable.
Based on there reply,
It seems to be a live video feedā¦
Generalized
I do agree with Ollie here. This seems like a very generalized response without, how do I put it, 'much sincerity' to it? I mean, GPT itself says that it is not permitted to live monitoring whichever way I prompt it.
@Stephen did you mention specifically in your message that this is important to you since you're visually impaired?
feedback
I too did send a message about this.
Hopefully this can be improved upon in the future but even now I've found it to be quite useful for identifying and reading things.
I look forward to seeing what happens in the future, but I am surprised we haven't heard so much as a word from be my eyes at this point, since the CEO himself had mentioned he'd been using it for months.
Perhaps they need to wait until video is part of the real time API?
@ Gokul
I mean that isnāt very generalized as they are answering a specific question but we shall see. No I didnāt tell them Iām
Blind and there is a reason for that, Iāve noticed with companies that the blind card always has the opposite effect. Responses like āwe are sorry but this isnāt ment to help blind people navigate, etc., etc.ā honestly, not telling a company Iām
Blind also got me a job at a company Iāve been working at for 3 years now lol. Thatās a fun story. I just donāt see why being blind matters, itās not doing what I want it to do and that is what matters. I mean it is pretty damn good for what it is but that one fix would make it perfect. Gokul, you may want to try using a different voice as some voices work better with vision than others. Please see my comment above.
If you have access,
Also switch gpts over to o1 pro