OpenAI says ChatGPT can now see, hear, and speak: Check how to use

ChatGPT can now see, hear, and speak: The generative artificial intelligence (AI) evolved to the advanced level of integration GPT-4V, a vision-capable model, and multimodal conversational modes for its ChatGPT system. OpenAI while announcing this feature said that it will bring a new dimension to users’ experience where you can explore infinite possibilities of knowledge with a voice control system and image recognition interface. The upgraded ChatGPT systems will use natural language processing (NLP) and machine learning algorithms to convert spoken words into text and then process that text to perform specific actions or tasks.

“We are beginning to roll out new voice and image capabilities in ChatGPT. They offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about” says openai.com.

Also Read: Artificial Intelligence (AI) Glossary and Terminologies – Complete Cheat Sheet List

What is GPT-4 with vision (GPT-4V) in ChatGPT that can see, hear, and speak: The latest announcement

Speak with ChatGPT and have it talk back: You can now use voice to engage in a back-and-forth conversation with ChatGPT AI assistant. With this advanced functionality, you can speak with it on the go, request a bedtime story for your family, or settle a dinner table debate.

ChatGPT offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about. The voice and image commands give you more ways to use ChatGPT in your life.

Talk about things with pictures: In the ChatGPT latest update, you can snap a picture of a landmark while travelling and have a live conversation about what’s interesting about the place and you will get results that will amaze you with facts and figures. Or when you’re home, snap pictures of your fridge and pantry to figure out what’s for dinner (and ask follow-up questions for a step-by-step recipe). And, after dinner, help your child with a math problem by taking a photo, circling the problem set, and having it share hints with both of you.

ChatGPT Voice-based search for Plus and Enterprise: We’re rolling out voice and images in ChatGPT to Plus and Enterprise users over the next two weeks. Voice is coming on iOS and Android (opt-in in your settings) and images will be available on all platforms.

Check Here Webstory on – Why Is Amazon Investing Up To 4 Billion In AI Startup Anthropic

Use your voice to engage in a back-and-forth conversation with the ChatGPT AI assistant.

To enable ChatGPTvoice go to your Settings → New Features on the mobile app and opt into voice conversations. Then, tap the headphone button located in the top-right corner of the home screen and choose your preferred voice out of five different voices.

The new voice capability is powered by a new text-to-speech model, capable of generating human-like audio from just text and a few seconds of sample speech. Also, ChatGPT has collaborated with professional voice actors to create each of the voices. And, among all the interesting feature ChatGPT use Whisper open-source speech recognition system, to transcribe your spoken words into text.

What is ChatGPT-4 Images-based search query:

A GPT-4 image-based search query is a query that includes both an image and text. This allows GPT-4 to use its multimodal capabilities to understand the image and generate more relevant and informative results.

To get started, tap the photo button to capture or choose an image. If you’re on iOS or Android, tap the plus button first. You can also discuss multiple images or use our drawing tool to guide your assistant. The ChatGPT Image command functionality understanding is powered by multimodal GPT-3.5 and GPT-4. These models apply their language reasoning skills to a wide range of images, such as photographs, screenshots, and documents containing both text and images.

For example, show ChatGPT one or more images to troubleshoot your grill won’t start, explore the contents of your fridge to plan a meal, or analyze a complex graph for work-related data. To focus on a specific part of the image, you can use the drawing tool in our mobile app.

Also Read: OpenAI ChatGPT to launch DALL-E 3 in October to create realistic generative images with less prompts

This post was last modified on September 26, 2023 8:20 am

Françoise

Francoise Hardy, A digital content creator and tech integration specialist with over 10 years of experience, is known for his deep knowledge in AI, ML, Data Science, Robotics, and Neural Networks. He began his career with a passion for emerging technologies, leading to innovative solutions and digital transformation in various businesses. Francoise's expertise extends to the ethical aspects of technology, advocating for responsible usage. Recognized by his peers, he is a sought-after speaker and writer in the tech industry. His commitment to advancing technology for societal benefit defines his career.