Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
AI

OpenAI Opens Its Speech AI Engine To Developers 7

At its DevDay event today, OpenAI announced that it is giving third-party developers access to its speech-to-speech engine that powers ChatGPT's advanced voice mode. "The move paves the way for a wave of AI apps that offer conversational voice interfaces," reports Axios. From the report: Early testers of the feature include nutrition and fitness app Healthify and Speak, a language learning app. Other new features being made available to developers include the ability to fine tune models based on pictures. In a demo for reporters, OpenAI executives showed an example of the new audio capabilities combined with Twilio's API to allow an AI assistant to call a fictional candy shop and place an order for 400 chocolate covered strawberries.

Developers will only be able to use the voices provided by OpenAI -- the same ones that are options within ChatGPT. While the voice won't be watermarked in any way and developers won't have to make the AI system identify itself, OpenAI says it's against the company's terms of service to use its systems to spam or mislead people.
This discussion has been archived. No new comments can be posted.

OpenAI Opens Its Speech AI Engine To Developers

Comments Filter:
  • I remember being amazed in the 90's at trade shows where a presenter could convert speech to text like magic.
    I even bought one the packages, but ultimately never used it. Why?
    - I typically think at the speed of typing.
    - My coworkers have no interest in hearing me talk to my computer.
    - Do you know anyone who really uses siri for example?
    but the main one is security. I really don't want anyone in earshot knowing what I'm doing.

    Use cases where it will work. Places where speech is the primary method used now.
    -

    • Do you know anyone who really uses siri for example?

      Tons of 'normal' people use voice assistants. Hell, tons of people prefer voice in general.

      I'm sure a lot of slashdotters (including me) love the efficiency, structure and all the other advantages of text based communication, but for other people watching videos, listening to podcasts, leaving minutes long voice messages, talking around the water cooler, and talking loudly with their phone on speaker in crowded spaces is apparently much preferred. They will have no issue with and will even welcome voice as

  • here: https://openai.com/index/intro... [openai.com] useless site in OPs submission doesn't even link to it

I'd rather just believe that it's done by little elves running around.

Working...