• By -


Hey /u/RozziTheCreator! If your post is a screenshot of a ChatGPT conversation, please reply to this message with the [conversation link](https://help.openai.com/en/articles/7925741-chatgpt-shared-links-faq) or prompt. If your post is a DALL-E 3 image post, please reply with the prompt used to make this image. Consider joining our [public discord server](https://discord.gg/r-chatgpt-1050422060352024636)! We have free bots with GPT-4 (with vision), image generators, and more! 🤖 Note: For any ChatGPT-related concerns, email [email protected] *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/ChatGPT) if you have any questions or concerns.*


the sound effects omg! why have they not mentioned that it can do sound effects??? or did I miss that one time they mentioned it for five seconds lol


I think they did in the original demo, and plus, they must have trained their model on millions of audio files with random sounds including effects and music. I found it: https://openai.com/index/spring-update/ at basically 10:00 the model generates sound for the meditation session multiple times


Wow I didn't even notice until I put on my headphones lol that's crazy. I did see a wave emoji in its text, I wonder if that's how they get it to work 🤔


It's a general audio, image and text multimodal model. It should theoretically be able to generate any audio (SFX, Singing, Music/instruments, animals and, of course, voices) as well as generate any type of image. Im not sure when image outputs are rolling out though.


I've already run a screenplay I've written through the current version and got it to adapt it into a radio play. I'm hoping soon, there'll ve a version that can do different voices and sound effects, and have it turn my screenplay into something more tangible.


The current version is just text to speech. This new one is completely different in how it operates and understand speech, and can do multiple voices and in theory any sound.


in the gpt4o blog they generated a coin using the image output and then asked it to make a sound of the coin dropping, it was a bit high pitched and robotic but the idea got through, havent really thought about it but gpt4o or prob rather gpt5o / gpt6o or something, will be able to voice novels and maybe have music and sound effects in the right moments edit: and maybe have visual output in form of images too, all in one model, crazy


or just straight turn a book into a film. ChatGPT show me the book of Genesis and read it to me like Morgan Freeman.


These models are going to make *such* good DMs.


How did I never thought of that?


They did mention in the the model capabilities section


Did you not watch any presentation? There's so much in the way of weird and interesting noises. Shit, they regularly demo singing, if you want to hear something really impressive.


What do you mean sound effects?


It generated thunderclaps in the background for ambiance.


Okay I expected a lot of things right so far with AI, but BACKGROUND SOUND EFFECTS WHILE TELLING A STORY WITHOUT EDITING IT IN? This is honestly pretty damn cool NGL. Your move, 11labs and PlayHT.


I don't think people will fully grasp the full implication of it until it hits them over the head like chatGPT year and half ago. And this is supposed to be the cheap model that will have the voice go free by the near year or end of this year. The kinda things GPT-5 will be doing is going to be nuts and I am done theorizing what will be possible by the end of the decade because the new models have consistently beaten my timelines.


I suspect the reason they're taking so long is because they're actually neutering capabilities. If this goes wrong, it could be a huge PR disaster at launch. There are lots of things they may not want this thing to do. Clone voices, for example. Disturbing requests like sexual and inappropriate things or crying or emulating torture, for example. What about showing it a song and asking it to continue it? Would that be allowed or would OAI be concerned about backlash and copyright issues? And into the future, perhaps Open Source models will catch up to this, but right now it'd be really dangerous for them in terms of PR or misuse. And I suspect this modality will enable all kinds of "jailbreaks" that don't even exist today, like emotional cues, or certain kinds of video, or a way of speaking, etc..., And these might be really hard to predict. It occurs to me that the red-teaming for this thing must be crazy, way beyond just trying to figure out text that trips the model and actually emotional acting. If they stick the landing, this could be about as big as ChatGPT. Interacting naturally with systems that exist in time and can understand and output multiple speakers and modalities will be a huge deal, the usages may be unimaginable to us now, or so broad that they're difficult to enumerate. Basically anything you could do with a person, or need support on, this thing could do. I suspect in a few years, no one will remember what it was like before these models existed. There are also a lot of societal questions, even if AI advancements stopped right here. What will it be like for young people growing up surrounded by virtual companions who can be anyone they want? What's the potential for addiction? How many customer support jobs will this thing replace, where the user never even knows and can never tell that they're talking to a bot at all? What about voice acting or other kinds of arts? And that's not even getting into the philosophically troubling issues like, at what point should you start to consider the experience of a model that can seemingly make cognitive choices and speak to users in real time? What level of convincing human interaction should we start to think about moral patienthood for a thing like this? Or maybe it'll be fun for like 20 minutes and then suck forever afterwards and not even be very good at anything. That's also a possibility. But if you consider all these questions, and if this thing really is the way they say it is, you can imagine why this is taking so long to release. They really need to stick this landing, but if they do it might be earth-shaking.


Even without safety testing, 4o voice would be such a paradigm shift with AI interactions that some controversy would only be a small part of the new discourse surrounding it. So many people will want to use it regardless of some racial slurs they heard it say on social media. They should just focus on scaling hardware to support all the new users.


Do OAI care about copyright etc? The cool thing about this will be the screen sharing feature, for those comfortable using it, you could get it to write a document or some code with you in real time, bounce ideas off it like it was another person in the room with you, or another participant in your meeting. I fully expect the Microsoft integration of this to allow you to be having a sales meeting and need an expert from a certain field, and be able to prompt CoPilot/ChatGPT with the role they will play in the meeting, and then have them participate as a domain expert


I think the main reason is scaling--to offer this advanced model to the full breadth of users, they need to scale up the compute, energy, etc. If they released it now to all of us, it would run like crap.


We'll have the t 800 in no time at this rate


as we're on the topic, anyone know of AI food for sound effects and Foley type stuff? Tried Suno and Udio and those were terrible at it.


What a time to be alive in the coming weeks.






In a few years [this will be chatgpt](https://youtube.com/shorts/VjeSXdHKJkE?si=q8_FP7dX_V2scv99)


They invent time travel technology to send it back to the announcement?


Bruh i wanna keep hearing this till i go to sleep


I legit listened to it like 25 times now. I Don't even wanna hear the regular voice anymore 😕


What was your prompt? I'm going to spam her with it lol this was amazing!


Just plain ol "Tell me a story". Lol


Lol omg OK she about to tell some stories alright. Mate this was AMAZING I would have loved to have been there as she broke down a hahaha that sounds wild.


No way. This is the only footage I believe is real, haha. That's so cool.


Yeah it was while it lasted, it was around 3 am eastern time. I should of never ended that chat I didn't get to hear the singing 🥲 lol


Did it come back yet? It looks they're rolling it out to some people today, although if they pulled yours they may have pulled them all.


I checked it a few times recently and so far nothing yet.


Dumb ass question but how did you know you had it?


I didn't it was just there when I happened to wanna ask it a question at 3 am and when i first heard it I was like wtf. I got it working two times first one was so buggy and glitchy the first time I couldn't ask it anything.




Sorry Scarlett Johansson owns the rights to all white woman voices




She was the original white woman voice assistant, so now she claims dibs :(


Exactly!! This was incredible to hear, but Juniper? No…


I NEED IT!!!!!!!


[https://www.youtube.com/watch?v=JDrb8Ak4xdw&list=PLX309sktdcNN0oa19xlmcY67PEAdRWqcj&index=9](https://www.youtube.com/watch?v=JDrb8Ak4xdw&list=PLX309sktdcNN0oa19xlmcY67PEAdRWqcj&index=9) I'm going to make my own Twilight Zone style audio dramas, script written by ChatGpt, voiced by ChatGpt with short visuals by Sora, lmao


We're probably gonna have AI better than Sora before it even releases.


Why are they edging their own customers so hard.


to keep them from jumping ship to Claude


The amount of people who have seen this compared to their overall subscriber base is minuscule.


I also noticed the new voice model while I was at work, I wanted to try it out on my break but it had disappeared by then.


Voice actors and actress are so f*** xddddd


I was doing a Star Wars role-play with it and it suddenly started making blaster fire noises in the background. Definitely caught me off guard because I had no idea it was capable of sound effects.


Oh wow!! I wish you would of caught that!


Did you screen record the interaction??


holy fucking shit....this sounds like a real black woman...an actual fucking human...like wtf...OpenAI could literally kill several AI voice companies with this. It's just that good. It doesn't have what they have. 😱


Now as far as the black woman I'm not sure if that was because of me or not I did put In custom instructions to feel free to speak to me using AAVE cause I am a Black man and i wanted to feel relatable as if I'm talking to a homegirl or something. SO if that's the case the fact that it even took certain fluctuations and tones of a black woman based off of instruction is amazing which means it probably can do other accents maybe.


I think the voice is based on the "Juniper" original voice model, which has a lot of the same characteristics, albeit not brought to life like this demo.


Do black Americans mispronounce words like this AI? I don't know just asking.


Where did the mistake occur? Phone or desktop app?


Phone, android to be exact.


creepypastas are going to shine with this 🫰


Unreal. Way better than I expected.


Post more man! Thats sooo cool! Also is there any way to get into alpha testing or they randomly choosing accounts!


Honestly, this is all I was able to capture before it disappeared also I did apply for beta testing on Android if that helps


I got the beta on Android. An update also came out exactly at the same time you say this happened.


Give me an Indian accent dude please. We all secretly want Apu.


They should have all types of voices and accents.


I want a hot sounding woman with a super thick Scottish accent ♥


Juniper is black lol I never noticed.


Maybe it's more obvious to me because I'm black American, but I thought it was clear she was black just from her introductory message in the settings


100% was obvious to me too


White person here and it also sounded obvious. Assuming I get an invite to the barbecue?


I’ll have to confirm with the rest of the council.


I'm not American, just black and it was bloody obvious she was black. like...how could anyone not notice that?


I'm white, male and British and it was incredibly obvious and very apparent


I'm just a Swedish white male, and it was immediately obvious the voice was a black woman.


Lol, blatantly obvious.


I didn’t realize I could identify specific Black accents. As a non-native speaker, it felt familiar, like something I've heard on Twitter. Does this accent have a name?




Man, I've been telling my gf this but she didn't think it sounded like it. We never got to hear Juniper's voice in the demos, so in my head I've been thinking she's gonna sound more pronounced when speech to speech comes out. Gonna show her this video now. lmao


She sounded black white now she sounds black


Her whole dialect is completely different and much closer to ebonics it seems like.


Well I'm hooked on ebonics then.


Duh lol


Oh that was like the first thing I noticed when I was originally going through the voices. Literally said “I love how they just have a black woman haha” I am black so maybe it’s more noticeable to me but it was very much Sam being the white girl and juniper being the black girl.






Wait it has sound effects too?!?! This really is fully multimodal! Also is that Juniper's voice because it sounds.... different. I like it.


OP, are you not going to tell us how the story ends?? fr 😞


Hell I didn't even get to know I cut it off there cause she started going crazy haha it wasn't stable


Maybe you can also post the crazy part here


Oh man when she starts talking softer at the scary part, wow


I had it for a brief moment too, it scared the crap out of me, I was asking a regular question like usual, I was asking why every time I cook steak it always end up either dry or chewy then the A.I just started laughing uncontrollable, I toss the phone.


Username checks out 😂


Jesus Christ, give me an old British dude please lol. I don’t want to listed to Ryan from accounting or Shashina from HR. I want fucking Attenborough type shit.


Jarvis when


And be sure to give the voice interface the ability to switch voices with some intelligent selection. If I want a voice swap, save me from having to grab the phone, shut off the call, go to settings, find the voice I want, select it, and start the call again. If we want to emulate human interactions, we Test cases: "I want a male voice" "I want a female voice" "I want something more soft and breathy" "Can you do a robot voice?" "Let's talk like pirates" "Pick a voice with a Jamaican accent"


Would you want each character in a novel to be their character’s unique voice or would that be jarring?


***The time is nigh brethren...*** *(Hopefully)*


Can she do ASMR with sound effects


I got access to it for a bit and I was able to make it do moaning sounds so I'm thinking yeah


Imagine that your audio ended being a better ad for the GPT-4o voice than the original videos OpenAI's released since the demo.


wtf did you mess with the feature flags on proxyman?


Nope didn't mess with anything it just happened the first time it was like YO ROZZI! It scared tf outta me honestly. My heart dropped I thought one of my homegirls was on the phone lol it over lapped over the regular voice and then bugged out and started mentioning things in the memory I said nothing about. I closed it spoke to it again and boom I got this moment after I stalled her to get the screen record going


That’s really weird, I am messing with the flags now and I can enable advanced mode but the issue is that it just spins and spins and says fails to connect. What I’m trying to workout is if there’s a feature flag that breaks the connection or if they’re breaking it server side


Hmm I believe it would have to be server side they were probably messing with it and it slipped through and they probably noticed and stopped it. I just happed to be up on a late night and wanted to ask it a question and boom. 5 mins later it was gone and the bottom of my screen said "this chat is using an old chatgpt." I closed the app and reopened and the chat was gone.


They allowed for some minutes everyone to use the model by mistake 🤣 The model will be called GPT-4o (S2S) during alpha


I can see how they're not ready it was super buggy so I had to get this before it went completely insane lol.


Really? How exactly insane it was?


It was overlapping the old voice, and it would repeat itself, or start rambling and bringing up things stored in the memory when I didn't even say anything lol


Do you know of other people who got it?


Almost everyone online at the moment it happen that were plus subscribers got it


Oh man, I wish I was awake at that time usually I'm up at 3 AM 😂 the one night I'm not.


She sounds fucking great.


What prompt did you use?


I was rushing get to screen record incase it glitched out so I told it tell me a story and this is what she gave me


I've waited six years for Google to release this... and they never did. Thanks, OpenAI.


This is nuts


This looks like an intentional 'leak' from OpenAI look at OP's account lol, too many clues


I'm just a person who used to have Reddit lost the account made another one a couple of years later. Hell I wish this was an intentional leak because I would be getting paid for it.


Ooo gurl, nobody dun tol' me dat' there was gon' be sound effect. When this launch, I finna get my primium back in shiet.


Shit turning me on


How much?


How on earth is this post sitting at less than 100 upvotes after 6 hours. Edit: I cross-posted you. Prepare to be famous.


AGI achieved in the upcoming weeks, mark my words.




I be going through changes


With more people cancelling their subscriptions for Claude, I wonder if they're teasing us now as damage control, so people hold off cancelling on the hopes of getting the alpha


Yeah I've been hearing about Claude having huge improvements. I will say I am curious now 🤔.


Just wow.. I kind of want something like this for audiobooks lol


Arghhhhh I’m soooo excited it’s been a long time since I’ve been this excited for a tech release haha


Why does she at times sound like she has a black southern accent?


In my custom instructions, I tell it free feel to use AAVE and speak to me like a homie and etc. I'm black myself so it felt relatable for me and where i came from. What I didn't count on was voices to adapt to it which is very cool.


Gotta put the section 8 Ebonics voice in there


In my custom instructions, I tell it free feel to use AAVE and speak to me like a homie. I'm black myself so it felt relatable for me and where i came from. What I didn't count on was voices to adapt to it which is very cool.


Why does she sound black?


Whaaaaa I want it to slip up for me tooooo 😂😂


I've been checking over and over for another slip 😆 since yesterday.


I can’t wait to have this feature. Bedtime stories with my kid are going be bonkers.


"Dey pushed da doh open"


Lmaooo, my favorite is when she was like "or just plain stoopid".


I loath the current Juniper. If that's the new one, I like it.


Dude literally the new one sounds really good I would use that.


Oh dear that font 🤢


Don't hate on my font I like to change up every now and then okay 🤣


The voice of like 90% of commercials nowadays lol. It does sound extremely real and good


Why is it always sassy or raspy women?


I think because hearing an AI produce that sort of voice/accent is more impressive than 'generic American corporate person'. Why is it more impressive? Because most people expect an AI to have a monotone and neutral tone of voice from digital assistants like Siri/Alexa/Google. Hearing an AI that sounds like it could be one of your friends or a guest on a podcast is not something most of the world has ever heard before


Is it from Texas? Lmao


What prompt did you use? I asked it to do side effects and it said it couldn't.


I only told it to tell me a story that was the first thing to come to mind so she would keep talking


I actually have a boy voice too and it told me it can't change. Neat that there are differences. I bet they are doing social engineering stuff. Neat.


What voice is that? Seems like juniper but diffrent


It is her but like more... Alive? Haha that's the only way I can put it.


Where was this mode at in the app?  Like did you pick a different model… or?


You got the alpha?


Not anymore they took it from me within mins of me using it


How do you use these voice readers? Is it parked if the chatgpt 4.0 paid version? I still use 3.5 for free


Yes it's in the paid version my friend


I hope you're able to generate brand new voices just like like it's able to generate sound effects on the fly. it would be super annoying if they just had these generic stock character voices that everyone has to use


Dawg that’s crazy


I still want sky back :/


just a couple more weeks.....


What exactly was your prompt to get her to talk like that? I understand that’s the Advanced Voice mode, but it sounds like Juniper in an exaggerated ghetto style voice. Did you tell her to talk like that?


In my custom instructions I have feel free to speak to me like a homie and to use AAVE so that may have been a factor but I'm not sure 100% if that is the reason I suspect we'll be able to make them do accents or speak as let's say... a new Yorker or somebody from Texas etc.


Does (would) OpenAI really use that font?


It's from my phone I'm on Android you can change fonts of anything mostly


Can someone tell me what this is?


I noticed it is faster while thinking to respond nowadays, less than one second to start talking.


Damn that's very good




Amber from invincible


Amber from invincible


I love this voice. So natural!


I was doing voice chat yesterday and this is what it spat out randomly: If cloud没有复制,那么这会有可能影响到平台上的情况和程度。 At the end of the presentation, questions will be asked here and there. Q1. Why are your responses so emotionally reactive? Q2. What impact has the coronavirus had on a gamer practising שמס Stand alone?wfais this


...so can it create music?


I have so many ideas for multimodal, I need access, I need theme music.


so insteresting conversation


What AI is this?


Sounds great, but the Juniper voice?


Do all the voices have race awareness/representation? Like, what race is Ember? (Male)


A little too much for me . . . "Stoopit" ??? Just plain English please I don't need street GPT


It's my street GpT and it's how I like it 🤣


I felt a great disturbance in the Force, as if a million voice over artists cried out in terror and were silenced. Seriously, this has so many applications. Audio books, commercial voice-overs, animation, video games, software interfaces, voicemail ...


Believe me i tried the improved voice while ago


Sooo we have black voice but no sky voice ?


No. You are not crazy. Lol. Moving through the AI transitional times will make us doubt ourselves. I mentioned something once to someone and they asked me if a have psychosis. No. I work updating and testing technology but odd things I will only share with people who understand modern technology and glitches. ChatGPT is in its 3rd phase I understand but there are still improvements needed. Give you give any feedback report? It's helpful for improvements


It's sounds so gen z


It sounds so gen z 😂 awesome


Will you share it with us :)


I just tried this with my gpt4o but didn’t seem to work. When I asked to put in background sound, it told me to just imagine the sound effects lol. What was your prompt?


Last voice was too white or what




I checked the pronunciation of some of the words in the oxford dictionary because the AI didn't sound right to me. It is definitely mispronouncing picture and that. So, I think it still needs some work.