

I'm reminded of the quote "everything has its pleasure and its price". I think it's a bit of column A and a bit of column B. >As impressive as a lot of these models are, I can't help but feel like they're going to end up making an incredible amount of sterile soulless content that makes everyone's lives worse. That exists to a certain extent already, but I don't see how this stuff won't make it way easier, way more effective, and way more widespread. If I want people to buy more Triscuts next year what's stopping me from writing a bunch of prompts to insert subtle marketing cues to buy Triscuts with entire fake ecosystems of users, fan art, radio call ins, user stories, etc in like every niche community in existence and flooding them with soulless fake interaction? Are all online forums going to end up being drowned out by cynical pumped out super cheap to produce simulacrums of creative content now too? We're already drowning in ad dominated cynical soulless computer generated search results. I can't tell if I'm starting to get that old person "new things are scary" instinct or if my gut level of fear about the implications of these things is warranted.Īs impressive as a lot of these models are, I can't help but feel like they're going to end up making an incredible amount of sterile soulless content that makes everyone's lives worse.

We are hiring researchers, frontend and full-stack developers! If you are interested, send over your GitHub account and short message to founderselevenlabs.io. API is directly available as part of Beta we are preparing the infrastructure to scale easily for the release! We are working on adding SSML-like support for better control speed controls will be coming as part of that too We can clone voices instantly, based just on 5s of speech, without training required Latency for our streaming TTS is <1s with quality results available above, which is the usual problem with existing good TTS models (like tortoise-tts) To address a few questions that frequently came up: Our goal is to let you convert any written content into high-quality, compelling audio. We are planning to open up Beta later this month.

With the published blog post, we are now deploying a way to help them design entirely new ones!Īnyone will be able to generate that level of quality just with a copy-paste. Additionally, we provide creators with a way to clone their own voice based on very short samples. We’re currently focused on researching and deploying a different way for speech synthesis that can generate nuanced intonation and emotions by understanding text and taking context into account. Thank you so much for the constructive and positive feedback - we’re taking it onboard!
