Revision as of 10:10, 7 December 2024 editRocketKnightX (talk | contribs)Extended confirmed users1,790 edits Undid revision 1261647607 by BrocadeRiverPoems (talk) ENOUGH VANDALIZING.Tags: Undo Mobile edit Mobile web edit Advanced mobile edit← Previous edit | Latest revision as of 17:49, 23 December 2024 edit undoAlalch E. (talk | contribs)Extended confirmed users, New page reviewers, Rollbackers29,914 editsm Arranged references with script | ||
(216 intermediate revisions by 18 users not shown) | |||
Line 1: | Line 1: | ||
{{Short description|Real-time text-to-speech AI tool}} | |||
<!-- Please do not remove or change this AfD message until the discussion has been closed. --> | <!-- Please do not remove or change this AfD message until the discussion has been closed. --> | ||
{{AfDM|page=15.ai ( |
{{AfDM|page=15.ai (3rd nomination)|year=2024|month=December|day=20|substed=yes|origtag=afdx|help=off}} | ||
<!-- End of AfD message, feel free to edit beyond this point --> | <!-- End of AfD message, feel free to edit beyond this point --> | ||
{{Use mdy dates|date=December 2024}} | |||
{{Short description|Real-time text-to-speech tool using artificial intelligence}} | |||
{{pp|small=yes}} | |||
{{Use mdy dates|date=July 2022}} | |||
{{Infobox website | {{Infobox website | ||
| name = 15.ai | | name = 15.ai | ||
Line 13: | Line 12: | ||
| commercial = No | | commercial = No | ||
| registration = None | | registration = None | ||
| launch_date = |
| launch_date = {{start date and age|2020|03}} | ||
| type = ], ], ], ] | | type = ], ] | ||
| website = {{URL|https://15.ai}} | | website = {{URL|https://15.ai}} | ||
| language = English | | language = English | ||
| current_status = Inactive | |||
}} | }} | ||
{{Artificial intelligence}} | |||
'''15.ai''' was a ] ] ] that generated ] voices from fictional characters from various media sources.<ref name="kotaku">{{cite web | |||
|url= https://kotaku.com/this-website-lets-you-make-glados-say-whatever-you-want-1846062835 | |||
|title= Website Lets You Make GLaDOS Say Whatever You Want | |||
|last= Zwiezen | |||
|first= Zack | |||
|date= 2021-01-18 | |||
|website= ] | |||
|access-date= 2021-01-18 | |||
|quote= | |||
|archive-date= 2021-01-17 | |||
|archive-url= https://web.archive.org/web/20210117164748/https://kotaku.com/this-website-lets-you-make-glados-say-whatever-you-want-1846062835 | |||
|url-status= live | |||
}}</ref><ref name="gameinformer">{{cite magazine | |||
|url= https://www.gameinformer.com/gamer-culture/2021/01/18/make-portals-glados-and-other-beloved-characters-say-the-weirdest-things | |||
|title= Make Portal's GLaDOS And Other Beloved Characters Say The Weirdest Things With This App | |||
|last= Ruppert | |||
|first= Liana | |||
|date= 2021-01-18 | |||
|magazine= ] | |||
|access-date= 2021-01-18 | |||
|quote= | |||
|archive-date= 2021-01-18 | |||
|archive-url= https://web.archive.org/web/20210118175543/https://www.gameinformer.com/gamer-culture/2021/01/18/make-portals-glados-and-other-beloved-characters-say-the-weirdest-things | |||
|url-status= dead | |||
}}</ref><ref name="pcgamer">{{cite web | |||
|url= https://www.pcgamer.com/make-the-cast-of-tf2-recite-old-memes-with-this-ai-text-to-speech-tool | |||
|title= Make the cast of TF2 recite old memes with this AI text-to-speech tool | |||
|last= Clayton | |||
|first= Natalie | |||
|date= 2021-01-19 | |||
|website= ] | |||
|access-date= 2021-01-19 | |||
|quote= | |||
|archive-date= 2021-01-19 | |||
|archive-url= https://web.archive.org/web/20210119133726/https://www.pcgamer.com/make-the-cast-of-tf2-recite-old-memes-with-this-ai-text-to-speech-tool/ | |||
|url-status= live | |||
}}</ref><ref name="rockpapershotgun">{{cite web | |||
|url= https://www.rockpapershotgun.com/2021/01/18/put-words-in-game-characters-mouths-with-this-fascinating-text-to-speech-tool/ | |||
|title= Put words in game characters' mouths with this fascinating text to speech tool | |||
|last= Morton | |||
|first= Lauren | |||
|date= 2021-01-18 | |||
|website= ] | |||
|access-date= 2021-01-18 | |||
|quote= | |||
|archive-date= 2021-01-18 | |||
|archive-url= https://web.archive.org/web/20210118213308/https://www.rockpapershotgun.com/2021/01/18/put-words-in-game-characters-mouths-with-this-fascinating-text-to-speech-tool/ | |||
|url-status= live | |||
}}</ref> Created by a ]ous developer under the alias '''15''',<ref name="automaton"/><ref name="elevenlabs">{{cite web | |||
|url=https://elevenlabs.io/blog/15-ai | |||
|title=15.AI: Everything You Need to Know & Best Alternatives | |||
|website=] | |||
|date=2024-02-07 | |||
|access-date=2024-11-18 | |||
|url-status=live | |||
|archive-date=July 15, 2024 | |||
|archive-url=https://web.archive.org/web/20240715151316/https://elevenlabs.io/blog/15-ai | |||
}}</ref><ref name="resemble">{{cite web | |||
|url=https://www.resemble.ai/free-15ai-character-voice-cloning-alternatives/ | |||
|title=Free 15.ai Character Voice Cloning and Alternatives | |||
|website=Resemble.ai | |||
|date= October 17, 2024 | |||
|access-date= 2024-11-18 | |||
}}</ref><ref name="play.ht">{{cite web | |||
|url=https://play.ht/blog/15-ai/ | |||
|title=Everything You Need to Know About 15.ai: The AI Voice Generator | |||
|website=Play.ht | |||
|date=2024-09-12 | |||
|access-date=2024-11-18 | |||
}}</ref> the project used a combination of ] algorithms, ] ], and ] models to generate emotive character voices faster than real-time.{{efn|The term ''"faster than real-time"'' in speech synthesis means that the system can generate audio more quickly than the actual duration of the speech – for example, generating 10 seconds of speech in less than 10 seconds would be considered faster than real-time.}}<ref name="hashdork">{{cite web | |||
|url=https://hashdork.com/15-ai/ | |||
|title=15.ai – Natural and Emotional Text-to-Speech Using Neural Networks | |||
|website=Hashdork | |||
|date=2024-05-15 | |||
|access-date=2024-11-18 | |||
|url-status=live | |||
|archive-date=July 4, 2024 | |||
|archive-url=https://web.archive.org/web/20240704144415/https://hashdork.com/15-ai/ | |||
}}</ref><ref name="thelinuxcode">{{cite web | |||
|url=https://thelinuxcode.com/what-15ai-and-how-does-work/ | |||
|title=Demystifying 15.ai: How AI Generates Ultra-Realistic Text-to-Speech Voices | |||
|website=TheLinuxCode | |||
|date=2023-12-27 | |||
|access-date=2024-11-18 | |||
|url-status=live | |||
|archive-date=December 27, 2023 | |||
|archive-url=https://web.archive.org/web/20231227222306/https://thelinuxcode.com/what-15ai-and-how-does-work/ | |||
}}</ref> | |||
'''15.ai''' was a free non-commercial ] that used ] to generate ] voices of fictional characters from ].<ref name="UDN-2021">{{cite web |last=遊戲 |first=遊戲角落 |date=January 20, 2021 |title=這個AI語音可以模仿《傳送門》GLaDOS講出任何對白!連《Undertale》都可以學 |trans-title=This AI Voice Can Imitate Portal's GLaDOS Saying Any Dialog! It Can Even Learn Undertale |url=https://game.udn.com/game/story/10453/5189551 |url-status=live |access-date=December 18, 2024 |website=] |language=zh-tw |quote= |trans-quote= |archive-date=December 19, 2024 |archive-url=https://web.archive.org/web/20241219214330/https://game.udn.com/game/story/10453/5189551}}</ref><ref name="Yoshiyuki-2021">{{cite web |last=Yoshiyuki |first=Furushima |date=January 18, 2021 |title=『Portal』のGLaDOSや『UNDERTALE』のサンズがテキストを読み上げてくれる。文章に込められた感情まで再現することを目指すサービス「15.ai」が話題に |trans-title=Portal's GLaDOS and UNDERTALE's Sans Will Read Text for You. "15.ai" Service Aims to Reproduce Even the Emotions in Text, Becomes Topic of Discussion |url=https://news.denfaminicogamer.jp/news/210118f |url-status=live |archive-url=https://web.archive.org/web/20210118051321/https://news.denfaminicogamer.jp/news/210118f |archive-date=January 18, 2021 |access-date=December 18, 2024 |website=] |language=ja |quote=日本語入力には対応していないが、ローマ字入力でもなんとなくそれっぽい発音になる。; 15.aiはテキスト読み上げサービスだが、特筆すべきはそのなめらかな発音と、ゲームに登場するキャラクター音声を再現している点だ。 |trans-quote=It does not support Japanese input, but even if you input using romaji, it will somehow give you a similar pronunciation.; 15.ai is a text-to-speech service, but what makes it particularly noteworthy is its smooth pronunciation and the fact that it reproduces the voices of characters that appear in games.}}</ref> Conceived by an artificial intelligence researcher known as ''"15"'' during their time at the ] and developed following their successful exit from a ] venture, the application allowed users to make characters from various media speak custom text with emotional inflections faster than real-time.{{efn|The term ''"faster than real-time"'' in speech synthesis means that the system can generate audio more quickly than the actual duration of the speech – for example, generating 10 seconds of speech in less than 10 seconds would be considered faster than real-time.}}<ref name="Kurosawa-2021">{{cite web |last=Kurosawa |first=Yuki |date=January 19, 2021 |title=ゲームキャラ音声読み上げソフト「15.ai」公開中。『Undertale』や『Portal』のキャラに好きなセリフを言ってもらえる |trans-title=Game Character Voice Reading Software "15.ai" Now Available. Get Characters from Undertale and Portal to Say Your Desired Lines |url=https://automaton-media.com/articles/newsjp/20210119-149494/ |url-status=live |archive-url=https://web.archive.org/web/20210119103031/https://automaton-media.com/articles/newsjp/20210119-149494/ |archive-date=January 19, 2021 |access-date=December 18, 2024 |website=] |language=ja |quote=英語版ボイスのみなので注意。;もうひとつ15.aiの大きな特徴として挙げられるのが、豊かな感情表現だ。 |trans-quote=Please note that only English voices are available.;Another major feature of 15.ai is its rich emotional expression.}}</ref><ref name="Ruppert-2021">{{cite magazine |last=Ruppert |first=Liana |date=January 18, 2021 |title=Make Portal's GLaDOS And Other Beloved Characters Say The Weirdest Things With This App |url=https://www.gameinformer.com/gamer-culture/2021/01/18/make-portals-glados-and-other-beloved-characters-say-the-weirdest-things |url-status=dead |archive-url=https://web.archive.org/web/20210118175543/https://www.gameinformer.com/gamer-culture/2021/01/18/make-portals-glados-and-other-beloved-characters-say-the-weirdest-things |archive-date=January 18, 2021 |access-date=December 18, 2024 |magazine=] |quote=}}</ref><ref name="Clayton-2021">{{cite web |last=Clayton |first=Natalie |date=January 19, 2021 |title=Make the cast of TF2 recite old memes with this AI text-to-speech tool |url=https://www.pcgamer.com/make-the-cast-of-tf2-recite-old-memes-with-this-ai-text-to-speech-tool |url-status=live |archive-url=https://web.archive.org/web/20210119133726/https://www.pcgamer.com/make-the-cast-of-tf2-recite-old-memes-with-this-ai-text-to-speech-tool/ |archive-date=January 19, 2021 |access-date=December 18, 2024 |website=] |quote=}}</ref><ref name="Morton-2021">{{cite web |last=Morton |first=Lauren |date=January 18, 2021 |title=Put words in game characters' mouths with this fascinating text to speech tool |url=https://www.rockpapershotgun.com/2021/01/18/put-words-in-game-characters-mouths-with-this-fascinating-text-to-speech-tool/ |url-status=live |archive-url=https://web.archive.org/web/20210118213308/https://www.rockpapershotgun.com/2021/01/18/put-words-in-game-characters-mouths-with-this-fascinating-text-to-speech-tool/ |archive-date=January 18, 2021 |access-date=December 18, 2024 |website=] |quote=}}</ref> | |||
In early 2020, 15.ai appeared online as a ] of the ] of ] and ].<ref name="play.ht"/><ref name="thebatch"> | |||
{{cite web |last=Ng |first=Andrew |date=2020-04-01 |title=Voice Cloning for the Masses |url=https://blog.deeplearning.ai/blog/the-batch-ai-against-coronavirus-datasets-voice-cloning-for-the-masses-finding-unexploded-bombs-seeing-see-through-objects-optimizing-training-parameters |url-status=dead |archive-url=https://web.archive.org/web/20200807111844/https://blog.deeplearning.ai/blog/the-batch-ai-against-coronavirus-datasets-voice-cloning-for-the-masses-finding-unexploded-bombs-seeing-see-through-objects-optimizing-training-parameters |archive-date=2020-08-07 |access-date=2020-04-05 |website=DeepLearning.AI |quote=}} | |||
</ref> Its gratis nature, ease of use without ], and improvements over existing text-to-speech implementations made it popular.<ref name="gameinformer"/><ref name="kotaku" /><ref name="pcgamer" /> Some critics and ]s questioned the ] and ] of making such technology so readily accessible.<ref name="wccftech"/> | |||
Launched in March 2020,<ref name="Ng-2020">{{cite web |last=Ng |first=Andrew |date=April 1, 2020 |title=Voice Cloning for the Masses |url=https://www.deeplearning.ai/the-batch/voice-cloning-for-the-masses/|access-date=December 22, 2024 |website=] |quote=}}</ref> The service gained widespread attention in early 2021 when it went ] on social media platforms like ] and ], and quickly became popular among Internet fandoms, including the '']'', '']'', and '']'' fandoms.<ref name="Zwiezen-2021">{{cite web |last=Zwiezen |first=Zack |date=January 18, 2021 |title=Website Lets You Make GLaDOS Say Whatever You Want |url=https://kotaku.com/this-website-lets-you-make-glados-say-whatever-you-want-1846062835 |url-status=live |archive-url=https://web.archive.org/web/20210117164748/https://kotaku.com/this-website-lets-you-make-glados-say-whatever-you-want-1846062835 |archive-date=January 17, 2021 |access-date=December 18, 2024 |website=] |quote=}}</ref><ref name="Chandraseta-2021" /><ref name="GamerSky-2021">{{cite web |date=January 18, 2021 |title=这个网站可用AI生成语音 让ACG角色"说"出你输入的文本 |trans-title=This Website Can Use AI to Generate Voice, Making ACG Characters "Say" the Text You Input |url=https://www.gamersky.com/news/202101/1355887.shtml |url-status=live |access-date=December 18, 2024 |website=] |language=zh |quote= |trans-quote= |archive-date=December 11, 2024 |archive-url=https://web.archive.org/web/20241211221628/https://www.gamersky.com/news/202101/1355887.shtml}}</ref> The website had a role in the emergence of AI voice cloning (]) ]. | |||
The site was credited as the impetus behind the popularization of AI ] (also known as '']'') in ].<ref name="play.ht"/> It was embraced by Internet ]s such as ], '']'', and '']''.<ref name="automaton"/><ref name="Denfaminicogamer"/><ref name="play.ht"/> | |||
In January 2022, Voiceverse NFT sparked controversy when it was discovered that the company, which had partnered with voice actor ], had misappropriated 15.ai's work for their own platform. The service was ultimately taken offline in September 2022. Its shutdown led to the emergence of various commercial alternatives in subsequent years. | |||
Several commercial alternatives appeared in the following years.<ref name="elevenlabs"/><ref name="resemble"/> In January 2022, the company Voiceverse NFT ] 15.ai's work as part of their platform.<ref name="nme">{{cite web | |||
|url= https://www.nme.com/news/gaming-news/voiceverse-nft-admits-to-taking-voice-lines-from-non-commercial-service-3140663 | |||
|title= Voiceverse NFT admits to taking voice lines from non-commercial service | |||
|last= Williams | |||
|first= Demi | |||
|date= 2022-01-18 | |||
|website= ] | |||
|access-date= 2022-01-18 | |||
|quote= | |||
|archive-date= 2022-01-18 | |||
|archive-url= https://web.archive.org/web/20220118162845/https://www.nme.com/news/gaming-news/voiceverse-nft-admits-to-taking-voice-lines-from-non-commercial-service-3140663 | |||
|url-status= live | |||
}}</ref><ref name="stevivor">{{cite web | |||
|url= https://stevivor.com/news/troy-baker-nft-voiceverse-15-ai/ | |||
|title= Troy Baker-backed NFT company admits to using content without permission | |||
|last= Wright | |||
|first= Steve | |||
|date= 2022-01-17 | |||
|website= Stevivor | |||
|access-date= 2022-01-17 | |||
|quote= | |||
|archive-date= 2022-01-17 | |||
|archive-url= https://web.archive.org/web/20220117231918/https://stevivor.com/news/troy-baker-nft-voiceverse-15-ai/ | |||
|url-status= live | |||
}}</ref> | |||
== History == | |||
In September 2022, a year after its last stable release, 15.ai was taken offline.<ref name="elevenlabs"/> As of November 2024, the website was still offline, with the creator's most recent post being dated February 2023.<ref>{{Cite tweet |number=1628834708653068290 |user=fifteenai |title=If all goes well, the next update should be the culmination of a year and a half of nonstop work put into a huge number of fixes and major improvements to the algorithm. Just give me a bit more time – it should be worth it.}}</ref> | |||
15.ai was conceived in 2016 as a research project in ] by a developer known as ''"15"'' during their undergraduate studies at the ] (MIT).<ref name="Chandraseta-2021">{{cite web |last=Chandraseta |first=Rionaldi |date=January 21, 2021 |title=Generate Your Favourite Characters' Voice Lines using Machine Learning |url=https://towardsdatascience.com/generate-your-favourite-characters-voice-lines-using-machine-learning-c0939270c0c6 |url-status=live |access-date=December 18, 2024 |website=Towards Data Science |archive-date=January 21, 2021 |archive-url=https://web.archive.org/web/20210121132456/https://towardsdatascience.com/generate-your-favourite-characters-voice-lines-using-machine-learning-c0939270c0c6}}</ref> The developer was inspired by ]'s ] paper, with development continuing through their studies as ] released Tacotron the following year.<ref name="Twitter">{{cite web |title=The past and future of 15.ai |url=https://x.com/fifteenai/status/1865439846744871044 |website=] |access-date=December 19, 2024 |archive-date=December 8, 2024 |archive-url=https://web.archive.org/web/20241208035548/https://x.com/fifteenai/status/1865439846744871044 |url-status=live}}</ref> The name ''15'' is a reference to the creator's claim that a voice can be cloned with as little as 15 seconds of data.<ref name="Chandraseta-2021" /><ref>{{cite web |last=Button |first=Chris |date=January 19, 2021 |title=Make GLaDOS, SpongeBob and other friends say what you want with this AI text-to-speech tool |url=https://www.byteside.com/2021/01/15-ai-deepmoji-glados-spongebob-characters-ai-text-to-speech/ |url-status=live |access-date=December 18, 2024 |website=Byteside |quote= |archive-date=June 25, 2024 |archive-url=https://web.archive.org/web/20240625180514/https://www.byteside.com/2021/01/15-ai-deepmoji-glados-spongebob-characters-ai-text-to-speech/}}</ref> 15.ai was released in March 2020.<ref>{{multiref|{{cite web |title=About |url=https://fifteen.ai/about |website=fifteen.ai |access-date=December 23, 2024 |archive-url=https://archive.is/oaJPz |archive-date=February 23, 2020 |date=February 19, 2020 |type=Official website |quote=2020-02-19: The web app isn't fully ready just yet}}|{{cite web |title=About |url=https://fifteen.ai/about |website=fifteen.ai |access-date=December 23, 2024 |archive-url=https://archive.is/aXhTU |archive-date=March 3, 2020 |date=March 2, 2020 |type=Official website}}<!-- multiref end-->}}</ref><!--In April 2020, British-American computer scientist ]'s wrote about 15.ai in his newsletter ''The Batch''; he described it as a proof of concept of voice cloning for practical use cases.<ref name="Ng-2020" />--> More voices were added to the website in the following months.<ref>{{cite web |last=Scotellaro |first=Shaun |date=March 31, 2020 |title=Rainbow Dash Voice Added to 15.ai |url=https://www.equestriadaily.com/2020/03/rainbow-dash-voice-added-to-15ai.html |url-status=live |access-date=December 18, 2024 |website=] |quote= |archive-date=December 1, 2024 |archive-url=https://web.archive.org/web/20241201163118/https://www.equestriadaily.com/2020/03/rainbow-dash-voice-added-to-15ai.html}}</ref><ref>{{cite web |last=Scotellaro |first=Shaun |date=October 5, 2020|title=15.ai Adds Tons of New Pony Voices|url=https://www.equestriadaily.com/2020/10/15ai-adds-tons-of-new-pony-voices.html|access-date=December 21, 2024|website=]}}</ref> | |||
In early 2021, the application went viral on ] and ], with people generating skits, ], and fan content using voices from popular games and shows.<ref name="Zwiezen-2021" /><ref name="Clayton-2021" /><ref name="Ruppert-2021" /><ref name="Yoshiyuki-2021" /> 15.ai use also resulted in memes and ]s. These included recreations of the popular ] video '']'',<ref name="UDN-2021" /> ''The RED Bread Bank'',<ref name="Kurosawa-2021" /> and ''] Struggles'',<ref name="Morton-2021" /> which have amassed millions of views on social media. Content creators, ], and ] have also used 15.ai as part of their videos as ].<ref name="Play.ht-2024" /> According to the developer, at its peak, the platform incurred operational costs of {{Currency|12000|United States}} per month from ] infrastructure needed to handle millions of daily voice generations. They funded the website through their previous startup earnings.<ref name="Twitter" /> | |||
== Features == | |||
], known for his sinister robotic voice, was one of the available characters on 15.ai.<ref name="kotaku"/>]] | |||
The platform required no ] or ] to generate voices.<ref name="LaPS4"/><ref name="yahoofin"/><ref name="resemble"/><ref name="play.ht"/> Users could generate speech by entering text and selecting a character voice (optionally specifying an emotional contextualizer and/or phonetic transcriptions), with the system producing three variations of the audio with different emotional deliveries.<ref name="hashdork"/> The platform operated completely ], though the developer reported spending thousands of dollars monthly to maintain the service.<ref name="play.ht"/> | |||
On January 14, 2022, a controversy ensued after it was discovered that Voiceverse NFT, a company that video game and ] ] voice actor ], had misappropriated voice lines generated from 15.ai as part of their marketing campaign.<ref>{{cite web |last1=Lawrence |first1=Briana |title=Shonen Jump Scare Leads to Company Reassuring Fans That They Aren't Getting Into NFTs |url=https://www.themarysue.com/shonen-jump-not-doing-nfts/ |website=] |access-date=23 December 2024 |date=19 January 2022}}</ref><ref name="Williams-2022">{{cite web |last=Williams |first=Demi |date=January 18, 2022 |title=Voiceverse NFT admits to taking voice lines from non-commercial service |url=https://www.nme.com/news/gaming-news/voiceverse-nft-admits-to-taking-voice-lines-from-non-commercial-service-3140663 |url-status=live |archive-url=https://web.archive.org/web/20220118162845/https://www.nme.com/news/gaming-news/voiceverse-nft-admits-to-taking-voice-lines-from-non-commercial-service-3140663 |archive-date=January 18, 2022 |access-date=December 18, 2024 |website=] |quote=}}</ref><ref name="Wright-2022">{{cite web |last=Wright |first=Steve |date=January 17, 2022 |title=Troy Baker-backed NFT company admits to using content without permission |url=https://stevivor.com/news/troy-baker-nft-voiceverse-15-ai/ |url-status=live |archive-url=https://web.archive.org/web/20220117231918/https://stevivor.com/news/troy-baker-nft-voiceverse-15-ai/ |archive-date=January 17, 2022 |access-date=December 18, 2024 |website=Stevivor |quote=}}</ref> ] showed that Voiceverse had generated audio of characters from '']'' using 15.ai, pitched them up to make them sound unrecognizable from the original voices to market their own platform—in violation of 15.ai's terms of service.<ref name="Phillips-2022" /><ref>{{cite web |last=Lopez |first=Ule |date=January 16, 2022 |title=Voiceverse NFT Service Reportedly Uses Stolen Technology from 15ai |url=https://wccftech.com/voiceverse-nft-service-uses-stolen-technology-from-15ai/ |url-status=live |archive-url=https://web.archive.org/web/20220116194519/https://wccftech.com/voiceverse-nft-service-uses-stolen-technology-from-15ai/ |archive-date=January 16, 2022 |access-date=June 7, 2022 |website=Wccftech}}</ref> Voiceverse claimed that someone in their marketing team used the voice without properly crediting 15.ai; in response, 15 tweeted "Go fuck yourself."<ref name="Wright-2022" /><ref name="Phillips-2022" /><ref>{{Cite tweet |number=1482088782765576192 |user=fifteenai |title=Go fuck yourself. |date=January 14, 2022}}</ref> | |||
Available characters included ] and ] from '']'', characters from '']'', ] and other ] from '']'', ], ] and ] from '']'', the ], ] from '']'', the Narrator from '']'', ] from '']'', ], Dan from '']'', and ] from '']''.<ref name="Denfaminicogamer">{{cite web | |||
|url= https://news.denfaminicogamer.jp/news/210118f | |||
|title= 『Portal』のGLaDOSや『UNDERTALE』のサンズがテキストを読み上げてくれる。文章に込められた感情まで再現することを目指すサービス「15.ai」が話題に | |||
|last= Yoshiyuki | |||
|first= Furushima | |||
|date= 2021-01-18 | |||
|website= Denfaminicogamer | |||
|access-date= 2021-01-18 | |||
|quote= | |||
|archive-date= 2021-01-18 | |||
|archive-url= https://web.archive.org/web/20210118051321/https://news.denfaminicogamer.jp/news/210118f | |||
|url-status= live | |||
}}</ref><ref name="automaton">{{cite web | |||
|url= https://automaton-media.com/articles/newsjp/20210119-149494/ | |||
|title= ゲームキャラ音声読み上げソフト「15.ai」公開中。『Undertale』や『Portal』のキャラに好きなセリフを言ってもらえる | |||
|last= Kurosawa | |||
|first= Yuki | |||
|date= 2021-01-19 | |||
|website= ] | |||
|access-date= 2021-01-19 | |||
|quote= | |||
|archive-date= 2021-01-19 | |||
|archive-url= https://web.archive.org/web/20210119103031/https://automaton-media.com/articles/newsjp/20210119-149494/ | |||
|url-status= live | |||
}}</ref><ref name="LaPS4">{{cite web | |||
|url= https://www.laps4.com/noticias/descubre-15-ai-un-sitio-web-en-el-que-podras-hacer-que-glados-diga-lo-que-quieras/ | |||
|title= Descubre 15.AI, un sitio web en el que podrás hacer que GlaDOS diga lo que quieras | |||
|last= Villalobos | |||
|first= José | |||
|date= 2021-01-18 | |||
|website= ] | |||
|access-date= 2021-01-18 | |||
|quote= | |||
|archive-date= 2021-01-18 | |||
|archive-url= https://web.archive.org/web/20210118172043/https://www.laps4.com/noticias/descubre-15-ai-un-sitio-web-en-el-que-podras-hacer-que-glados-diga-lo-que-quieras/ | |||
|url-status= live | |||
}}</ref><ref name="yahoofin">{{cite web | |||
|url= https://es-us.finanzas.yahoo.com/noticias/15-ai-sitio-te-permite-152000712.html | |||
|title= 15.ai, el sitio que te permite usar voces de personajes populares para que digan lo que quieras | |||
|last= Moto | |||
|first= Eugenio | |||
|date= 2021-01-20 | |||
|website= ] | |||
|access-date= 2021-01-20 | |||
|quote= | |||
|archive-date= 2022-03-08 | |||
|archive-url= https://web.archive.org/web/20220308230836/https://es-us.finanzas.yahoo.com/noticias/15-ai-sitio-te-permite-152000712.html | |||
|url-status= live | |||
}}</ref> | |||
In September 2022, 15.ai was taken offline.<ref name="Play.ht-2024" /><ref name="ElevenLabs-2024" /> The developer claimed that this was due to legal issues surrounding ].<ref name="Twitter" /> | |||
The ] nature of the ] model ensured that each generation would have slightly different intonations, similar to multiple takes from a ].<ref name="hashdork"/><ref name="automaton"/> The application supported manually altering the ] of a generated line using ''emotional contextualizers'' (a term coined by this project), a sentence or phrase conveying the emotion of the take that serves as a guide for the model during inference.<ref name="automaton"/><ref name="Denfaminicogamer"/> | |||
Emotional contextualizers were representations of the emotional content of a sentence deduced via ] ] ] using ], a deep neural network ] algorithm developed by the ] in 2017.<ref>{{cite book |last=Felbo |first=Bjarke |arxiv=1708.00524 |title=Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing|chapter=Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm |date=2017 |pages=1615–1625 |doi=10.18653/v1/D17-1169 |s2cid=2493033 }}</ref><ref>{{cite web | |||
|url= https://www.theregister.com/2017/08/07/sarcasm_detector_bot_mit/ | |||
|title= A sarcasm detector bot? That sounds absolutely brilliant. Definitely | |||
|last= Corfield | |||
|first= Gareth | |||
|date= 2017-08-07 | |||
|website= ] | |||
|access-date= 2022-06-02 | |||
|archive-date= 2022-06-02 | |||
|archive-url= https://web.archive.org/web/20220602215737/https://www.theregister.com/2017/08/07/sarcasm_detector_bot_mit/ | |||
|url-status= live | |||
}}</ref> DeepMoji was trained on 1.2 billion emoji occurrences in ] data from 2013 to 2017, and outperformed human subjects in correctly identifying sarcasm in Tweets and other online modes of communication.<ref>{{cite web | |||
|url= https://www.technologyreview.com/2017/08/03/105566/an-algorithm-trained-on-emoji-knows-when-youre-being-sarcastic-on-twitter/ | |||
|title= An Algorithm Trained on Emoji Knows When You're Being Sarcastic on Twitter | |||
|last= | |||
|first= | |||
|date= 2017-08-03 | |||
|website= ] | |||
|access-date= 2022-06-02 | |||
|archive-date= 2022-06-02 | |||
|archive-url= https://web.archive.org/web/20220602215737/https://www.technologyreview.com/2017/08/03/105566/an-algorithm-trained-on-emoji-knows-when-youre-being-sarcastic-on-twitter/ | |||
|url-status= live | |||
}}</ref><ref>{{cite web | |||
|url= https://www.bbc.com/news/technology-40850171 | |||
|title= Emojis help software spot emotion and sarcasm | |||
|last= | |||
|first= | |||
|date= 2017-08-07 | |||
|website= ] | |||
|access-date= 2022-06-02 | |||
|archive-date= 2022-06-02 | |||
|archive-url= https://web.archive.org/web/20220602215735/https://www.bbc.com/news/technology-40850171 | |||
|url-status= live | |||
}}</ref><ref>{{cite web | |||
|url= https://www.newsweek.com/emoji-computer-sarcasm-emotion-training-hate-speech-647474 | |||
|title= Emoji-Filled Mean Tweets Help Scientists Create Sarcasm-Detecting Bot That Could Uncover Hate Speech | |||
|last= Lowe | |||
|first= Josh | |||
|date= 2017-08-07 | |||
|website= ] | |||
|access-date= 2022-06-02 | |||
|archive-date= 2022-06-02 | |||
|archive-url= https://web.archive.org/web/20220602215735/https://www.newsweek.com/emoji-computer-sarcasm-emotion-training-hate-speech-647474 | |||
|url-status= live | |||
}}</ref> | |||
== Features == | |||
15.ai used a ''multi-speaker model''—hundreds of voices were trained concurrently rather than sequentially, decreasing the required training time and enabling the model to learn and generalize shared emotional context, even for voices with no exposure to that context.<ref name="arxivmello">{{cite arXiv |last=Valle |first=Rafael |eprint=1910.11997 |title=Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens |class=eess |date=2020 }}</ref> Consequently, the characters in the application were powered by a single trained model, as opposed to multiple single-speaker models.<ref name="arxivmulti">{{cite arXiv |last=Cooper |first=Erica |eprint=1910.10838 |title=Zero-Shot Multi-Speaker Text-To-Speech with State-of-the-art Neural Speaker Embeddings |class=eess |date=2020 }}</ref> The ] used by 15.ai was scraped from a variety of Internet sources, including ], ], the ], ], ], and ]. Pronunciations of unfamiliar words were automatically deduced using ]s learned by the deep learning model.<ref name="automaton"/> | |||
The platform was non-commercial,<ref name="Williams-2022" /> and operated without requiring user registration or accounts.<ref name="Phillips-2022">{{cite web |last=Phillips |first=Tom |date=January 17, 2022 |title=Troy Baker-backed NFT firm admits using voice lines taken from another service without permission |url=https://www.eurogamer.net/articles/2022-01-17-troy-baker-backed-nft-firm-admits-using-voice-lines-taken-from-another-service-without-permission |url-status=live |archive-url=https://web.archive.org/web/20220117164033/https://www.eurogamer.net/articles/2022-01-17-troy-baker-backed-nft-firm-admits-using-voice-lines-taken-from-another-service-without-permission |archive-date=January 17, 2022 |access-date=December 18, 2024 |website=] |quote=}}</ref> Users generated speech by inputting text and selecting a character voice, with optional parameters for emotional contextualizers and phonetic transcriptions. Each request produced three audio variations with distinct emotional deliveries.<ref name="Chandraseta-2021" /> Characters available included multiple characters from '']'' and '']''; ] and ] from the '']'' series; ]; ] from '']''; ] and ] from ]; ] from '']''; ] from '']''; ] from '']''; the ]; the Narrator from '']''; and ] from '']''.<ref name="Zwiezen-2021" /><ref name="Clayton-2021" /><ref name="Morton-2021" /><ref name="Ruppert-2021" /> Certain "silent" characters like ] and ] were able to be selected as a joke, and would emit silent audio files when any text was submitted.<ref name="Morton-2021" /><ref name="UDN-2021" /> | |||
The deep learning model's nondeterministic properties produced variations in speech output, creating different intonations with each generation, similar to how ] produce different takes.<ref name="Yoshiyuki-2021" /> 15.ai introduced the concept of ''"emotional contextualizers,"'' which allowed users to specify the emotional tone of generated speech through guiding phrases.<ref name="Kurosawa-2021" /><ref name="Chandraseta-2021" /> The emotional contextualizer functionality utilized DeepMoji, a sentiment analysis neural network developed at the ].<ref name="Kurosawa-2021" /><ref name="Chandraseta-2021" /> Introduced in 2017, DeepMoji processed ] embeddings from 1.2 billion Twitter posts (2013-2017) to analyze emotional content. Testing showed the system could identify emotional elements, including sarcasm, more accurately than human evaluators.<ref>{{cite web |last= |first= |date=August 3, 2017 |title=An Algorithm Trained on Emoji Knows When You're Being Sarcastic on Twitter |url=https://www.technologyreview.com/2017/08/03/105566/an-algorithm-trained-on-emoji-knows-when-youre-being-sarcastic-on-twitter/ |url-status=live |archive-url=https://web.archive.org/web/20220602215737/https://www.technologyreview.com/2017/08/03/105566/an-algorithm-trained-on-emoji-knows-when-youre-being-sarcastic-on-twitter/ |archive-date=June 2, 2022 |access-date=December 18, 2024 |website=]}}</ref> | |||
The application supported a simplified phonetic transcription known as ], to correct mispronunciations and account for ]—words that are spelled the same but are pronounced differently (such as the word ''read'', which can be pronounced as either {{IPAc-en|ˈ|r|ɛ|d}} or {{IPAc-en|ˈ|r|iː|d}} depending on its ]). It followed the ]'s ARPABET conventions.<ref name="automaton" /> | |||
{{clear}} | |||
The application provided support for a simplified version of ], a set of English phonetic transcriptions originally developed by the ] in the 1970s. This feature allowed users to correct mispronunciations or specify the desired pronunciation between ] – words that have the same spelling but have different pronunciations. Users could invoke ARPABET transcriptions by enclosing the phoneme string in curly braces within the input box (for example, "{AA1 R P AH0 B EH2 T}" to specify the pronunciation of the word "ARPABET" ({{IPAc-en|ˈ|ɑːr|p|ə|ˌ|b|ɛ|t}} {{respell|AR|pə|beht}}).<ref name="equestriacn" /><ref name="Kurosawa-2021" /> The interface displayed parsed words with color-coding to indicate pronunciation certainty: green for words found in the existing pronunciation lookup table, blue for manually entered ARPAbet pronunciations, and red for words where the pronunciation had to be algorithmically predicted.<ref name="equestriacn" /> | |||
== Background == | |||
=== Artificial intelligence in speech synthesis === | |||
{{Main|Deep learning speech synthesis}} | |||
{{See also|Audio deepfake}} | |||
]'s ].<ref name="deepmind" />]] | |||
In 2016, with the proposal of ]'s ], deep-learning-based models for speech synthesis began to gain popularity as a method of modeling waveforms and generating high-fidelity human-like speech.<ref name="arxiv1">{{cite arXiv |last=Hsu |first=Wei-Ning |eprint=1810.07217 |title=Hierarchical Generative Modeling for Controllable Speech Synthesis |class=cs.CL |date=2018 }}</ref><ref name="arxiv2">{{cite arXiv |last=Habib |first=Raza |eprint=1910.01709 |title=Semi-Supervised Generative Modeling for Controllable Speech Synthesis |class=cs.CL |date=2019 }}</ref><ref name="deepmind">{{cite web|url=https://www.deepmind.com/blog/high-fidelity-speech-synthesis-with-wavenet|title=High-fidelity speech synthesis with WaveNet|last1=van den Oord|first1=Aäron|last2=Li|first2=Yazhe|last3=Babuschkin|first3=Igor|date=2017-11-12|website=]|access-date=2022-06-05|archive-date=2022-06-18|archive-url=https://web.archive.org/web/20220618205838/https://www.deepmind.com/blog/high-fidelity-speech-synthesis-with-wavenet|url-status=live}}</ref> Tacotron2, a neural network architecture for speech synthesis developed by ], was published in 2018 and required tens of hours of audio data to produce intelligible speech; when trained on 2 hours of speech, the model was able to produce intelligible speech with mediocre quality, and when trained on 36 minutes of speech, the model was unable to produce intelligible speech.<ref name="tacotron">{{cite web|url=https://google.github.io/tacotron/publications/semisupervised/index.html|title=Audio samples from "Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis"|date=2018-08-30|access-date=2022-06-05|archive-date=2020-11-11|archive-url=https://web.archive.org/web/20201111222714/https://google.github.io/tacotron/publications/semisupervised/index.html|url-status=live}}</ref><ref name="arxiv3">{{cite arXiv |eprint=1712.05884 |title=Natural TTS Synthesis by Conditioning WaveNet on Mel-Spectrogram Predictions |class=cs.CL |date=2018 |last1=Shen |first1=Jonathan |last2=Pang |first2=Ruoming |last3=Weiss |first3=Ron J. |last4=Schuster |first4=Mike |last5=Jaitly |first5=Navdeep |last6=Yang |first6=Zongheng |last7=Chen |first7=Zhifeng |last8=Zhang |first8=Yu |last9=Wang |first9=Yuxuan |last10=Skerry-Ryan |first10=RJ |last11=Saurous |first11=Rif A. |last12=Agiomyrgiannakis |first12=Yannis |last13=Wu |first13=Yonghui }}</ref> | |||
Later versions of 15.ai introduced multi-speaker capabilities. Rather than training separate models for each voice, 15.ai used a unified model that learned multiple voices simultaneously through speaker ]–learned numerical representations that captured each character's unique vocal characteristics.<ref name="Twitter" /> Along with the emotional context conferred by DeepMoji, this neural network architecture enabled the model to learn shared patterns across different characters' emotional expressions and speaking styles, even when individual characters lacked examples of certain emotional contexts in their training data.<ref name="Kurosawa-2021" /> | |||
For years, reducing the amount of data required to train a realistic high-quality text-to-speech model has been a primary goal of scientific researchers in the field of deep learning speech synthesis.<ref>{{cite arXiv |last=Chung |first=Yu-An |eprint=1808.10128 |title=Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis |class=cs.CL |date=2018 }}</ref><ref>{{cite arXiv |last=Ren |first=Yi |eprint=1905.06791 |title=Almost Unsupervised Text to Speech and Automatic Speech Recognition |class=cs.CL |date=2019 }}</ref> The developer of 15.ai claims that as little as 15 seconds of data is sufficient to clone a voice up to human standards, a significant reduction in the amount of data required.<ref name="eurogamer"/> | |||
The interface included technical metrics and graphs,<ref name="equestriacn">{{cite web|date=October 1, 2021|access-date=December 22, 2024|url=https://www.equestriacn.com/2021/10/15-ai-is-back-online-updated-to-v23.html|title=15.ai已经重新上线,版本更新至v23|trans-title=15.ai has been re-launched, version updated to v23|language=zh}}</ref> which, according to the developer, served to highlight the research aspect of the website.<ref name="Twitter" /> As of version v23, released in September 2021, the interface displayed comprehensive model analysis information, including word parsing results and emotional analysis data. The ] and ] (GAN) hybrid denoising function, introduced in an earlier version, was streamlined to remove manual parameter inputs.<ref name="equestriacn" /> | |||
=== Copyrighted material in deep learning === | |||
{{Main|Artificial intelligence and copyright}} | |||
] between ] and the ] in 2013 ruled that ]—a service that searches the full text of printed copyrighted books—was ], thus meeting all requirements for fair use.<ref>- F.2d – (2d Cir, 2015). (temporary cites: 2015 U.S. App. LEXIS 17988; | |||
{{Dead link|date=September 2024 |bot=InternetArchiveBot |fix-attempted=yes }} (October 16, 2015))</ref> This case set an important legal precedent for the field of deep learning and artificial intelligence: using copyrighted material to train a ] or a ''non-commercial'' ] was deemed legal. The legality of ''commercial'' generative models trained using copyrighted material is still under debate; due to the black-box nature of machine learning models, any allegations of copyright infringement via direct competition would be difficult to prove.<ref>{{cite journal |last1=Li |first1=Y. |last2=Li |first2=J. |title=Does Black-Box Machine Learning Shift the US Fair Use Doctrine? |year=2021 |journal=SSRN |url=https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3998805 |doi=10.2139/ssrn.3998805 |doi-broken-date=November 29, 2024 |ssrn=3998805 |access-date=November 18, 2024 |archive-date=January 25, 2022 |archive-url=https://web.archive.org/web/20220125142022/https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3998805 |url-status=live }}</ref> | |||
== |
== Reception and legacy == | ||
Critics described 15.ai as easy to use and generally able to convincingly replicate character voices, with occasional mixed results.<ref name="Clayton-2021" /><ref>{{cite web |last1=Moto |first1=Eugenio |title=15.ai, el sitio que te permite usar voces de personajes populares para que digan lo que quieras |url=https://www.qore.com/noticias/78756/15ai-el-sitio-que-te-permite-usar-voces-de-personajes-populares-para-que-digan-lo-que-quieras/pagina/1/1000 |website=Qore |access-date=21 December 2024 |language=es |date=20 January 2021 |quote=Si bien los resultados ya son excepcionales, sin duda pueden mejorar más}}</ref><ref>{{cite web |last=Scotellaro |first=Shaun |date=March 4, 2020 |title=Neat "Pony Preservation Project" Using Neural Networks to Create Pony Voices |url=https://www.equestriadaily.com/2020/03/neat-pony-preservation-project-using.html |url-status=live |access-date=December 18, 2024 |website=] |archive-date=June 23, 2021 |archive-url=https://web.archive.org/web/20210623210048/https://www.equestriadaily.com/2020/03/neat-pony-preservation-project-using.html}}</ref><ref name="Ruppert-2021" /><ref>{{cite web |last=Villalobos |first=José |date=January 18, 2021 |title=Descubre 15.AI, un sitio web en el que podrás hacer que GlaDOS diga lo que quieras |trans-title=Discover 15.AI, a Website Where You Can Make GlaDOS Say What You Want |url=https://www.laps4.com/noticias/descubre-15-ai-un-sitio-web-en-el-que-podras-hacer-que-glados-diga-lo-que-quieras/ |url-status=live |archive-url=https://web.archive.org/web/20210118172043/https://www.laps4.com/noticias/descubre-15-ai-un-sitio-web-en-el-que-podras-hacer-que-glados-diga-lo-que-quieras/ |archive-date=January 18, 2021 |access-date=January 18, 2021 |website=LaPS4 |language=es |quote=La dirección es 15.AI y funciona tan fácil como parece. |trans-quote=The address is 15.AI and it works as easy as it looks.}}</ref><ref name="do Prado-2024">{{cite web|url=https://arkade.com.br/faca-glados-bob-esponja-e-outros-personagens-falarem-textos-escritos-por-voce/|trans-title=Make GLaDOS, SpongeBob and other characters speak texts written by you!|last=do Prado|first=Renan|website=Arkade|access-date=December 22, 2024|date=January 19, 2021|language=pt-br|title=Faça GLaDOS, Bob Esponja e outros personagens falarem textos escritos por você!}}</ref> Natalie Clayton of '']'' wrote that ]' voice was replicated well, but noted challenges in mimicking '']''{{'}}s narrator: "the algorithm simply can't capture ]'s whimsically droll intonation."<ref name="Clayton-2021" /> Zack Zwiezen of '']'' reported that " girlfriend was convinced it was a new voice line from ]' voice actor, ]".<ref name="Zwiezen-2021" /> Taiwanese newspaper '']'' also highlighted 15.ai's ability to recreate GLaDOS's mechanical voice, alongside its diverse range of character voice options.<ref name="UDN-2021" /> ''] Taiwan'' reported that "GLaDOS in ''Portal'' can pronounce lines nearly perfectly", but also criticized that "there are still many imperfections, such as word limit and tone control, which are still a little weird in some words."<ref name="anything">{{cite web| url=https://tw.news.yahoo.com/15-ai-044220764.html|date=January 19, 2021|access-date=December 22, 2024|title=讓你喜愛的ACG角色說出任何話! AI生成技術幫助你實現夢想|trans-title=Let your favorite ACG characters say anything! AI generation technology helps you realize your dreams|language=zh |quote=大家是否都曾經想像過,假如能讓自己喜歡的遊戲或是動畫角色說出自己想聽的話,不論是名字、惡搞或是經典名言,都是不少人的夢想吧。不過來到 2021 年,現在這種夢想不再是想想而已,因為有一個網站通過 AI 生成的技術,讓大家可以讓不少遊戲或是動畫角色,說出任何你想要他們講出的東西,而且相似度與音調都有相當高的準確度 |trans-quote=Have you ever imagined what it would be like if your favorite game or anime characters could say exactly what you want to hear? Whether it's names, parodies, or classic quotes, this is a dream for many. However, as we enter 2021, this dream is no longer just a fantasy, because there is a website that uses AI-generated technology, allowing users to make various game and anime characters say anything they want with impressive accuracy in both similarity and tone.}}</ref> | |||
15.ai was designed and created by an anonymous research scientist known by the alias ''15''.<ref name="automaton"/><ref name="elevenlabs"/><ref name="resemble"/> In his blog '']'', economist ] cited the developer of 15.ai as an example of underrated talent in AI.<ref name="marginalrevolution">{{cite web | |||
|url= https://marginalrevolution.com/marginalrevolution/2022/05/the-most-underrated-talent-in-ai.html | |||
|title= The most underrated talent in AI? | |||
|last= Cowen | |||
|first= Tyler | |||
|date= 2022-05-12 | |||
|website= ] | |||
|access-date= 2024-11-27 | |||
|url-status= live | |||
|archive-date= 2022-06-19 | |||
|archive-url= https://web.archive.org/web/20220619203626/https://marginalrevolution.com/marginalrevolution/2022/05/the-most-underrated-talent-in-ai.html | |||
}}</ref> | |||
Multiple other critics also found the character limit and the prosody options as not entirely satisfactory.<ref name="GamerSky-2021" /><ref name="anything" /> Peter Paltridge of ] and ] news outlet ''Anime Superhero'' opined that "voice synthesis has evolved to the point where the more expensive efforts are nearly indistinguishable from actual human speech," but also noted that "In some ways, ] is still more advanced than this. It was possible to affect SAM’s inflections by using special characters, as well as change his pitch at will. With 15.ai, you’re at the mercy of whatever random inflections you get."<ref>{{cite web|last=Paltridge|first=Peter|url=https://animesuperhero.com/this-website-will-say-whatever-you-type-in-spongebobs-voice/|title=This Website Will Say Whatever You Type In Spongebob's Voice|access-date=December 22, 2024|date=January 18, 2021}}</ref> Conversely, Lauren Morton of '']'' praised the depth of pronunciation control—"if you're willing to get into the nitty gritty of it".<ref name="Morton-2021" /> Takayuki Furushima of '']'' highlighted the "smooth pronunciations", and Yuki Kurosawa of '']'' noted its "rich emotional expression" as a major feature; both Japanese authors noted the lack of Japanese-language support.<ref name="Yoshiyuki-2021" /><ref name="Kurosawa-2021" /> Renan do Prado of the Brazilian gaming news outlet ''Arkade'' pointed out that users could create amusing results in ], although generation primarily performed well in English.<ref name="do Prado-2024" /> | |||
Developing and running 15.ai cost several thousands of dollars per month, initially funded by the developer's personal finances after a successful ] exit.<ref name="play.ht"/> The algorithm used by the project was dubbed '''DeepThroat.'''<ref name="play.ht"/><ref name="15aiabout">{{cite web |last= |first= |date=2022-02-20 |title=15.ai – About |url=https://15.ai/about |url-status=dead |archive-url=https://archive.today/20211006074716/https://15.ai/about |archive-date=2021-10-06 |access-date=2022-02-20 |website=15.ai |publisher= |quote=}}</ref> The project and algorithm were conceived as part of MIT's ], and had been in development since 2018.<ref name="thebatch"/><ref name="automaton"/><ref>{{cite web | |||
|url=https://www.byteside.com/2021/01/15-ai-deepmoji-glados-spongebob-characters-ai-text-to-speech/ | |||
|title=Make GLaDOS, SpongeBob and other friends say what you want with this AI text-to-speech tool | |||
|last=Button | |||
|first=Chris | |||
|date=2021-01-19 | |||
|website=Byteside | |||
|access-date=2024-11-18 | |||
|url-status=live | |||
|archive-date=June 25, 2024 | |||
|archive-url=https://web.archive.org/web/20240625180514/https://www.byteside.com/2021/01/15-ai-deepmoji-glados-spongebob-characters-ai-text-to-speech/ | |||
}}</ref> The model used by 15.ai was inspired by a 2019 paper that introduced ] to text-to-speech models.<ref name="thebatch"/><ref>{{cite book |last=Jia |first=Ye |arxiv=1806.04558 |title=1806.04558|date=2019 }}</ref> | |||
South Korean video game outlet ''Zuntata'' wrote that "the surprising thing about 15.ai is that , there's only about 30 seconds of data, but it achieves pronunciation accuracy close to 100%".<ref>{{cite web |date=January 20, 2021 |title=게임 캐릭터 음성으로 영어를 읽어주는 소프트 15.ai 공개. |trans-title=Software 15.ai Released That Reads English in Game Character Voices |url=https://zuntata.tistory.com/7283 |access-date=December 18, 2024 |website=] |language=ko |quote= |trans-quote=}}</ref> Machine learning professor Yongqiang Li wrote in his blog that he was surprised to see that the application was free.<ref>{{cite web |last=Li |first=Yongqiang |title=语音开源项目优选:免费配音网站15.ai |trans-title=Voice Open Source Project Selection: Free Voice Acting Website 15.ai |url=https://zhuanlan.zhihu.com/p/346417192 |access-date=December 18, 2024 |website=] |language=zh |quote= |trans-quote=}}</ref> | |||
]'s /mlp/ board has been integral to the development of 15.ai.<ref name="gwern">{{cite journal |last=Branwen |first=Gwern |date=2020-03-06 |title="15.ai", 15, Pony Preservation Project |url=https://www.gwern.net/docs/ai/music/index#15-project-2020-section |url-status=live |publisher=Gwern |archive-url=https://web.archive.org/web/20220318160737/https://www.gwern.net/docs/ai/music/index#15-project-2020-section |archive-date=2022-03-18 |access-date=2022-06-17 |website=Gwern.net}}</ref>]] | |||
The developer also worked closely with the Pony Preservation Project from /mlp/, the '']'' ] of ].<ref name="play.ht"/> This project was a "collaborative effort by /mlp/ to build and curate pony datasets" with the aim of creating applications in artificial intelligence.<ref>{{cite web | |||
|url= https://www.equestriadaily.com/2020/03/neat-pony-preservation-project-using.html | |||
|title= Neat "Pony Preservation Project" Using Neural Networks to Create Pony Voices | |||
|last= Scotellaro | |||
|first= Shaun | |||
|date= 2020-03-14 | |||
|website= ] | |||
|access-date= 2022-06-11 | |||
|archive-date= 2021-06-23 | |||
|url-status= live | |||
|archive-url= https://web.archive.org/web/20210623210048/https://www.equestriadaily.com/2020/03/neat-pony-preservation-project-using.html | |||
}}</ref><ref name="ppp"> | |||
{{cite web | |||
|url= https://desuarchive.org/mlp/thread/38204261/ | |||
|title= Pony Preservation Project (Thread 108) | |||
|last= | |||
|first= | |||
|date= 2022-02-20 | |||
|website= ] | |||
|publisher= Desuarchive | |||
|access-date= 2022-02-20 | |||
|quote= }}</ref> The ''Friendship Is Magic'' voices on 15.ai were trained on a large dataset ]d by the project: audio and dialogue from the show and related media<ref name="play.ht"/>—including ], ], ], ], and various other content voiced by the same voice actors—were ], ], and ] to remove background noise. | |||
15.ai was an early pioneer of audio deepfakes, leading to the emergence of AI speech synthesis-based memes.<ref name="anything" /><ref>{{cite web |last=VK |first=Anirudh |date=March 18, 2023 |title=Deepfakes Are Elevating Meme Culture, But At What Cost? |url=https://analyticsindiamag.com/ai-origins-evolution/deepfakes-are-elevating-meme-culture-but-at-what-cost/ |access-date=December 18, 2024 |website=Analytics India Magazine |quote="While AI voice memes have been around in some form since '15.ai' launched in 2020, "}}</ref> Its influence has been noted in the years after it became defunct,<ref>{{cite web |last=Wright |first=Steven |date=March 21, 2023 |title=Why Biden, Trump, and Obama Arguing Over Video Games Is YouTube's New Obsession |url=https://www.inverse.com/gaming/youtube-ai-presidential-gaming-debates |url-status=live |access-date=December 18, 2024 |website=] |quote="AI voice tools used to create "audio deepfakes" have existed for years in one form or another, with 15.ai being a notable example." |archive-date=December 20, 2024 |archive-url=https://web.archive.org/web/20241220012854/https://www.inverse.com/gaming/youtube-ai-presidential-gaming-debates}}</ref> and since then, several commercial alternatives emerged, such as ]{{efn|which uses "11.ai" as a legal byname for its web domain<ref>{{cite web |title=Can I publish the content I generate on the platform? |url=https://help.elevenlabs.io/hc/en-us/articles/13313564601361-Can-I-publish-the-content-I-generate-on-the-platform |website=ElevenLabs |access-date=23 December 2024 |date=8 May 2024 |type=Official website}}</ref>}} and ].<ref name="ElevenLabs-2024">{{cite web |date=February 7, 2024 |title=15.AI: Everything You Need to Know & Best Alternatives |url=https://elevenlabs.io/blog/15-ai |url-status=live |access-date=December 18, 2024 |website=] |quote= |archive-date=July 15, 2024 |archive-url=https://web.archive.org/web/20240715151316/https://elevenlabs.io/blog/15-ai}}</ref><ref name="Play.ht-2024">{{cite web |date=September 12, 2024 |title=Everything You Need to Know About 15.ai: The AI Voice Generator |url=https://play.ht/blog/15-ai/ |access-date=December 18, 2024 |website=Play.ht |quote=}}</ref> The original claim that only 15 seconds of data is required to clone a human's voice was corroborated by ] in 2024.<ref>{{cite web |last= |first= |title=Navigating the Challenges and Opportunities of Synthetic Voices |url=https://openai.com/index/navigating-the-challenges-and-opportunities-of-synthetic-voices/ |website=] |url-status=live |date=March 9, 2024 |access-date=December 18, 2024 |archive-date=November 25, 2024 |archive-url=https://web.archive.org/web/20241125181327/https://openai.com/index/navigating-the-challenges-and-opportunities-of-synthetic-voices/}}</ref> | |||
The first public release of 15.ai was unveiled in March 2020, with the service experiencing intermittent availability as the developer conducted ongoing ] work.{{citation needed|date=December 2024}} The tool gained heavy attention in ] in early 2021, with multiple gaming news outlets covering its capabilities.<ref name="pcgamer"/><ref name="kotaku"/><ref name="gameinformer"/> 15.ai saw further attention in 2022 when it was discovered that a company that voice actor ] had partnered with plagiarized outputs from the tool.<ref name="nme"/> {{See below|Troy Baker / Voiceverse NFT plagiarism scandal}} | |||
In late 2022, 15.ai was taken offline. {{As of|November 2024}}, the website is still inactive. | |||
== Reception == | |||
15.ai was met with a largely positive reception from users and ]. Liana Ruppert of '']'' described it as "simplistically brilliant"<ref name="gameinformer"/> and José Villalobos of '']'' wrote that it "works as easy as it looks."<ref name="LaPS4"/>{{efn|Translated from original quote written in Spanish: ''"La dirección es 15.AI y funciona tan fácil como parece."''<ref name="LaPS4"/>}} Lauren Morton of '']'' called the tool "fascinating,"<ref name="rockpapershotgun"/> and Yuki Kurosawa of '']'' deemed it "revolutionary."<ref name="automaton"/>{{efn|Translated from original quote written in Japanese: ''"しかし15.aiが画期的なのは「データが30秒しかない文字でも、ほぼ100%の発音精度を達成できること」そして「ごくわずかなデータのみを使って、自然な感情のこもった音声を数百以上生成できること」だという。"''<ref name="automaton"/>}} Users praised the ability to easily create audio of popular characters that sound believable to those unaware they had been synthesized. Zack Zwiezen of '']'' reported that " girlfriend was convinced it was a new voice line from ]' voice actor, ]".<ref name="kotaku"/> Natalie Clayton of '']'' wrote that "]' shrill, nasally voice works shockingly well". | |||
The website's impact extended beyond English-speaking media. Furushima Yoshiyuki of '']'' wrote that "it's amazing that are all synthetically generated", and Eugenio Moto of '']'' reported that "while the results are already exceptional, they can certainly get better." | |||
== In popular culture == | |||
=== Fandom content creation === | |||
<!-- Deleted image removed: ] --> | |||
15.ai was frequently used for ] in various ]s, including the ], the '']'' fandom, the '']'' fandom, and the '']'' fandom, with numerous videos and projects containing speech from 15.ai having gone ].<ref name="kotaku" /><ref name="gameinformer" /> The platform is credited as the impetus behind the popularization of AI voice cloning in content creation, demonstrating the potential for accessible, high-quality voice synthesis technology.<ref name="play.ht"/> | |||
The ''My Little Pony: Friendship Is Magic'' fandom saw a resurgence in video and musical content creation as a result, inspiring a new genre of fan-created content assisted by artificial intelligence. Some ]s weren adapted into fully voiced "episodes": ''The Tax Breaks'' is a 17-minute long animated video rendition of a fan-written story published in 2014 that uses voices generated from 15.ai with ] and ], emulating the episodic style of the early seasons of ''Friendship Is Magic''.<ref name="taxbreaks">{{cite web | |||
|url= https://www.equestriadaily.com/2022/05/full-simple-animated-episode-tax-breaks.html | |||
|title= Full Simple Animated Episode – The Tax Breaks (Twilight) | |||
|last= Scotellaro | |||
|first= Shaun | |||
|date= 2022-05-15 | |||
|website= ] | |||
|access-date= 2022-05-28 | |||
|quote= | |||
|archive-date= 2022-05-21 | |||
|url-status= live | |||
|archive-url= https://web.archive.org/web/20220521132423/https://www.equestriadaily.com/2022/05/full-simple-animated-episode-tax-breaks.html | |||
}}</ref><ref>{{Cite web |date=27 April 2014 |title=The Terribly Taxing Tribulations of Twilight Sparkle |url=https://www.fimfiction.net/story/185725 |url-status=live |archive-url=https://web.archive.org/web/20220630170105/https://www.fimfiction.net/story/185725 |archive-date=30 June 2022 |access-date=28 April 2022 |website=Fimfiction.net}}</ref> | |||
Viral videos from the ''Team Fortress 2'' fandom featuring voices from 15.ai include ''Spy is a ]'' (which gained over 3 million views on YouTube across multiple videos<ref group="yt">{{cite web|url=https://www.youtube.com/watch?v=TAmhr6Was3E|title=SPY IS A FURRY|work=]|date=January 17, 2021 |access-date=June 14, 2022|archive-date=June 13, 2022|archive-url=https://web.archive.org/web/20220613094918/https://www.youtube.com/watch?v=TAmhr6Was3E|url-status=live}}</ref><ref group="yt">{{cite web|url=https://www.youtube.com/watch?v=lwQn7ISVV_8|title=Spy is a Furry Animated|work=]|access-date=June 14, 2022|archive-date=June 14, 2022|archive-url=https://web.archive.org/web/20220614203255/https://www.youtube.com/watch?v=lwQn7ISVV_8|url-status=live}}</ref><ref group="yt">{{cite web|url=https://www.youtube.com/watch?v=r0FLyW86owo|title= – Spy's Confession – |work=]|date=January 15, 2021 |access-date=June 14, 2022|archive-date=June 30, 2022|archive-url=https://web.archive.org/web/20220630170113/https://www.youtube.com/watch?v=r0FLyW86owo|url-status=live}}</ref>) and ''The RED Bread Bank'', both of which inspired ] animated video renditions.<ref name="automaton"/> Other fandoms used voices from 15.ai to produce viral videos. {{As of|July 2022}}, the viral video ''] Struggles'' (with voices from ''Friendship Is Magic'') had over 5.5 million views on YouTube;<ref group="yt">{{cite web|url=https://www.youtube.com/watch?v=UPE3vnLY3TE|title=Among Us Struggles|work=]|date=September 21, 2020 |access-date=July 15, 2022}}</ref> ], ], and ] streamers also used 15.ai for their videos, such as FitMC's video on the history of ]—one of the oldest running '']'' servers—and datpon3's TikTok video featuring the main characters of ''Friendship Is Magic'', which have 1.4 million and 510 thousand views, respectively.<ref group="yt">{{cite web|url=https://www.youtube.com/watch?v=1V1O2gTdqHw|title=The UPDATED 2b2t Timeline (2010–2020)|work=]|date=March 14, 2020 |access-date=June 14, 2022|archive-date=June 1, 2022|archive-url=https://web.archive.org/web/20220601085855/https://www.youtube.com/watch?v=1V1O2gTdqHw|url-status=live}}</ref><ref group="tt">{{cite web|url=https://www.tiktok.com/@datpon3/video/6813618431217241350|title=She said " 👹 "|work=]|access-date=July 15, 2022|archive-date=February 21, 2022|archive-url=https://web.archive.org/web/20220221225053/https://www.tiktok.com/@datpon3/video/6813618431217241350|url-status=live}}</ref> | |||
Some users created AI ]s using 15.ai and external voice control software. One user on Twitter created a personal desktop assistant inspired by ] using 15.ai-generated dialogue in tandem with voice control system VoiceAttack.<ref name="automaton"/><ref name="Denfaminicogamer"/> | |||
=== Troy Baker / Voiceverse NFT plagiarism scandal === | |||
{{Main|Troy Baker#Partnership scandal}} | |||
On January 14, 2022, it was discovered that Voiceverse NFT, a company that video game and ] ] ] ] announced his partnership with, had plagiarized voice lines generated from 15.ai as part of their marketing campaign.<ref name="nme"/><ref name="stevivor"/> ] showed that Voiceverse had generated audio of characters from '']'' using 15.ai, pitched them up to make them sound unrecognizable from the original voices to market their own platform—in violation of 15.ai's terms of service.<ref name="eurogamer">{{cite web | |||
|url= https://www.eurogamer.net/articles/2022-01-17-troy-baker-backed-nft-firm-admits-using-voice-lines-taken-from-another-service-without-permission | |||
|title= Troy Baker-backed NFT firm admits using voice lines taken from another service without permission | |||
|last= Phillips | |||
|first= Tom | |||
|date= 2022-01-17 | |||
|website= ] | |||
|access-date= 2022-01-17 | |||
|quote= | |||
|archive-date= 2022-01-17 | |||
|archive-url= https://web.archive.org/web/20220117164033/https://www.eurogamer.net/articles/2022-01-17-troy-baker-backed-nft-firm-admits-using-voice-lines-taken-from-another-service-without-permission | |||
|url-status= live | |||
}}</ref><ref name="wccftech">{{cite web | |||
|url= https://wccftech.com/voiceverse-nft-service-uses-stolen-technology-from-15ai/ | |||
|title= Troy Baker-backed NFT firm admits using voice lines taken from another service without permission | |||
|last= Lopez | |||
|first= Ule | |||
|date= 2022-01-16 | |||
|website= Wccftech | |||
|access-date= 2022-06-07 | |||
|url-status= live | |||
|archive-date= 2022-01-16 | |||
|archive-url= https://web.archive.org/web/20220116194519/https://wccftech.com/voiceverse-nft-service-uses-stolen-technology-from-15ai/ | |||
}}</ref> Voiceverse claimed that someone in their marketing team used the voice without properly crediting 15.ai, and in response, 15 tweeted "Go fuck yourself."<ref name="nme" /><ref name="stevivor"/><ref name="eurogamer"/><ref group="tweet">{{Cite tweet |user=fifteenai |number=1482088782765576192|date = January 14, 2022 |title=Go fuck yourself.}}</ref> | |||
== Legacy == | |||
===Impact on voice cloning technology=== | |||
15.ai introduced several technical innovations in ].<ref name="automaton"/> While traditional text-to-speech systems like ]'s Tacotron2 required tens of hours of audio data to produce intelligible speech in 2017,<ref name="tacotron"/><ref name="arxiv3"/> 15.ai claimed to achieve high-quality voice cloning with as little as 15 seconds of training data.<ref name="eurogamer"/><ref name="play.ht"/> This reduction in required training data represented a breakthrough in the field of speech synthesis.<ref name="hashdork"/><ref name="play.ht"/> | |||
The project also introduced the concept of "emotional contextualizers" for controlling speech emotion through ].<ref name="automaton"/><ref name="Denfaminicogamer"/><ref name="hashdork"/> | |||
===Reactions from voice actors and pundits=== | |||
] in 2017]] | |||
Some voice actors have publicly decried the use of voice cloning technology. Cited reasons include concerns about ], ], ], unauthorized use of an actor's voice in ] or ], and the potential of ].<ref name="elevenlabs"/><ref name="wccftech"/><ref name="play.ht"/><ref name="hashdork"/> | |||
In his 2020 assessment of 15.ai in ] ] '']'', computer scientist ] wrote: | |||
{{Quote|"Voice cloning could be enormously productive. In ], it could revolutionize the use of virtual actors. In cartoons and audiobooks, it could enable voice actors to participate in many more productions. In online education, kids might pay more attention to lessons delivered by the voices of favorite personalities. And how many YouTube how-to video producers would love to have a synthetic ] narrate their scripts?<ref name="thebatch"/>}} | |||
However, he also wrote: | |||
{{Quote|"...but synthesizing a human actor's voice without consent is arguably unethical and possibly illegal. And this technology will be catnip for deepfakers, who could scrape recordings from ]s to impersonate private individuals."<ref name="thebatch"/>}} | |||
== See also == | == See also == | ||
*] | |||
{{div col}} | |||
*] | |||
*] | *] | ||
*] | |||
*] | |||
*] | |||
*] | |||
*] | |||
*] | |||
*] | *] | ||
*] | |||
{{div col end}} | |||
==Notes== | == Notes == | ||
{{notelist}} | {{notelist}} | ||
==References== | == References == | ||
;Notes | |||
{{reflist}} | {{reflist}} | ||
;Tweets | |||
{{reflist|group=tweet|35em}} | |||
;YouTube (referenced for view counts and usage of 15.ai only) | |||
{{reflist|group=yt|35em}} | |||
;TikTok | |||
{{reflist|group=tt|35em}} | |||
] | |||
==External links== | |||
] | |||
* | |||
] | |||
* {{Official website|15.ai}} | |||
] | |||
* {{Twitter | id= fifteenai | name= 15 }} | |||
] | |||
* | |||
] | |||
] | |||
] | |||
{{Differentiable computing}} | |||
{{Speech synthesis}} | {{Speech synthesis}} | ||
{{My Little Pony: Friendship Is Magic}} | {{My Little Pony: Friendship Is Magic}} | ||
] | |||
] | |||
] | |||
] | |||
] | |||
] | |||
] | |||
] | |||
] | |||
] | |||
] |
Latest revision as of 17:49, 23 December 2024
Real-time text-to-speech AI toolAn editor has nominated this article for deletion. You are welcome to participate in the deletion discussion, which will decide whether or not to retain it.Feel free to improve the article, but do not remove this notice before the discussion is closed. For more information, see the guide to deletion. Find sources: "15.ai" – news · newspapers · books · scholar · JSTOR%5B%5BWikipedia%3AArticles+for+deletion%2F15.ai+%283rd+nomination%29%5D%5DAFD |
Type of site | Artificial intelligence, speech synthesis |
---|---|
Available in | English |
Founder(s) | 15 |
URL | 15 |
Commercial | No |
Registration | None |
Launched | March 2020; 4 years ago (2020-03) |
Current status | Inactive |
15.ai was a free non-commercial web application that used artificial intelligence to generate text-to-speech voices of fictional characters from popular media. Conceived by an artificial intelligence researcher known as "15" during their time at the Massachusetts Institute of Technology and developed following their successful exit from a startup venture, the application allowed users to make characters from various media speak custom text with emotional inflections faster than real-time.
Launched in March 2020, The service gained widespread attention in early 2021 when it went viral on social media platforms like YouTube and Twitter, and quickly became popular among Internet fandoms, including the My Little Pony: Friendship Is Magic, Team Fortress 2, and SpongeBob SquarePants fandoms. The website had a role in the emergence of AI voice cloning (audio deepfake) memes.
In January 2022, Voiceverse NFT sparked controversy when it was discovered that the company, which had partnered with voice actor Troy Baker, had misappropriated 15.ai's work for their own platform. The service was ultimately taken offline in September 2022. Its shutdown led to the emergence of various commercial alternatives in subsequent years.
History
15.ai was conceived in 2016 as a research project in deep learning speech synthesis by a developer known as "15" during their undergraduate studies at the Massachusetts Institute of Technology (MIT). The developer was inspired by DeepMind's WaveNet paper, with development continuing through their studies as Google AI released Tacotron the following year. The name 15 is a reference to the creator's claim that a voice can be cloned with as little as 15 seconds of data. 15.ai was released in March 2020. More voices were added to the website in the following months.
In early 2021, the application went viral on Twitter and YouTube, with people generating skits, memes, and fan content using voices from popular games and shows. 15.ai use also resulted in memes and viral videos. These included recreations of the popular Source Filmmaker video Heavy is Dead, The RED Bread Bank, and Among Us Struggles, which have amassed millions of views on social media. Content creators, YouTubers, and TikTokers have also used 15.ai as part of their videos as voiceovers. According to the developer, at its peak, the platform incurred operational costs of US$12,000 per month from AWS infrastructure needed to handle millions of daily voice generations. They funded the website through their previous startup earnings.
On January 14, 2022, a controversy ensued after it was discovered that Voiceverse NFT, a company that video game and anime dub voice actor Troy Baker announced his partnership with, had misappropriated voice lines generated from 15.ai as part of their marketing campaign. Log files showed that Voiceverse had generated audio of characters from My Little Pony: Friendship Is Magic using 15.ai, pitched them up to make them sound unrecognizable from the original voices to market their own platform—in violation of 15.ai's terms of service. Voiceverse claimed that someone in their marketing team used the voice without properly crediting 15.ai; in response, 15 tweeted "Go fuck yourself."
In September 2022, 15.ai was taken offline. The developer claimed that this was due to legal issues surrounding artificial intelligence and copyright.
Features
The platform was non-commercial, and operated without requiring user registration or accounts. Users generated speech by inputting text and selecting a character voice, with optional parameters for emotional contextualizers and phonetic transcriptions. Each request produced three audio variations with distinct emotional deliveries. Characters available included multiple characters from Team Fortress 2 and My Little Pony: Friendship Is Magic; GLaDOS and Wheatley from the Portal series; SpongeBob SquarePants; Rise Kujikawa from Persona 4; Daria Morgendorffer and Jane Lane from Daria; Carl Brutananadilewski from Aqua Teen Hunger Force; Steven Universe from Steven Universe; Sans from Undertale; the Tenth Doctor Who; the Narrator from The Stanley Parable; and HAL 9000 from 2001: A Space Odyssey. Certain "silent" characters like Chell and Gordon Freeman were able to be selected as a joke, and would emit silent audio files when any text was submitted.
The deep learning model's nondeterministic properties produced variations in speech output, creating different intonations with each generation, similar to how voice actors produce different takes. 15.ai introduced the concept of "emotional contextualizers," which allowed users to specify the emotional tone of generated speech through guiding phrases. The emotional contextualizer functionality utilized DeepMoji, a sentiment analysis neural network developed at the MIT Media Lab. Introduced in 2017, DeepMoji processed emoji embeddings from 1.2 billion Twitter posts (2013-2017) to analyze emotional content. Testing showed the system could identify emotional elements, including sarcasm, more accurately than human evaluators.
The application provided support for a simplified version of ARPABET, a set of English phonetic transcriptions originally developed by the Advanced Research Projects Agency in the 1970s. This feature allowed users to correct mispronunciations or specify the desired pronunciation between heteronyms – words that have the same spelling but have different pronunciations. Users could invoke ARPABET transcriptions by enclosing the phoneme string in curly braces within the input box (for example, "{AA1 R P AH0 B EH2 T}" to specify the pronunciation of the word "ARPABET" (/ˈɑːrpəˌbɛt/ AR-pə-beht). The interface displayed parsed words with color-coding to indicate pronunciation certainty: green for words found in the existing pronunciation lookup table, blue for manually entered ARPAbet pronunciations, and red for words where the pronunciation had to be algorithmically predicted.
Later versions of 15.ai introduced multi-speaker capabilities. Rather than training separate models for each voice, 15.ai used a unified model that learned multiple voices simultaneously through speaker embeddings–learned numerical representations that captured each character's unique vocal characteristics. Along with the emotional context conferred by DeepMoji, this neural network architecture enabled the model to learn shared patterns across different characters' emotional expressions and speaking styles, even when individual characters lacked examples of certain emotional contexts in their training data.
The interface included technical metrics and graphs, which, according to the developer, served to highlight the research aspect of the website. As of version v23, released in September 2021, the interface displayed comprehensive model analysis information, including word parsing results and emotional analysis data. The flow and generative adversarial network (GAN) hybrid denoising function, introduced in an earlier version, was streamlined to remove manual parameter inputs.
Reception and legacy
Critics described 15.ai as easy to use and generally able to convincingly replicate character voices, with occasional mixed results. Natalie Clayton of PC Gamer wrote that SpongeBob SquarePants' voice was replicated well, but noted challenges in mimicking The Stanley Parable's narrator: "the algorithm simply can't capture Kevan Brighting's whimsically droll intonation." Zack Zwiezen of Kotaku reported that " girlfriend was convinced it was a new voice line from GLaDOS' voice actor, Ellen McLain". Taiwanese newspaper United Daily News also highlighted 15.ai's ability to recreate GLaDOS's mechanical voice, alongside its diverse range of character voice options. Yahoo! News Taiwan reported that "GLaDOS in Portal can pronounce lines nearly perfectly", but also criticized that "there are still many imperfections, such as word limit and tone control, which are still a little weird in some words."
Multiple other critics also found the character limit and the prosody options as not entirely satisfactory. Peter Paltridge of anime and superhero news outlet Anime Superhero opined that "voice synthesis has evolved to the point where the more expensive efforts are nearly indistinguishable from actual human speech," but also noted that "In some ways, SAM is still more advanced than this. It was possible to affect SAM’s inflections by using special characters, as well as change his pitch at will. With 15.ai, you’re at the mercy of whatever random inflections you get." Conversely, Lauren Morton of Rock, Paper, Shotgun praised the depth of pronunciation control—"if you're willing to get into the nitty gritty of it". Takayuki Furushima of Den Fami Nico Gamer highlighted the "smooth pronunciations", and Yuki Kurosawa of AUTOMATON noted its "rich emotional expression" as a major feature; both Japanese authors noted the lack of Japanese-language support. Renan do Prado of the Brazilian gaming news outlet Arkade pointed out that users could create amusing results in Portuguese, although generation primarily performed well in English.
South Korean video game outlet Zuntata wrote that "the surprising thing about 15.ai is that , there's only about 30 seconds of data, but it achieves pronunciation accuracy close to 100%". Machine learning professor Yongqiang Li wrote in his blog that he was surprised to see that the application was free.
15.ai was an early pioneer of audio deepfakes, leading to the emergence of AI speech synthesis-based memes. Its influence has been noted in the years after it became defunct, and since then, several commercial alternatives emerged, such as ElevenLabs and Speechify. The original claim that only 15 seconds of data is required to clone a human's voice was corroborated by OpenAI in 2024.
See also
Notes
- The term "faster than real-time" in speech synthesis means that the system can generate audio more quickly than the actual duration of the speech – for example, generating 10 seconds of speech in less than 10 seconds would be considered faster than real-time.
- which uses "11.ai" as a legal byname for its web domain
References
- ^ 遊戲, 遊戲角落 (January 20, 2021). "這個AI語音可以模仿《傳送門》GLaDOS講出任何對白!連《Undertale》都可以學" [This AI Voice Can Imitate Portal's GLaDOS Saying Any Dialog! It Can Even Learn Undertale]. United Daily News (in Chinese (Taiwan)). Archived from the original on December 19, 2024. Retrieved December 18, 2024.
- ^ Yoshiyuki, Furushima (January 18, 2021). "『Portal』のGLaDOSや『UNDERTALE』のサンズがテキストを読み上げてくれる。文章に込められた感情まで再現することを目指すサービス「15.ai」が話題に" [Portal's GLaDOS and UNDERTALE's Sans Will Read Text for You. "15.ai" Service Aims to Reproduce Even the Emotions in Text, Becomes Topic of Discussion]. Den Fami Nico Gamer (in Japanese). Archived from the original on January 18, 2021. Retrieved December 18, 2024.
日本語入力には対応していないが、ローマ字入力でもなんとなくそれっぽい発音になる。; 15.aiはテキスト読み上げサービスだが、特筆すべきはそのなめらかな発音と、ゲームに登場するキャラクター音声を再現している点だ。
[It does not support Japanese input, but even if you input using romaji, it will somehow give you a similar pronunciation.; 15.ai is a text-to-speech service, but what makes it particularly noteworthy is its smooth pronunciation and the fact that it reproduces the voices of characters that appear in games.] - ^ Kurosawa, Yuki (January 19, 2021). "ゲームキャラ音声読み上げソフト「15.ai」公開中。『Undertale』や『Portal』のキャラに好きなセリフを言ってもらえる" [Game Character Voice Reading Software "15.ai" Now Available. Get Characters from Undertale and Portal to Say Your Desired Lines]. AUTOMATON (in Japanese). Archived from the original on January 19, 2021. Retrieved December 18, 2024.
英語版ボイスのみなので注意。;もうひとつ15.aiの大きな特徴として挙げられるのが、豊かな感情表現だ。
[Please note that only English voices are available.;Another major feature of 15.ai is its rich emotional expression.] - ^ Ruppert, Liana (January 18, 2021). "Make Portal's GLaDOS And Other Beloved Characters Say The Weirdest Things With This App". Game Informer. Archived from the original on January 18, 2021. Retrieved December 18, 2024.
- ^ Clayton, Natalie (January 19, 2021). "Make the cast of TF2 recite old memes with this AI text-to-speech tool". PC Gamer. Archived from the original on January 19, 2021. Retrieved December 18, 2024.
- ^ Morton, Lauren (January 18, 2021). "Put words in game characters' mouths with this fascinating text to speech tool". Rock, Paper, Shotgun. Archived from the original on January 18, 2021. Retrieved December 18, 2024.
- Ng, Andrew (April 1, 2020). "Voice Cloning for the Masses". DeepLearning.AI. Retrieved December 22, 2024.
- ^ Zwiezen, Zack (January 18, 2021). "Website Lets You Make GLaDOS Say Whatever You Want". Kotaku. Archived from the original on January 17, 2021. Retrieved December 18, 2024.
- ^ Chandraseta, Rionaldi (January 21, 2021). "Generate Your Favourite Characters' Voice Lines using Machine Learning". Towards Data Science. Archived from the original on January 21, 2021. Retrieved December 18, 2024.
- ^ "这个网站可用AI生成语音 让ACG角色"说"出你输入的文本" [This Website Can Use AI to Generate Voice, Making ACG Characters "Say" the Text You Input]. GamerSky (in Chinese). January 18, 2021. Archived from the original on December 11, 2024. Retrieved December 18, 2024.
- ^ "The past and future of 15.ai". Twitter. Archived from the original on December 8, 2024. Retrieved December 19, 2024.
- Button, Chris (January 19, 2021). "Make GLaDOS, SpongeBob and other friends say what you want with this AI text-to-speech tool". Byteside. Archived from the original on June 25, 2024. Retrieved December 18, 2024.
-
- "About". fifteen.ai (Official website). February 19, 2020. Archived from the original on February 23, 2020. Retrieved December 23, 2024.
2020-02-19: The web app isn't fully ready just yet
- "About". fifteen.ai (Official website). March 2, 2020. Archived from the original on March 3, 2020. Retrieved December 23, 2024.
- "About". fifteen.ai (Official website). February 19, 2020. Archived from the original on February 23, 2020. Retrieved December 23, 2024.
- Scotellaro, Shaun (March 31, 2020). "Rainbow Dash Voice Added to 15.ai". Equestria Daily. Archived from the original on December 1, 2024. Retrieved December 18, 2024.
- Scotellaro, Shaun (October 5, 2020). "15.ai Adds Tons of New Pony Voices". Equestria Daily. Retrieved December 21, 2024.
- ^ "Everything You Need to Know About 15.ai: The AI Voice Generator". Play.ht. September 12, 2024. Retrieved December 18, 2024.
- Lawrence, Briana (January 19, 2022). "Shonen Jump Scare Leads to Company Reassuring Fans That They Aren't Getting Into NFTs". The Mary Sue. Retrieved December 23, 2024.
- ^ Williams, Demi (January 18, 2022). "Voiceverse NFT admits to taking voice lines from non-commercial service". NME. Archived from the original on January 18, 2022. Retrieved December 18, 2024.
- ^ Wright, Steve (January 17, 2022). "Troy Baker-backed NFT company admits to using content without permission". Stevivor. Archived from the original on January 17, 2022. Retrieved December 18, 2024.
- ^ Phillips, Tom (January 17, 2022). "Troy Baker-backed NFT firm admits using voice lines taken from another service without permission". Eurogamer. Archived from the original on January 17, 2022. Retrieved December 18, 2024.
- Lopez, Ule (January 16, 2022). "Voiceverse NFT Service Reportedly Uses Stolen Technology from 15ai [UPDATE]". Wccftech. Archived from the original on January 16, 2022. Retrieved June 7, 2022.
- @fifteenai (January 14, 2022). "Go fuck yourself" (Tweet) – via Twitter.
- ^ "15.AI: Everything You Need to Know & Best Alternatives". ElevenLabs. February 7, 2024. Archived from the original on July 15, 2024. Retrieved December 18, 2024.
- "An Algorithm Trained on Emoji Knows When You're Being Sarcastic on Twitter". MIT Technology Review. August 3, 2017. Archived from the original on June 2, 2022. Retrieved December 18, 2024.
- ^ "15.ai已经重新上线,版本更新至v23" [15.ai has been re-launched, version updated to v23] (in Chinese). October 1, 2021. Retrieved December 22, 2024.
- Moto, Eugenio (January 20, 2021). "15.ai, el sitio que te permite usar voces de personajes populares para que digan lo que quieras". Qore (in Spanish). Retrieved December 21, 2024.
Si bien los resultados ya son excepcionales, sin duda pueden mejorar más
- Scotellaro, Shaun (March 4, 2020). "Neat "Pony Preservation Project" Using Neural Networks to Create Pony Voices". Equestria Daily. Archived from the original on June 23, 2021. Retrieved December 18, 2024.
- Villalobos, José (January 18, 2021). "Descubre 15.AI, un sitio web en el que podrás hacer que GlaDOS diga lo que quieras" [Discover 15.AI, a Website Where You Can Make GlaDOS Say What You Want]. LaPS4 (in Spanish). Archived from the original on January 18, 2021. Retrieved January 18, 2021.
La dirección es 15.AI y funciona tan fácil como parece.
[The address is 15.AI and it works as easy as it looks.] - ^ do Prado, Renan (January 19, 2021). "Faça GLaDOS, Bob Esponja e outros personagens falarem textos escritos por você!" [Make GLaDOS, SpongeBob and other characters speak texts written by you!]. Arkade (in Brazilian Portuguese). Retrieved December 22, 2024.
- ^ "讓你喜愛的ACG角色說出任何話! AI生成技術幫助你實現夢想" [Let your favorite ACG characters say anything! AI generation technology helps you realize your dreams] (in Chinese). January 19, 2021. Retrieved December 22, 2024.
大家是否都曾經想像過,假如能讓自己喜歡的遊戲或是動畫角色說出自己想聽的話,不論是名字、惡搞或是經典名言,都是不少人的夢想吧。不過來到 2021 年,現在這種夢想不再是想想而已,因為有一個網站通過 AI 生成的技術,讓大家可以讓不少遊戲或是動畫角色,說出任何你想要他們講出的東西,而且相似度與音調都有相當高的準確度
[Have you ever imagined what it would be like if your favorite game or anime characters could say exactly what you want to hear? Whether it's names, parodies, or classic quotes, this is a dream for many. However, as we enter 2021, this dream is no longer just a fantasy, because there is a website that uses AI-generated technology, allowing users to make various game and anime characters say anything they want with impressive accuracy in both similarity and tone.] - Paltridge, Peter (January 18, 2021). "This Website Will Say Whatever You Type In Spongebob's Voice". Retrieved December 22, 2024.
- "게임 캐릭터 음성으로 영어를 읽어주는 소프트 15.ai 공개" [Software 15.ai Released That Reads English in Game Character Voices]. Tistory (in Korean). January 20, 2021. Retrieved December 18, 2024.
- Li, Yongqiang. "语音开源项目优选:免费配音网站15.ai" [Voice Open Source Project Selection: Free Voice Acting Website 15.ai]. Zhihu (in Chinese). Retrieved December 18, 2024.
- VK, Anirudh (March 18, 2023). "Deepfakes Are Elevating Meme Culture, But At What Cost?". Analytics India Magazine. Retrieved December 18, 2024.
While AI voice memes have been around in some form since '15.ai' launched in 2020,
- Wright, Steven (March 21, 2023). "Why Biden, Trump, and Obama Arguing Over Video Games Is YouTube's New Obsession". Inverse. Archived from the original on December 20, 2024. Retrieved December 18, 2024.
AI voice tools used to create "audio deepfakes" have existed for years in one form or another, with 15.ai being a notable example.
- "Can I publish the content I generate on the platform?". ElevenLabs (Official website). May 8, 2024. Retrieved December 23, 2024.
- "Navigating the Challenges and Opportunities of Synthetic Voices". OpenAI. March 9, 2024. Archived from the original on November 25, 2024. Retrieved December 18, 2024.
Speech synthesis | |||||
---|---|---|---|---|---|
Free software |
| ||||
Proprietary software |
| ||||
Machine | |||||
Applications | |||||
Protocols | |||||
Developers/ Researchers | |||||
Process |