Revision as of 21:43, 6 March 2022 editHackerKnownAs (talk | contribs)478 edits Removed advertisementTag: Undo← Previous edit | Revision as of 12:04, 11 March 2022 edit undo129.11.33.146 (talk)No edit summaryNext edit → | ||
Line 79: | Line 79: | ||
== Features == | == Features == | ||
Available characters include ] and ] from '']'', ] and a number of ] from '']'', ] from '']'', ] and ] from '']'', the ] from '']'', ] from '']'', the Narrator from '']'', the ]/] ] Announcer (formerly), ] from ], and ] from '']''.<ref name="Denfaminicogamer"> | Available characters include ] and ] from '']'', characters from ], ] and a number of ] from '']'', ] from '']'', ] and ] from '']'', the ] from '']'', ] from '']'', the Narrator from '']'', the ]/] ] Announcer (formerly), ] from ], and ] from '']''.<ref name="Denfaminicogamer"> | ||
{{cite web | {{cite web | ||
|url= https://news.denfaminicogamer.jp/news/210118f | |url= https://news.denfaminicogamer.jp/news/210118f |
Revision as of 12:04, 11 March 2022
Real-time text-to-speech tool using artificial intelligenceFile:15.ai logo.png | |
Developer(s) | 15 |
---|---|
Initial release | March 2020; 4 years ago (2020-03) |
Stable release | v24.2.1 / September 2021; 3 years ago (2021-09) |
Written in | Vue.js, Python, Julia |
Available in | English |
Type | Artificial intelligence, speech synthesis, machine learning, deep learning |
Website | 15 |
15.ai is a non-commercial freeware artificial intelligence web application that generates natural emotive high-fidelity text-to-speech voices from an assortment of fictional characters from a variety of media sources. Developed by an anonymous MIT researcher under the eponymous pseudonym 15, the project uses a combination of audio synthesis algorithms, speech synthesis deep neural networks, and sentiment analysis models to generate and serve emotive character voices faster than real-time, even those with a very small amount of data.
Features
Available characters include GLaDOS and Wheatley from Portal, characters from Team Fortress 2, Twilight Sparkle and a number of main, secondary, and supporting characters from My Little Pony: Friendship is Magic, SpongeBob from SpongeBob SquarePants, Daria Morgendorffer and Jane Lane from Daria, the Tenth Doctor from Doctor Who, HAL 9000 from 2001: A Space Odyssey, the Narrator from The Stanley Parable, the Wii U/3DS/Switch Super Smash Bros. Announcer (formerly), Sans from Undertale, and Carl Brutananadilewski from Aqua Teen Hunger Force.
The deep learning model used by the application is nondeterministic: each time that speech is generated from the same text string, the intonation of the speech will be slightly different. The application supports English phonetic transcriptions (such as ARPABET) to correct mispronunciations or to account for heteronyms—words that are spelled the same but are pronounced differently (such as the word read, which can be pronounced as either /ˈrɛd/ or /ˈriːd/ depending on its tense). The app also supports altering the emotion of a generated line using emotional contextualizers (a term coined by this project), a sentence or phrase that conveys the emotion of the take that serves as a guide for the model during inference.
The lexicon used by 15.ai was scraped from a variety of Internet sources, including Oxford Dictionaries, Wiktionary, the CMU Pronouncing Dictionary, 4chan, Reddit, and Twitter. Pronunciations of unfamiliar words are automatically deduced using phonological rules learned by the deep learning model.
Background
Speech synthesis
Main article: Deep learning speech synthesisIn 2016, with the proposal of DeepMind's WaveNet, deep-learning-based models for speech synthesis began to gain popularity as a method of modeling waveforms and generating human-like speech. Tacotron2, a neural network architecture for speech synthesis developed by Google AI, was published in 2018 and required tens of hours of audio data to train. For years, reducing the amount of data required to train a realistic high-quality text-to-speech model has been a primary goal of scientific researchers in the field of deep learning speech synthesis.
The developer of 15.ai claims that as little as 15 seconds of data is sufficient to clone a voice up to human standards, a significant reduction in the amount of data required.
Copyrighted material in deep learning
Main article: Authors Guild, Inc. v. Google, Inc.A landmark case between Google and the Authors Guild in 2013 ruled that Google Books—a service that searches the full text of printed copyrighted books—met all requirements for fair use. This case set an important legal precedent for the field of deep learning and artificial intelligence: using copyrighted material to train a discriminative model or a non-commercial generative model was deemed legal.
Development
15.ai was designed and created by an anonymous research scientist affiliated with the Massachusetts Institute of Technology known by the alias 15, ostensibly in reference to the minimum amount of data required to convincingly clone a voice. The project began development while the developer was an undergraduate. Although the application costs several thousands of dollars a month to keep up and maintain, the developer has stated that they are capable of paying the high cost of running the site out of pocket.
The developer has also worked closely with the Pony Preservation Project from /mlp/, the My Little Pony board of 4chan. The Pony Preservation Project is a "collaborative effort by /mlp/ to build and curate pony datasets" with the aim of creating applications in artificial intelligence. According to the developer, the collective efforts and constructive criticism from the Pony Preservation Project has been integral to the development of 15.ai. During months-long hiatuses of the public release of the site in late 2020 and early 2021, a number of test sites were put up for exclusive use by /mlp/ and the Pony Preservation Project for the purposes of testing and generating content.
Reception
15 @fifteenai I've been informed that the aforementioned NFT vocal synthesis is actively attempting to appropriate my work for their own benefit. After digging through the log files, I have evidence that some of the voices that they are taking credit for were indeed generated from my own site.
January 14, 2022
Voiceverse Origins @VoiceverseNFT Hey @fifteenai we are extremely sorry about this. The voice was indeed taken from your platform, which our marketing team used without giving proper credit. Chubbiverse team has no knowledge of this. We will make sure this never happens again.
January 14, 2022
15 @fifteenai Go fuck yourself.
January 14, 2022
Over 3 million lines were generated within the first two weeks of the September 2021 release of the application. As of February 2022, 15.ai's Patreon was raising over $1,100 per month.
Fandom content creation
15.ai has been frequently used for content creation in various fandoms, including the My Little Pony: Friendship Is Magic fandom, the Team Fortress 2 fandom, the Portal fandom, and the SpongeBob SquarePants fandom. Numerous videos/projects that have 15.ai-synthesized speech in them have been made and gone viral. However, numerous videos/projects that have non-15.ai-synthesized speech in them have been made and gone viral too—some of them without having properly credited the source(s) of their AI-synthesized voice clips. As a consequence, many videos/projects that have been made with other speech synthesis software have been mistaken as being made with 15.ai, and vice versa. Due to this misattribution and absence of proper credit, 15.ai's terms of service has a rule that forbids having 15.ai-and-non-15.ai-synthesized speech in the same videos/projects.
The My Little Pony: Friendship Is Magic fandom has seen a resurgence in video and musical content creation as a direct result; moreover, the project has been utilized as a creative tool in pornography. For instance, the Pony Zone videos is a series of erotic musical videos that heavily samples 15.ai as the vocals—the creators of such videos make frequent use of salacious emotional contextualizers and punctuation/ARPABET tricks to induce the models to grunt, sigh, and moan convincingly.
Troy Baker / VoiceVerse NFT scandal
On January 14, 2022, it was discovered that Voiceverse NFT, a company that video game voice actor Troy Baker announced his partnership with, had stolen voice lines generated from 15.ai as part of their marketing campaign without permission. Log files showed that Voiceverse had generated audio of Twilight Sparkle and Rainbow Dash from the show My Little Pony: Friendship Is Magic using 15.ai, pitched them up to make them sound unrecognizable from the original voices, and appropriated them without proper credit to falsely market their own platform—a violation of 15.ai's terms of service.
The initial partnership between Troy Baker and Voiceverse was met with severe backlash and universally negative reception. Critics highlighted the potential environmental impact of and potential for exit scams associated with NFT sales. Two weeks later, on January 31, Baker announced that he would discontinue his partnership with Voiceverse.
Resistance from voice actors
Some voice actors have publicly decried the use of voice cloning technology. Cited reasons include concerns about impersonation and fraud, unauthorized use of an actor's voice in pornography, and the potential of AI being used to make voice actors obsolete.
See also
- Speech synthesis
- Deep learning speech synthesis
- Virtual actor
- 4chan
- My Little Pony: Friendship Is Magic fandom
References
- Notes
- ^ "15.ai - About". 15.ai. 2022-02-20. Retrieved 2022-02-20.
- ^ Chandraseta, Rionaldi (2021-01-19). "Generate Your Favourite Characters' Voice Lines using Machine Learning". Toward Data Science. Toward Data Science. Retrieved 2021-01-23.
- ^ Zwiezen, Zack (2021-01-18). "Website Lets You Make GLaDOS Say Whatever You Want". Kotaku. Kotaku. Retrieved 2021-01-18.
- ^ Ruppert, Liana (2021-01-18). "Make Portal's GLaDOS And Other Beloved Characters Say The Weirdest Things With This App". Game Informer. Game Informer. Retrieved 2021-01-18.
- Clayton, Natalie (2021-01-19). "Make the cast of TF2 recite old memes with this AI text-to-speech tool". PC Gamer. PC Gamer. Retrieved 2021-01-19.
- Morton, Lauren (2021-01-18). "Put words in game characters' mouths with this fascinating text to speech tool". Rock, Paper, Shotgun. Rock, Paper, Shotgun. Retrieved 2021-01-18.
- Yoshiyuki, Furushima (2021-01-18). "『Portal』のGLaDOSや『UNDERTALE』のサンズがテキストを読み上げてくれる。文章に込められた感情まで再現することを目指すサービス「15.ai」が話題に". Denfaminicogamer. Denfaminicogamer. Retrieved 2021-01-18.
- Kurosawa, Yuki (2021-01-19). "ゲームキャラ音声読み上げソフト「15.ai」公開中。『Undertale』や『Portal』のキャラに好きなセリフを言ってもらえる". AUTOMATON. AUTOMATON. Retrieved 2021-01-19.
- Villalobos, José (2021-01-18). "Descubre 15.AI, un sitio web en el que podrás hacer que GlaDOS diga lo que quieras". LaPS4. LaPS4. Retrieved 2021-01-18.
- Moto, Eugenio (2021-01-20). "15.ai, el sitio que te permite usar voces de personajes populares para que digan lo que quieras". Yahoo! Finance. Yahoo! Finance. Retrieved 2021-01-20.
- "15.ai - Guide". 15.ai. 2022-02-20. Retrieved 2022-02-20.
- Hsu, Wei-Ning (2018). "Hierarchical Generative Modeling for Controllable Speech Synthesis". arXiv:1810.07217 .
- Habib, Raza (2019). "Semi-Supervised Generative Modeling for Controllable Speech Synthesis". arXiv:1910.01709 .
- A bot will complete this citation soon. Click here to jump the queue arXiv:1712.05884.
- Chung, Yu-An (2018). "Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis". arXiv:1808.10128 .
- Ren, Yi (2019). "Almost Unsupervised Text to Speech and Automatic Speech Recognition". arXiv:1905.06791 .
- ^ "15.ai - FAQ". 15.ai. 2021-01-18. Retrieved 2021-01-18.
- Stewart, Matthew (2019-10-31). "The Most Important Court Decision For Data Science and Machine Learning". Towards Data Science. Towards Data Science. Retrieved 2022-02-21.
- "Pony Preservation Project (Thread 108)". 4chan. Desuarchive. 2022-02-20. Retrieved 2022-02-20.
- "15.ai - Thanks". 15.ai. 2022-02-20. Retrieved 2022-02-20.
- "15.ai is creating natural emotive high-fidelity TTS with minimal viable data". Patreon. 2022-02-20. Retrieved 2022-02-20.
- Williams, Demi (2022-01-18). "Voiceverse NFT admits to taking voice lines from non-commercial service". NME. NME. Retrieved 2022-01-18.
- Wright, Steve (2022-01-17). "Troy Baker-backed NFT company admits to using content without permission". Stevivor. Stevivor. Retrieved 2022-01-17.
- Henry, Joseph (2022-01-18). "Troy Baker's Partner NFT Company Voiceverse Reportedly Steals Voice Lines From 15.ai". Tech Times. Tech Times. Retrieved 2022-02-14.
- Yea, Yong (2022-01-14). "Troy Baker Faces Mass Backlash For Supporting Shady AI Voice NFTs With Company That Has Stolen Work". YouTube. YouTube. Retrieved 2022-01-30.
- Phillips, Tom (2022-01-17). "Troy Baker-backed NFT firm admits using voice lines taken from another service without permission". Eurogamer. Eurogamer. Retrieved 2022-01-17.
- Phillips, Tom (2022-01-14). "Video game voice actor Troy Baker is now promoting NFTs". Eurogamer. Eurogamer. Retrieved 2022-01-14.
- Strickland, Derek (2022-01-31). "Last of Us actor Troy Baker heeds fans, abandons NFT plans". Tweaktown. Tweaktown. Retrieved 2022-01-31.
- Peterson, Danny (2022-01-31). "'The Last of Us' actor Troy Baker reverses course on NFTs amid fan backlash". We Got This Covered. We Got This Covered. Retrieved 2022-02-14.
- Ng, Andrew (2021-03-07). "Weekly Newsletter Issue 83". The Batch. The Batch. Retrieved 2021-03-07.
- Tweets
- @fifteenai (January 14, 2022). "I've been informed that the aforementioned NFT vocal synthesis is actively attempting to appropriate my work for their own benefit. After digging through the log files, I have evidence that some of the voices that they are taking credit for were indeed generated from my own site" (Tweet) – via Twitter.
{{cite web}}
: CS1 maint: url-status (link) - @VoiceverseNFT (January 14, 2022). "Hey @fifteenai we are extremely sorry about this. The voice was indeed taken from your platform, which our marketing team used without giving proper credit. Chubbiverse team has no knowledge of this. We will make sure this never happens again" (Tweet) – via Twitter.
{{cite web}}
: CS1 maint: url-status (link) - @fifteenai (January 14, 2022). "Go fuck yourself" (Tweet) – via Twitter.
{{cite web}}
: CS1 maint: url-status (link)