Misplaced Pages

15.ai

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

This is an old revision of this page, as edited by Alalch E. (talk | contribs) at 16:06, 25 December 2024 (migrate quote from cs1). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Revision as of 16:06, 25 December 2024 by Alalch E. (talk | contribs) (migrate quote from cs1)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff) Real-time text-to-speech AI tool
An editor has nominated this article for deletion.
You are welcome to participate in the deletion discussion, which will decide whether or not to retain it.Feel free to improve the article, but do not remove this notice before the discussion is closed. For more information, see the guide to deletion.
Find sources: "15.ai" – news · newspapers · books · scholar · JSTOR%5B%5BWikipedia%3AArticles+for+deletion%2F15.ai+%283rd+nomination%29%5D%5DAFD

15.ai
Type of siteArtificial intelligence, speech synthesis
Available inEnglish
Founder(s)15
URL15.ai
CommercialNo
RegistrationNone
LaunchedMarch 2020; 4 years ago (2020-03)
Current statusInactive

15.ai was a free non-commercial web application that used artificial intelligence to generate text-to-speech voices of fictional characters from popular media. Conceived by an artificial intelligence researcher known as "15" during their time at the Massachusetts Institute of Technology and developed following their successful exit from a startup venture, the application allowed users to make characters from various media speak custom text with emotional inflections faster than real-time.

Launched in March 2020, The service gained widespread attention in early 2021 when it went viral on social media platforms like YouTube and Twitter, and quickly became popular among Internet fandoms, including the My Little Pony: Friendship Is Magic, Team Fortress 2, and SpongeBob SquarePants fandoms. The website had a role in the emergence of AI voice cloning (audio deepfake) memes.

In January 2022, Voiceverse NFT sparked controversy when it was discovered that the company, which had partnered with voice actor Troy Baker, had misappropriated 15.ai's work for their own platform. The service was ultimately taken offline in September 2022. Its shutdown led to the emergence of various commercial alternatives in subsequent years.

History

15.ai was conceived in 2016 as a research project in deep learning speech synthesis by a developer known as "15" during their undergraduate studies at the Massachusetts Institute of Technology (MIT). The developer was inspired by DeepMind's WaveNet paper, with development continuing through their studies as Google AI released Tacotron the following year. The name 15 is a reference to the creator's claim that a voice can be cloned with as little as 15 seconds of data. 15.ai was released in March 2020. More voices were added to the website in the following months.

In early 2021, the application went viral on Twitter and YouTube, with people generating skits, memes, and fan content using voices from popular games and shows. 15.ai use also resulted in memes and viral videos. These included recreations of the popular Source Filmmaker video Heavy is Dead, The RED Bread Bank, and Among Us Struggles, which have amassed millions of views on social media. Content creators, YouTubers, and TikTokers have also used 15.ai as part of their videos as voiceovers. According to the developer, at its peak, the platform incurred operational costs of US$12,000 per month from AWS infrastructure needed to handle millions of daily voice generations. They funded the website through their previous startup earnings.

On January 14, 2022, a controversy ensued after it was discovered that Voiceverse NFT, a company that video game and anime dub voice actor Troy Baker announced his partnership with, had misappropriated voice lines generated from 15.ai as part of their marketing campaign. Log files showed that Voiceverse had generated audio of characters from My Little Pony: Friendship Is Magic using 15.ai, pitched them up to make them sound unrecognizable from the original voices to market their own platform—in violation of 15.ai's terms of service. Voiceverse claimed that someone in their marketing team used the voice without properly crediting 15.ai; in response, 15 tweeted "Go fuck yourself."

In September 2022, 15.ai was taken offline. The developer claimed that this was due to legal issues surrounding artificial intelligence and copyright.

Features

The platform was non-commercial, and operated without requiring user registration or accounts. Users generated speech by inputting text and selecting a character voice, with optional parameters for emotional contextualizers and phonetic transcriptions. Each request produced three audio variations with distinct emotional deliveries. Characters available included multiple characters from Team Fortress 2 and My Little Pony: Friendship Is Magic; GLaDOS and Wheatley from the Portal series; SpongeBob SquarePants; Rise Kujikawa from Persona 4; Daria Morgendorffer and Jane Lane from Daria; Carl Brutananadilewski from Aqua Teen Hunger Force; Steven Universe from Steven Universe; Sans from Undertale; the Tenth Doctor Who; the Narrator from The Stanley Parable; and HAL 9000 from 2001: A Space Odyssey. Certain "silent" characters like Chell and Gordon Freeman were able to be selected as a joke, and would emit silent audio files when any text was submitted.

The deep learning model's nondeterministic properties produced variations in speech output, creating different intonations with each generation, similar to how voice actors produce different takes. 15.ai introduced the concept of "emotional contextualizers," which allowed users to specify the emotional tone of generated speech through guiding phrases. The emotional contextualizer functionality utilized DeepMoji, a sentiment analysis neural network developed at the MIT Media Lab. Introduced in 2017, DeepMoji processed emoji embeddings from 1.2 billion Twitter posts (2013-2017) to analyze emotional content. Testing showed the system could identify emotional elements, including sarcasm, more accurately than human evaluators.

The application provided support for a simplified version of ARPABET, a set of English phonetic transcriptions originally developed by the Advanced Research Projects Agency in the 1970s. This feature allowed users to correct mispronunciations or specify the desired pronunciation between heteronyms – words that have the same spelling but have different pronunciations. Users could invoke ARPABET transcriptions by enclosing the phoneme string in curly braces within the input box (for example, "{AA1 R P AH0 B EH2 T}" to specify the pronunciation of the word "ARPABET" (/ˈɑːrpəˌbɛt/ AR-pə-beht). The interface displayed parsed words with color-coding to indicate pronunciation certainty: green for words found in the existing pronunciation lookup table, blue for manually entered ARPAbet pronunciations, and red for words where the pronunciation had to be algorithmically predicted.

Later versions of 15.ai introduced multi-speaker capabilities. Rather than training separate models for each voice, 15.ai used a unified model that learned multiple voices simultaneously through speaker embeddings–learned numerical representations that captured each character's unique vocal characteristics. Along with the emotional context conferred by DeepMoji, this neural network architecture enabled the model to learn shared patterns across different characters' emotional expressions and speaking styles, even when individual characters lacked examples of certain emotional contexts in their training data.

The interface included technical metrics and graphs, which, according to the developer, served to highlight the research aspect of the website. As of version v23, released in September 2021, the interface displayed comprehensive model analysis information, including word parsing results and emotional analysis data. The flow and generative adversarial network (GAN) hybrid denoising function, introduced in an earlier version, was streamlined to remove manual parameter inputs.

Reception and legacy

Critics described 15.ai as easy to use and generally able to convincingly replicate character voices, with occasional mixed results. Natalie Clayton of PC Gamer wrote that SpongeBob SquarePants' voice was replicated well, but noted challenges in mimicking the Narrator from the The Stanley Parable: "the algorithm simply can't capture Kevan Brighting's whimsically droll intonation." Zack Zwiezen of Kotaku reported that " girlfriend was convinced it was a new voice line from GLaDOS' voice actor, Ellen McLain". Taiwanese newspaper United Daily News also highlighted 15.ai's ability to recreate GLaDOS's mechanical voice, alongside its diverse range of character voice options. Yahoo! News Taiwan reported that "GLaDOS in Portal can pronounce lines nearly perfectly", but also criticized that "there are still many imperfections, such as word limit and tone control, which are still a little weird in some words."

Multiple other critics also found the character limit and the prosody options as not entirely satisfactory. Peter Paltridge of anime and superhero news outlet Anime Superhero opined that "voice synthesis has evolved to the point where the more expensive efforts are nearly indistinguishable from actual human speech," but also noted that "In some ways, SAM is still more advanced than this. It was possible to affect SAM’s inflections by using special characters, as well as change his pitch at will. With 15.ai, you’re at the mercy of whatever random inflections you get." Conversely, Lauren Morton of Rock, Paper, Shotgun praised the depth of pronunciation control—"if you're willing to get into the nitty gritty of it". Takayuki Furushima of Den Fami Nico Gamer highlighted the "smooth pronunciations", and Yuki Kurosawa of AUTOMATON noted its "rich emotional expression" as a major feature; both Japanese authors noted the lack of Japanese-language support. Renan do Prado of the Brazilian gaming news outlet Arkade pointed out that users could create amusing results in Portuguese, although generation primarily performed well in English.

South Korean video game outlet Zuntata wrote that "the surprising thing about 15.ai is that , there's only about 30 seconds of data, but it achieves pronunciation accuracy close to 100%". Machine learning professor Yongqiang Li wrote in his blog that he was surprised to see that the application was free.

15.ai was an early pioneer of audio deepfakes, leading to the emergence of AI speech synthesis-based memes. Its influence has been noted in the years after it became defunct, and since then, several commercial alternatives emerged, such as ElevenLabs and Speechify. The original claim that only 15 seconds of data is required to clone a human's voice was corroborated by OpenAI in 2024.

See also

Explanatory footnotes

  1. The term "faster than real-time" in speech synthesis means that the system can generate audio more quickly than the actual duration of the speech – for example, generating 10 seconds of speech in less than 10 seconds would be considered faster than real-time.
  2. which uses "11.ai" as a legal byname for its web domain

References

Notes

  1. 遊戲 2021; Yoshiyuki 2021.
  2. Kurosawa 2021; Ruppert 2021; Clayton 2021; Morton 2021.
  3. Ng 2020.
  4. Zwiezen 2021; Chandraseta 2021.
  5. ^ GamerSky 2021.
  6. ^ Chandraseta 2021.
  7. ^ "The past and future of 15.ai". Twitter. Archived from the original on December 8, 2024. Retrieved December 19, 2024.
  8. Chandraseta 2021; Button 2021.
    • "About". fifteen.ai (Official website). February 19, 2020. Archived from the original on February 23, 2020. Retrieved December 23, 2024. 2020-02-19: The web app isn't fully ready just yet
    • "About". fifteen.ai (Official website). March 2, 2020. Archived from the original on March 3, 2020. Retrieved December 23, 2024.
  9. Scotellaro 2020a; Scotellaro 2020b.
  10. Zwiezen 2021; Clayton 2021; Ruppert 2021; Yoshiyuki 2021.
  11. ^ 遊戲 2021.
  12. ^ Kurosawa 2021.
  13. ^ Morton 2021.
  14. Play.ht 2024.
  15. Lawrence 2022; Williams 2022; Wright 2022.
  16. Phillips 2022; Lopez 2022.
  17. Wright 2022; Phillips 2022.
  18. fifteenai 2022. sfn error: no target: CITEREFfifteenai2022 (help)
  19. ^ ElevenLabs 2024a; Play.ht 2024.
  20. Williams 2022.
  21. Phillips 2022.
  22. Zwiezen 2021; Clayton 2021; Morton 2021; Ruppert 2021.
  23. Yoshiyuki 2021.
  24. Knight 2017.
  25. ^ www.equestriacn.com 2022.
  26. Clayton 2021; Ruppert 2021; Moto 2021; Scotellaro 2020; Villalobos 2021.
  27. Clayton 2021.
  28. Zwiezen 2021.
  29. ^ MrSun 2021.
  30. Paltridge 2021.
  31. Yoshiyuki 2021: 日本語入力には対応していないが、ローマ字入力でもなんとなくそれっぽい発音になる。; 15.aiはテキスト読み上げサービスだが、特筆すべきはそのなめらかな発音と、ゲームに登場するキャラクター音声を再現している点だ。 (transl. It does not support Japanese input, but even if you input using romaji, it will somehow give you a similar pronunciation.; 15.ai is a text-to-speech service, but what makes it particularly noteworthy is its smooth pronunciation and the fact that it reproduces the voices of characters that appear in games.)
  32. do Prado 2021.
  33. zuntata.tistory.com 2021.
  34. Li 2021.
  35. MrSun 2021; Anirudh VK 2023.
  36. Wright 2023.
  37. ElevenLabs 2024b.
  38. OpenAI 2024.

Works cited

Speech synthesis
Free software
Speaking
Singing
Proprietary
software
Speaking
Singing
Machine
Applications
Protocols
Developers/
Researchers
Process
My Little Pony (2010–2021)
Equestria
Friendship Is Magic
(2010–2019)
Episodes
Season 1 (2010–2011)
"Friendship Is Magic"
"The Cutie Mark Chronicles"
"The Best Night Ever"
Season 2 (2011–2012)
"The Return of Harmony"
"Hearts and Hooves Day"
"A Canterlot Wedding"
Season 3 (2012–2013)
"The Crystal Empire"
"One Bad Apple"
"Magic Duel"
"Spike at Your Service"
"Keep Calm and Flutter On"
"Games Ponies Play"
"Magical Mystery Cure"
Season 4 (2013–2014)
"Princess Twilight Sparkle"
"Power Ponies"
"Three's a Crowd"
"Pinkie Pride"
"Filli Vanilli"
"Twilight's Kingdom"
Season 5 (2015)
"The Cutie Map"
"Slice of Life"
"Amending Fences"
"Crusaders of the Lost Mark"
"The Cutie Re-Mark"
Season 6 (2016)
"A Hearth's Warming Tail"
Season 7 (2017)
"The Perfect Pear"
Season 8 (2018)
"Grannies Gone Wild"
Season 9 (2019)
"The Last Crusade"
Finale
My Little Pony: The Movie
(2017)
Other series
Games
Comics
Fandom
See alsoMy Little Pony: Equestria Girls
Categories: