15.ai: Difference between revisions

Browse history interactively ← Previous edit Next edit →Content deleted Content addedVisual WikitextInline

Revision as of 17:32, 14 November 2024 editHackerKnownAs (talk \| contribs)478 edits Does not satisfy the requirements of WP:TWITTER Tag: Reverted← Previous edit		Revision as of 00:21, 15 November 2024 edit undoBrocadeRiverPoems (talk \| contribs)Extended confirmed users1,074 edits Reverting to Talk Page Consensus. This subversion of consensus is clear WP:OWNERSHIPBEHAVIOR. This reversion made re-implements several unreliable sources, misrepresented sources, and subverts consensus based edits for no policy reason.Tags: Manual revert RevertedNext edit →
Line 1:		Line 1:
	{{Short description\|Real-time text-to-speech tool using artificial intelligence}}		{{Short description\|Real-time text-to-speech tool using artificial intelligence}}
			{{pp\|small=yes}}
	{{Good article}}		{{Good article}}
			{{Multiple issues\|section=\|
	{{pp-protected\|small=yes}}
			{{COI\|date=October 2024}}
			{{POV\|date=October 2024}}
			{{cite check\|date=October 2024}}
			}}
	{{Use mdy dates\|date=July 2022}}		{{Use mdy dates\|date=July 2022}}
	{{Infobox website		{{Infobox website
	\| name = 15.ai		\| name = 15.ai
			\| logo_caption = {{Deletable file-caption\|Thursday, 24 October 2024\|F7}}
	\| logo =
	\| screenshot =		\| screenshot =
	\| caption =		\| caption =
Line 11:		Line 16:
	\| commercial = No		\| commercial = No
	\| registration = None		\| registration = None
	\| launch_date = '''Initial release''': {{start date and age\|2020\|03\|12}}<br/>'''Stable release''': v24.2.1 ~~/ {{start date and age\|2021\|09}}~~		\| launch_date = '''Initial release''': {{start date and age\|2020\|03\|12}}<br/>'''Stable release''': v24.2.1
	\| current_status = Under maintenance		\| current_status = Under maintenance
	\| type = ], ], ], ]		\| type = ], ], ], ]
Line 18:		Line 23:
	}}		}}
	{{Artificial intelligence}}		{{Artificial intelligence}}
			'''15.ai''' was a ] ] ], launched in 2020, that generated ] voices from fictional characters from various media sources.<ref name="kotaku">{{cite web
	'''15.ai''' is a ] ] ] ] that generates natural emotive high-fidelity{{efn\|The phrase "high-fidelity" in TTS research is often used to describe ]s that are able to reconstruct waveforms with very little distortion, and is not simply synonymous with "high quality." See the papers for HiFi-GAN,<ref>{{cite arXiv \|last=Kong \|first=Jungil\|eprint=2010.05646v2 \|title=HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis\|class=cs \|date=2020 }}</ref> GAN-TTS,<ref>{{cite arXiv \|last=Binkowski \|first=Mikołaj\|eprint=1909.11646v2 \|title=High Fidelity Speech Synthesis with Adversarial Networks\|class=cs \|date=2019 }}</ref> and parallel ]<ref name="deepmind"/> for unbiased examples of this usage of terminology.}} ] voices from an assortment of fictional characters from a variety of media sources.<ref name="kotaku">{{cite web
	\|url= https://kotaku.com/this-website-lets-you-make-glados-say-whatever-you-want-1846062835		\|url= https://kotaku.com/this-website-lets-you-make-glados-say-whatever-you-want-1846062835
	\|title= Website Lets You Make GLaDOS Say Whatever You Want		\|title= Website Lets You Make GLaDOS Say Whatever You Want
Line 66:		Line 71:
	\|archive-url= https://web.archive.org/web/20210118213308/https://www.rockpapershotgun.com/2021/01/18/put-words-in-game-characters-mouths-with-this-fascinating-text-to-speech-tool/		\|archive-url= https://web.archive.org/web/20210118213308/https://www.rockpapershotgun.com/2021/01/18/put-words-in-game-characters-mouths-with-this-fascinating-text-to-speech-tool/
	\|url-status= live		\|url-status= live
	}}</ref> ~~Developed~~ by a ] ~~] researcher~~ under the ~~name~~ '''15''', the project ~~uses~~ a combination of ] algorithms, ] ], and ] models to generate ~~and serve~~ emotive character voices faster than real-time~~, particularly those with a very small amount of ] data~~.		}}</ref> Created by a ]ous developer under the alias '''15''', the project used a combination of ] algorithms, ] ], and ] models to generate emotive character voices faster than real-time.

	~~Launched in~~ early 2020, 15.ai ~~began~~ as a ] of the ] of voice acting and dubbing ~~using technology~~.<ref name="thebatch">		In early 2020, 15.ai appeared online as a ] of the ] of voice acting and dubbing.<ref name="thebatch">
	{{cite web \|last=Ng \|first=Andrew \|date=2020-04-01 \|title=Voice Cloning for the Masses \|url=https://blog.deeplearning.ai/blog/the-batch-ai-against-coronavirus-datasets-voice-cloning-for-the-masses-finding-unexploded-bombs-seeing-see-through-objects-optimizing-training-parameters \|url-status=dead \|archive-url=https://web.archive.org/web/20200807111844/https://blog.deeplearning.ai/blog/the-batch-ai-against-coronavirus-datasets-voice-cloning-for-the-masses-finding-unexploded-bombs-seeing-see-through-objects-optimizing-training-parameters \|archive-date=2020-08-07 \|access-date=2020-04-05 \|website=The Batch \|quote=}}		{{cite web \|last=Ng \|first=Andrew \|date=2020-04-01 \|title=Voice Cloning for the Masses \|url=https://blog.deeplearning.ai/blog/the-batch-ai-against-coronavirus-datasets-voice-cloning-for-the-masses-finding-unexploded-bombs-seeing-see-through-objects-optimizing-training-parameters \|url-status=dead \|archive-url=https://web.archive.org/web/20200807111844/https://blog.deeplearning.ai/blog/the-batch-ai-against-coronavirus-datasets-voice-cloning-for-the-masses-finding-unexploded-bombs-seeing-see-through-objects-optimizing-training-parameters \|archive-date=2020-08-07 \|access-date=2020-04-05 \|website=The Batch \|quote=}}
	</ref> Its gratis ~~and non-commercial~~ nature ~~(with the only stipulation being that the project be properly credited when used)~~, ease of use, no ] ~~registration requirement~~, and ~~substantial~~ improvements to ~~current~~ text-to-speech implementations ~~have~~ ~~been~~ ~~lauded by users;~~<ref name="gameinformer"/><ref name="kotaku" /><ref name="pcgamer" /> ~~some~~ critics and ]s ~~have~~ questioned the ] and ] of ~~leaving~~ such technology ~~publicly available and~~ readily accessible.~~<ref name="thebatch"/><ref name="batch"/>~~<ref name="wccftech"/>		</ref> Its gratis nature, ease of use without ], and improvements over existing text-to-speech implementations made it popular.<ref name="gameinformer"/><ref name="kotaku" /><ref name="pcgamer" /> Some critics and ]s questioned the ] and ] of making such technology so readily accessible.<ref name="wccftech"/>

	~~Credited~~ as the impetus behind the popularization of AI ] (also known as '']'') in ] ~~and as the first publicly available AI vocal synthesis project to involve the use of existing popular fictional characters, 15~~.~~ai has had a~~ ~~significant~~ ~~impact~~ on ~~multiple~~ Internet ]s, ~~most~~ ~~notably the~~ ], '']'', and '']'' ~~fandoms. Furthermore, 15.ai has inspired the use of ]'s '''Pony Preservation Project''' in other ] projects~~.<ref name="automaton"/><ref name="Denfaminicogamer"/>		The site was credited as the impetus behind the popularization of AI ] (also known as '']'') in ]. It was embraced by Internet ]s such as ], '']'', and '']''.<ref name="automaton"/><ref name="Denfaminicogamer"/>

	Several commercial alternatives ~~have~~ ~~spawned with~~ the ~~rising~~ ~~popularity of 15.ai, leading to cases of misattribution and theft~~. In January 2022, it ~~was~~ ~~discovered that '''~~Voiceverse NFT~~''', a company that voice actor ] announced his partnership with, had~~ ] 15.ai's work as part of their platform.<ref name="nme">{{cite web		Several commercial alternatives appeared in the following years. In January 2022, the company Voiceverse NFT ] 15.ai's work as part of their platform.<ref name="nme">{{cite web
	\|url= https://www.nme.com/news/gaming-news/voiceverse-nft-admits-to-taking-voice-lines-from-non-commercial-service-3140663		\|url= https://www.nme.com/news/gaming-news/voiceverse-nft-admits-to-taking-voice-lines-from-non-commercial-service-3140663
	\|title= Voiceverse NFT admits to taking voice lines from non-commercial service		\|title= Voiceverse NFT admits to taking voice lines from non-commercial service
Line 111:		Line 116:
	\|url-status= live		\|url-status= live
	}}</ref>		}}</ref>

			In September 2022, a year after its last stable release, 15.ai was taken offline. As of November 2024, the website was still offline, with the creator's most recent post being dated February 2023.<ref>{{Cite tweet \|number=1628834708653068290 \|user=fifteenai \|title=If all goes well, the next update should be the culmination of a year and a half of nonstop work put into a huge number of fixes and major improvements to the algorithm. Just give me a bit more time – it should be worth it.}}</ref>

	== Features ==		== Features ==
	], known for his sinister robotic voice, is one of the available characters on 15.ai.<ref name="kotaku"/>]]		], known for his sinister robotic voice, is one of the available characters on 15.ai.<ref name="kotaku"/>]]
	Available characters ~~include~~ ] and ] from '']'', characters from '']'', ] and ~~a number of~~ ] from '']'', ] ~~from '']''~~, ] and ] from '']'', the ] ~~from '']'', ] from '']'', the Narrator from '']''~~, the ]/] ] Announcer (formerly)~~, ] from '']'', ] ~~from '']''~~, Dan from '']'', and ] from '']''.<ref name="Denfaminicogamer">{{cite web~~		Available characters included ] and ] from '']'', characters from '']'', ] and other ] from '']'', ], ] and ] from '']'', the ], ] from '']'', the Narrator from '']'', ] from '']'', ], Dan from '']'', and ] from '']''.<ref name="Denfaminicogamer">{{cite web
	\|url= https://news.denfaminicogamer.jp/news/210118f		\|url= https://news.denfaminicogamer.jp/news/210118f
	\|title= 『Portal』のGLaDOSや『UNDERTALE』のサンズがテキストを読み上げてくれる。文章に込められた感情まで再現することを目指すサービス「15.ai」が話題に		\|title= 『Portal』のGLaDOSや『UNDERTALE』のサンズがテキストを読み上げてくれる。文章に込められた感情まで再現することを目指すサービス「15.ai」が話題に
Line 164:		Line 171:
	}}</ref>		}}</ref>

	The ] model used by the application is ]: each time ~~that~~ speech is generated from the same string of text, the intonation ~~of the speech will be~~ slightly ~~different~~. The application ~~also supports~~ manually altering the ] of a generated line using ''emotional contextualizers'' (a term coined by this project), a sentence or phrase ~~that conveys~~ the emotion of the take that serves as a guide for the model during inference.<ref name="automaton"/><ref name="Denfaminicogamer"/>		The ] model used by the application was ]: each time speech was generated from the same string of text, the intonation changed slightly. The application supported manually altering the ] of a generated line using ''emotional contextualizers'' (a term coined by this project), a sentence or phrase conveying the emotion of the take that serves as a guide for the model during inference.<ref name="automaton"/><ref name="Denfaminicogamer"/>
	Emotional contextualizers ~~are~~ representations of the emotional content of a sentence deduced via ] ] ] using DeepMoji, a deep neural network ] algorithm developed by the ] in 2017.<ref>{{cite book \|last=Felbo \|first=Bjarke \|arxiv=1708.00524 \|title=Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing\|chapter=Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm \|date=2017 \|pages=1615–1625 \|doi=10.18653/v1/D17-1169 \|s2cid=2493033 }}</ref><ref>{{cite web		Emotional contextualizers were representations of the emotional content of a sentence deduced via ] ] ] using ], a deep neural network ] algorithm developed by the ] in 2017.<ref>{{cite book \|last=Felbo \|first=Bjarke \|arxiv=1708.00524 \|title=Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing\|chapter=Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm \|date=2017 \|pages=1615–1625 \|doi=10.18653/v1/D17-1169 \|s2cid=2493033 }}</ref><ref>{{cite web
	\|url= https://www.theregister.com/2017/08/07/sarcasm_detector_bot_mit/		\|url= https://www.theregister.com/2017/08/07/sarcasm_detector_bot_mit/
	\|title= A sarcasm detector bot? That sounds absolutely brilliant. Definitely		\|title= A sarcasm detector bot? That sounds absolutely brilliant. Definitely
Line 176:		Line 183:
	\|archive-url= https://web.archive.org/web/20220602215737/https://www.theregister.com/2017/08/07/sarcasm_detector_bot_mit/		\|archive-url= https://web.archive.org/web/20220602215737/https://www.theregister.com/2017/08/07/sarcasm_detector_bot_mit/
	\|url-status= live		\|url-status= live
	}}</ref> DeepMoji was trained on 1.2 billion emoji occurrences in ] data from 2013 to 2017, and ~~has been found to outperform~~ human subjects in correctly identifying sarcasm in Tweets and other online modes of communication.<ref>{{cite web		}}</ref> DeepMoji was trained on 1.2 billion emoji occurrences in ] data from 2013 to 2017, and outperformed human subjects in correctly identifying sarcasm in Tweets and other online modes of communication.<ref>{{cite web
	\|url= https://www.technologyreview.com/2017/08/03/105566/an-algorithm-trained-on-emoji-knows-when-youre-being-sarcastic-on-twitter/		\|url= https://www.technologyreview.com/2017/08/03/105566/an-algorithm-trained-on-emoji-knows-when-youre-being-sarcastic-on-twitter/
	\|title= An Algorithm Trained on Emoji Knows When You're Being Sarcastic on Twitter		\|title= An Algorithm Trained on Emoji Knows When You're Being Sarcastic on Twitter
Line 211:		Line 218:
	}}</ref>		}}</ref>

	15.ai ~~uses~~ a ''multi-speaker model''—hundreds of voices ~~are~~ trained concurrently rather than sequentially, decreasing the required training time and enabling the model to learn and generalize shared emotional context, even for voices with no exposure to ~~such emotional~~ context.<ref name="arxivmello">{{cite arXiv \|last=Valle \|first=Rafael \|eprint=1910.11997 \|title=Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens \|class=eess \|date=2020 }}</ref> Consequently, the ~~entire lineup of~~ characters in the application is powered by a single trained model, as opposed to multiple single-speaker models ~~trained on different datasets~~.<ref name="arxivmulti">{{cite arXiv \|last=Cooper \|first=Erica \|eprint=1910.10838 \|title=Zero-Shot Multi-Speaker Text-To-Speech with State-of-the-art Neural Speaker Embeddings \|class=eess \|date=2020 }}</ref> The ] used by 15.ai ~~has been~~ scraped from a variety of Internet sources, including ], ], the ], ], ], and ]. Pronunciations of unfamiliar words ~~are~~ automatically deduced using ]s learned by the deep learning model.<ref name="automaton"/>		15.ai used a ''multi-speaker model''—hundreds of voices were trained concurrently rather than sequentially, decreasing the required training time and enabling the model to learn and generalize shared emotional context, even for voices with no exposure to that context.<ref name="arxivmello">{{cite arXiv \|last=Valle \|first=Rafael \|eprint=1910.11997 \|title=Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens \|class=eess \|date=2020 }}</ref> Consequently, the characters in the application were powered by a single trained model, as opposed to multiple single-speaker models.<ref name="arxivmulti">{{cite arXiv \|last=Cooper \|first=Erica \|eprint=1910.10838 \|title=Zero-Shot Multi-Speaker Text-To-Speech with State-of-the-art Neural Speaker Embeddings \|class=eess \|date=2020 }}</ref> The ] used by 15.ai was scraped from a variety of Internet sources, including ], ], the ], ], ], and ]. Pronunciations of unfamiliar words were automatically deduced using ]s learned by the deep learning model.<ref name="automaton"/>

	The application supports a simplified version of a set of English phonetic transcriptions known as ] to correct mispronunciations or to account for ]—words that are spelled the same but are pronounced differently (such as the word ''read'', which can be pronounced as either {{IPAc-en\|ˈ\|r\|ɛ\|d}} or {{IPAc-en\|ˈ\|r\|iː\|d}} depending on its ]). While the original ARPABET codes developed in the 1970s by the ] supports 50 unique symbols to designate and differentiate between English phonemes,<ref name="klautau">{{cite web\|last=Klautau\|first=Aldebaro\|year=2001\|url=http://www.laps.ufpa.br/aldebaro/papers/ak_arpabet01.pdf\|title=ARPABET and the TIMIT alphabet\|access-date=September 8, 2017\|archive-url=https://web.archive.org/web/20160603180727/http://www.laps.ufpa.br/aldebaro/papers/ak_arpabet01.pdf\|archive-date=June 3, 2016}}</ref> the ]'s ARPABET convention (the set of transcription codes followed by 15.ai<ref name="automaton" />) reduces the symbol set to 39 phonemes by combining ] phonetic realizations into a single standard (e.g. <code>]</code>; <code>]/]</code>) and using multiple common symbols together to replace ] (e.g. <code>EN/AH0 N</code>).<ref name="columbia">{{cite web
	\|url= http://www.cs.columbia.edu/~julia/courses/CS6998-2019/%5B07%5D%20Phonetics.pdf
	\|title= Phonetics
	\|last=
	\|first=
	\|date= 2017
	\|website= ]
	\|access-date= 2022-06-11
	\|url-status= live
	\|archive-date= 2022-06-19
	\|archive-url= https://web.archive.org/web/20220619180213/http://www.cs.columbia.edu/~julia/courses/CS6998-2019/%5B07%5D%20Phonetics.pdf
	}}</ref><ref name="prondicts">{{cite thesis
	\|type=MSc
	\|last=Loots
	\|first=Linsen
	\|date=March 2010
	\|title=Data-Driven Augmentation of Pronunciation Dictionaries
	\|publisher=Stellenbosch University, Department of Electrical & Electronic Engineering
	\|citeseerx=10.1.1.832.2872
	\|url=http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.832.2872&rep=rep1&type=pdf
	\|access-date=2022-06-11
	\|url-status=live
	\|quote=Table 3.2
	\|archive-date=2022-06-11
	\|archive-url=https://web.archive.org/web/20220611175904/http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.832.2872&rep=rep1&type=pdf
	}}</ref> ARPABET strings can be invoked in the application by wrapping the string of phonemes in ] within the input box (e.g. <code>{AA1 R P AH0 B EH2 T}</code> to denote {{IPAc-en\|ˈ\|ɑːr\|p\|ə\|ˌ\|b\|ɛ\|t}}, the pronunciation of the word ''ARPABET'').<ref name="automaton" />

	The following is a table of phonemes used by 15.ai and the CMU Pronouncing Dictionary:<ref name="cmudict">
	{{cite web
	\|url= http://www.speech.cs.cmu.edu/cgi-bin/cmudict
	\|title= The CMU Pronouncing Dictionary
	\|last=
	\|first=
	\|date= 2015-07-16
	\|website= ]
	\|access-date= 2022-06-04
	\|archive-date= 2022-06-03
	\|archive-url= https://web.archive.org/web/20220603181334/http://www.speech.cs.cmu.edu/cgi-bin/cmudict
	\|url-status= live}}</ref>

	<div style="text-align: center;">
	{\| class="wikitable" style="display:inline-table; text-align: center;"
	\|+ Vowels
	! colspan="1" \| ARPABET
	! colspan="1" \| ]
	! rowspan="1" \| ]
	! rowspan="1" \| Example
	\|-
	\| <code>AA</code>
	\| ''ah''
	\| {{IPA link\|ä\|ɑ}}
	\| style="text-align:left" \| '''o'''dd
	\|-
	\| <code>AE</code>
	\| ''a''
	\| {{IPA link\|æ}}
	\| style="text-align:left" \| '''a'''t
	\|-
	\| <code>AH0</code>
	\| ''ə''
	\| {{IPA link\|ə}}
	\| style="text-align:left" \| '''a'''bout
	\|-
	\| <code>AH</code>
	\| ''u, uh''
	\| {{IPA link\|ʌ}}
	\| style="text-align:left" \| h'''u'''t
	\|-
	\| <code>AO</code>
	\| ''aw''
	\| {{IPA link\|ɔ}}
	\| style="text-align:left" \| '''ou'''ght
	\|-
	\| <code>AW</code>
	\| ''ow''
	\| {{IPA\|aʊ}}
	\| style="text-align:left" \| c'''ow'''
	\|-
	\| <code>AY</code>
	\| ''eye''
	\| {{IPA\|aɪ}}
	\| style="text-align:left" \| h'''i'''de
	\|-
	\| <code>EH</code>
	\| ''e, eh''
	\| {{IPA link\|ɛ}}
	\| style="text-align:left" \| '''E'''d
	\|-
	\|}

	{\| class="wikitable" style="display:inline-table; text-align: center; margin-right: 2em;"
	\|+ Vowels
	! colspan="1" \| ARPABET
	! colspan="1" \| ]
	! rowspan="1" \| ]
	! rowspan="1" \| Example
	\|-
	\| <code>ER</code>
	\| ''ur'', ''ər''
	\| {{IPA link\|ɝ}}, {{IPA link\|ɚ}}
	\| style="text-align:left" \| h'''ur'''t
	\|-
	\| <code>EY</code>
	\| ''ay''
	\| {{IPA\|eɪ}}
	\| style="text-align:left" \| '''a'''te
	\|-
	\| <code>IH</code>
	\| ''i'', ''ih''
	\| {{IPA link\|ɪ}}
	\| style="text-align:left" \| '''i'''t
	\|-
	\| <code>IY</code>
	\| ''ee''
	\| {{IPA link\|i}}
	\| style="text-align:left" \| '''ea'''t
	\|-
	\| <code>OW</code>
	\| ''oh''
	\| {{IPA\|oʊ}}
	\| style="text-align:left" \| '''oa'''t
	\|-
	\| <code>OY</code>
	\| ''oy''
	\| {{IPA\|ɔɪ}}
	\| style="text-align:left" \| t'''oy'''
	\|-
	\| <code>UH</code>
	\| ''uu''
	\| {{IPA link\|ʊ}}
	\| style="text-align:left" \| h'''oo'''d
	\|-
	\| <code>UW</code>
	\| ''oo''
	\| {{IPA link\|u}}
	\| style="text-align:left" \| t'''wo'''
	\|}

	{\| class="wikitable" style="display:inline-table; text-align: center; margin-right: 2em;"
	\|+ Stress
	! AB
	! Description
	\|-
	\| style="text-align:center" \| 0
	\| No stress
	\|-
	\| style="text-align:center" \| 1
	\| ]
	\|-
	\| style="text-align:center" \| 2
	\| ]
	\|}

	{\| class="wikitable" style="display:inline-table; text-align: center;"
	\|+ Consonants
	! colspan="1" \| ARPABET
	! colspan="1" \| ]
	! rowspan="1" \| ]
	! rowspan="1" \| Example
	\|-
	\| <code>B</code>
	\| ''b''
	\| {{IPA link\|b}}
	\| style="text-align:left" \| '''b'''e
	\|-
	\| <code>CH</code>
	\| ''ch'', ''tch''
	\| {{IPA link\|tʃ}}
	\| style="text-align:left" \| '''ch'''eese
	\|-
	\| <code>D</code>
	\| ''d''
	\| {{IPA link\|d}}
	\| style="text-align:left" \| '''d'''ee
	\|-
	\| <code>DH</code>
	\| ''dh''
	\| {{IPA link\|ð}}
	\| style="text-align:left" \| '''th'''ee
	\|-
	\| <code>F</code>
	\| ''f''
	\| {{IPA link\|f}}
	\| style="text-align:left" \| '''f'''ee
	\|-
	\| <code>G</code>
	\| ''g''
	\| {{IPA link\|ɡ}}
	\| style="text-align:left" \| '''g'''reen
	\|-
	\| <code>HH</code>
	\| ''h''
	\| {{IPA link\|h}}
	\| style="text-align:left" \| '''h'''e
	\|-
	\| <code>JH</code>
	\| ''j''
	\| {{IPA link\|dʒ}}
	\| style="text-align:left" \| '''g'''ee
	\|-
	\|}

	{\| class="wikitable" style="display:inline-table; text-align: center;"
	\|+ Consonants
	! colspan="1" \| ARPABET
	! colspan="1" \| ]
	! rowspan="1" \| ]
	! rowspan="1" \| Example
	\|-
	\| <code>K</code>
	\| ''k''
	\| {{IPA link\|k}}
	\| style="text-align:left" \| '''k'''ey
	\|-
	\| <code>L</code>
	\| ''l''
	\| {{IPA link\|l}}
	\| style="text-align:left" \| '''l'''ee
	\|-
	\| <code>M</code>
	\| ''m''
	\| {{IPA link\|m}}
	\| style="text-align:left" \| '''m'''e
	\|-
	\| <code>N</code>
	\| ''n''
	\| {{IPA link\|n}}
	\| style="text-align:left" \| '''kn'''ee
	\|-
	\| <code>NG</code>
	\| ''ng''
	\| {{IPA link\|ŋ}}
	\| style="text-align:left" \| pi'''ng'''
	\|-
	\| <code>P</code>
	\| ''p''
	\| {{IPA link\|p}}
	\| style="text-align:left" \| '''p'''ee
	\|-
	\| <code>R</code>
	\| ''r''
	\| {{IPA link\|r}}
	\| style="text-align:left" \| '''r'''ead
	\|-
	\| <code>S</code>
	\| ''s'', ''ss''
	\| {{IPA link\|s}}
	\| style="text-align:left" \| '''s'''ea
	\|}

	{\| class="wikitable" style="display:inline-table; text-align: center;"
	\|+ Consonants
	! colspan="1" \| ARPABET
	! colspan="1" \| ]
	! rowspan="1" \| ]
	! rowspan="1" \| Example
	\|-
	\| <code>SH</code>
	\| ''sh''
	\| {{IPA link\|ʃ}}
	\| style="text-align:left" \| '''sh'''e
	\|-
	\| <code>T</code>
	\| ''t''
	\| {{IPA link\|t}}
	\| style="text-align:left" \| '''t'''ea
	\|-
	\| <code>TH</code>
	\| ''th''
	\| {{IPA link\|θ}}
	\| style="text-align:left" \| '''th'''eta
	\|-
	\| <code>V</code>
	\| ''v''
	\| {{IPA link\|v}}
	\| style="text-align:left" \| '''v'''ee
	\|-
	\| <code>W</code>
	\| ''w'', ''wh''
	\| {{IPA link\|w}}
	\| style="text-align:left" \| '''w'''e
	\|-
	\| <code>Y</code>
	\| ''y''
	\| {{IPA link\|j}}
	\| style="text-align:left" \| '''y'''ield
	\|-
	\| <code>Z</code>
	\| ''z''
	\| {{IPA link\|z}}
	\| style="text-align:left" \| '''z'''ee
	\|-
	\| <code>ZH</code>
	\| ''zh''
	\| {{IPA link\|ʒ}}
	\| style="text-align:left" \| sei'''z'''ure
	\|}
	</div>

			The application supported a simplified phonetic transcription known as ], to correct mispronunciations and account for ]—words that are spelled the same but are pronounced differently (such as the word ''read'', which can be pronounced as either {{IPAc-en\|ˈ\|r\|ɛ\|d}} or {{IPAc-en\|ˈ\|r\|iː\|d}} depending on its ]). It followed the ]'s ARPABET conventions.<ref name="automaton" />
	{{clear}}		{{clear}}

Line 519:		Line 228:
	{{See also\|Audio deepfake}}		{{See also\|Audio deepfake}}
	]'s ].<ref name="deepmind" />]]		]'s ].<ref name="deepmind" />]]
	In 2016, with the proposal of ]'s ], deep-learning-based models for speech synthesis began to gain popularity as a method of modeling waveforms and generating human-like speech.<ref name="arxiv1">{{cite arXiv \|last=Hsu \|first=Wei-Ning \|eprint=1810.07217 \|title=Hierarchical Generative Modeling for Controllable Speech Synthesis \|class=cs.CL \|date=2018 }}</ref><ref name="arxiv2">{{cite arXiv \|last=Habib \|first=Raza \|eprint=1910.01709 \|title=Semi-Supervised Generative Modeling for Controllable Speech Synthesis \|class=cs.CL \|date=2019 }}</ref><ref name="deepmind">{{cite web\|url=https://www.deepmind.com/blog/high-fidelity-speech-synthesis-with-wavenet\|title=High-fidelity speech synthesis with WaveNet\|last1=van den Oord\|first1=Aäron\|last2=Li\|first2=Yazhe\|last3=Babuschkin\|first3=Igor\|date=2017-11-12\|website=]\|access-date=2022-06-05\|archive-date=2022-06-18\|archive-url=https://web.archive.org/web/20220618205838/https://www.deepmind.com/blog/high-fidelity-speech-synthesis-with-wavenet\|url-status=live}}</ref~~><ref name="thebatch"/~~> Tacotron2, a neural network architecture for speech synthesis developed by ], was published in 2018 and required tens of hours of audio data to produce intelligible speech; when trained on 2 hours of speech, the model was able to produce intelligible speech with mediocre quality, and when trained on 36 minutes of speech, the model was unable to produce intelligible speech.<ref name="tacotron">{{cite web\|url=https://google.github.io/tacotron/publications/semisupervised/index.html\|title=Audio samples from "Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis"\|date=2018-08-30\|access-date=2022-06-05\|archive-date=2020-11-11\|archive-url=https://web.archive.org/web/20201111222714/https://google.github.io/tacotron/publications/semisupervised/index.html\|url-status=live}}</ref><ref name="arxiv3">{{cite arXiv \|eprint=1712.05884 \|title=Natural TTS Synthesis by Conditioning WaveNet on Mel-Spectrogram Predictions \|class=cs.CL \|date=2018 \|last1=Shen \|first1=Jonathan \|last2=Pang \|first2=Ruoming \|last3=Weiss \|first3=Ron J. \|last4=Schuster \|first4=Mike \|last5=Jaitly \|first5=Navdeep \|last6=Yang \|first6=Zongheng \|last7=Chen \|first7=Zhifeng \|last8=Zhang \|first8=Yu \|last9=Wang \|first9=Yuxuan \|last10=Skerry-Ryan \|first10=RJ \|last11=Saurous \|first11=Rif A. \|last12=Agiomyrgiannakis \|first12=Yannis \|last13=Wu \|first13=Yonghui }}</ref>		In 2016, with the proposal of ]'s ], deep-learning-based models for speech synthesis began to gain popularity as a method of modeling waveforms and generating high-fidelity human-like speech.<ref name="arxiv1">{{cite arXiv \|last=Hsu \|first=Wei-Ning \|eprint=1810.07217 \|title=Hierarchical Generative Modeling for Controllable Speech Synthesis \|class=cs.CL \|date=2018 }}</ref><ref name="arxiv2">{{cite arXiv \|last=Habib \|first=Raza \|eprint=1910.01709 \|title=Semi-Supervised Generative Modeling for Controllable Speech Synthesis \|class=cs.CL \|date=2019 }}</ref><ref name="deepmind">{{cite web\|url=https://www.deepmind.com/blog/high-fidelity-speech-synthesis-with-wavenet\|title=High-fidelity speech synthesis with WaveNet\|last1=van den Oord\|first1=Aäron\|last2=Li\|first2=Yazhe\|last3=Babuschkin\|first3=Igor\|date=2017-11-12\|website=]\|access-date=2022-06-05\|archive-date=2022-06-18\|archive-url=https://web.archive.org/web/20220618205838/https://www.deepmind.com/blog/high-fidelity-speech-synthesis-with-wavenet\|url-status=live}}</ref> Tacotron2, a neural network architecture for speech synthesis developed by ], was published in 2018 and required tens of hours of audio data to produce intelligible speech; when trained on 2 hours of speech, the model was able to produce intelligible speech with mediocre quality, and when trained on 36 minutes of speech, the model was unable to produce intelligible speech.<ref name="tacotron">{{cite web\|url=https://google.github.io/tacotron/publications/semisupervised/index.html\|title=Audio samples from "Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis"\|date=2018-08-30\|access-date=2022-06-05\|archive-date=2020-11-11\|archive-url=https://web.archive.org/web/20201111222714/https://google.github.io/tacotron/publications/semisupervised/index.html\|url-status=live}}</ref><ref name="arxiv3">{{cite arXiv \|eprint=1712.05884 \|title=Natural TTS Synthesis by Conditioning WaveNet on Mel-Spectrogram Predictions \|class=cs.CL \|date=2018 \|last1=Shen \|first1=Jonathan \|last2=Pang \|first2=Ruoming \|last3=Weiss \|first3=Ron J. \|last4=Schuster \|first4=Mike \|last5=Jaitly \|first5=Navdeep \|last6=Yang \|first6=Zongheng \|last7=Chen \|first7=Zhifeng \|last8=Zhang \|first8=Yu \|last9=Wang \|first9=Yuxuan \|last10=Skerry-Ryan \|first10=RJ \|last11=Saurous \|first11=Rif A. \|last12=Agiomyrgiannakis \|first12=Yannis \|last13=Wu \|first13=Yonghui }}</ref>

	For years, reducing the amount of data required to train a realistic high-quality text-to-speech model has been a primary goal of scientific researchers in the field of deep learning speech synthesis.<ref>{{cite arXiv \|last=Chung \|first=Yu-An \|eprint=1808.10128 \|title=Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis \|class=cs.CL \|date=2018 }}</ref><ref>{{cite arXiv \|last=Ren \|first=Yi \|eprint=1905.06791 \|title=Almost Unsupervised Text to Speech and Automatic Speech Recognition \|class=cs.CL \|date=2019 }}</ref> The developer of 15.ai claims that as little as 15 seconds of data is sufficient to clone a voice up to human standards, a significant reduction in the amount of data required.<ref name="eurogamer"/>		For years, reducing the amount of data required to train a realistic high-quality text-to-speech model has been a primary goal of scientific researchers in the field of deep learning speech synthesis.<ref>{{cite arXiv \|last=Chung \|first=Yu-An \|eprint=1808.10128 \|title=Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis \|class=cs.CL \|date=2018 }}</ref><ref>{{cite arXiv \|last=Ren \|first=Yi \|eprint=1905.06791 \|title=Almost Unsupervised Text to Speech and Automatic Speech Recognition \|class=cs.CL \|date=2019 }}</ref> The developer of 15.ai claims that as little as 15 seconds of data is sufficient to clone a voice up to human standards, a significant reduction in the amount of data required.<ref name="eurogamer"/>
Line 526:		Line 235:
	{{Main\|Authors Guild, Inc. v. Google, Inc.}}		{{Main\|Authors Guild, Inc. v. Google, Inc.}}
	A landmark case between ] and the ] in 2013 ruled that ]—a service that searches the full text of printed copyrighted books—was ], thus meeting all requirements for fair use.<ref>- F.2d – (2d Cir, 2015). (temporary cites: 2015 U.S. App. LEXIS 17988;		A landmark case between ] and the ] in 2013 ruled that ]—a service that searches the full text of printed copyrighted books—was ], thus meeting all requirements for fair use.<ref>- F.2d – (2d Cir, 2015). (temporary cites: 2015 U.S. App. LEXIS 17988;
	{{Dead link\|date=September 2024 \|bot=InternetArchiveBot \|fix-attempted=yes }} (October 16, 2015))</ref> This case set an important legal precedent for the field of deep learning and artificial intelligence: using copyrighted material to train a ] or a ''non-commercial'' ] was deemed legal. The legality of ''commercial'' generative models trained using copyrighted material is still under debate; due to the black-box nature of machine learning models, any allegations of copyright infringement via direct competition would be difficult to prove.{{cn\|date=June 2024}}		{{Dead link\|date=September 2024 \|bot=InternetArchiveBot \|fix-attempted=yes }} (October 16, 2015))</ref> This case set an important legal precedent for the field of deep learning and artificial intelligence: using copyrighted material to train a ] or a ''non-commercial'' ] was deemed legal. The legality of ''commercial'' generative models trained using copyrighted material is still under debate; due to the black-box nature of machine learning models, any allegations of copyright infringement via direct competition would be difficult to prove.{{citation needed\|date=June 2024}}

	== Development ==		== Development ==
	15.ai was designed and created by an anonymous research scientist ~~affiliated with the ]~~ known by the alias ''15''.~~<ref~~ ~~name~~=~~"twitter">~~		15.ai was designed and created by an anonymous research scientist known by the alias ''15''.{{Citation needed\|date=October 2024}}
	{{cite web
	\|url= https://twitter.com/fifteenai
	\|title= 15
	\|last=
	\|first=
	\|date= 2022-06-09
	\|website= ]
	\|publisher=
	\|access-date= 2022-06-09
	\|quote= }}
	</ref>

			The algorithm used by the project was dubbed '''DeepThroat.'''<ref name="15aiabout">{{cite web \|last= \|first= \|date=2022-02-20 \|title=15.ai – About \|url=https://15.ai/about \|url-status=dead \|archive-url=https://archive.today/20211006074716/https://15.ai/about \|archive-date=2021-10-06 \|access-date=2022-02-20 \|website=15.ai \|publisher= \|quote=}}</ref> The developer said the project and algorithm were conceived as part of MIT's ], and had been in development for years before the first release of the application.<ref name="automaton"/>
	According to posts made by its developer on ], 15.ai costs several thousands of dollars per month to operate; they are able to support the project due to a successful startup ].<ref name="hn">{{cite web
	\|url= https://news.ycombinator.com/item?id=31711118
	\|title= 15.ai
	\|last=
	\|first=
	\|date= 2022-06-12
	\|website= ]
	\|publisher=
	\|access-date= 2022-06-13
	\|quote=
	\|archive-date= 2022-06-13
	\|archive-url= https://web.archive.org/web/20220613000443/http://news.ycombinator.com/item?id=31711118
	\|url-status= live
	}}</ref> The developer has stated that during their undergraduate years at MIT, they were paid the ] to work on a related project (approximately $14 an hour in ]<ref>{{cite web
	\|url= https://urop.mit.edu/guidelines/participation-considerations/pay-credit-volunteer/
	\|title= Pay, Credit & Volunteer
	\|last=
	\|first=
	\|date=
	\|website= ] ]
	\|publisher=
	\|access-date= 2022-06-13
	\|quote=
	\|archive-date= 2022-06-19
	\|archive-url= https://web.archive.org/web/20220619234437/https://urop.mit.edu/guidelines/participation-considerations/pay-credit-volunteer/
	\|url-status= live
	}}</ref>) that eventually evolved into 15.ai. They also stated that the democratization of voice cloning technology is not the only function of the website; in response to a user asking whether the research could be conducted without a public website, the developer wrote:
	{{Blockquote
	\|text= The website has multiple purposes. It serves as a ] of a platform that allows anyone to create ], even if they can't hire someone to voice their projects.

			]'s /mlp/ board has been integral to the development of 15.ai.<ref name="gwern">{{cite journal \|last=Branwen \|first=Gwern \|date=2020-03-06 \|title="15.ai"⁠, 15, Pony Preservation Project \|url=https://www.gwern.net/docs/ai/music/index#15-project-2020-section \|url-status=live \|publisher=Gwern \|archive-url=https://web.archive.org/web/20220318160737/https://www.gwern.net/docs/ai/music/index#15-project-2020-section \|archive-date=2022-03-18 \|access-date=2022-06-17 \|website=Gwern.net}}</ref>]]
	It also demonstrates the progress of my research in a far more engaging manner—by being able to use the actual model, you can discover things about it that even I wasn't aware of (such as getting characters to make gasping noises or moans by placing commas in between certain phonemes).
			The developer also worked closely with the Pony Preservation Project from /mlp/, the '']'' ] of ]. This project was a "collaborative effort by /mlp/ to build and curate pony datasets" with the aim of creating applications in artificial intelligence.<ref>{{cite web

	It also doesn't let me get away with ] and ] (which I believe is a big problem endemic in ] today—it's disingenuous and misleading). Being able to interact with the model with no filter allows the user to judge exactly how good the current work is at face value.
	\|author=15ai, ''Hacker News''<ref name="hn"/>
	}}

	The algorithm used by the project to facilitate the cloning of voices with minimal viable data has been dubbed '''DeepThroat'''<ref name="15aiabout">{{cite web \|last= \|first= \|date=2022-02-20 \|title=15.ai – About \|url=https://15.ai/about \|url-status=dead \|archive-url=https://archive.today/20211006074716/https://15.ai/about \|archive-date=2021-10-06 \|access-date=2022-02-20 \|website=15.ai \|publisher= \|quote=}}</ref> (a ] in reference to ] using ] and the sexual act of ]). The project and algorithm—initially conceived as part of MIT's ]—had been in development for years before the first release of the application.<ref name="automaton"/>

	]'s /mlp/ board has been integral to the development of 15.ai.<ref name="gwern"/>]]
	The developer has also worked closely with the Pony Preservation Project from /mlp/, the '']'' ] of ]. The '''Pony Preservation Project''', which began in 2019, is a "collaborative effort by /mlp/ to build and curate pony datasets" with the aim of creating applications in artificial intelligence.<ref name="gwern">{{cite journal
	\|url= https://www.gwern.net/docs/ai/music/index#15-project-2020-section
	\|title= "15.ai"⁠, 15, Pony Preservation Project
	\|last= Branwen
	\|first= Gwern
	\|date= 2020-03-06
	\|website= Gwern.net
	\|publisher= Gwern
	\|access-date= 2022-06-17
	\|url-status= live
	\|archive-date= 2022-03-18
	\|archive-url= https://web.archive.org/web/20220318160737/https://www.gwern.net/docs/ai/music/index#15-project-2020-section
	}}</ref><ref>{{cite web
	\|url= https://www.equestriadaily.com/2020/03/neat-pony-preservation-project-using.html		\|url= https://www.equestriadaily.com/2020/03/neat-pony-preservation-project-using.html
	\|title= Neat "Pony Preservation Project" Using Neural Networks to Create Pony Voices		\|title= Neat "Pony Preservation Project" Using Neural Networks to Create Pony Voices
Line 614:		Line 264:
	\|publisher= Desuarchive		\|publisher= Desuarchive
	\|access-date= 2022-02-20		\|access-date= 2022-02-20
	\|quote= }}</ref> The ''Friendship Is Magic'' voices on 15.ai were trained on a large dataset ]d by the ~~Pony Preservation Project~~: audio and dialogue from the show and related media—including ], ], ], ], and various other content voiced by the same voice actors—were ], ], and ] to remove background noise. ~~According to the developer, the collective efforts and constructive criticism from the Pony Preservation Project have been integral to the development of 15.ai.<ref name="gwern"/>~~		\|quote= }}</ref> The ''Friendship Is Magic'' voices on 15.ai were trained on a large dataset ]d by the project: audio and dialogue from the show and related media—including ], ], ], ], and various other content voiced by the same voice actors—were ], ], and ] to remove background noise.

	In addition, the developer has stated that the logo of 15.ai, which features a robotic ], is an homage to the fact that her voice (as originally portrayed by ]) was indispensable to the implementation of emotional contextualizers.<ref name="hn"/>

	== Reception ==		== Reception ==
			15.ai met with a largely positive reception. Liana Ruppert of '']'' described it "simplistically brilliant" and José Villalobos of '']'' wrote that it "works as easy as it looks."<ref name="LaPS4"/>{{efn\|Translated from original quote written in Spanish: ''"La dirección es 15.AI y funciona tan fácil como parece."''<ref name="LaPS4"/>}} Users praised the ability to easily create audio of popular characters that sound believable to those unaware they had been synthesized. Zack Zwiezen of '']'' reported that " girlfriend was convinced it was a new voice line from ]' voice actor, ]".<ref name="kotaku"/>
	] wrote that the technology behind 15.ai could potentially open up to cases of ].]]
	15.ai has been met with largely positive reception. Liana Ruppert of '']'' described 15.ai as "simplistically brilliant."<ref name="gameinformer"/> Lauren Morton of '']'' and Natalie Clayton of '']'' called it "fascinating,"<ref name="rockpapershotgun"/><ref name="pcgamer"/> and José Villalobos of '']'' wrote that it "works as easy as it looks."<ref name="LaPS4"/>{{efn\|Translated from original quote written in Spanish: ''"La dirección es 15.AI y funciona tan fácil como parece."''<ref name="LaPS4"/>}} Users praised the ability to easily create audio of popular characters that sound believable to those unaware that the voices had been synthesized by artificial intelligence: Zack Zwiezen of '']'' reported that " girlfriend was convinced it was a new voice line from ]' voice actor, ],"<ref name="kotaku"/> while Rionaldi Chandraseta of ''Towards Data Science'' wrote that, upon watching a ] video featuring popular character voices generated by 15.ai, " first thought was the video creator used ] to pay for new dialogues from the original voice actors" and stated that "the quality of voices done by 15.ai is miles ahead of ."

	Reception has also been largely acclaimed overseas, especially in ]. Takayuki Furushima of ''Den Fami Nico Gamer'' has described 15.ai as "like magic," and Yuki Kurosawa of ''Automaton Media'' called it "revolutionary."<ref name="Denfaminicogamer"/><ref name="automaton"/>

	Computer scientist and technology entrepreneur ] commented in his newsletter ''The Batch'' that the technology behind 15.ai could be "enormously productive" and could "revolutionize the use of ]s"; he also noted that "synthesizing a human actor's voice without consent is arguably unethical and possibly illegal" and could potentially open up to cases of ].<ref name="thebatch"/><ref name="batch"/> In his blog '']'', ] ] deemed 15 one of the "most underrated talents in AI and machine learning."<ref>{{cite web
	\|url= https://marginalrevolution.com/marginalrevolution/2022/05/the-most-underrated-talent-in-ai.html
	\|title= The most underrated talent in AI?
	\|last= Cowen
	\|first= Tyler
	\|date= 2022-05-12
	\|website= ]
	\|access-date= 2022-06-16
	\|url-status= live
	\|archive-date= 2022-06-19
	\|archive-url= https://web.archive.org/web/20220619203626/https://marginalrevolution.com/marginalrevolution/2022/05/the-most-underrated-talent-in-ai.html
	}}</ref>

	== Impact ==		== Impact ==
	=== Fandom content creation ===		=== Fandom content creation ===
	<!-- Deleted image removed: ] -->		<!-- Deleted image removed: ] -->
	15.ai ~~has been~~ frequently used for ] in various ]s, including the ], the '']'' fandom, the '']'' fandom, and the '']'' fandom, with numerous videos and projects containing speech from 15.ai having gone ].<ref name="kotaku" /><ref name="gameinformer" />		15.ai was frequently used for ] in various ]s, including the ], the '']'' fandom, the '']'' fandom, and the '']'' fandom, with numerous videos and projects containing speech from 15.ai having gone ].<ref name="kotaku" /><ref name="gameinformer" />

	The ''My Little Pony: Friendship Is Magic'' fandom ~~has seen~~ a resurgence in video and musical content creation as a ~~direct~~ result, inspiring a new genre of fan-created content assisted by artificial intelligence. Some ] ~~have been~~ adapted into fully voiced "episodes": ''The Tax Breaks'' is a 17-minute long animated video rendition of a fan-written story published in 2014 that uses voices generated from 15.ai with ] and ], emulating the episodic style of the early seasons of ''Friendship Is Magic''.<ref name="taxbreaks">{{cite web		The ''My Little Pony: Friendship Is Magic'' fandom saw a resurgence in video and musical content creation as a result, inspiring a new genre of fan-created content assisted by artificial intelligence. Some ]s weren adapted into fully voiced "episodes": ''The Tax Breaks'' is a 17-minute long animated video rendition of a fan-written story published in 2014 that uses voices generated from 15.ai with ] and ], emulating the episodic style of the early seasons of ''Friendship Is Magic''.<ref name="taxbreaks">{{cite web
	\|url= https://www.equestriadaily.com/2022/05/full-simple-animated-episode-tax-breaks.html		\|url= https://www.equestriadaily.com/2022/05/full-simple-animated-episode-tax-breaks.html
	\|title= Full Simple Animated Episode – The Tax Breaks (Twilight)		\|title= Full Simple Animated Episode – The Tax Breaks (Twilight)
Line 656:		Line 288:
	}}</ref><ref>{{Cite book \|date=27 April 2014 \|title=The Terribly Taxing Tribulations of Twilight Sparkle \|url=https://www.fimfiction.net/story/185725 \|url-status=live \|archive-url=https://web.archive.org/web/20220630170105/https://www.fimfiction.net/story/185725 \|archive-date=30 June 2022 \|access-date=28 April 2022 \|website=Fimfiction.net}}</ref>		}}</ref><ref>{{Cite book \|date=27 April 2014 \|title=The Terribly Taxing Tribulations of Twilight Sparkle \|url=https://www.fimfiction.net/story/185725 \|url-status=live \|archive-url=https://web.archive.org/web/20220630170105/https://www.fimfiction.net/story/185725 \|archive-date=30 June 2022 \|access-date=28 April 2022 \|website=Fimfiction.net}}</ref>

	Viral videos from the ''Team Fortress 2'' fandom ~~that feature~~ voices from 15.ai include ''Spy is a ]'' (which ~~has~~ gained over 3 million views on YouTube ~~total~~ across multiple videos<ref group="yt">{{cite web\|url=https://www.youtube.com/watch?v=TAmhr6Was3E\|title=SPY IS A FURRY\|work=]\|date=January 17, 2021 \|access-date=June 14, 2022\|archive-date=June 13, 2022\|archive-url=https://web.archive.org/web/20220613094918/https://www.youtube.com/watch?v=TAmhr6Was3E\|url-status=live}}</ref><ref group="yt">{{cite web\|url=https://www.youtube.com/watch?v=lwQn7ISVV_8\|title=Spy is a Furry Animated\|work=]\|access-date=June 14, 2022\|archive-date=June 14, 2022\|archive-url=https://web.archive.org/web/20220614203255/https://www.youtube.com/watch?v=lwQn7ISVV_8\|url-status=live}}</ref><ref group="yt">{{cite web\|url=https://www.youtube.com/watch?v=r0FLyW86owo\|title= – Spy's Confession – \|work=]\|date=January 15, 2021 \|access-date=June 14, 2022\|archive-date=June 30, 2022\|archive-url=https://web.archive.org/web/20220630170113/https://www.youtube.com/watch?v=r0FLyW86owo\|url-status=live}}</ref>) and ''The RED Bread Bank'', both of which ~~have~~ inspired ] animated video renditions.<ref name="automaton"/> Other fandoms ~~have~~ used voices from 15.ai to produce viral videos. {{As of\|July 2022}}, the viral video ''] Struggles'' (~~which uses~~ voices from ''Friendship Is Magic'') ~~has~~ over 5.5 million views on YouTube;<ref group="yt">{{cite web\|url=https://www.youtube.com/watch?v=UPE3vnLY3TE\|title=Among Us Struggles\|work=]\|date=September 21, 2020 \|access-date=July 15, 2022}}</ref> ], ], and ] streamers ~~have~~ also used 15.ai for their videos, such as FitMC's video on the history of ]—one of the oldest running '']'' servers—and datpon3's TikTok video featuring the main characters of ''Friendship Is Magic'', which have 1.4 million and 510 thousand views, respectively.<ref group="yt">{{cite web\|url=https://www.youtube.com/watch?v=1V1O2gTdqHw\|title=The UPDATED 2b2t Timeline (2010–2020)\|work=]\|date=March 14, 2020 \|access-date=June 14, 2022\|archive-date=June 1, 2022\|archive-url=https://web.archive.org/web/20220601085855/https://www.youtube.com/watch?v=1V1O2gTdqHw\|url-status=live}}</ref><ref group="tt">{{cite web\|url=https://www.tiktok.com/@datpon3/video/6813618431217241350\|title=She said " 👹 " \|work=]\|access-date=July 15, 2022}}</ref>		Viral videos from the ''Team Fortress 2'' fandom featuring voices from 15.ai include ''Spy is a ]'' (which gained over 3 million views on YouTube across multiple videos<ref group="yt">{{cite web\|url=https://www.youtube.com/watch?v=TAmhr6Was3E\|title=SPY IS A FURRY\|work=]\|date=January 17, 2021 \|access-date=June 14, 2022\|archive-date=June 13, 2022\|archive-url=https://web.archive.org/web/20220613094918/https://www.youtube.com/watch?v=TAmhr6Was3E\|url-status=live}}</ref><ref group="yt">{{cite web\|url=https://www.youtube.com/watch?v=lwQn7ISVV_8\|title=Spy is a Furry Animated\|work=]\|access-date=June 14, 2022\|archive-date=June 14, 2022\|archive-url=https://web.archive.org/web/20220614203255/https://www.youtube.com/watch?v=lwQn7ISVV_8\|url-status=live}}</ref><ref group="yt">{{cite web\|url=https://www.youtube.com/watch?v=r0FLyW86owo\|title= – Spy's Confession – \|work=]\|date=January 15, 2021 \|access-date=June 14, 2022\|archive-date=June 30, 2022\|archive-url=https://web.archive.org/web/20220630170113/https://www.youtube.com/watch?v=r0FLyW86owo\|url-status=live}}</ref>) and ''The RED Bread Bank'', both of which inspired ] animated video renditions.<ref name="automaton"/> Other fandoms used voices from 15.ai to produce viral videos. {{As of\|July 2022}}, the viral video ''] Struggles'' (with voices from ''Friendship Is Magic'') had over 5.5 million views on YouTube;<ref group="yt">{{cite web\|url=https://www.youtube.com/watch?v=UPE3vnLY3TE\|title=Among Us Struggles\|work=]\|date=September 21, 2020 \|access-date=July 15, 2022}}</ref> ], ], and ] streamers also used 15.ai for their videos, such as FitMC's video on the history of ]—one of the oldest running '']'' servers—and datpon3's TikTok video featuring the main characters of ''Friendship Is Magic'', which have 1.4 million and 510 thousand views, respectively.<ref group="yt">{{cite web\|url=https://www.youtube.com/watch?v=1V1O2gTdqHw\|title=The UPDATED 2b2t Timeline (2010–2020)\|work=]\|date=March 14, 2020 \|access-date=June 14, 2022\|archive-date=June 1, 2022\|archive-url=https://web.archive.org/web/20220601085855/https://www.youtube.com/watch?v=1V1O2gTdqHw\|url-status=live}}</ref><ref group="tt">{{cite web\|url=https://www.tiktok.com/@datpon3/video/6813618431217241350\|title=She said " 👹 " \|work=]\|access-date=July 15, 2022}}</ref>

	Some users ~~have~~ created AI ]s using 15.ai and external voice control software. One user on Twitter created a personal desktop assistant inspired by ] using 15.ai-generated dialogue in tandem with voice control system VoiceAttack~~, with the program being able to boot up applications, utter corresponding random dialogues, and thank the user in response to actions~~.<ref name="automaton"/><ref name="Denfaminicogamer"/>		Some users created AI ]s using 15.ai and external voice control software. One user on Twitter created a personal desktop assistant inspired by ] using 15.ai-generated dialogue in tandem with voice control system VoiceAttack.<ref name="automaton"/><ref name="Denfaminicogamer"/>

	=== Troy Baker / Voiceverse NFT plagiarism scandal ===		=== Troy Baker / Voiceverse NFT plagiarism scandal ===
Line 678:		Line 310:
	In December 2021, the developer of 15.ai posted on ] that they had no interest in incorporating ] (NFTs) into their work.<ref name="wccftech"/><ref name="stevivor"/><ref group="tweet">{{Cite tweet \|user=fifteenai \|number=1470190153188749313\|date = December 12, 2021 \|title=I have no interest in incorporating NFTs into any aspect of my work. Please stop asking.}}</ref>		In December 2021, the developer of 15.ai posted on ] that they had no interest in incorporating ] (NFTs) into their work.<ref name="wccftech"/><ref name="stevivor"/><ref group="tweet">{{Cite tweet \|user=fifteenai \|number=1470190153188749313\|date = December 12, 2021 \|title=I have no interest in incorporating NFTs into any aspect of my work. Please stop asking.}}</ref>

	On January 14, 2022, it was discovered that Voiceverse NFT, a company that video game and ] ] ] ] announced his partnership with, had plagiarized voice lines generated from 15.ai as part of their marketing campaign.<ref name="nme"/><ref name="stevivor"/><ref name="techtimes"/> ] showed that Voiceverse had generated audio of ] and ] from the show '']'' using 15.ai, pitched them up to make them sound unrecognizable from the original voices, and appropriated them without ~~proper~~ credit to falsely market their own ~~platform—a~~ violation of 15.ai's terms of service.<ref name="eurogamer">{{cite web		On January 14, 2022, it was discovered that Voiceverse NFT, a company that video game and ] ] ] ] announced his partnership with, had plagiarized voice lines generated from 15.ai as part of their marketing campaign.<ref name="nme"/><ref name="stevivor"/><ref name="techtimes"/> ] showed that Voiceverse had generated audio of ] and ] from the show '']'' using 15.ai, pitched them up to make them sound unrecognizable from the original voices, and appropriated them without credit, to falsely market their own platform—in violation of 15.ai's terms of service.<ref name="eurogamer">{{cite web
	\|url= https://www.eurogamer.net/articles/2022-01-17-troy-baker-backed-nft-firm-admits-using-voice-lines-taken-from-another-service-without-permission		\|url= https://www.eurogamer.net/articles/2022-01-17-troy-baker-backed-nft-firm-admits-using-voice-lines-taken-from-another-service-without-permission
	\|title= Troy Baker-backed NFT firm admits using voice lines taken from another service without permission		\|title= Troy Baker-backed NFT firm admits using voice lines taken from another service without permission
Line 746:		Line 378:
	\|archive-url= https://web.archive.org/web/20220114104215/https://www.eurogamer.net/articles/2022-01-14-video-game-voice-actor-troy-baker-is-now-promoting-nfts		\|archive-url= https://web.archive.org/web/20220114104215/https://www.eurogamer.net/articles/2022-01-14-video-game-voice-actor-troy-baker-is-now-promoting-nfts
	\|url-status= live		\|url-status= live
	}}</ref> Commentators also pointed out the irony in Baker's initial Tweet announcing the partnership, which ended with "You can hate. Or you can create. What'll it be?", hours before the public revelation that the company in question had resorted to theft instead of creating their own product. Baker responded that he appreciated people sharing their thoughts and their responses were "giving a lot to think about."<ref>{{Cite web\|last=McWhertor\|first=Michael\|date=2022-01-14\|title=The Last of Us voice actor wants to sell 'voice NFTs,' drawing ire\|url=https://www.polygon.com/22883752/troy-baker-nfts-voice-last-of-us-bioshock\|access-date=2022-01-14\|website=Polygon\|language=en-US\|archive-date=2022-01-14\|archive-url=https://web.archive.org/web/20220114174747/https://www.polygon.com/22883752/troy-baker-nfts-voice-last-of-us-bioshock\|url-status=live}}</ref><ref>{{Cite web\|title=Last Of Us Voice Actor Pisses Everyone Off With NFT Push\|url=https://kotaku.com/last-of-us-voice-actor-pisses-everyone-off-with-nft-pus-1848360093\|access-date=2022-01-14\|website=Kotaku\|date=January 14, 2022\|language=en-us\|archive-date=2022-01-14\|archive-url=https://web.archive.org/web/20220114154523/https://kotaku.com/last-of-us-voice-actor-pisses-everyone-off-with-nft-pus-1848360093\|url-status=live}}</ref> ~~He also acknowledged that the "hate/create" part in his initial Tweet might have been "a bit antagonistic,"~~ and asked fans on social media to forgive him.<ref name="stevivor"/><ref>{{Cite web\|last=Purslow\|first=Matt\|date=2022-01-14\|title=Troy Baker Is Working With NFTs, but Fans Are Unimpressed\|url=https://www.ign.com/articles/troy-baker-nft-voiceverse\|access-date=2022-01-14\|website=IGN\|language=en\|archive-date=2022-01-14\|archive-url=https://web.archive.org/web/20220114130245/https://www.ign.com/articles/troy-baker-nft-voiceverse\|url-status=live}}</ref> Two weeks later~~, on January 31~~, Baker ~~announced that he would discontinue~~ his partnership with Voiceverse.<ref name="tweaktown">{{cite web		}}</ref> Commentators also pointed out the irony in Baker's initial Tweet announcing the partnership, which ended with "You can hate. Or you can create. What'll it be?", hours before the public revelation that the company in question had resorted to theft instead of creating their own product. Baker responded that he appreciated people sharing their thoughts and their responses were "giving a lot to think about,"<ref>{{Cite web\|last=McWhertor\|first=Michael\|date=2022-01-14\|title=The Last of Us voice actor wants to sell 'voice NFTs,' drawing ire\|url=https://www.polygon.com/22883752/troy-baker-nfts-voice-last-of-us-bioshock\|access-date=2022-01-14\|website=Polygon\|language=en-US\|archive-date=2022-01-14\|archive-url=https://web.archive.org/web/20220114174747/https://www.polygon.com/22883752/troy-baker-nfts-voice-last-of-us-bioshock\|url-status=live}}</ref><ref>{{Cite web\|title=Last Of Us Voice Actor Pisses Everyone Off With NFT Push\|url=https://kotaku.com/last-of-us-voice-actor-pisses-everyone-off-with-nft-pus-1848360093\|access-date=2022-01-14\|website=Kotaku\|date=January 14, 2022\|language=en-us\|archive-date=2022-01-14\|archive-url=https://web.archive.org/web/20220114154523/https://kotaku.com/last-of-us-voice-actor-pisses-everyone-off-with-nft-pus-1848360093\|url-status=live}}</ref> and asked fans on social media to forgive him.<ref name="stevivor"/><ref>{{Cite web\|last=Purslow\|first=Matt\|date=2022-01-14\|title=Troy Baker Is Working With NFTs, but Fans Are Unimpressed\|url=https://www.ign.com/articles/troy-baker-nft-voiceverse\|access-date=2022-01-14\|website=IGN\|language=en\|archive-date=2022-01-14\|archive-url=https://web.archive.org/web/20220114130245/https://www.ign.com/articles/troy-baker-nft-voiceverse\|url-status=live}}</ref> Two weeks later, Baker discontinued his partnership with Voiceverse.<ref name="tweaktown">{{cite web
	\|url= https://www.tweaktown.com/news/84299/last-of-us-actor-troy-baker-heeds-fans-abandons-nft-plans/index.html		\|url= https://www.tweaktown.com/news/84299/last-of-us-actor-troy-baker-heeds-fans-abandons-nft-plans/index.html
	\|title= Last of Us actor Troy Baker heeds fans, abandons NFT plans		\|title= Last of Us actor Troy Baker heeds fans, abandons NFT plans
Line 757:		Line 389:
	\|archive-date= 2022-01-31		\|archive-date= 2022-01-31
	\|archive-url= https://web.archive.org/web/20220131172752/https://www.tweaktown.com/news/84299/last-of-us-actor-troy-baker-heeds-fans-abandons-nft-plans/index.html		\|archive-url= https://web.archive.org/web/20220131172752/https://www.tweaktown.com/news/84299/last-of-us-actor-troy-baker-heeds-fans-abandons-nft-plans/index.html
	\|url-status= live
	}}</ref><ref name="wgtc">{{cite web
	\|url= https://wegotthiscovered.com/gaming/the-last-of-us-actor-troy-baker-reverses-course-on-nfts-amid-fan-backlash/
	\|title= 'The Last of Us' actor Troy Baker reverses course on NFTs amid fan backlash
	\|last= Peterson
	\|first= Danny
	\|date= 2022-01-31
	\|website= We Got This Covered
	\|access-date= 2022-02-14
	\|quote=
	\|archive-date= 2022-02-14
	\|archive-url= https://web.archive.org/web/20220214191046/https://wegotthiscovered.com/gaming/the-last-of-us-actor-troy-baker-reverses-course-on-nfts-amid-fan-backlash/
	\|url-status= live		\|url-status= live
	}}</ref><ref>{{Cite web\|last=Peters\|first=Jay\|date=2022-01-31\|title=The voice of Joel from The Last of Us steps away from NFT project after outcry\|url=https://www.theverge.com/2022/1/31/22910633/troy-baker-voiceverse-nft-voice-actor-project-the-last-of-us\|access-date=2022-02-04\|website=The Verge\|language=en\|archive-date=2022-02-04\|archive-url=https://web.archive.org/web/20220204042246/https://www.theverge.com/2022/1/31/22910633/troy-baker-voiceverse-nft-voice-actor-project-the-last-of-us\|url-status=live}}</ref>		}}</ref><ref>{{Cite web\|last=Peters\|first=Jay\|date=2022-01-31\|title=The voice of Joel from The Last of Us steps away from NFT project after outcry\|url=https://www.theverge.com/2022/1/31/22910633/troy-baker-voiceverse-nft-voice-actor-project-the-last-of-us\|access-date=2022-02-04\|website=The Verge\|language=en\|archive-date=2022-02-04\|archive-url=https://web.archive.org/web/20220204042246/https://www.theverge.com/2022/1/31/22910633/troy-baker-voiceverse-nft-voice-actor-project-the-last-of-us\|url-status=live}}</ref>

	===Reactions from voice actors===		===Reactions from voice actors===
	Some voice actors have publicly decried the use of voice cloning technology. Cited reasons include concerns about ], unauthorized use of an actor's voice in ], and the potential of ].<ref name="~~thebatch~~"/>~~<ref name="batch">{{cite web~~		Some voice actors have publicly decried the use of voice cloning technology. Cited reasons include concerns about ], unauthorized use of an actor's voice in ], and the potential of ].<ref name="wccftech"/>
	\|url= https://read.deeplearning.ai/the-batch/issue-83/
	\|title= Weekly Newsletter Issue 83
	\|last= Ng
	\|first= Andrew
	\|date= 2021-03-07
	\|website= The Batch
	\|access-date= 2021-03-07
	\|quote=
	\|archive-date= 2022-02-26
	\|archive-url= https://web.archive.org/web/20220226175907/https://read.deeplearning.ai/the-batch/issue-83/
	\|url-status= live
	}}</ref><ref name="wccftech"/>

	== See also ==		== See also ==
Line 812:		Line 420:

	==External links==		==External links==
			*
	* {{Official website\|15.ai}}		* {{Official website\|15.ai}}
	* {{Twitter \| id= fifteenai \| name= 15 }}		* {{Twitter \| id= fifteenai \| name= 15 }}

Revision as of 00:21, 15 November 2024

Real-time text-to-speech tool using artificial intelligence

This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)

A major contributor to this article appears to have a close connection with its subject. It may require cleanup to comply with Misplaced Pages's content policies, particularly neutral point of view. Please discuss further on the talk page. (October 2024) (Learn how and when to remove this message)

The neutrality of this article is disputed. Relevant discussion may be found on the talk page. Please do not remove this message until conditions to do so are met. (October 2024) (Learn how and when to remove this message)

This article may contain citations that do not verify the text. Please check for citation inaccuracies. (October 2024) (Learn how and when to remove this message)

(Learn how and when to remove this message)

15.ai
Type of site	Artificial intelligence, speech synthesis, machine learning, deep learning
Available in	English
Founder(s)	15
URL	15.ai
Commercial	No
Registration	None
Launched	Initial release: March 12, 2020; 4 years ago (2020-03-12) Stable release: v24.2.1
Current status	Under maintenance

Artificial intelligence
Part of a series on

Major goals Artificial general intelligence Intelligent agent Recursive self-improvement Planning Computer vision General game playing Knowledge reasoning Natural language processing Robotics AI safety
Approaches Machine learning Symbolic Deep learning Bayesian networks Evolutionary algorithms Hybrid intelligent systems Systems integration
Applications Bioinformatics Deepfake Earth sciences Finance Generative AI Art Audio Music Government Healthcare Mental health Industry Translation Military Physics Projects
Philosophy Artificial consciousness Chinese room Friendly AI Control problem/Takeover Ethics Existential risk Regulation Turing test
History Timeline Progress AI winter AI boom
Glossary Glossary
v t e

15.ai was a freeware artificial intelligence web application, launched in 2020, that generated text-to-speech voices from fictional characters from various media sources. Created by a pseudonymous developer under the alias 15, the project used a combination of audio synthesis algorithms, speech synthesis deep neural networks, and sentiment analysis models to generate emotive character voices faster than real-time.

In early 2020, 15.ai appeared online as a proof of concept of the democratization of voice acting and dubbing. Its gratis nature, ease of use without user accounts, and improvements over existing text-to-speech implementations made it popular. Some critics and voice actors questioned the legality and ethicality of making such technology so readily accessible.

The site was credited as the impetus behind the popularization of AI voice cloning (also known as audio deepfakes) in content creation. It was embraced by Internet fandoms such as My Little Pony, Team Fortress 2, and SpongeBob SquarePants.

Several commercial alternatives appeared in the following years. In January 2022, the company Voiceverse NFT plagiarized 15.ai's work as part of their platform.

In September 2022, a year after its last stable release, 15.ai was taken offline. As of November 2024, the website was still offline, with the creator's most recent post being dated February 2023.

Features

HAL 9000, known for his sinister robotic voice, is one of the available characters on 15.ai.

Available characters included GLaDOS and Wheatley from Portal, characters from Team Fortress 2, Twilight Sparkle and other characters from My Little Pony: Friendship Is Magic, SpongeBob, Daria Morgendorffer and Jane Lane from Daria, the Tenth Doctor Who, HAL 9000 from 2001: A Space Odyssey, the Narrator from The Stanley Parable, Carl Brutananadilewski from Aqua Teen Hunger Force, Steven Universe, Dan from Dan Vs., and Sans from Undertale.

The deep learning model used by the application was nondeterministic: each time speech was generated from the same string of text, the intonation changed slightly. The application supported manually altering the emotion of a generated line using emotional contextualizers (a term coined by this project), a sentence or phrase conveying the emotion of the take that serves as a guide for the model during inference. Emotional contextualizers were representations of the emotional content of a sentence deduced via transfer learned emoji embeddings using DeepMoji, a deep neural network sentiment analysis algorithm developed by the MIT Media Lab in 2017. DeepMoji was trained on 1.2 billion emoji occurrences in Twitter data from 2013 to 2017, and outperformed human subjects in correctly identifying sarcasm in Tweets and other online modes of communication.

15.ai used a multi-speaker model—hundreds of voices were trained concurrently rather than sequentially, decreasing the required training time and enabling the model to learn and generalize shared emotional context, even for voices with no exposure to that context. Consequently, the characters in the application were powered by a single trained model, as opposed to multiple single-speaker models. The lexicon used by 15.ai was scraped from a variety of Internet sources, including Oxford Dictionaries, Wiktionary, the CMU Pronouncing Dictionary, 4chan, Reddit, and Twitter. Pronunciations of unfamiliar words were automatically deduced using phonological rules learned by the deep learning model.

The application supported a simplified phonetic transcription known as ARPABET, to correct mispronunciations and account for heteronyms—words that are spelled the same but are pronounced differently (such as the word read, which can be pronounced as either /ˈrɛd/ or /ˈriːd/ depending on its tense). It followed the CMU Pronouncing Dictionary's ARPABET conventions.

Background

Speech synthesis

Main article: Deep learning speech synthesis See also: Audio deepfake

In 2016, with the proposal of DeepMind's WaveNet, deep-learning-based models for speech synthesis began to gain popularity as a method of modeling waveforms and generating high-fidelity human-like speech. Tacotron2, a neural network architecture for speech synthesis developed by Google AI, was published in 2018 and required tens of hours of audio data to produce intelligible speech; when trained on 2 hours of speech, the model was able to produce intelligible speech with mediocre quality, and when trained on 36 minutes of speech, the model was unable to produce intelligible speech.

For years, reducing the amount of data required to train a realistic high-quality text-to-speech model has been a primary goal of scientific researchers in the field of deep learning speech synthesis. The developer of 15.ai claims that as little as 15 seconds of data is sufficient to clone a voice up to human standards, a significant reduction in the amount of data required.

Copyrighted material in deep learning

Main article: Authors Guild, Inc. v. Google, Inc.

A landmark case between Google and the Authors Guild in 2013 ruled that Google Books—a service that searches the full text of printed copyrighted books—was transformative, thus meeting all requirements for fair use. This case set an important legal precedent for the field of deep learning and artificial intelligence: using copyrighted material to train a discriminative model or a non-commercial generative model was deemed legal. The legality of commercial generative models trained using copyrighted material is still under debate; due to the black-box nature of machine learning models, any allegations of copyright infringement via direct competition would be difficult to prove.

Development

15.ai was designed and created by an anonymous research scientist known by the alias 15.

The algorithm used by the project was dubbed DeepThroat. The developer said the project and algorithm were conceived as part of MIT's Undergraduate Research Opportunities Program, and had been in development for years before the first release of the application.

The developer also worked closely with the Pony Preservation Project from /mlp/, the My Little Pony board of 4chan. This project was a "collaborative effort by /mlp/ to build and curate pony datasets" with the aim of creating applications in artificial intelligence. The Friendship Is Magic voices on 15.ai were trained on a large dataset crowdsourced by the project: audio and dialogue from the show and related media—including all nine seasons of Friendship Is Magic, the 2017 movie, spinoffs, leaks, and various other content voiced by the same voice actors—were parsed, hand-transcribed, and processed to remove background noise.

Reception

15.ai met with a largely positive reception. Liana Ruppert of Game Informer described it "simplistically brilliant" and José Villalobos of LaPS4 wrote that it "works as easy as it looks." Users praised the ability to easily create audio of popular characters that sound believable to those unaware they had been synthesized. Zack Zwiezen of Kotaku reported that " girlfriend was convinced it was a new voice line from GLaDOS' voice actor, Ellen McLain".

Impact

Fandom content creation

15.ai was frequently used for content creation in various fandoms, including the My Little Pony: Friendship Is Magic fandom, the Team Fortress 2 fandom, the Portal fandom, and the SpongeBob SquarePants fandom, with numerous videos and projects containing speech from 15.ai having gone viral.

The My Little Pony: Friendship Is Magic fandom saw a resurgence in video and musical content creation as a result, inspiring a new genre of fan-created content assisted by artificial intelligence. Some fanfictions weren adapted into fully voiced "episodes": The Tax Breaks is a 17-minute long animated video rendition of a fan-written story published in 2014 that uses voices generated from 15.ai with sound effects and audio editing, emulating the episodic style of the early seasons of Friendship Is Magic.

Viral videos from the Team Fortress 2 fandom featuring voices from 15.ai include Spy is a Furry (which gained over 3 million views on YouTube across multiple videos) and The RED Bread Bank, both of which inspired Source Filmmaker animated video renditions. Other fandoms used voices from 15.ai to produce viral videos. As of July 2022, the viral video Among Us Struggles (with voices from Friendship Is Magic) had over 5.5 million views on YouTube; YouTubers, TikTokers, and Twitch streamers also used 15.ai for their videos, such as FitMC's video on the history of 2b2t—one of the oldest running Minecraft servers—and datpon3's TikTok video featuring the main characters of Friendship Is Magic, which have 1.4 million and 510 thousand views, respectively.

Some users created AI virtual assistants using 15.ai and external voice control software. One user on Twitter created a personal desktop assistant inspired by GLaDOS using 15.ai-generated dialogue in tandem with voice control system VoiceAttack.

Troy Baker / Voiceverse NFT plagiarism scandal

Avatar of Troy Baker
Troy Baker @TroyBakerVA

I’m partnering with @VoiceverseNFT to explore ways where together we might bring new tools to new creators to make new things, and allow everyone a chance to own & invest in the IP’s they create. We all have a story to tell. You can hate. Or you can create. What'll it be?
January 14, 2022

In December 2021, the developer of 15.ai posted on Twitter that they had no interest in incorporating non-fungible tokens (NFTs) into their work.

On January 14, 2022, it was discovered that Voiceverse NFT, a company that video game and anime dub voice actor Troy Baker announced his partnership with, had plagiarized voice lines generated from 15.ai as part of their marketing campaign. Log files showed that Voiceverse had generated audio of Twilight Sparkle and Rainbow Dash from the show My Little Pony: Friendship Is Magic using 15.ai, pitched them up to make them sound unrecognizable from the original voices, and appropriated them without credit, to falsely market their own platform—in violation of 15.ai's terms of service.

15 @fifteenai

I've been informed that the aforementioned NFT vocal synthesis is actively attempting to appropriate my work for their own benefit. After digging through the log files, I have evidence that some of the voices that they are taking credit for were indeed generated from my own site.
January 14, 2022

Avatar of Voiceverse Origins
Voiceverse Origins @VoiceverseNFT

Hey @fifteenai we are extremely sorry about this. The voice was indeed taken from your platform, which our marketing team used without giving proper credit. Chubbiverse team has no knowledge of this. We will make sure this never happens again.
January 14, 2022

15 @fifteenai

Go fuck yourself.
January 14, 2022

A week prior to the announcement of the partnership with Baker, Voiceverse made a (now-deleted) Twitter post directly responding to a (now-deleted) video posted by Chubbiverse—an NFT platform with which Voiceverse had partnered—showcasing an AI-generated voice and claimed that it was generated using Voiceverse's platform, remarking "I wonder who created the voice for this? ;)" A few hours after news of the partnership broke, the developer of 15.ai—having been alerted by another Twitter user asking for his opinion on the partnership, to which he speculated that it "sounds like a scam"—posted screenshots of log files that proved that a user of the website (with their IP address redacted) had submitted inputs of the exact words spoken by the AI voice in the video posted by Chubbiverse, and subsequently responded to Voiceverse's claim directly, tweeting "Certainly not you :)".

Following the tweet, Voiceverse admitted to plagiarizing voices from 15.ai as their own platform, claiming that their marketing team had used the project without giving proper credit and that the "Chubbiverse team no knowledge of this." In response to the admission, 15 tweeted "Go fuck yourself." The final tweet went viral, accruing over 75,000 total likes and 13,000 total retweets across multiple reposts.

The initial partnership between Baker and Voiceverse was met with severe backlash and universally negative reception. Critics highlighted the environmental impact of and potential for exit scams associated with NFT sales. Commentators also pointed out the irony in Baker's initial Tweet announcing the partnership, which ended with "You can hate. Or you can create. What'll it be?", hours before the public revelation that the company in question had resorted to theft instead of creating their own product. Baker responded that he appreciated people sharing their thoughts and their responses were "giving a lot to think about," and asked fans on social media to forgive him. Two weeks later, Baker discontinued his partnership with Voiceverse.

Reactions from voice actors

Some voice actors have publicly decried the use of voice cloning technology. Cited reasons include concerns about impersonation and fraud, unauthorized use of an actor's voice in pornography, and the potential of AI being used to make voice actors obsolete.

Notes

Translated from original quote written in Spanish: "La dirección es 15.AI y funciona tan fácil como parece."

References

Notes

^ Zwiezen, Zack (January 18, 2021). "Website Lets You Make GLaDOS Say Whatever You Want". Kotaku. Archived from the original on January 17, 2021. Retrieved January 18, 2021.
^ Ruppert, Liana (January 18, 2021). "Make Portal's GLaDOS And Other Beloved Characters Say The Weirdest Things With This App". Game Informer. Archived from the original on January 18, 2021. Retrieved January 18, 2021.
^ Clayton, Natalie (January 19, 2021). "Make the cast of TF2 recite old memes with this AI text-to-speech tool". PC Gamer. Archived from the original on January 19, 2021. Retrieved January 19, 2021.
Morton, Lauren (January 18, 2021). "Put words in game characters' mouths with this fascinating text to speech tool". Rock, Paper, Shotgun. Archived from the original on January 18, 2021. Retrieved January 18, 2021.
Ng, Andrew (April 1, 2020). "Voice Cloning for the Masses". The Batch. Archived from the original on August 7, 2020. Retrieved April 5, 2020.
^ Lopez, Ule (January 16, 2022). "Troy Baker-backed NFT firm admits using voice lines taken from another service without permission". Wccftech. Archived from the original on January 16, 2022. Retrieved June 7, 2022.
^ Kurosawa, Yuki (January 19, 2021). "ゲームキャラ音声読み上げソフト「15.ai」公開中。『Undertale』や『Portal』のキャラに好きなセリフを言ってもらえる". AUTOMATON. Archived from the original on January 19, 2021. Retrieved January 19, 2021.
^ Yoshiyuki, Furushima (January 18, 2021). "『Portal』のGLaDOSや『UNDERTALE』のサンズがテキストを読み上げてくれる。文章に込められた感情まで再現することを目指すサービス「15.ai」が話題に". Denfaminicogamer. Archived from the original on January 18, 2021. Retrieved January 18, 2021.
^ Williams, Demi (January 18, 2022). "Voiceverse NFT admits to taking voice lines from non-commercial service". NME. Archived from the original on January 18, 2022. Retrieved January 18, 2022.
^ Wright, Steve (January 17, 2022). "Troy Baker-backed NFT company admits to using content without permission". Stevivor. Archived from the original on January 17, 2022. Retrieved January 17, 2022.
^ Henry, Joseph (January 18, 2022). "Troy Baker's Partner NFT Company Voiceverse Reportedly Steals Voice Lines From 15.ai". Tech Times. Archived from the original on January 26, 2022. Retrieved February 14, 2022.
@fifteenai (February 23, 2023). "If all goes well, the next update should be the culmination of a year and a half of nonstop work put into a huge number of fixes and major improvements to the algorithm. Just give me a bit more time – it should be worth it" (Tweet) – via Twitter.
^ Villalobos, José (January 18, 2021). "Descubre 15.AI, un sitio web en el que podrás hacer que GlaDOS diga lo que quieras". LaPS4. Archived from the original on January 18, 2021. Retrieved January 18, 2021.
Moto, Eugenio (January 20, 2021). "15.ai, el sitio que te permite usar voces de personajes populares para que digan lo que quieras". Yahoo! Finance. Archived from the original on March 8, 2022. Retrieved January 20, 2021.
Felbo, Bjarke (2017). "Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm". Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. pp. 1615–1625. arXiv:1708.00524. doi:10.18653/v1/D17-1169. S2CID 2493033.
Corfield, Gareth (August 7, 2017). "A sarcasm detector bot? That sounds absolutely brilliant. Definitely". The Register. Archived from the original on June 2, 2022. Retrieved June 2, 2022.
"An Algorithm Trained on Emoji Knows When You're Being Sarcastic on Twitter". MIT Technology Review. August 3, 2017. Archived from the original on June 2, 2022. Retrieved June 2, 2022.
"Emojis help software spot emotion and sarcasm". BBC. August 7, 2017. Archived from the original on June 2, 2022. Retrieved June 2, 2022.
Lowe, Josh (August 7, 2017). "Emoji-Filled Mean Tweets Help Scientists Create Sarcasm-Detecting Bot That Could Uncover Hate Speech". Newsweek. Archived from the original on June 2, 2022. Retrieved June 2, 2022.
Valle, Rafael (2020). "Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens". arXiv:1910.11997 .
Cooper, Erica (2020). "Zero-Shot Multi-Speaker Text-To-Speech with State-of-the-art Neural Speaker Embeddings". arXiv:1910.10838 .
^ van den Oord, Aäron; Li, Yazhe; Babuschkin, Igor (November 12, 2017). "High-fidelity speech synthesis with WaveNet". DeepMind. Archived from the original on June 18, 2022. Retrieved June 5, 2022.
Hsu, Wei-Ning (2018). "Hierarchical Generative Modeling for Controllable Speech Synthesis". arXiv:1810.07217 .
Habib, Raza (2019). "Semi-Supervised Generative Modeling for Controllable Speech Synthesis". arXiv:1910.01709 .
"Audio samples from "Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis"". August 30, 2018. Archived from the original on November 11, 2020. Retrieved June 5, 2022.
Shen, Jonathan; Pang, Ruoming; Weiss, Ron J.; Schuster, Mike; Jaitly, Navdeep; Yang, Zongheng; Chen, Zhifeng; Zhang, Yu; Wang, Yuxuan; Skerry-Ryan, RJ; Saurous, Rif A.; Agiomyrgiannakis, Yannis; Wu, Yonghui (2018). "Natural TTS Synthesis by Conditioning WaveNet on Mel-Spectrogram Predictions". arXiv:1712.05884 .
Chung, Yu-An (2018). "Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis". arXiv:1808.10128 .
Ren, Yi (2019). "Almost Unsupervised Text to Speech and Automatic Speech Recognition". arXiv:1905.06791 .
^ Phillips, Tom (January 17, 2022). "Troy Baker-backed NFT firm admits using voice lines taken from another service without permission". Eurogamer. Archived from the original on January 17, 2022. Retrieved January 17, 2022.
- F.2d – (2d Cir, 2015). (temporary cites: 2015 U.S. App. LEXIS 17988; Slip opinion (October 16, 2015))
"15.ai – About". 15.ai. February 20, 2022. Archived from the original on October 6, 2021. Retrieved February 20, 2022.
Branwen, Gwern (March 6, 2020). ""15.ai"⁠, 15, Pony Preservation Project". Gwern.net. Gwern. Archived from the original on March 18, 2022. Retrieved June 17, 2022.
Scotellaro, Shaun (March 14, 2020). "Neat "Pony Preservation Project" Using Neural Networks to Create Pony Voices". Equestria Daily. Archived from the original on June 23, 2021. Retrieved June 11, 2022.
"Pony Preservation Project (Thread 108)". 4chan. Desuarchive. February 20, 2022. Retrieved February 20, 2022.
Scotellaro, Shaun (May 15, 2022). "Full Simple Animated Episode – The Tax Breaks (Twilight)". Equestria Daily. Archived from the original on May 21, 2022. Retrieved May 28, 2022.
The Terribly Taxing Tribulations of Twilight Sparkle. April 27, 2014. Archived from the original on June 30, 2022. Retrieved April 28, 2022. {{cite book}}: |website= ignored (help)
Phillips, Tom (January 14, 2022). "Video game voice actor Troy Baker is now promoting NFTs". Eurogamer. Archived from the original on January 14, 2022. Retrieved January 14, 2022.
McWhertor, Michael (January 14, 2022). "The Last of Us voice actor wants to sell 'voice NFTs,' drawing ire". Polygon. Archived from the original on January 14, 2022. Retrieved January 14, 2022.
"Last Of Us Voice Actor Pisses Everyone Off With NFT Push". Kotaku. January 14, 2022. Archived from the original on January 14, 2022. Retrieved January 14, 2022.
Purslow, Matt (January 14, 2022). "Troy Baker Is Working With NFTs, but Fans Are Unimpressed". IGN. Archived from the original on January 14, 2022. Retrieved January 14, 2022.
Strickland, Derek (January 31, 2022). "Last of Us actor Troy Baker heeds fans, abandons NFT plans". Tweaktown. Archived from the original on January 31, 2022. Retrieved January 31, 2022.
Peters, Jay (January 31, 2022). "The voice of Joel from The Last of Us steps away from NFT project after outcry". The Verge. Archived from the original on February 4, 2022. Retrieved February 4, 2022.

Tweets

@TroyBakerVA (January 14, 2022). "I'm partnering with @VoiceverseNFT to explore ways where together we might bring new tools to new creators to make new things, and allow everyone a chance to own & invest in the IP's they create. We all have a story to tell. You can hate. Or you can create. What'll it be?" (Tweet) – via Twitter.
@fifteenai (December 12, 2021). "I have no interest in incorporating NFTs into any aspect of my work. Please stop asking" (Tweet) – via Twitter.
@fifteenai (January 14, 2022). "I've been informed that the aforementioned NFT vocal synthesis is actively attempting to appropriate my work for their own benefit. After digging through the log files, I have evidence that some of the voices that they are taking credit for were indeed generated from my own site" (Tweet) – via Twitter.
@VoiceverseNFT (January 14, 2022). "Hey @fifteenai we are extremely sorry about this. The voice was indeed taken from your platform, which our marketing team used without giving proper credit. Chubbiverse team has no knowledge of this. We will make sure this never happens again" (Tweet) – via Twitter.
@fifteenai (January 14, 2022). "Go fuck yourself" (Tweet) – via Twitter.
@VoiceverseNFT (January 7, 2022). "I wonder who created the voice for this? ;)" (Tweet). Archived from the original on January 7, 2022 – via Twitter.
@fifteenai (January 14, 2022). "Sounds like a scam" (Tweet) – via Twitter.
@fifteenai (January 14, 2022). "Give proper credit or remove this post" (Tweet) – via Twitter.
@fifteenai (January 14, 2022). "Certainly not you :)" (Tweet) – via Twitter.
@fifteenai (January 14, 2022). "Go fuck yourself" (Tweet) – via Twitter.
@yongyea (January 14, 2022). "The NFT scheme that Troy Baker is promoting is already finding itself in trouble after stealing and profiting off of somebody else's work. Who could've seen this coming" (Tweet) – via Twitter.
@BronyStruggle (January 15, 2022). "actual" (Tweet) – via Twitter.

YouTube (referenced for view counts and usage of 15.ai only)

"SPY IS A FURRY". YouTube. January 17, 2021. Archived from the original on June 13, 2022. Retrieved June 14, 2022.
"Spy is a Furry Animated". YouTube. Archived from the original on June 14, 2022. Retrieved June 14, 2022.
"[SFM] – Spy's Confession – [TF2 15.ai]". YouTube. January 15, 2021. Archived from the original on June 30, 2022. Retrieved June 14, 2022.
"Among Us Struggles". YouTube. September 21, 2020. Retrieved July 15, 2022.
"The UPDATED 2b2t Timeline (2010–2020)". YouTube. March 14, 2020. Archived from the original on June 1, 2022. Retrieved June 14, 2022.

TikTok

"She said " 👹 "". TikTok. Retrieved July 15, 2022.

External links

v t e Differentiable computing
General	Differentiable programming Information geometry Statistical manifold Automatic differentiation Neuromorphic computing Pattern recognition Ricci calculus Computational learning theory Inductive bias
Hardware	IPU TPU VPU Memristor SpiNNaker
Software libraries	TensorFlow PyTorch Keras scikit-learn Theano JAX Flux.jl MindSpore
Portals Computer programming Technology

Speech synthesis

Free software

Speaking	eSpeak/eSpeakNG Gnopernicus Gnuspeech Orca Festival Speech Synthesis System/Flite FreeTTS Automatik Text Reader Retrieval-based Voice Conversion
Singing	eCantorix Lyricos / Flinger Sinsy Retrieval-based Voice Conversion

Proprietary
software

Speaking	Amazon Polly DECtalk Software Automatic Mouth Talk It! Microsoft Agent Microsoft Speech API Microsoft text-to-speech voices Readspeaker Voice browser CoolSpeech IVONA CereProc CeVIO Creative Studio Voiceroid LaLaVoice 15.ai ElevenLabs
Singing	Alter/Ego Cantor CeVIO Creative Studio Chipspeech NIAONiao Virtual Singer PPG Phonem Symphonic Choirs UTAU Vocalina Vocaloid Xiaoice