Misplaced Pages

MediaWiki:Titleblacklist: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editNext edit →Content deleted Content addedVisualWikitext
Revision as of 19:28, 3 December 2008 view sourceNawlinWiki (talk | contribs)Edit filter managers, Administrators221,643 edits add 2← Previous edit Revision as of 15:46, 7 December 2008 view source FT2 (talk | contribs)Edit filter managers, Administrators55,546 editsm missing "."Next edit →
(One intermediate revision by the same user not shown)
Line 136: Line 136:
.*Hewgyr.* <moveonly> .*Hewgyr.* <moveonly>
.*Hewgyor." <moveonly> .*Hewgyor." <moveonly>
.*Everett.* # Used for harassment username and page creation - remove end Dec 2008


# DISALLOW CREATION OF USER OR USER TALK PAGES FOR A SPECIFIC IP RANGE BY NON-AUTOCONFIRMED USERS # DISALLOW CREATION OF USER OR USER TALK PAGES FOR A SPECIFIC IP RANGE BY NON-AUTOCONFIRMED USERS

Revision as of 15:46, 7 December 2008

# This is a title blacklist; every title that matches regex here are forbidden to create.
# Options exist to stop editing, account creation, and moves as well.  See mw:Extension:Title Blacklist for documentation
# See the talk page for more information.
# Please comment any additions made to the blacklist.
# Note: Internally, the pattern delimiter is '/', so be sure to escape all '/'s.
# UTF-8 mode is enabled. Do not use literal non-breaking spaces in regexes as some browsers cannot handle them.
# OBSCURE ASCII CHARACTER LOOKALIKES
.*.* <casesensitive> # Select Unicode Letterlike Symbols (excluding Kelvin, Angstrom and Ohm signs, see talk)
.*.* <casesensitive> # Circled and parenthesized Latin letters
.*.* <casesensitive | errmsg=titleblacklist-custom-fullwidth> # Fullwidth Latin letters
.*.* <casesensitive | moveonly> # Question mark lookalikes, used for page move vandalism
.*.* <casesensitive> # Phonetic extensions, almost never used in valid titles
.*.* <casesensitive | moveonly> # IPA extensions, somewhat more common, so blocking only moves for now
.*.* <casesensitive | moveonly> # Select mathematical operators (excluding "−", "∞" and some other common ones)
.*.* <casesensitive | moveonly> # Misc./supplemental mathematical symbols
.*.* <casesensitive | moveonly> # Letter lookalikes; none of these are currently used in any mainspace title
# OTHER UNDESIRABLE CHARACTERS
.*.* <casesensitive | errmsg=titleblacklist-custom-nbsp> # Non-breaking and other unusual spaces, with custom error message
.*.* <casesensitive> # BiDi overrides
.*.* <casesensitive> # "Other punctuation", with some exceptions (may need more, this is a huge character class); note that single-character titles are permitted by the title whitelist
.*\p{Cc}.* <casesensitive> # Control characters
.*\x{FEFF}.* <casesensitive> # Byte order mark
.*.* <casesensitive> # Swastikas, hammer-and-sickle
.*\x{00AD}.* <casesensitive> # Soft-hyphen
.*.* <casesensitive> # Very few characters outside the Basic Multilingual Plane are useful in titles
.*.* <casesensitive> # Graphic pictures for control codes
# EXCESSIVE PUNCTUATION OR REPETITION
.*{3}(?<!!!!).*
.*{2}(?<!!!!).* <moveonly>
.*\s+.*
.*‽‽.* <moveonly> 
.*¿¿.* <moveonly>
.*{2}.* # Disallows two adjacent "separator" characters (mostly funky spaces)
.*{5}.* # Disallows five consecutive characters that are not letters (in any script), numbers, or spaces
.*()\1{4}.* <moveonly> # Disallows four or more of the same character from page moves
.*(.)\1{10}.* <newaccountonly> # Disallows eleven or more of the same character repeated in usernames
.*\p{Lu}(\P{L}*\p{Lu}){9}.* <casesensitive | moveonly>  # Disallows moves with more than nine consecutive capital letters
# INVERTED QUESTION MARK WITH NON-LATIN TEXT
.*¿.*.*
.*.*¿.*
# BLP TARGETS
.*Seth.*Patinkin.*
.*Jan.*Szatkowski.*
.*(Bill|William).*Beggs.*
# ATTACK TITLES AND/OR PAGE MOVE VANDALISM TARGETS
.*JEWS DID .* <casesensitive>
.*on?whee+ls.* <moveonly> # Disallows moves with "on wheels" with 2 or more Es
.*on wh33ls.*
.*on whiels.*
.*\bwith wh?iels\b.* <moveonly>
.*on rails.* <moveonly>
.*on treads.* <moveonly>
.*BITCH.* <casesensitive>
.*COCK.* <casesensitive>
.*(c|ċ)(c|ċ)k.*
.*CUM.* <casesensitive | moveonly>
.*DICK.* <casesensitive>
.*giiant.*
.*smaller.than.average.* <moveonly>
.*have sex.* <moveonly>
.*(?:suck|his|your|my) penis.* <moveonly>
.*(?:http|https|ftp|mailto|torrent|ed2k)\:\/\/+\.+.*
.*\bis\s+an?\s+(?:dick|cunt|fag|bitch|shit|fuck|loser|ass|gay|ghey|moron|retard|stupid|slut|pa?edo).* <autoconfirmed>
.*\bis\s+an?\s+(?:dick|cunt|fag|bitch|shit|fuck|loser|ass|gay|ghey|moron|retard|stupid|slut|pa?edo).* <moveonly>
.*.*.*
.*\bnimp\.org.*
.*JIHAD, BITCHES.* <casesensitive>
.*Vandalism is Terrorism.*
.*WANT TO HA.* <casesensitive | moveonly>
.*waant to h.* <moveonly>
.*Brian.*Peppers.*
.*suck my.* <moveonly>
.*GE ORGAS.* <casesensitive | moveonly>
.*ge orrg.* <moveonly>
.*RM, STICKY.* <casesensitive>
.*rm sticky.* <moveonly>
.*TAIN OUT OF.* <casesensitive | moveonly>
.*nigger.*nigger.*
.*sk8r.* <moveonly>
.*loves the.* <moveonly>
.*cking fail.*
.*Epic fail.*
.*Ll.* <moveonly>
.*WHUT.* <casesensitive | moveonly>
.*What what.* <moveonly>
.*Grp.* <moveonly>
.*rwp.*
.*GGER.* <casesensitive>
.*HE.* <casesensitive>
.*HR.* <casesensitive>
.*AR.* <casesensitive>
.*RMY.* <casesensitive | moveonly>
.*ERM.* <casesensitive>
.*ERMI.* <casesensitive>
.*RMIE.* <casesensitive>
.*R.M.I.E.* <casesensitive | moveonly>
.*R..M..I..E.* <casesensitive | moveonly>
.*RMEY.* <casesensitive>
.*Rapes babies.*
.*instead f.* <moveonly>
.*rplcng.* <moveonly>
.*h s.* <moveonly>
.*ǃǃ.* <moveonly>
.*Ɩ\P{L}Ɩ.* <moveonly>
.*has.been.moved.* <moveonly>
.*NEGRO.* <casesensitive | moveonly>
.*COON SPIC.* <casesensitive | moveonly>
.*is stretched by.* <newaccountonly>
.*coċk.* <newaccountonly>
.*Brit(ph|f)ag.* #Britfag/phag
.*\b(moral)?fag\b.* <moveonly>
.*EconomicsGuy.* <newaccountonly>
.*\bN(a|o)wlins?(Wiki)?\b.* <moveonly>
.*\bL(o|w+|w)l\b.* <moveonly>
.*\b\W+\W+.* <moveonly>
.*\b\W*\W*.* <moveonly|casesensitive>
.*Wikipedo.*
.*An hero.* <moveonly>
.*whilst.* <moveonly>
.*\.\.\.H.* <moveonly>
.*\.\.\.\.H.* <moveonly>
.*\bfapped.* <moveonly>
.*Krimpet.* <moveonly>
.*,,+.* <moveonly>
.*;;+.* <moveonly>
.*(\pP{2,}\PP){4}.* <moveonly|errmsg=titleblacklist-custom-pagemove> #Antigrawp, works by blocking titles with overused punctuation (eg H..A..G..G..E..R)
.*\b.* <moveonly|errmsg=titleblacklist-custom-pagemove>
.*vvp.* <moveonly>
.*Hewgyr.* <moveonly>
.*Hewgyor." <moveonly>
.*Everett.* # Used for harassment username and page creation - remove end Dec 2008
# DISALLOW CREATION OF USER OR USER TALK PAGES FOR A SPECIFIC IP RANGE BY NON-AUTOCONFIRMED USERS
User( talk)?:71\.107\.(1(2|\d)|2(\d|5))\.(?\d\d?|2(5|\d)) <autoconfirmed>
User( talk)?:75\.47\.(1(2|\d)|2(\d|5))\.(?\d\d?|2(5|\d)) <autoconfirmed>
# PAGE MOVE TARGETS
(.*\W)?(|\\W\)+(\W|\W.*\W)?((\W|\W.*\W)?)*((\W|\W.*\W)?)+((\W|\W.*\W)?)++(\W.*)?  <moveonly> # HERMY
(.*\W)?+(\W|\W.*\W)?((\W|\W.*\W)?)+((\W|\W.*\W)?)+((\W|\W.*\W)?)*(|\\W\)+(\W.*)? <moveonly> # YMREH
.*((\W|\W.*\W)?(\W|\W.*\W)?)+((\W|\W.*\W)?)+((\W|\W.*\W)?)+.* <moveonly>
.*I\W*B\W*H\W*H\W*F\W*S.* <moveonly>
.*I\W*F\W*S\W*N\W*Z.* <moveonly>
Misplaced Pages( talk)?:(*(?-i:).*|(.*\W)?+(\W|\W.*\W)?(((\W|\W.*\W)?)+((\W|\W.*\W)?)+((\W|\W.*\W)?)++|((\W|\W.*\W)?)+((\W|\W.*\W)?)+((\W|\W.*\W)?)+Y+)(\W.*)?) <moveonly> # No haggery in project space, please. (Only ASCII/Latin1 characters needed in this regexp.)
(Help|Portal)( talk)?:(.*(?-i:).*|(.*\W)?+(\W|\W.*\W)?(((\W|\W.*\W)?)+((\W|\W.*\W)?)+((\W|\W.*\W)?)++|((\W|\W.*\W)?)+((\W|\W.*\W)?)+((\W|\W.*\W)?)+Y+)(\W.*)?) <moveonly> # ..nor in help or portal spaces either. (Only ASCII/Latin1 characters needed in this regexp.)
# POTENTIALLY CONFUSING MIXED-SCRIPT TITLES
# Cyrillic/Greek + Latin intentionally skipped due to false positives
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)*.*\p{Cyrillic}.* # Cyrillic + Non-ASCII Latin
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{Cyrillic}*\p{Cyrillic}.*.* # Cyrillic + Non-ASCII Latin
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)*.*\p{Greek}.* # Greek + Non-ASCII Latin
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{Greek}*\p{Greek}.*.* # Greek + Non-ASCII Latin
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{Cyrillic}*\p{Cyrillic}.*\p{Greek}.* # Cyrillic + Greek
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{Greek}*\p{Greek}.*\p{Cyrillic}.* # Cyrillic + Greek
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Armenian}.*.* # Armenian + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Armenian}.* # Armenian + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Bengali}.*.* # Bengali + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Bengali}.* # Bengali + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Cherokee}.*.* # Cherokee + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Cherokee}.* # Cherokee + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Ethiopic}.*.* # Ethiopic + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Ethiopic}.* # Ethiopic + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Georgian}.*.* # Georgian + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Georgian}.* # Georgian + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Gujarati}.*.* # Gujarati + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Gujarati}.* # Gujarati + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Gurmukhi}.*.* # Gurmukhi + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Gurmukhi}.* # Gurmukhi + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Kannada}.*.* # Kannada + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Kannada}.* # Kannada + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Khmer}.*.* # Khmer + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Khmer}.* # Khmer + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Lao}.*.* # Lao + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Lao}.* # Lao + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Malayalam}.*.* # Malayalam + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Malayalam}.* # Malayalam + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Myanmar}.*.* # Myanmar + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Myanmar}.* # Myanmar + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Oriya}.*.* # Oriya + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Oriya}.* # Oriya + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Runic}.*.* # Runic + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Runic}.* # Runic + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Sinhala}.*.* # Sinhala + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Sinhala}.* # Sinhala + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Syriac}.*.* # Syriac + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Syriac}.* # Syriac + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Tamil}.*.* # Tamil + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Tamil}.* # Tamil + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Telugu}.*.* # Telugu + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Telugu}.* # Telugu + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Thaana}.*.* # Thaana + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Thaana}.* # Thaana + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Thai}.*.* # Thai + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Thai}.* # Thai + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*\p{Tibetan}.*.* # Tibetan + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:)\P{L}*.*\p{Tibetan}.* # Tibetan + anything else
(?!(User|Misplaced Pages|Image|File)( talk)?:|Talk:).*.* # Unused obscure scripts
# DISALLOW PAGE MOVES TO MIXED-SCRIPT TITLES
# Intentionally move-only due to false positives
(?!(User|Misplaced Pages)( talk)?:|Talk:)\P{L}*\p{Latin}.*.* <moveonly> # Latin + non-Latin
(?!(User|Misplaced Pages)( talk)?:|Talk:)\P{L}*.*\p{Latin}.* <moveonly> # Latin + non-Latin
(?!(User|Misplaced Pages)( talk)?:|Talk:)\P{L}*\p{Greek}.*.* <moveonly> # Greek + non-Greek
(?!(User|Misplaced Pages)( talk)?:|Talk:)\P{L}*.*\p{Greek}.* <moveonly> # Greek + non-Greek
(?!(User|Misplaced Pages)( talk)?:|Talk:)\P{L}*\p{Cyrillic}.*.* <moveonly> # Cyrillic + non-Cyrillic
(?!(User|Misplaced Pages)( talk)?:|Talk:)\P{L}*.*\p{Cyrillic}.* <moveonly> # Cyrillic + non-Cyrillic
# Slightly different regexp for user/project/talk pages, to allow e.g. Latin subpages of Cyrillic usernames:
((User|Misplaced Pages)( talk)?:|Talk:)(.*\/)?\P{L}*\p{Latin}*.* <moveonly> # Latin + non-Latin 
((User|Misplaced Pages)( talk)?:|Talk:)(.*\/)?\P{L}**\p{Latin}.* <moveonly> # Latin + non-Latin
((User|Misplaced Pages)( talk)?:|Talk:)(.*\/)?\P{L}*\p{Greek}*.* <moveonly> # Greek + non-Greek
((User|Misplaced Pages)( talk)?:|Talk:)(.*\/)?\P{L}**\p{Greek}.* <moveonly> # Greek + non-Greek
((User|Misplaced Pages)( talk)?:|Talk:)(.*\/)?\P{L}*\p{Cyrillic}*.* <moveonly> # Cyrillic + non-Cyrillic
((User|Misplaced Pages)( talk)?:|Talk:)(.*\/)?\P{L}**\p{Cyrillic}.* <moveonly> # Cyrillic + non-Cyrillic
.*(\P{L}*){4}.* <casesensitive | moveonly> # Non-Latin all caps
# Block a particular bot
AOL user message bot .* <newaccountonly>
# GENERIC IMAGE FILE NAMES (with custom error message)
# at most three letters of potentially meaningful text:
(Image|File):\P{L}*((Ima?ge?|Pict?(ure)?|Media|Photo)\P{L}+)?(\p{L}\P{L}*){0,3}((orig|copy|thumb|small)\P{L}*)?\.+  <reupload | errmsg=titleblacklist-custom-imagename>
# no more than two contiguous letters (raising to three would be tempting, but needs more testing):
(Image|File):\P{L}*((Ima?ge?|Pict?(ure)?|Media|Photo)\P{L}+)?(\p{L}{1,2}\P{L}+)*((\p{L}{1,2}|orig|copy|thumb|small)\P{L}*)?\.+  <reupload | errmsg=titleblacklist-custom-imagename>
# month name followed by no more than two contiguous letters, JPEG suffix (be careful if you edit this, easy to trigger false positives):
(Image|File):\P{L}*(January|Jan|February|Febr?|March|Mar|April|Apr|May|June?|July?|August|Aug|September|Sept?|October|Oct|November|Nov|December|Dec)(\P{L}+\p{L}{1,2})*\P{L}*\.JPE?G  <reupload | errmsg=titleblacklist-custom-imagename>
# Common digital cameral file names, based on list at http://diddly.com/random/about.html
# See also MediaWiki:Filename-prefix-blacklist, used to generate a warning on the upload form
(Image|File):DCP\d+\.JPG  <reupload | errmsg=titleblacklist-custom-imagename>  # Kodak
(Image|File):DSC.\d+\.JPG  <reupload | errmsg=titleblacklist-custom-imagename>  # Design rule for Camera File system (Nikon, Fuji, Polaroid)
(Image|File):MVC-?\d+\.JPG  <reupload | errmsg=titleblacklist-custom-imagename>  # Sony Mavica
(Image|File):P\d+\.JPG  <reupload | errmsg=titleblacklist-custom-imagename>  # Olympus, Kodak
(Image|File):I?MG?\d+\.JPG  <reupload | errmsg=titleblacklist-custom-imagename>  # Canon, Pentax
(Image|File):1\d+-\d+(_IMG)?\.JPG  <reupload | errmsg=titleblacklist-custom-imagename>  # Canon
(Image|File):(IM|EX)\d+\.JPG  <reupload | errmsg=titleblacklist-custom-imagename>  # HP Photosmart
(Image|File):DC\d+\.JPG  <reupload | errmsg=titleblacklist-custom-imagename>  # Kodak
(Image|File):PIC?\d+\.JPG  <reupload | errmsg=titleblacklist-custom-imagename>  # Minolta
(Image|File):PANA\d+\.JPG  <reupload | errmsg=titleblacklist-custom-imagename>  # Panasonic
(Image|File):DUW\d+\.JPG  <reupload | errmsg=titleblacklist-custom-imagename>  # some mobile phones
(Image|File):CIMG\d+\.JPG  <reupload | errmsg=titleblacklist-custom-imagename>  # Casio
(Image|File):JD\d+\.JPG  <reupload | errmsg=titleblacklist-custom-imagename>  # Jenoptik
# Other common patterns
(Image|File):\d{9}{6}_{2}\P{L}*\.\w+  <reupload | errmsg=titleblacklist-custom-imagename>  # some image hosting site?
(Image|File):\d{8,}_{10}(_)?\P{L}*\.\w+  <reupload | errmsg=titleblacklist-custom-imagename>  # another image hosting site?
# (Image|File):(\d{9,10})+?\.\w+  <reupload | errmsg=titleblacklist-custom-imagename>  # yet another image hosting site? (redundant to "no more than two contiguous letters")
(Image|File):({8}-)?{4}-{4}-{4}-?{12}.*  <reupload | errmsg=titleblacklist-custom-imagename>  # UUID (with some variations included)
(Image|File):(|\d+)_{10,}(-\d+-|_?(\w\w?|full))?\.+  <reupload | errmsg=titleblacklist-custom-imagename>  # L_9173c67eae58edc35ba7f2df08a7d5c6.jpg, 2421601587_abaf4e3e81.jpg, 1_bf38bcd9c5512a5ab99ca2219a4b1e2f_full.gif, etc.
(Image|File):\P{L}*No\P{L}*name\P{L}*\.+  <reupload | errmsg=titleblacklist-custom-imagename>  # Noname2.jpg