Wikipedia:Bots/Requests for approval/BG19bot 10

Source: Wikipedia, the free encyclopedia.

Operator: Bgwhite (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 06:11, Tuesday, January 31, 2017 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): WPCleaner is coded in Java

Source code available: WPCleaner

Function overview: Fix </br> tags and other invalid tags.

Links to relevant discussions (where appropriate): mw:Parsing/Replacing Tidy Wikipedia talk:WikiProject Check Wikipedia#Line break tags

Edit period(s): Should be one time.

Estimated number of pages affected: 75,000-100,000

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Tidy will shortly be replaced by a new parser called Html5Depurate. The new parser will no longer "correct" some bad tags such as </br>, <small />, <b />, <div />, etc. These will have to be fixed before the new parser is turned on. CheckWiki has been catching these for several months and are being fixed by WPCleaner and manually done. Template space has recently been manually cleared by Jonesey95. Need to clear other spaces. Wikipedia, File, Mediawiki, Help, Category, Portal and their associated talk pages have ~66,000 page. User space has ~9,000. It will also fix other bad br tags such as <br /d>, <a br /> and <br / >. This is CheckWiki error 2 and Checkwiki will be used to find these cases. After the bot runs, anything left will have to be fixed manually and Jonesey95 gets to have that fun. WhatamIdoing (WMF) has been the one prodding us.

Discussion