Wikipedia:Bots/Requests for approval/Blippy1998Bot
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at Wikipedia:Bots/Noticeboard. The result of the discussion was
Approved.
New to bots on Wikipedia? Read these primers!
- Approval process – How this discussion works
- Overview/Policy – What bots are/What they can (or can't) do
- Dictionary – Explains bot-related jargon
Operator: Blippy1998 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 07:50, Tuesday, August 19, 2025 (UTC)
Function overview: Periodically scrape information about the U.S. Congress from appropriate .gov websites and update relevant Wikipedia pages with it (e.g. templates such as Template:HouseRepublicanTally)
Automatic, Supervised, or Manual: Automatic
Programming language(s): Python (probably)
Source code available: not yet written
Links to relevant discussions (where appropriate): While there have been questions about whether the templates to be updated by the bot are appropriate in every potential use case or appropriate in their current form (see here for a short discussion), this task simply automates the updating of the templates, which has been done manually for approximately a year without complaints from the community, and there exist no discussions regarding that.
Edit period(s): Approximately daily
Estimated number of pages affected: in the future, probably less than a dozen templates (currently 3), which would together appear in probably less than 100 pages (currently 9)
Namespace(s): just Templates for now
Exclusion compliant (Yes/No): Yes
Function details: For now, scrape the official U.S. House of Representatives summary webpage of members' party affiliation daily and fill Template:HouseDemocraticTally and Template:HouseRepublicanTally with the updated numbers. This would also affect Template:HouseVacantTally. I may also add a Template:HouseIndependentTally page for completeness and have it do the same thing.
Discussion
[edit]Bot updating a template transcluded on fewer than 10 articles does not look great, but I have no problem with it. I have checked the website's robots.txt, and there are no restrictions on scraping the mentioned webpage, but I doubt that there will be any changes in the number of house members in the near future, so a trial would not be effective. I can approve it for these four mentioned pages if you can share the source code for this task so that I can be confident that it will do exactly what is described in the task description. I also suggest using a proper user agent with valid contact details when making requests to the site. – DreamRimmer ■ 16:44, 27 August 2025 (UTC)[reply]
- I'm replying to let you know I saw your reply, but I still don't have code written, so you don't need to bother with this yet. Hopefully I'll have it done well before September 9.
- In the meantime, you can imagine a script making a call to the website, feeding it into some typical parser (e.g. BeautifulSoup), looking for the right tags, doing a sanity check, and editing each page if and only if the number has changed. Then, that just gets called regularly (~daily I guess) by a cron job.
- There are a few special elections coming up this fall, so each one should serve as a test of whether it's working. That's mainly what it's for, anyway: to account for minor changes in membership throughout the Congress so that less-frequently-updated sections of Wikipedia that have these tallies stay up to date. To be clear, while this would affect Template:HouseVacantTally, it would not edit it directly, as that template uses the expr module to calculate its contents. Also, Template:HouseIndependentTally does not yet exist and I don't plan to create it in the near future, so I think my bot would only actually need to edit the first two for now. If you want to approve it for more, that's fine with me, but I just wanted to make sure you have the right picture. Blippy1998 (talk) 19:01, 1 September 2025 (UTC)[reply]
- https://pastebin.com/4nWvUBPY
- It's done! Blippy1998 (talk) 17:25, 2 September 2025 (UTC)[reply]
- Oops! Please ignore that one. This one should actually work Blippy1998 (talk) 18:50, 2 September 2025 (UTC)[reply]
- Per what DreamRimmer said above, this line:
page = requests.get(URL, headers={'User-Agent': 'Mozilla/5.0'})- Should have a much more descriptive User-Agent string, including your username and the fact the scraping is being done for Wikipedia. that way, if the admins have any issues they can easily contact you :) MrAureliusRYell at me! 02:55, 9 September 2025 (UTC)[reply]
- Thank you! Here is an updated version. Blippy1998 (talk) 15:10, 9 September 2025 (UTC)[reply]
Speedily Approved. The code looks fine. Approval is limited to Template:HouseDemocraticTally and Template:HouseRepublicanTally, and it should not be run more than once per day. Any extension will require filing a new request. – DreamRimmer ■ 16:39, 12 September 2025 (UTC)[reply]
- Thank you! Here is an updated version. Blippy1998 (talk) 15:10, 9 September 2025 (UTC)[reply]
- Oops! Please ignore that one. This one should actually work Blippy1998 (talk) 18:50, 2 September 2025 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at Wikipedia:Bots/Noticeboard.