Wikipedia:Bots/Requests for approval/AppstudiosBot@snippet fetch
New to bots on Wikipedia? Read these primers!
- Approval process – How this discussion works
- Overview/Policy – What bots are/What they can (or can't) do
- Dictionary – Explains bot-related jargon
Operator: Appstudiobot (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 22:55, Wednesday, November 5, 2025 (UTC)
Function overview: A read-only bot for abatch job to fetch ~1.2 million article summaries. This is to populate an external database, which will reduce future API load on Wikipedia.
Automatic, Supervised, or Manual: Automatic
Programming language(s): Python
Source code available: Yes
Links to relevant discussions (where appropriate): (This is a technical request for apihighlimits for a read-only task, no prior consensus discussion exists.)
Edit period(s): One-time run
Estimated number of pages affected: 0 (zero). This bot performs no edits.
Namespace(s): None. This bot is read-only.
Exclusion compliant (Yes/No): Yes (Bot is read-only and makes no edits).
Function details: This is a one-time batch job to fetch the first paragraph (summary) of approximately 1.2 million articles from the "Coordinates on Wikidata" category. This data is being collected to populate a database for an external application, in order to lessen the continuous load from external applications on Wikipedia.
API: MediaWiki Action API (action=query, prop=extracts)
Mode: Read-only. The bot will make no edits.
Frequency: This is a one-time run. It is not a continuous or recurring task.
Speed: The script will run at a polite rate (e.g., 1 request per second) and will respect the maxlag=5 parameter.
Rationale: The standard API limit of 50 titles per request would require ~24,500 requests and take ~3 days to complete this one-time task. By having the apihighlimits right (the 500-item batch size), the bot can complete the same task in ~2,500 requests and finish in under 24 hours. This is 10x more efficient and places a significantly lower load on the API servers for this large, one-off read.