Wikipedia talk:Reliable sources/Perennial sources/Index
Appearance
(Redirected from Wikipedia talk:RSPINDEX)
Parts
[edit]There are starting to be a lot of pieces of the puzzle to keep track of to support this demo, so I thought I would list them here.
List of pages created for the demo
| ||
|---|---|---|
|
Main
Landing Pages
Landing page redirects
Templates
Categories
Requests and Bugs
Other
|
Thanks, Mathglot (talk) 08:43, 23 September 2025 (UTC) updated 23:26, 26 October 2025 (UTC) by Mathglot (talk)
Implementation notes
[edit]- Page locations – presumably the Index page would live where the existing table is at WP:Reliable sources/Perennial sources, and the legacy content would be moved somewhere as historical; perhaps to WP:Reliable sources/Perennial sources/Table (although there is more on the page than just a table.) The landing pages are currently under WP:Reliable sources/Perennial sources/all (so they are not directly under its parent), or we could do something like: WP:Reliable sources/Perennial/sources.
- Tools and templates – for conversion and for rendering landing pages (some are listed in section § Parts above)
- RSPLAST automation – WT:RSP section "#Semi-automating template RSPLAST" was archived and is now in Archive 12.
- Index page format – in considering the List of subpages approach in the Rfc, almost all the attention has been on the landing pages, and little or nothing on the format of the Index page. First, as noted at WP:RSPDEMO#Q3, multiple index pages are possible. Second, there could be other ways to style the index page; one alternative is at WP:Reliable sources/Perennial sources/Index/alt.
- Index to subpage mapping – the basic question is: do links map one-to-one with landing pages, many-to-one, or something else? I believe the assumption is many to one, as is the case in the mockup at WP:RSPINDEX (such as with BuzzFeed and Dotdash Meredith) but this is a basic question that should be discussed and decided.
- Landing page format – We should encourage users to add mockups of landing pages of their own design to the demo, and then keep a list of unique styles somewhere for easy reference. Ideally, the unique styles should be named for easy reference in discussion.
- Landing page to source mapping – a subset of that, is the question of whether one landing page deals strictly with one source or may contain several related sources. For example: should we have one landing page for BuzzFeed, and another for BuzzFeed News, or two? Likewise, with Dotdash Meredith. See Note (b) about one landing page and Note (c) about one source at the Demo FAQ.
- 'See also' naming and usage – maybe call it "Related sources" so WP:NOTSEEALSO is not a temptation? See here.
- Nutshell – pros and cons
- Infobox – yes or no, and what style if yes (at least two are modeled in WP:RSPINDEX landing pages)
- Talk pages – a Talk page for each landing page, or one centralized page for all? (centralized, per this comment)
- Conversion strategy – there are various approaches to conversion of the existing WP:RSP table to landing page style:
- PCRE regex – a regex was used to create most of the content in the demo landing pages linked by WP:RSPINDEX. Workable in small numbers (dozens of pages, but not hundreds) by someone knowledgeable about PCRE, but fairly fragile and not accessible to most editors.
- Edit template/preload conversion – a semi-automated conversion strategy was mentioned involving the {{Edit}} template with a preload page. Further automation of this approach was discussed involving a dynamic preload page (instead of a fixed one) using a module with some powerful replacement features that I will have to locate again. (Note: here.) updated by Mathglot (talk) at Mathglot (talk) 02:47, 7 November 2025 (UTC)
- Bot conversion – The trivially easy part is splitting the table into rows and writing each row to its on own page named after the easily parsed id attribute (or some other attribute) in the row. The hard part, is parsing the row content and formatting a particular landing page style based on it. This might involve the use of a JSON-generating parser like this Python demo, regexes, a combination, or something else.
Mathglot (talk) 22:08, 29 October 2025 (UTC)
- Bot conversion If you assume that the Python parser is complete, or can be extended to be complete, then additional Python code could be written that takes the data for each row, puts it in a "template" (I mean this in the generic programming sense, not a wiki template), and uploads the new page to an appropriate place on Wikipedia.
- For example, if we had an agreed upon format for the subpage for CNN, we could create a candidate page under User:RSPMigrationBot/sources/CNN. Then editors could review that candidate page and, when satisfied that it met all of the requirements, move it into the proper WP:Reliable_sources/<your url scheme here>/CNN page.
- We could also have a "workpage" that lists the candidate pages for every row in the current RSPS page, with status markers for "draft, ready for review"/"reviewed, ready for move"/"moved", to keep track of what is left to be done.
- This wouldn't even have to be a proper "bot" in the sense of a program that performs mass edits on Wikipedia. It would not need tokens, or to go through a review process. Since it's a one (or a few...) time migration, I could use my own editing credentials and the REST API and create the candidate pages under my own userspace or wherever I have editing rights (and yes, I'm volunteering to do this!). audiodude (talk) 04:48, 11 November 2025 (UTC)
- I like the general principle, but the page move part would mean 500 page moves. Could we not simply develop the pages in the right directory from the outset, and just leave them there? They would all remain incognito, so to speak, until the Index page pointing to them was released. Alternatively, if we wanted to enable gradual release as candidates became ready, that could be accomplished entirely on the Index page: a landing page/candidate page that is ready for prime time would have its link on the Index page updated to point to it; before that, it would point to the row in the legacy table. Mathglot (talk) 06:05, 11 November 2025 (UTC)
- Yes we could develop the pages "in place". I like the approach of building the final index piece by piece as the subpages are reviewed/approved. audiodude (talk) 04:22, 15 November 2025 (UTC)
- Or even, in situ, but that's only because I'm mad I didn't get to take Latin 3 and 4. Mathglot (talk) 04:45, 15 November 2025 (UTC)
- Yes we could develop the pages "in place". I like the approach of building the final index piece by piece as the subpages are reviewed/approved. audiodude (talk) 04:22, 15 November 2025 (UTC)
- I like the general principle, but the page move part would mean 500 page moves. Could we not simply develop the pages in the right directory from the outset, and just leave them there? They would all remain incognito, so to speak, until the Index page pointing to them was released. Alternatively, if we wanted to enable gradual release as candidates became ready, that could be accomplished entirely on the Index page: a landing page/candidate page that is ready for prime time would have its link on the Index page updated to point to it; before that, it would point to the row in the legacy table. Mathglot (talk) 06:05, 11 November 2025 (UTC)
- Landing Page Format: Assuming we go with Bot Conversion, as I described above, it would be exceedingly easy to experiment with different landing page/subpage/reliable source page formats across the entire set of sources, once the initial "temporary" bot has been implemented. We could have User:RSPMigrationBot/format1/source/CNN User:RSPMigrationBot/format1/source/BBC and then we come up with a new format and it's User:RSPMigrationBot/format2/CNN User:RSPMigrationBot/format2/source/BBC. The same is true of "index" pages, aka the main table page or the page that gets crowned as the new Wikipedia:Reliable_sources/Perennial_sources. In fact, @User:Mathglot mentioned that there are already two possible such formats. I could write the bot tonight and generate all the pages in both formats very easily. I'm almost ready to just do it as a proof of concept/demo. And now that I think of it, in regards to the discussion above about "moving" pages versus generating them in their final location, we could just change the path in the bot's source code so that if formatFoo wins out, the bot writes that format to Wikipedia:Reliable_sources/Perennial_sources/all/BBC (or whatever the agreed upon path is). Ping @User:WhatamIdoing because I heard you worked on a format? audiodude (talk) 05:34, 15 November 2025 (UTC)
- WaId's format can be seen at Deutsche Welle. On the one hand, I love your why wait, let's get it done approach and on the other, I'm guessing that it will take a while for the RSP contributors to really get wind that this is really about to happen (though they have been informed). As long as as you say, it can just be re-run at any time without much extra work on your part, than by all means go for it! I half suspect that seeing the whole thing played out in two formats might induce more people to comment; I just hope they don't think it is a fait accompli an an either-or, and that they are still welcome to offer suggestions including completely different formats of their own design. Afaic, I see no bar to running the bot to build the pages in the two formats (and I like the anonymous path segments in there that you chose). Tomorrow I'll look into building out the index. Mathglot (talk) 06:01, 15 November 2025 (UTC)
- Has anyone figured out how to re-point the shortcuts like WP:RSYOUTUBE to the new subpages yet? WhatamIdoing (talk) 08:55, 15 November 2025 (UTC)
- I think when you click the link, you click at the top where it says "Redirected from WP:RSYOUTUBE". That takes you to the redirect page and you edit source and edit the #REDIRECT audiodude (talk) 15:12, 15 November 2025 (UTC)
- But that could be automated as well, because the parser captures the shortcut audiodude (talk) 15:13, 15 November 2025 (UTC)
- Before thinking about 'how', we should think about 'when', and 'what' (which set), and under what release/cutover scheme. This needs to be thought out ahead of time, and as a function of release method chosen, i.e. whether we are cutting over to the new system using an all-at-once release style, or a gradual style as mentioned above. We have two complete sets of links to take care of in a cutover: the already existing shortcut links, which point into the table, and the Index page currently under construction, which has one link to each source page.
- Under the all-at-once cutover style (which I think of it as the "big bang" method), one day we have RSP page=table, shortcut links all point to rows, Index page links point to source pages, and the next day we have RSP page=Index page (table page moved away), shortcut links all point to source pages (and so do the unchanged Index page links). Before cutover, viewers see just the table; after cutover, they see just the Index page and RSP source pages. Not clear how page review works under this scheme, but if we review, then we cut over after all 500 pages have been reviewed.
- The gradual cutover style has various flavors. Under one hybrid flavor, we create all the index links A–Z pointing to table rows, and go live with it; meanwhile, bot-built source pages are under review, and when ready have their two links updated (one shortcut, one index page) point to the source page. As "reviewing" a page is a human activity, it would take place at human scale, and the two links would be updated by hand. During the gradual cutover, RSP users using Index page links would sometimes be directed to a source page if it is ready, and sometimes to a table row if it isn't. To mitigate 'surprise' and provide 100% transparency, I support the use of icons paired with Index links so the user knows exactly where the link takes them before clicking. (This was previously described, and working on the Index page demo for a while, but I will link it or summarize below later.) One undecided issue under this scheme is what change, if any, should there be to a table row when a source page is "ready" and the two links now point to it? What should the table row look like during the interregnum—should it perhaps be blanked, or left intact but in faded font (my preference), or perhaps replaced with a link to the source page?
- One implementation note of this flavor is that changes to the index page (altering links and icons and where they point) are a lot simpler than one might imagine under human review + manual update system, and would amount to changing one (or maybe two) class name(s), with all the magic happening in the styles.css page using a combination of ::after pseudo-selector and display:none to create icons pointing into the legacy table before review and blanking them after review. Once all 500 were reviewed/live under this system, the styles.css page could be simplified by removing the no-longer needed classes and styles, but that would not cause any change to anything visible to the user. So the "reviewing" process would be a user editing the index page, and changing one link from class="pending" to class="reviewed" with everything else happening under the hood. (This is not the only flavor of gradual release.)
- Pros & cons:
- big-bang/no-review – easiest and fastest. Goes live as soon as the bot run creates the source pages, the Index links point to them, and shortcuts are updated. No guarantee that source pages faithfully represent the original table row; might have missing content. Could be mitigated by including a copy of the original table row on the source page, and let users review organically post-cutover; (or not); instructions would recommend removing the table row copy to proxy "marked reviewed", but that likely wouldn't happen much. New sources: users instructed to add them directly as an RSP source page format (table locked or marked legacy).
- big-bang with review – slowest; similar to previous, but doesn't go live until all rows are reviewed. Would likely delay cutover for quite some time; like the previous, users would see either the table, or the Index+source pages, never both. New sources: add as table row to the overflow table (Wikipedia:Reliable sources/Perennial sources/X); these would get converted before release.
- gradual release – easy and fast to start out with, but would not cover everything, and would not be complete for some time. Users sometimes directed to table rows via the Index, sometimes to source pages (or both via icons), until all were done. New sources: add directly as an RSP source page, update Index with new link.
- That's all I have for now. Mathglot (talk) 19:38, 16 November 2025 (UTC)
- I think when you click the link, you click at the top where it says "Redirected from WP:RSYOUTUBE". That takes you to the redirect page and you edit source and edit the #REDIRECT audiodude (talk) 15:12, 15 November 2025 (UTC)
- I finished modelling the two formats (format1 and format2) and did a few tests, then kicked off the process of generating all of the pages. I almost immediately ran into a spam/blacklist issue. Apparently many of the sources are blacklisted through technical means, and I can't create pages for them, because they contain links to themselves. Not sure if this restricted is lifted in the Wikipedia: namespace (I was creating my pages in a scratch space in User:).</nowiki> audiodude (talk) 15:18, 17 November 2025 (UTC)
- You should not link them. Only the plain text without "https://" would be fine, from my experience on Meta-Wiki (I haven't tested on enwiki though). SuperGrey (talk) 16:21, 17 November 2025 (UTC)
- Thanks, that fixed it! I decided to try and put the link and fallback to the bare domain if I get an error. audiodude (talk) 20:36, 17 November 2025 (UTC)
- You should not link them. Only the plain text without "https://" would be fine, from my experience on Meta-Wiki (I haven't tested on enwiki though). SuperGrey (talk) 16:21, 17 November 2025 (UTC)
- Has anyone figured out how to re-point the shortcuts like WP:RSYOUTUBE to the new subpages yet? WhatamIdoing (talk) 08:55, 15 November 2025 (UTC)
- WaId's format can be seen at Deutsche Welle. On the one hand, I love your why wait, let's get it done approach and on the other, I'm guessing that it will take a while for the RSP contributors to really get wind that this is really about to happen (though they have been informed). As long as as you say, it can just be re-run at any time without much extra work on your part, than by all means go for it! I half suspect that seeing the whole thing played out in two formats might induce more people to comment; I just hope they don't think it is a fait accompli an an either-or, and that they are still welcome to offer suggestions including completely different formats of their own design. Afaic, I see no bar to running the bot to build the pages in the two formats (and I like the anonymous path segments in there that you chose). Tomorrow I'll look into building out the index. Mathglot (talk) 06:01, 15 November 2025 (UTC)